1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
|
\documentclass[12pt, a4paper]{article}
\usepackage{palatino, enumitem, parskip, xspace}
\usepackage[top=1.5cm, bottom=1.5cm, left=2cm, right=2cm]{geometry}
\usepackage[dvipsnames]{xcolor}
\newcommand{\eg}{\emph{e.g.,}\xspace}
\newcommand{\todo}[1]{\textcolor{Blue}{\textbf{TODO(#1)}}}
\newcommand{\etal}{\emph{et~al.}\xspace}
\begin{document}
\begin{center}
\Large My BSc Defence Script
\end{center}
\begin{enumerate}[label=\textbf{Slide \arabic*.}]
\item \textbf{Introduction}\\
Good morning everyone, my name is Mateusz and today I will present to you my project \emph{Sunfish: Enabling Predictive Analytics For Datacenters Through Digital Twinning}.
At a top level, my project is about trying to ease datacenter management by trying to pave the way to predicting unexpected events.
\item \textbf{Societal Impact}\\
As you know and as you will likely see in the upcoming presentations today, datacenters are important.
But, I would like to shortly mention this myself.
A single GPU is already very complex.
Within a Google Datacenter, there are hundreds of server racks, with tens of such GPUs.
This begs the question: How are we going to manage this large of a datacenter, that has so many \emph{layers of complexity}?
We cannot let these systems go down or experience big failures, because \eg in Netherlands over 3 million professionals depend daily on the cloud.
\todo{Read the slide box.}
As such, we must do something to manage datacenters well.
\item \textbf{Problem Statement}\\
Digital Twinning pairs complex objects (like datacenters) via a two-way connection with their virtual representation.
\todo{Give example about the airplane from aviation.}
\emph{It a method to manage complex systems.}
However, in digital twinning, specifically datacenter digital twinning a lot of elements are still shifting about and there are a lot of ways to create the virtual models and there seems to not be a fully functioning DCDT out there (\emph{that meets the official NASM definition}).
DCDT's lack mandatory features one of which is predictive analytics.
Predictive Analytics is a type of ODA that draws insights into the future based on current data, \eg telling when a host failure might happen before it does (\emph{and yet it is NOT present in existing DCDTs}).
\item \textbf{Research Questions}\\
We wish to enable the development of predictive analysis components for DCDT's by designing a predictive DCDT.
We ask the following research questions. \todo{Read from slide boxes and explain for each (1) describe why it's important (2) say why it's challenging (3) say what makes it scientific.}
\item \textbf{Literature Survey}\\
This is the most exciting part of the thesis for me.
To answer \textbf{RQ1} we conduct a comprehensive literature survey.
We did not conduct the systematic literature survey by Kitchenham \etal, instead we relied heavily on snowballing and manual search for works in Google Scholar and DBLP.
Google Scholar referred us to ACM Digital Library, IEEExplore, Science Direct and others.
We used structured queries such a ``datacenters'' \texttt{AND} ``digital twinning'' or plainly ``datacenter digital twins''.
To filter out relevant work we read the abstract, introduction and conclusion and afterwards decided whether to include the article.
The results are in \textbf{Table 1.1}.
\todo{"Read the slide box."}
\item \textbf{System Model}\\
We also created a holistic system model of DCDTs.
We decided to make a system model instead of a taxonomy, because we discuss the design of a set of systems, and there are not that many to consider making a full \emph{Linnaeus} tree and a taxonomy.
The system model is in \textbf{Fig. 1.3}, and what I found to be the most interesting while reading the literature was the lack of the connection between the two twins.
As such, what makes this design special is the \emph{Digital Thread}. \todo{Read the slide box.}
\item \textbf{Reference Architecture and Prototype}\\
From the literature survey, we gathered the potential use-cases of our system, which we omit for brevity.
From the use-cases we developed as set of functional and non-functional requirements, based on which we created the reference architecture.
The most innovative part of the data pipeline is the use of both in-band and out-of-band data pipelines, by including both a short-term cache and a long-term database.
The most interesting thing that I devised myself, is the predictive analytics component.
\todo{Go through the elements in the plot.}
Given this reference architecture, we created a prototype, called \emph{Sunfish}.
We evaluate this prototype in the following slides.
\item \textbf{Novel Evaluation Method}\\
Now we go to the most difficult part.
In order to evaluate a prototype, we propose a novel approach.
Many researchers do not have a real facility to experiment with.
We propose to use a second simulator to act as the real datacenter.
\todo{Say in order to not cram content into the presentation, we omit the technical setup, and include it in extra slides.}
\item \textbf{Experiment 1: Red and Yellow Alarms}\\
For Experiment 1 we copy the idea of Milojicic \etal for different ways a DCDT can notify the datacenter.
Imagine a scenario: a datacenter will soon run a workload.
We want to detect and differentiate between failures that are big and unexpected and failures we anticipated would occur.
To achieve this: the DCDT runs the workload using the simulator.
We cannot know what kind of failures we can expect, so we use a statistical distribution to approximate what might occur in practice.
In result, we get a picture of what kind of problems we might expect.
We now use the real-time feedback loop to notify the DC operators that what is happening in reality is different from simulation.
If we get within 80\% of the predicted threshold for number of failures we send a yellow alarm.
If we get within 90\% we send a red alarm.
\item \textbf{Experiment 2: Conceptual Experiment}\\
\item \textbf{Key Takeaways}\\
\todo{Read from the slide.}
\end{enumerate}
\end{document}
|