diff options
| author | mjkwiatkowski <mati.rewa@gmail.com> | 2026-06-23 13:29:28 +0200 |
|---|---|---|
| committer | mjkwiatkowski <mati.rewa@gmail.com> | 2026-06-23 13:29:28 +0200 |
| commit | aff0ce83458daebf0bdf783ca5144225a72b328a (patch) | |
| tree | 36e6e14bcd0b8cffce3db09dafe65f4f23130592 /main.tex | |
| parent | f062b0967b126c80af7968aebd7db3b0e5779f7a (diff) | |
feat: changed experiment I to be something completely different
Diffstat (limited to 'main.tex')
| -rw-r--r-- | main.tex | 62 |
1 files changed, 41 insertions, 21 deletions
@@ -73,8 +73,10 @@ \begin{tcolorbox}[title=A holistic DCDT system model] We propose a generic model of datacenter digital twinning that can be mapped to each system from \textbf{Table 1.1}. To answer \textbf{RQ2}, we design a ref. arch. for \emph{Operations Model}. + We introduce the \emph{Digital Thread}: a bridge between software and reality. \end{tcolorbox} \begin{center} + \vspace{-0.1cm} \includegraphics[width=0.8\textwidth]{images/system_model2.pdf} \end{center} % The reason why the cooling system is in the graph is because of the fact that 40\% of total energy consumed in DCs comes from cooling~\cite{DBLP:conf/noms/ZhangZLZWC22}. @@ -103,48 +105,66 @@ \end{center} \vspace{-0.2cm} \tiny - \textbf{Figure 1.4:} The predictive datacenter digital twin architecture. \end{minipage} + \textbf{Figure 1.4:} The predictive datacenter digital twin reference architecture. \end{minipage} % We decided to use discrete-event simulation, as opposed to computational fluid dynamics because of the high overheads of development time needed for CFD. % CFD simply takes too long to run, making it unfeasible for real-time analytics and simulation. % Citing ExaDigit: [CFD] they are also more computationally expensive, generally making real-time operation unfeasible. % Consider adding this minipage directly to the ``draw.io'' diagram \end{frame} - +% You should skip \hfill completely or in favour of \hspace very minimally. \begin{frame}\frametitle{\textbf{RQ3}: Experimental Setup} + % The software stack of \emph{Sunfish} includes state-of-the-art software. + %The time-series data flows first to the \texttt{Grafana} dashboard, \texttt{PostgreSQL} database and \texttt{Redis} cache, as advised in~\cite{DBLP:conf/sc/TaheriBPRHDEWPM24}. + \begin{minipage}[b]{0.45\linewidth} - \begin{center} - \includegraphics[width=1.2\linewidth]{images/predictive_analyticsv2.pdf} - \end{center} - \vspace{-0.3cm} - \tiny - \textbf{Figure 1.5:} Evaluating DCDTs is difficult. To answer \textbf{RQ3} we provide a novel way to evaluate datacenter digital twins through discrete-event simulation. + \begin{tcolorbox}[title=Setup Recipe] + \scriptsize + \textbf{Step 1.} Ensure Redis and PostgreSQL servers are up and alive.\newline + + \textbf{Step 2.} Run a Confluent Kafka setup: Kafka Connect, Schema Registry and a Kafka server.\newline + + \textbf{Step 3.} Start the Python HTTP Server, and the Python Redis Monitor.\newline + + \textbf{Step 4.} Run the (modified) OpenDC (physical twin) with example experiment.\newline + + \textbf{Step 5.} \emph{Sunfish} will automatically start a second OpenDC instance, and start the data analysis. + \end{tcolorbox} + \vspace{0.5cm} \end{minipage} - \hfill + \hspace{0.35cm} \begin{minipage}[b]{0.45\linewidth} + \vspace{-0.2cm} \begin{center} - \includegraphics[width=0.7\linewidth]{images/scrs.jpg} + \includegraphics[width=1.2\linewidth]{images/predictive_analyticsv2.pdf} \end{center} - \vspace{-0.2cm} \tiny - \textbf{Figure 1.6:} The software stack used to implement \emph{Sunfish}. - The time-series data flows initially to the \texttt{Grafana} dashboard, \texttt{PostgreSQL} database and \texttt{Redis} cache, as suggested in~\cite{DBLP:conf/sc/TaheriBPRHDEWPM24}. + \vspace{-0.4cm} + \textbf{Figure 1.5:} We can't just go and test digital twins on large systems as large systems often aren't at hand. + Answering \textbf{RQ3} we provide a novel way to evaluate datacenter digital twins through discrete-event simulation instead. \end{minipage} \end{frame} \begin{frame}\frametitle{\textbf{RQ3}: Experimental Results I} - \begin{tcolorbox}[title=Main Finding I] - On average, \emph{Sunfish} achieves 12.17\% less failures per task than baseline (OpenDC). - Insights from predictive digital twinning yield noticeable performance difference. + % You have some model, and this can be based on multiple traces. + %Get insight from CINECA --> you get a probability of certain hosts failing. + % Anomaly detection --> CINECA, how good their detection is? + %If you incorporate that? If you can make the case that because of our new digital twin we can incorporate such models, anomaly/failure detection, from CINECA. + %If we had that in, we can reach these kinds of gains. + % @Mateusz there is really not a possibility to incorporate CINECA's models, so to address Dante's feedback, I created this experiment. + + \begin{tcolorbox}[title=Failure Detection: Main Finding I] + On average, \emph{Sunfish} can detect 14.5\% of unexpected failures in the physical twin. + We show, that digital twinning \emph{can} be used for failure detection. + \end{tcolorbox} \begin{minipage}[b]{0.45\linewidth} \begin{center} - \includegraphics[width=1.1\textwidth]{images/18_Jun_2026_201008.pdf} + \includegraphics[width=1.1\textwidth]{images/23_Jun_2026_102028.pdf} \end{center} \vspace{-0.3cm} \tiny - \textbf{Figure 1.5:} Experiment 1 -- on the \emph{x}-axis are different community failure traces. - On the \emph{y}-axis is the mean number of times a task has failed, during the entire workload. - Vertical bars is standard deviation, measured over 5 repetitions. + \textbf{Figure 1.5:} Experiment 1 Setup: The Digital Twin estimates the failures based on the Normal Distribution \emph{N\textasciitilde($\mu$,$\sigma$)} with $\mu = 1.5$ and $\sigma = 0.5$. + ``Real'' OpenDC failures come from a WhatsApp user reports. \end{minipage} % Explain what the axis are in the figure caption. % Talk about the experimental setup in the figure. @@ -153,7 +173,7 @@ \begin{frame}\frametitle{\textbf{RQ3}: Experimental Results II} - \begin{tcolorbox}[title=Main Finding II] + \begin{tcolorbox}[title=Failure Prediction: Main Finding II] Here explain what did you find. \end{tcolorbox} |
