From 7631e050a4fdacf123c1069d4776ac703ca96a93 Mon Sep 17 00:00:00 2001 From: mjkwiatkowski Date: Wed, 3 Jun 2026 09:59:24 +0200 Subject: feat: fixed the first research question, added figure spacing --- content/background.tex | 76 +++++++++++++------------------------------------- content/intro.tex | 3 +- 2 files changed, 21 insertions(+), 58 deletions(-) (limited to 'content') diff --git a/content/background.tex b/content/background.tex index c88367f..e0f0493 100644 --- a/content/background.tex +++ b/content/background.tex @@ -1,14 +1,11 @@ \chapter{Background}\label{s:background} -\section{Datacenters}\label{ss:datacenters} -Explain the high risk phenomena that occur in datacenters, which includes failures. -\subsection{Failures}\label{sss:failures} - \section{Digital Twinning}\label{ss:digital-twinning} - -\gls{ed} is an open-source framework for developing digital twins of supercomputers. +% To fix: remove the \gls commands for ExaDigiT. +% This is getting silly. +\gls{ed}~\cite{DBLP:conf/sc/BrewerMKWBHSGGW24} is an open-source framework for developing digital twins of supercomputers. It consists of 3 modules: \begin{enumerate*}[label=(\arabic*)] \item resource allocator and power simulator @@ -32,7 +29,17 @@ In order to model the cooling system the authors use the Modelica software, and The authors provide a intuitive way to interact with the system using a visual dashboard, and an advanced augmented reality model. The authors posit that the best way to address the 3V's of data (velocity, volume and variety) is to use augmented reality coupled with dashboards. +SmarDC~\cite{DBLP:conf/noms/ZhangZLZWC22} is a digital twin solution for optimization of power consumption in datacenters. +Specifically, Zhang \etal propose that using \gls{ai} enhanced modeling paired with digital twinning can help make dynamic adjustments to the datacenter cooling subsystem. +SmartDC has been proven to ensure efficient energy-saving rate of a China Telecom datacenter at 41\%. +However, the main purpose of SmartDC is not to continuously interact with the facility, but to provide additional training data for a more accurate, \gls{ml} solution. +The digital twin is designed to provide extra datasets for training \gls{ai} models. +% This digital twin together with ExaDigiT use computational fluid dynamics (CFD). +% ExaDigiT uses open-source Modelica software and SmartDC uses proprietary 6SigmaDC. +% At this point it would make sense to create the distinction between _structural_ digital twinning and _behavioural_ digital twinning. +% Link to 6SigmaDC: https://www5.cadence.com/trial_datacenter_insights_lp.html +DyTwin~\cite{DBLP:conf/sc/TaheriBPRHDEWPM24} is an adaptive digital twin with visualization and anomaly detection features. Predictive modelling uses statistics to predict outcomes. When deployed commercially, for example in datacenters, predictive modelling is often referred to as predictive analytics~\cite{Wikipedia:PredictiveModelling}. @@ -52,60 +59,14 @@ The process of inference from data to provide the best explanation is called abd %See the article by Fei Tao \subsection{Datacenter simulation}\label{sss:simulation} +\input{sources/dt_features_comparison.tex} -\begin{table}[h] - \centering - \renewcommand{\arraystretch}{1.4} - \begin{tabular}{m{0.7\linewidth}cc} - \toprule - Feature & \gls{ed} & \\ - \midrule - Virtual Prototyping & & \\ - Scenario Exploration & & \\ - 3D Facility Modelling & & \\ - Predictive maintenance & & \\ - Predictive energy modelling & & \\ - Reliability and availability modeling & & \\ - Cooling modelling & & \\ - Network modelling & & \\ - Predictive modelling & & \\ - Power consumption modelling & & \\ - Visual analytics dashboard & & \\ - Forensic analysis and diagnostics & & \\ - Failure detection & & \\ - Operational optimization & & \\ - Resource allocation & & \\ - \midrule - \end{tabular} - \caption{Comparison of selected features of existing datacenter digital twins.} -\end{table} - - -\begin{table}[h] - \centering - \renewcommand{\arraystretch}{1.4} - \begin{tabular}{cccm{0.3\linewidth}c} - \toprule - Project & Environment & Stakeholders & Highlighted Features & GUI \\ - \midrule - - CloudSim & Cloud, Fog, Edge & Research & VC\textsuperscript{$\star$}, N, S, E, WF, FD, EXP, CM, PI & \ding{51}\textsuperscript{$\dagger$} \\ - \midrule - SimGrid & Grid, P3P, Cloud & Research, Edu. & VC\textsuperscript{$\star$}, N\textsuperscript{$\star$}, S, E\textsuperscript{$\star$}, WF\textsuperscript{$\star$} & \ding{51}\textsuperscript{$\dagger$} \\ - \midrule - DGSim & Grid & Research & WF, F, EXP & \ding{55} \\ - \midrule - GroudSim & Grid, Cloud & Research & WF, CM, F & \ding{55} \\ - \midrule - iCanCloud & Cloud & Research & VC, N\textsuperscript{$\star$}, S, CM & \ding{51}\textsuperscript{$\star$} \\ - \midrule - \textbf{OpenDC} & Cloud & Research, Edu. & VC\textsuperscript{$\star$}, N, S, E\textsuperscript{$\star$},, CM, FS\textsuperscript{$\star$}, ML, WF, F\textsuperscript{$\star$}, PI, EXP\textsuperscript{$\star$} & \ding{51}\textsuperscript{$\star$} \\ - \bottomrule - \end{tabular} - \caption{Comparison of selected datacenter simulators. \textbf{Models:} VC = VMs and containers; N = Network, S = Storage, E = Energy, CM = Cost Models, FS = FaaS, ML = Machine Learning, WF = Workflows, FD = Federation; \textbf{Phenomena:} F = Failures, PI = Performance interface; \textbf{Tools:} EXP = Experiment automation; \textbf{Support:} \ding{51} = Yes, \ding{55} = No; $\dagger$ = extension, not integrated; $\star$ = advanced, carefully calibrated feature. Adapted form Mastenbroek \etal} -\end{table} +\section{Datacenters}\label{ss:datacenters} +Explain the high risk phenomena that occur in datacenters, which includes failures. +% Ask Jesse if you can have both of such tables in this section +\input{sources/simulator_comparison.tex} One of the key arguments that speak for a datacenter digital twin is that datacenters already connect hundreds of monitoring sensors and data coming from them. Monitoring of server racks, VM's, CPU profiling and all that give us lots of data. @@ -124,6 +85,7 @@ A generic definition is needed. +\subsection{Failures}\label{sss:failures} %Why predictive analytics? Why predictive behaviour? %What is below here is true, but nonetheless the argumentation should be slightly changed. And a citation is needed. diff --git a/content/intro.tex b/content/intro.tex index dd8efa7..6af41d7 100644 --- a/content/intro.tex +++ b/content/intro.tex @@ -87,7 +87,8 @@ We propose that digital twinning can be enhanced by integrating predictive analy \noindent We divide the problem of designing a predictive \gls{dcdt} into three research questions: \begin{enumerate}[label=\emph{RQ\textsubscript{\arabic*}}, align=left, itemsep=0pt] - \item \emph{How to define a \gls{dcdt}?}\\ + % First research question stolen from Capelin by Georgios Andreadis and adapted to my work. + \item \emph{How capture and assess the current state-of-the-art of digital twinning for datacenters?}\\ There is currently a lack of a unified definition of what constitutes a \gls{dcdt}, and the differences between a \gls{dcdt} and a generic \gls{dt}. It is necessary that we establish a common definition of a \gls{dcdt} in the research community. We must develop a holistic \gls{dcdt} model that factors in the necessary components of a \gls{dt}. -- cgit v1.2.3