1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
|
\chapter{Background}\label{s:background}
Predictive modelling uses statistics to predict outcomes.
When deployed commercially, for example in datacenters, predictive modelling is often referred to as predictive analytics~\cite{Wikipedia:PredictiveModelling}.
Almost any statistical model can be used for prediction purposes, but nowadays predictive analysis is synonymous with machine learning.
A primary example of popular analysis type is linear regression.
A major limitation of predictive analytics is that history cannot always predict the future.
Using historical data to predict outcomes works only under the assumption that there are certain long lasting patterns in the system.
Additionally, no matter how extensive is the training data, there is always the possibility of new variables that have not been considered or even defined, yet are critical to the outcome of the prediction~\cite{Wikipedia:PredictiveModelling}.
%Here you have to cite Deisenroth, 2024, chapter 8.1.4.
An inference function is a machine learning model which uses probabilistic parameter estimation~\cite{}.
A prime example of using probability to find a good machine learning model is Bayesian inference.
% Stanford Encyclopedia of Philosophy, Douven 2017
The process of inference from data to provide the best explanation is called abduction.
%Include something about data-preprocessing in the pipeline.
%See the article by Fei Tao
One of the key arguments that speak for a datacenter digital twin is that datacenters already connect hundreds of monitoring sensors and data coming from them.
Monitoring of server racks, VM's, CPU profiling and all that give us lots of data.
Data analytics, such as ODA can give actual meaningful insights into what we are doing.
Moreover, advanced technologies have made sensors, IoT give us much information.
ODA can predict failures, help maintain the equipment, save bills, cut costs.
But currently one of the key challenges is to somehow connect the physical and virtual spaces.
The answer to how to do this is a digital twin.
%[citation needed]
As of 2026, there is a lack of consensus of what is a digital twin.
By proxy, there is neither consensus on what is the definition of a datacenter digital twin.
A generic definition is needed.
%Why predictive analytics? Why predictive behaviour?
%What is below here is true, but nonetheless the argumentation should be slightly changed. And a citation is needed.
However, there has been little effor made to integrate analytics that enable consistent and relaible prediction of datacenter behaviour into a holistic digital twin of a datacenter.
Nor has the fidelity of failure modeling inside a datacenter simulation increased.
The failure model is still a linear model.
% Since a datacenter simulator is quite different from a digital twin, we cannot use the same computation methods (not as they are right now, at least) -- we must adapt them.
The prediciton models are the same ones for the digital twin as the ones used for the datacenter simulator.
Since a digital twin is not a standalone simulator, a change to how we both predict and model failures is necessary.
The longer the DT is working, the more accurate its predictions.
All the results are aggregated.
% Why has not anyone done this before?
It is also the case that currently this is possible only and only because of the recent development in High Performance Computing.
Between 2003 and 2011 the compute needed to run a Digital Twin was simply not there.
As such, while the concept existed, the hardware did not catch up yet.
However, in the last decade, multicore computing paradigms and the advent of GPU computing has finally enabled computation needed to run a Digital Twin.
This is what has changed, so that today running a digital twin is relevant, much more relevant than it was 10 years ago.
This is also why nobody has done a Digital Twin of a datacenter before.
The current widespread availability of HPC makes this possible.
Because of judgement born out of experience, evolution of existing datacenters is fairly successful; however the development of a new, modern datacenters is fraught with unexpected problems that results in weight growth, schedule delays and cost overruns.
Optimal datacenter management is characterized by high service availability and low downtime.
Achieving this in a 21\textsuperscript{st} century datacenter requires revolutionary changes in the way datacenters are operated and maintained.
A concept that creates just such a revolutionary change is the \gls{dcdt}.
|