From 0f835d57b0e989e25aa0b71fe374a0fb1a94e86f Mon Sep 17 00:00:00 2001
From: Dante Niewenhuis <d.niewenhuis@hotmail.com>
Date: Tue, 5 Nov 2024 14:17:08 +0100
Subject: Documentation update (#261)

* Updated a lot of documentation, added a new get-started tutorial.

* Applied Spotless

* Applied Spotless Java

* Added bitbrains workload to site
---
 site/docs/documentation/Input/Experiment.md       | 175 +++++++++++++++++
 site/docs/documentation/Input/ExperimentSchema.md |  81 ++++++++
 site/docs/documentation/Input/FailureModel.md     | 218 ++++++++++++++++++++++
 site/docs/documentation/Input/FailureModels.md    | 202 --------------------
 site/docs/documentation/Input/Scenario.md         | 125 -------------
 site/docs/documentation/Input/ScenarioSchema.md   |  81 --------
 site/docs/documentation/Input/Topology.md         |  61 +++---
 site/docs/documentation/Input/Traces.md           |  26 ---
 site/docs/documentation/Input/Workload.md         |  24 +++
 9 files changed, 531 insertions(+), 462 deletions(-)
 create mode 100644 site/docs/documentation/Input/Experiment.md
 create mode 100644 site/docs/documentation/Input/ExperimentSchema.md
 create mode 100644 site/docs/documentation/Input/FailureModel.md
 delete mode 100644 site/docs/documentation/Input/FailureModels.md
 delete mode 100644 site/docs/documentation/Input/Scenario.md
 delete mode 100644 site/docs/documentation/Input/ScenarioSchema.md
 delete mode 100644 site/docs/documentation/Input/Traces.md
 create mode 100644 site/docs/documentation/Input/Workload.md

(limited to 'site/docs/documentation/Input')

diff --git a/site/docs/documentation/Input/Experiment.md b/site/docs/documentation/Input/Experiment.md
new file mode 100644
index 00000000..c8b96d1f
--- /dev/null
+++ b/site/docs/documentation/Input/Experiment.md
@@ -0,0 +1,175 @@
+When using OpenDC, an experiment defines what should be run, and how. An experiment consists of one or more scenarios, 
+each defining a different simulation to run. Scenarios can differ in many things, such as the topology that is used, 
+the workload that is run, or the policies that are used to name a few. An experiment is defined using a JSON file. 
+In this page, we will discuss how to properly define experiments for OpenDC.
+
+:::info Code
+All code related to reading and processing Experiment files can be found [here](https://github.com/atlarge-research/opendc/tree/master/opendc-experiments/opendc-experiments-base/src/main/kotlin/org/opendc/experiments/base/experiment)
+
+The code used to run a given experiment can be found [here](https://github.com/atlarge-research/opendc/tree/master/opendc-experiments/opendc-experiments-base/src/main/kotlin/org/opendc/experiments/base/runner)
+:::
+
+## Schema
+
+The schema for the scenario file is provided in [schema](ExperimentSchema)
+In the following section, we describe the different components of the schema.
+Some components of an experiment are not single values, but lists. This is used to run multiple scenarios using 
+a single experiment file. OpenDC will execute all permutations of the different values. 
+This means that if all list based values have a single value, only one Scenario will be run. 
+
+| Variable            | Type                                         | Required? | Default  | Description                                                       |
+|---------------------|----------------------------------------------|-----------|----------|-------------------------------------------------------------------|
+| name                | string                                       | no        | ""       | Name of the scenario, used for identification and referencing.    |
+| outputFolder        | string                                       | no        | "output" | Directory where the simulation outputs will be stored.            |
+| initialSeed         | integer                                      | no        | 0        | Seed used for random number generation to ensure reproducibility. |
+| runs                | integer                                      | no        | 1        | Number of times the scenario should be run.                       |
+| exportModels        | List[[ExportModel](#exportmodel)]            | no        | Default  | Specifications for exporting data from the simulation.            |
+| computeExportConfig | [ComputeExportConfig](#checkpointmodel)      | no        | Default  | The features that should be exported during the simulation        |
+| maxNumFailures      | List[integer]                                | no        | [10]     | The max number of times a task can fail before being terminated.  |
+| topologies          | List[[Topology](#topology)]                  | yes       | N/A      | List of topologies used in the scenario.                          |
+| workloads           | List[[Workload](#workload)]                  | yes       | N/A      | List of workloads to be executed within the scenario.             |
+| allocationPolicies  | List[[AllocationPolicy](#allocation-policy)] | yes       | N/A      | Allocation policies used for resource management in the scenario. |
+| failureModels       | List[[FailureModel](#failuremodel)]          | no        | Default  | List of failure models to simulate various types of failures.     |
+| checkpointModels    | List[[CheckpointModel](#checkpointmodel)]    | no        | null     | Paths to carbon footprint trace files.                            |
+| carbonTracePaths    | List[string]                                 | no        | null     | Paths to carbon footprint trace files.                            |
+
+
+Many of the input fields of the experiment file are complex objects themselves. Next, we will describe the required input 
+type of each of these fields.
+
+### ExportModel
+
+| Variable       | Type  | Required? | Default | Description                                 |
+|----------------|-------|-----------|---------|---------------------------------------------|
+| exportInterval | Int64 | no         | 300       | The duration between two exports in seconds |
+
+
+### ComputeExportConfig
+The features that should be exported by OpenDC
+
+| Variable                 | Type         | Required? | Default      | Description                                                           |
+|--------------------------|--------------|-----------|--------------|-----------------------------------------------------------------------|
+| hostExportColumns        | List[String] | no        | All features | The features that should be exported to the host output file.         |
+| taskExportColumns        | List[String] | no        | All features | The features that should be exported to the task output file.         |
+| powerSourceExportColumns | List[String] | no        | All features | The features that should be exported to the power source output file. |
+| serviceExportColumns     | List[String] | no        | All features | The features that should be exported to the service output file.      |
+
+
+### Topology
+Defines the topology on which the workload will be run.
+
+:::info
+For more information about the Topology go [here](Topology)
+:::
+
+| Variable    | Type   | Required? | Default | Description                                                         |
+|-------------|--------|-----------|---------|---------------------------------------------------------------------|
+| pathToFile  | string | yes       | N/A     | Path to the JSON file defining the topology.                        |
+
+### Workload
+Defines the workload that needs to be executed.
+
+:::info
+For more information about workloads go [here](Workload)
+:::
+
+| Variable    | Type   | Required? | Default | Description                                                         |
+|-------------|--------|-----------|---------|---------------------------------------------------------------------|
+| pathToFile  | string | yes       | N/A     | Path to the file containing the workload trace.                     |
+| type        | string | yes       | N/A     | Type of the workload (e.g., "ComputeWorkload").                     |
+
+### Allocation Policy
+Defines the allocation policy that should be used to decide on which host each task should be executed
+
+:::info Code
+The different allocation policies that can be used can be found [here](https://github.com/atlarge-research/opendc/blob/master/opendc-compute/opendc-compute-simulator/src/main/kotlin/org/opendc/compute/simulator/scheduler/ComputeSchedulers.kt)
+:::
+
+| Variable   | Type   | Required? | Default | Description                |
+|------------|--------|-----------|---------|----------------------------|
+| policyType | string | yes       | N/A     | Type of allocation policy. |
+
+### FailureModel
+The failure model that should be used during the simulation
+See [FailureModels](FailureModel) for detailed instructions.
+
+### CheckpointModel
+The checkpoint model that should be used to create snapshots.
+
+| Variable                  | Type   | Required? | Default | Description                                                                                                         |
+|---------------------------|--------|-----------|---------|---------------------------------------------------------------------------------------------------------------------|
+| checkpointInterval        | Int64  | no         | 3600000 | The time between checkpoints in ms                                                                                  |
+| checkpointDuration        | Int64  | no         | 300000  | The time to create a snapshot in ms                                                                                 |
+| checkpointIntervalScaling | Double | no         | 1.0     | The scaling of the checkpointInterval after each succesful checkpoint. The default of 1.0 means no scaling happens. |
+
+
+## Examples
+In the following section, we discuss several examples of Scenario files. Any scenario file can be verified using the
+JSON schema defined in [schema](TopologySchema).
+
+### Simple
+
+The simplest scneario that can be provided to OpenDC is shown below:
+```json
+{
+    "topologies": [
+        {
+            "pathToFile": "topologies/topology1.json"
+        }
+    ],
+    "workloads": [
+        {
+            "type": "ComputeWorkload",
+            "pathToFile": "traces/bitbrains-small"
+        }
+    ],
+    "allocationPolicies": [
+        {
+            "policyType": "Mem"
+        }
+    ]
+}
+```
+
+This scenario creates a simulation from file topology1, located in the topologies folder, with a workload trace from the
+bitbrains-small file, and an allocation policy of type Mem. The simulation is run once (by default), and the default
+name is "".
+
+### Complex
+Following is an example of a more complex topology:
+```json
+{
+    "topologies": [
+        {
+            "pathToFile": "topologies/topology1.json"
+        },
+        {
+            "pathToFile": "topologies/topology2.json"
+        },
+        {
+            "pathToFile": "topologies/topology3.json"
+        }
+    ],
+    "workloads": [
+        {
+            "pathToFile": "traces/bitbrains-small",
+            "type": "ComputeWorkload"
+        },
+        {
+            "pathToFile": "traces/bitbrains-large",
+            "type": "ComputeWorkload"
+        }
+    ],
+    "allocationPolicies": [
+        {
+            "policyType": "Mem"
+        },
+        {
+            "policyType": "Mem-Inv"
+        }
+    ]
+}
+```
+
+This scenario runs a total of 12 experiments. We have 3 topologies (3 datacenter configurations), each simulated with
+2 distinct workloads, each using a different allocation policy (either Mem or Mem-Inv).
diff --git a/site/docs/documentation/Input/ExperimentSchema.md b/site/docs/documentation/Input/ExperimentSchema.md
new file mode 100644
index 00000000..78ec55f7
--- /dev/null
+++ b/site/docs/documentation/Input/ExperimentSchema.md
@@ -0,0 +1,81 @@
+Below is the schema for the Scenario JSON file. This schema can be used to validate a scenario file.
+A scenario file can be validated using a JSON schema validator, such as https://www.jsonschemavalidator.net/.
+
+```json
+{
+    "$schema": "OpenDC/Scenario",
+    "$defs": {
+        "topology": {
+            "type": "object",
+            "properties": {
+                "pathToFile": {
+                    "type": "string"
+                }
+            },
+            "required": [
+                "pathToFile"
+            ]
+        },
+        "workload": {
+            "type": "object",
+            "properties": {
+                "pathToFile": {
+                    "type": "string"
+                },
+                "type": {
+                    "type": "string"
+                }
+            },
+            "required": [
+                "pathToFile",
+                "type"
+            ]
+        },
+        "allocationPolicy": {
+            "type": "object",
+            "properties": {
+                "policyType": {
+                    "type": "string"
+                }
+            },
+            "required": [
+                "policyType"
+            ]
+        }
+    },
+    "properties": {
+        "name": {
+            "type": "string"
+        },
+        "topologies": {
+            "type": "array",
+            "items": {
+                "$ref": "#/$defs/topology"
+            },
+            "minItems": 1
+        },
+        "workloads": {
+            "type": "array",
+            "items": {
+                "$ref": "#/$defs/workload"
+            },
+            "minItems": 1
+        },
+        "allocationPolicies": {
+            "type": "array",
+            "items": {
+                "$ref": "#/$defs/allocationPolicy"
+            },
+            "minItems": 1
+        },
+        "runs": {
+            "type": "integer"
+        }
+    },
+    "required": [
+        "topologies",
+        "workloads",
+        "allocationPolicies"
+    ]
+}
+```
diff --git a/site/docs/documentation/Input/FailureModel.md b/site/docs/documentation/Input/FailureModel.md
new file mode 100644
index 00000000..ecaf7c03
--- /dev/null
+++ b/site/docs/documentation/Input/FailureModel.md
@@ -0,0 +1,218 @@
+OpenDC provides three types of failure models: [Trace-based](#trace-based-failure-models), [Sample-based](#sample-based-failure-models),
+and [Prefab](#prefab-failure-models).
+
+All failure models have a similar structure containing three simple steps.
+
+1. The _interval_ time determines the time between two failures.
+2. The _duration_ time determines how long a single failure takes.
+3. The _intensity_ determines how many hosts are effected by a failure.
+
+:::info Code
+The code that defines the Failure Models can found [here](https://github.com/atlarge-research/opendc/blob/master/opendc-experiments/opendc-experiments-base/src/main/kotlin/org/opendc/experiments/base/experiment/specs/FailureModelSpec.kt).
+:::
+
+## Trace based failure models
+Trace-based failure models are defined by a parquet file. This file defines the interval, duration, and intensity of
+several failures. The failures defined in the file are looped. A valid failure model file follows the format defined below:
+
+| Metric            | Datatype   | Unit          | Summary                                    |
+|-------------------|------------|---------------|--------------------------------------------|
+| failure_interval  | int64      | milli seconds | The duration since the last failure        |
+| failure_duration  | int64      | milli seconds | The duration of the failure                |
+| failure_intensity | float64    | ratio         | The ratio of hosts effected by the failure |
+
+:::info Code
+The code implementation of Trace Based Failure Models can be found [here](https://github.com/atlarge-research/opendc/blob/master/opendc-compute/opendc-compute-failure/src/main/kotlin/org/opendc/compute/failure/models/TraceBasedFailureModel.kt)
+:::
+
+### Example
+A trace-based failure model is specified by setting "type" to "trace-based".
+After, the user can define the path to the failure trace using "pathToFile":
+```json
+{
+    "type": "trace-based",
+    "pathToFile": "path/to/your/failure_trace.parquet"
+}
+```
+
+The "repeat" value can be set to false if the user does not want the failures to loop:
+```json
+{
+    "type": "trace-based",
+    "pathToFile": "path/to/your/failure_trace.parquet",
+    "repeat": "false"
+}
+```
+
+## Sample based failure models
+Sample based failure models sample from three distributions to get the _interval_, _duration_, and _intensity_ of
+each failure. Sample-based failure models are effected by randomness and will thus create different results based
+on the provided seed.
+
+:::info Code
+The code implementation for the Sample based failure models can be found [here](https://github.com/atlarge-research/opendc/blob/master/opendc-compute/opendc-compute-failure/src/main/kotlin/org/opendc/compute/failure/models/SampleBasedFailureModel.kt)
+:::
+
+### Distributions
+OpenDC supports eight different distributions based on java's [RealDistributions](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/RealDistribution.html).
+Because the different distributions require different variables, they have to be specified with a specific "type".
+Next, we show an example of a correct specification of all available distributions in OpenDC.
+
+#### [ConstantRealDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/ConstantRealDistribution.html)
+
+```json
+{
+    "type": "constant",
+    "value": 10.0
+}
+```
+
+#### [ExponentialDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/ExponentialDistribution.html)
+```json
+{
+    "type": "exponential",
+    "mean": 1.5
+}
+```
+
+#### [GammaDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/GammaDistribution.html)
+```json
+{
+    "type": "gamma",
+    "shape": 1.0,
+    "scale": 0.5
+}
+```
+
+#### [LogNormalDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/LogNormalDistribution.html)
+```json
+{
+    "type": "log-normal",
+    "scale": 1.0,
+    "shape": 0.5
+}
+```
+
+#### [NormalDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/NormalDistribution.html)
+```json
+{
+    "type": "normal",
+    "mean": 1.0,
+    "std": 0.5
+}
+```
+
+#### [ParetoDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/ParetoDistribution.html)
+```json
+{
+    "type": "pareto",
+    "scale": 1.0,
+    "shape": 0.6
+}
+```
+
+#### [UniformRealDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/UniformRealDistribution.html)
+```json
+{
+    "type": "constant",
+    "lower": 5.0,
+    "upper": 10.0
+}
+```
+
+#### [WeibullDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/WeibullDistribution.html)
+```json
+{
+    "type": "constant",
+    "alpha": 0.5,
+    "beta": 1.2
+}
+```
+
+### Example
+A sample-based failure model is defined using three distributions for _intensity_, _duration_, and _intensity_.
+Distributions can be mixed however the user wants. Note, values for _intensity_ and _duration_ are clamped to be positive.
+The _intensity_ is clamped to the range [0.0, 1.0).
+To specify a sample-based failure model, the type needs to be set to "custom".
+
+Example:
+```json
+{
+    "type": "custom",
+    "iatSampler": {
+        "type": "exponential",
+        "mean": 1.5
+    },
+    "durationSampler": {
+        "type": "constant",
+        "alpha": 0.5,
+        "beta": 1.2
+    },
+    "nohSampler": {
+        "type": "constant",
+        "value": 0.5
+    }
+}
+```
+
+## Prefab failure models
+The final type of failure models is the prefab models. These are models that are predefined in OpenDC and are based on
+research. Currently, OpenDC has 9 prefab models based on [The Failure Trace Archive: Enabling the comparison of failure measurements and models of distributed systems](https://www-sciencedirect-com.vu-nl.idm.oclc.org/science/article/pii/S0743731513000634)
+The figure below shows the values used to define the failure models.
+![img.png](img.png)
+
+Each failure model is defined four times, on for each of the four distribution.
+The final list of available prefabs is thus:
+
+    G5k06Exp
+    G5k06Wbl
+    G5k06LogN
+    G5k06Gam
+    Lanl05Exp
+    Lanl05Wbl
+    Lanl05LogN
+    Lanl05Gam
+    Ldns04Exp
+    Ldns04Wbl
+    Ldns04LogN
+    Ldns04Gam
+    Microsoft99Exp
+    Microsoft99Wbl
+    Microsoft99LogN
+    Microsoft99Gam
+    Nd07cpuExp
+    Nd07cpuWbl
+    Nd07cpuLogN
+    Nd07cpuGam
+    Overnet03Exp
+    Overnet03Wbl
+    Overnet03LogN
+    Overnet03Gam
+    Pl05Exp
+    Pl05Wbl
+    Pl05LogN
+    Pl05Gam
+    Skype06Exp
+    Skype06Wbl
+    Skype06LogN
+    Skype06Gam
+    Websites02Exp
+    Websites02Wbl
+    Websites02LogN
+    Websites02Gam
+
+:::info Code
+The different Prefab models can be found [here](https://github.com/atlarge-research/opendc/tree/master/opendc-compute/opendc-compute-failure/src/main/kotlin/org/opendc/compute/failure/prefab)
+:::
+
+### Example
+To specify a prefab model, the "type" needs to be set to "prefab".
+After, the prefab can be defined with "prefabName":
+
+```json
+{
+    "type": "prefab",
+    "prefabName": "G5k06Exp"
+}
+```
+
diff --git a/site/docs/documentation/Input/FailureModels.md b/site/docs/documentation/Input/FailureModels.md
deleted file mode 100644
index d62767f6..00000000
--- a/site/docs/documentation/Input/FailureModels.md
+++ /dev/null
@@ -1,202 +0,0 @@
-OpenDC provides three types of failure models: [Trace-based](#trace-based-failure-models), [Sample-based](#sample-based-failure-models), 
-and [Prefab](#prefab-failure-models). 
-
-All failure models have a similar structure containing three simple steps. 
-
-1. The _interval_ time determines the time between two failures.
-2. The _duration_ time determines how long a single failure takes.
-3. The _intensity_ determines how many hosts are effected by a failure.
-
-# Trace based failure models
-Trace-based failure models are defined by a parquet file. This file defines the interval, duration, and intensity of 
-several failures. The failures defined in the file are looped. A valid failure model file follows the format defined below:
-
-| Metric            | Datatype   | Unit          | Summary                                    |
-|-------------------|------------|---------------|--------------------------------------------|
-| failure_interval  | int64      | milli seconds | The duration since the last failure        |
-| failure_duration  | int64      | milli seconds | The duration of the failure                |
-| failure_intensity | float64    | ratio         | The ratio of hosts effected by the failure |
-
-## Schema
-A trace-based failure model is specified by setting "type" to "trace-based".
-After, the user can define the path to the failure trace using "pathToFile":
-```json
-{
-    "type": "trace-based",
-    "pathToFile": "path/to/your/failure_trace.parquet"
-}
-```
-
-The "repeat" value can be set to false if the user does not want the failures to loop:
-```json
-{
-    "type": "trace-based",
-    "pathToFile": "path/to/your/failure_trace.parquet",
-    "repeat": "false"
-}
-```
-
-# Sample based failure models
-Sample based failure models sample from three distributions to get the _interval_, _duration_, and _intensity_ of 
-each failure. Sample-based failure models are effected by randomness and will thus create different results based 
-on the provided seed. 
-
-## Distributions
-OpenDC supports eight different distributions based on java's [RealDistributions](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/RealDistribution.html).
-Because the different distributions require different variables, they have to be specified with a specific "type".
-
-#### [ConstantRealDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/ConstantRealDistribution.html)
-A distribution that always returns the same value. 
-
-```json
-{
-    "type": "constant",
-    "value": 10.0
-}
-```
-
-#### [ExponentialDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/ExponentialDistribution.html)
-```json
-{
-    "type": "exponential",
-    "mean": 1.5
-}
-```
-
-#### [GammaDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/GammaDistribution.html)
-```json
-{
-    "type": "gamma",
-    "shape": 1.0,
-    "scale": 0.5
-}
-```
- 
-#### [LogNormalDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/LogNormalDistribution.html)
-```json
-{
-    "type": "log-normal",
-    "scale": 1.0,
-    "shape": 0.5
-}
-```
-
-#### [NormalDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/NormalDistribution.html)
-```json
-{
-    "type": "normal",
-    "mean": 1.0,
-    "std": 0.5
-}
-```
-
-#### [ParetoDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/ParetoDistribution.html)
-```json
-{
-    "type": "constant",
-    "scale": 1.0,
-    "shape": 0.6
-}
-```
-
-#### [UniformRealDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/UniformRealDistribution.html)
-```json
-{
-    "type": "constant",
-    "lower": 5.0,
-    "upper": 10.0
-}
-```
-
-#### [WeibullDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/WeibullDistribution.html)
-```json
-{
-    "type": "constant",
-    "alpha": 0.5,
-    "beta": 1.2
-}
-```
-
-## Schema
-A sample-based failure model is defined using three distributions for _intensity_, _duration_, and _intensity_.
-Distributions can be mixed however the user wants. Note, values for _intensity_ and _duration_ are clamped to be positive. 
-The _intensity_ is clamped to the range [0.0, 1.0).
-To specify a sample-based failure model, the type needs to be set to "custom".
-
-Example:
-```json
-{
-    "type": "custom",
-    "iatSampler": {
-        "type": "exponential",
-        "mean": 1.5
-    },
-    "durationSampler": {
-        "type": "constant",
-        "alpha": 0.5,
-        "beta": 1.2
-    },
-    "nohSampler": {
-        "type": "constant",
-        "value": 0.5
-    }
-}
-```
-
-# Prefab failure models
-The final type of failure models is the prefab models. These are models that are predefined in OpenDC and are based on 
-research. Currently, OpenDC has 9 prefab models based on [The Failure Trace Archive: Enabling the comparison of failure measurements and models of distributed systems](https://www-sciencedirect-com.vu-nl.idm.oclc.org/science/article/pii/S0743731513000634) 
-The figure below shows the values used to define the failure models.
-![img.png](img.png)
-
-Each failure model is defined four times, on for each of the four distribution. 
-The final list of available prefabs is thus:
-
-    G5k06Exp
-    G5k06Wbl
-    G5k06LogN
-    G5k06Gam
-    Lanl05Exp
-    Lanl05Wbl
-    Lanl05LogN
-    Lanl05Gam
-    Ldns04Exp
-    Ldns04Wbl
-    Ldns04LogN
-    Ldns04Gam
-    Microsoft99Exp
-    Microsoft99Wbl
-    Microsoft99LogN
-    Microsoft99Gam
-    Nd07cpuExp
-    Nd07cpuWbl
-    Nd07cpuLogN
-    Nd07cpuGam
-    Overnet03Exp
-    Overnet03Wbl
-    Overnet03LogN
-    Overnet03Gam
-    Pl05Exp
-    Pl05Wbl
-    Pl05LogN
-    Pl05Gam
-    Skype06Exp
-    Skype06Wbl
-    Skype06LogN
-    Skype06Gam
-    Websites02Exp
-    Websites02Wbl
-    Websites02LogN
-    Websites02Gam
-
-## Schema
-To specify a prefab model, the "type" needs to be set to "prefab".
-After, the prefab can be defined with "prefabName":
-
-```json
-{
-    "type": "prefab",
-    "prefabName": "G5k06Exp"
-}
-```
-
diff --git a/site/docs/documentation/Input/Scenario.md b/site/docs/documentation/Input/Scenario.md
deleted file mode 100644
index ff7b9ffb..00000000
--- a/site/docs/documentation/Input/Scenario.md
+++ /dev/null
@@ -1,125 +0,0 @@
-The scenario of a simulation is defined using a JSON file. A scenario consists of one or more topologies, one or more
-workloads, one or more allocation policies, a name and a number of times the simulation is being run.
-
-## Schema
-
-The schema for the scenario file is provided in [schema](ScenarioSchema)
-In the following section, we describe the different components of the schema.
-
-### General Structure
-
-| Variable             | Type                                         | Required? | Default | Description                                                              |
-|----------------------|----------------------------------------------|-----------|-------|--------------------------------------------------------------------------|
-| name                 | string                                       | no        | ""      | Name of the scenario, used for identification and referencing.            |
-| topologies           | List[[Topology](#topology)]                  | yes       | N/A   | List of topologies used in the scenario.                                 |
-| workloads            | List[[Workload](#workload)]                  | yes       | N/A   | List of workloads to be executed within the scenario.                    |
-| allocationPolicies   | List[[AllocationPolicy](#allocation-policy)] | yes       | N/A   | Allocation policies used for resource management in the scenario.        |
-| failureModels        | List[[FailureModel](#failuremodel)]          | no        | empty | List of failure models to simulate various types of failures.            |
-| exportModels         | List[[ExportModel](#exportmodel)]            | no        | empty | Specifications for exporting data from the simulation.                   |
-| carbonTracePaths     | List[string]                                 | no        | null  | Paths to carbon footprint trace files.                                   |
-| outputFolder         | string                                       | no        | "output" | Directory where the simulation outputs will be stored.                   |
-| initialSeed          | integer                                      | no        | 0     | Seed used for random number generation to ensure reproducibility.        |
-| runs                 | integer                                      | no        | 1     | Number of times the scenario should be run.                              |
-
-### Topology
-
-| Variable    | Type   | Required? | Default | Description                                                         |
-|-------------|--------|-----------|---------|---------------------------------------------------------------------|
-| pathToFile  | string | yes       | N/A     | Path to the JSON file defining the topology.                        |
-
-### Workload
-
-| Variable    | Type   | Required? | Default | Description                                                         |
-|-------------|--------|-----------|---------|---------------------------------------------------------------------|
-| pathToFile  | string | yes       | N/A     | Path to the file containing the workload trace.                     |
-| type        | string | yes       | N/A     | Type of the workload (e.g., "ComputeWorkload").                     |
-
-### Allocation Policy
-
-| Variable    | Type   | Required? | Default | Description                                                         |
-|-------------|--------|-----------|---------|---------------------------------------------------------------------|
-| policyType  | string | yes       | N/A     | Type of allocation policy (e.g., "BestFit", "FirstFit").            |
-
-### FailureModel
-
-| Variable    | Type   | Required? | Default | Description                                                         |
-|-------------|--------|-----------|---------|---------------------------------------------------------------------|
-| modelType   | string | yes       | N/A     | Type of failure model to simulate specific operational failures.    |
-
-### ExportModel
-
-| Variable    | Type   | Required? | Default | Description                                                         |
-|-------------|--------|-----------|---------|---------------------------------------------------------------------|
-| exportType  | string | yes       | N/A     | Specifies the type of data export model for simulation results.     |
-
-
-## Examples
-In the following section, we discuss several examples of Scenario files. Any scenario file can be verified using the
-JSON schema defined in [schema](TopologySchema).
-
-### Simple
-
-The simplest scneario that can be provided to OpenDC is shown below:
-```json
-{
-    "topologies": [
-        {
-            "pathToFile": "topologies/topology1.json"
-        }
-    ],
-    "workloads": [
-        {
-            "pathToFile": "traces/bitbrains-small",
-            "type": "ComputeWorkload"
-        }
-    ],
-    "allocationPolicies": [
-        {
-            "policyType": "Mem"
-        }
-    ]
-}
-```
-
-This scenario creates a simulation from file topology1, located in the topologies folder, with a workload trace from the
-bitbrains-small file, and an allocation policy of type Mem. The simulation is run once (by default), and the default
-name is "".
-
-### Complex
-Following is an example of a more complex topology:
-```json
-{
-    "topologies": [
-        {
-            "pathToFile": "topologies/topology1.json"
-        },
-        {
-            "pathToFile": "topologies/topology2.json"
-        },
-        {
-            "pathToFile": "topologies/topology3.json"
-        }
-    ],
-    "workloads": [
-        {
-            "pathToFile": "traces/bitbrains-small",
-            "type": "ComputeWorkload"
-        },
-        {
-            "pathToFile": "traces/bitbrains-large",
-            "type": "ComputeWorkload"
-        }
-    ],
-    "allocationPolicies": [
-        {
-            "policyType": "Mem"
-        },
-        {
-            "policyType": "Mem-Inv"
-        }
-    ]
-}
-```
-
-This scenario runs a total of 12 experiments. We have 3 topologies (3 datacenter configurations), each simulated with
-2 distinct workloads, each using a different allocation policy (either Mem or Mem-Inv). 
diff --git a/site/docs/documentation/Input/ScenarioSchema.md b/site/docs/documentation/Input/ScenarioSchema.md
deleted file mode 100644
index 78ec55f7..00000000
--- a/site/docs/documentation/Input/ScenarioSchema.md
+++ /dev/null
@@ -1,81 +0,0 @@
-Below is the schema for the Scenario JSON file. This schema can be used to validate a scenario file.
-A scenario file can be validated using a JSON schema validator, such as https://www.jsonschemavalidator.net/.
-
-```json
-{
-    "$schema": "OpenDC/Scenario",
-    "$defs": {
-        "topology": {
-            "type": "object",
-            "properties": {
-                "pathToFile": {
-                    "type": "string"
-                }
-            },
-            "required": [
-                "pathToFile"
-            ]
-        },
-        "workload": {
-            "type": "object",
-            "properties": {
-                "pathToFile": {
-                    "type": "string"
-                },
-                "type": {
-                    "type": "string"
-                }
-            },
-            "required": [
-                "pathToFile",
-                "type"
-            ]
-        },
-        "allocationPolicy": {
-            "type": "object",
-            "properties": {
-                "policyType": {
-                    "type": "string"
-                }
-            },
-            "required": [
-                "policyType"
-            ]
-        }
-    },
-    "properties": {
-        "name": {
-            "type": "string"
-        },
-        "topologies": {
-            "type": "array",
-            "items": {
-                "$ref": "#/$defs/topology"
-            },
-            "minItems": 1
-        },
-        "workloads": {
-            "type": "array",
-            "items": {
-                "$ref": "#/$defs/workload"
-            },
-            "minItems": 1
-        },
-        "allocationPolicies": {
-            "type": "array",
-            "items": {
-                "$ref": "#/$defs/allocationPolicy"
-            },
-            "minItems": 1
-        },
-        "runs": {
-            "type": "integer"
-        }
-    },
-    "required": [
-        "topologies",
-        "workloads",
-        "allocationPolicies"
-    ]
-}
-```
diff --git a/site/docs/documentation/Input/Topology.md b/site/docs/documentation/Input/Topology.md
index cf726616..0d2479bd 100644
--- a/site/docs/documentation/Input/Topology.md
+++ b/site/docs/documentation/Input/Topology.md
@@ -2,6 +2,11 @@ The topology of a datacenter is defined using a JSON file. A topology consist of
 Each cluster consist of at least one host on which jobs can be executed. Each host consist of one or more CPUs,
 a memory unit and a power model.
 
+:::info Code
+The code related to reading and processing topology files can be found [here](https://github.com/atlarge-research/opendc/tree/master/opendc-compute/opendc-compute-topology/src/main/kotlin/org/opendc/compute/topology)
+:::
+
+
 ## Schema
 
 The schema for the topology file is provided in [schema](TopologySchema).
@@ -17,12 +22,12 @@ In the following section, we describe the different components of the schema.
 
 ### Host
 
-| variable   | type                  | required? | default | description                                                                    |
-|------------|-----------------------|-----------|---------|--------------------------------------------------------------------------------|
-| name       | string                | no        | Host    | The name of the host. This is only important for debugging and post-processing |
-| count      | integer               | no        | 1       | The amount of hosts of this type are in the cluster                            |
-| cpuModel        | [CPU](#cpuModel)           | yes       | N/A     | The CPUs in the host                                                           |
-| memory     | [Memory](#memory)     | yes       | N/A     | The memory used by the host                                                    |
+| variable    | type                        | required? | default | description                                                                    |
+|-------------|-----------------------------|-----------|---------|--------------------------------------------------------------------------------|
+| name        | string                      | no        | Host    | The name of the host. This is only important for debugging and post-processing |
+| count       | integer                     | no        | 1       | The amount of hosts of this type are in the cluster                            |
+| cpuModel    | [CPU](#cpu)                 | yes       | N/A     | The CPUs in the host                                                           |
+| memory      | [Memory](#memory)           | yes       | N/A     | The memory used by the host                                                    |
 | power model | [Power Model](#power-model) | yes       | N/A     | The power model used to determine the power draw of the host                   |
 
 ### CPU
@@ -49,12 +54,13 @@ In the following section, we describe the different components of the schema.
 
 ### Power Model
 
-| variable  | type    | Unit | required? | default | description                                                                |
-|-----------|---------|------|-----------|---------|----------------------------------------------------------------------------|
-| modelType | string  | N/A  | yes       | N/A     | The type of model used to determine power draw                             |
-| power     | string  | Watt | no        | 400     | The constant power draw when using the 'constant' power model type in Watt |
-| maxPower  | string  | Watt | yes       | N/A     | The power draw of a host when using max capacity in Watt                   |
-| idlePower | integer | Watt | yes       | N/A     | The power draw of a host when idle in Watt                                 |
+| variable        | type   | Unit | required? | default  | description                                                                   |
+|-----------------|--------|------|-----------|----------|-------------------------------------------------------------------------------|
+| vendor          | string | N/A  | yes       | N/A      | The type of model used to determine power draw                                |
+| modelName       | string | N/A  | yes       | N/A      | The type of model used to determine power draw                                |
+| arch            | string | N/A  | yes       | N/A      | The type of model used to determine power draw                                |
+| totalPower      | Int64  | Watt | no        | max long | The power draw of a host when using max capacity in Watt                      |
+| carbonTracePath | string | N/A  | no        | null     | Path to a carbon intensity trace. If not given, carbon intensity is always 0. |
 
 ## Examples
 
@@ -71,12 +77,11 @@ The simplest data center that can be provided to OpenDC is shown below:
         {
             "hosts": [
                 {
-                    "cpus": [
-                        {
-                            "coreCount": 16,
-                            "coreSpeed": 1000
-                        }
-                    ],
+                    "cpu":
+                    {
+                        "coreCount": 16,
+                        "coreSpeed": 1000
+                    },
                     "memory": {
                         "memorySize": 100000
                     }
@@ -87,7 +92,7 @@ The simplest data center that can be provided to OpenDC is shown below:
 }
 ```
 
-This is creates a data center with a single cluster containing a single host. This host consist of a single 16 core CPU
+This creates a data center with a single cluster containing a single host. This host consist of a single 16 core CPU
 with a speed of 1 Ghz, and 100 MiB RAM memory.
 
 ### Count
@@ -102,14 +107,14 @@ Duplicating clusters, hosts, or CPUs is easy using the "count" keyword:
             "hosts": [
                 {
                     "count": 5,
-                    "cpus": [
-                        {
-                            "coreCount": 16,
-                            "coreSpeed": 1000,
-                            "count": 10
-                        }
-                    ],
-                    "memory": {
+                    "cpu":
+                    {
+                        "coreCount": 16,
+                        "coreSpeed": 1000,
+                        "count": 10
+                    },
+                    "memory": 
+                    {
                         "memorySize": 100000
                     }
                 }
@@ -205,7 +210,7 @@ Aside from using number to indicate values it is also possible to define values
                         "modelType": "linear",
                         "power": "400 Watts",
                         "maxPower": "1 KW",
-                        "idlePower": "0.4W"
+                        "idlePower": "0.4 W"
                     }
                 }
             ]
diff --git a/site/docs/documentation/Input/Traces.md b/site/docs/documentation/Input/Traces.md
deleted file mode 100644
index ec5782cb..00000000
--- a/site/docs/documentation/Input/Traces.md
+++ /dev/null
@@ -1,26 +0,0 @@
-### Traces
-OpenDC works with two types of traces that describe the servers that need to be run. Both traces have to be provided as
-parquet files.
-
-#### Meta
-The meta trace provides an overview of the servers:
-
-| Metric       | Datatype   | Unit     | Summary                                          |
-|--------------|------------|----------|--------------------------------------------------|
-| id           | string     |          | The id of the server                             |
-| start_time   | datetime64 | datetime | The submission time of the server                |
-| stop_time    | datetime64 | datetime | The finish time of the submission                |
-| cpu_count    | int32      | count    | The number of CPUs required to run this server   |
-| cpu_capacity | float64    | MHz      | The amount of CPU required to run this server    |
-| mem_capacity | int64      | MB       | The amount of memory required to run this server |
-
-#### Trace
-The Trace file provides information about the computational demand of each server over time:
-
-| Metric    | Datatype   | Unit          | Summary                                     |
-|-----------|------------|---------------|---------------------------------------------|
-| id        | string     |               | The id of the server                        |
-| timestamp | datetime64 | datetime      | The timestamp of the sample                 |
-| duration  | int64      | milli seconds | The duration since the last sample          |
-| cpu_count | int32      | count         | The number of cpus required                 |
-| cpu_usage | float64    | MHz           | The amount of computational power required. |
diff --git a/site/docs/documentation/Input/Workload.md b/site/docs/documentation/Input/Workload.md
new file mode 100644
index 00000000..5f2e61ae
--- /dev/null
+++ b/site/docs/documentation/Input/Workload.md
@@ -0,0 +1,24 @@
+OpenDC works with two types of traces that describe the servers that need to be run. Both traces have to be provided as
+parquet files.
+
+#### Task
+The meta trace provides an overview of the servers:
+
+| Metric          | Datatype | Unit     | Summary                                          |
+|-----------------|----------|----------|--------------------------------------------------|
+| id              | string   |          | The id of the server                             |
+| submission_time | int64    | datetime | The submission time of the server                |
+| duration        | int64    | datetime | The finish time of the submission                |
+| cpu_count       | int32    | count    | The number of CPUs required to run this server   |
+| cpu_capacity    | float64  | MHz      | The amount of CPU required to run this server    |
+| mem_capacity    | int64    | MB       | The amount of memory required to run this server |
+
+#### Fragment
+The Fragment file provides information about the computational demand of each server over time:
+
+| Metric    | Datatype   | Unit          | Summary                                     |
+|-----------|------------|---------------|---------------------------------------------|
+| id        | string     |               | The id of the task                          |
+| duration  | int64      | milli seconds | The duration since the last sample          |
+| cpu_count | int32      | count         | The number of cpus required                 |
+| cpu_usage | float64    | MHz           | The amount of computational power required. |
-- 
cgit v1.2.3