From 0f835d57b0e989e25aa0b71fe374a0fb1a94e86f Mon Sep 17 00:00:00 2001 From: Dante Niewenhuis Date: Tue, 5 Nov 2024 14:17:08 +0100 Subject: Documentation update (#261) * Updated a lot of documentation, added a new get-started tutorial. * Applied Spotless * Applied Spotless Java * Added bitbrains workload to site --- site/docs/documentation/Input/Experiment.md | 175 +++++++++++++++++ site/docs/documentation/Input/ExperimentSchema.md | 81 ++++++++ site/docs/documentation/Input/FailureModel.md | 218 ++++++++++++++++++++++ site/docs/documentation/Input/FailureModels.md | 202 -------------------- site/docs/documentation/Input/Scenario.md | 125 ------------- site/docs/documentation/Input/ScenarioSchema.md | 81 -------- site/docs/documentation/Input/Topology.md | 61 +++--- site/docs/documentation/Input/Traces.md | 26 --- site/docs/documentation/Input/Workload.md | 24 +++ 9 files changed, 531 insertions(+), 462 deletions(-) create mode 100644 site/docs/documentation/Input/Experiment.md create mode 100644 site/docs/documentation/Input/ExperimentSchema.md create mode 100644 site/docs/documentation/Input/FailureModel.md delete mode 100644 site/docs/documentation/Input/FailureModels.md delete mode 100644 site/docs/documentation/Input/Scenario.md delete mode 100644 site/docs/documentation/Input/ScenarioSchema.md delete mode 100644 site/docs/documentation/Input/Traces.md create mode 100644 site/docs/documentation/Input/Workload.md (limited to 'site/docs/documentation/Input') diff --git a/site/docs/documentation/Input/Experiment.md b/site/docs/documentation/Input/Experiment.md new file mode 100644 index 00000000..c8b96d1f --- /dev/null +++ b/site/docs/documentation/Input/Experiment.md @@ -0,0 +1,175 @@ +When using OpenDC, an experiment defines what should be run, and how. An experiment consists of one or more scenarios, +each defining a different simulation to run. Scenarios can differ in many things, such as the topology that is used, +the workload that is run, or the policies that are used to name a few. An experiment is defined using a JSON file. +In this page, we will discuss how to properly define experiments for OpenDC. + +:::info Code +All code related to reading and processing Experiment files can be found [here](https://github.com/atlarge-research/opendc/tree/master/opendc-experiments/opendc-experiments-base/src/main/kotlin/org/opendc/experiments/base/experiment) + +The code used to run a given experiment can be found [here](https://github.com/atlarge-research/opendc/tree/master/opendc-experiments/opendc-experiments-base/src/main/kotlin/org/opendc/experiments/base/runner) +::: + +## Schema + +The schema for the scenario file is provided in [schema](ExperimentSchema) +In the following section, we describe the different components of the schema. +Some components of an experiment are not single values, but lists. This is used to run multiple scenarios using +a single experiment file. OpenDC will execute all permutations of the different values. +This means that if all list based values have a single value, only one Scenario will be run. + +| Variable | Type | Required? | Default | Description | +|---------------------|----------------------------------------------|-----------|----------|-------------------------------------------------------------------| +| name | string | no | "" | Name of the scenario, used for identification and referencing. | +| outputFolder | string | no | "output" | Directory where the simulation outputs will be stored. | +| initialSeed | integer | no | 0 | Seed used for random number generation to ensure reproducibility. | +| runs | integer | no | 1 | Number of times the scenario should be run. | +| exportModels | List[[ExportModel](#exportmodel)] | no | Default | Specifications for exporting data from the simulation. | +| computeExportConfig | [ComputeExportConfig](#checkpointmodel) | no | Default | The features that should be exported during the simulation | +| maxNumFailures | List[integer] | no | [10] | The max number of times a task can fail before being terminated. | +| topologies | List[[Topology](#topology)] | yes | N/A | List of topologies used in the scenario. | +| workloads | List[[Workload](#workload)] | yes | N/A | List of workloads to be executed within the scenario. | +| allocationPolicies | List[[AllocationPolicy](#allocation-policy)] | yes | N/A | Allocation policies used for resource management in the scenario. | +| failureModels | List[[FailureModel](#failuremodel)] | no | Default | List of failure models to simulate various types of failures. | +| checkpointModels | List[[CheckpointModel](#checkpointmodel)] | no | null | Paths to carbon footprint trace files. | +| carbonTracePaths | List[string] | no | null | Paths to carbon footprint trace files. | + + +Many of the input fields of the experiment file are complex objects themselves. Next, we will describe the required input +type of each of these fields. + +### ExportModel + +| Variable | Type | Required? | Default | Description | +|----------------|-------|-----------|---------|---------------------------------------------| +| exportInterval | Int64 | no | 300 | The duration between two exports in seconds | + + +### ComputeExportConfig +The features that should be exported by OpenDC + +| Variable | Type | Required? | Default | Description | +|--------------------------|--------------|-----------|--------------|-----------------------------------------------------------------------| +| hostExportColumns | List[String] | no | All features | The features that should be exported to the host output file. | +| taskExportColumns | List[String] | no | All features | The features that should be exported to the task output file. | +| powerSourceExportColumns | List[String] | no | All features | The features that should be exported to the power source output file. | +| serviceExportColumns | List[String] | no | All features | The features that should be exported to the service output file. | + + +### Topology +Defines the topology on which the workload will be run. + +:::info +For more information about the Topology go [here](Topology) +::: + +| Variable | Type | Required? | Default | Description | +|-------------|--------|-----------|---------|---------------------------------------------------------------------| +| pathToFile | string | yes | N/A | Path to the JSON file defining the topology. | + +### Workload +Defines the workload that needs to be executed. + +:::info +For more information about workloads go [here](Workload) +::: + +| Variable | Type | Required? | Default | Description | +|-------------|--------|-----------|---------|---------------------------------------------------------------------| +| pathToFile | string | yes | N/A | Path to the file containing the workload trace. | +| type | string | yes | N/A | Type of the workload (e.g., "ComputeWorkload"). | + +### Allocation Policy +Defines the allocation policy that should be used to decide on which host each task should be executed + +:::info Code +The different allocation policies that can be used can be found [here](https://github.com/atlarge-research/opendc/blob/master/opendc-compute/opendc-compute-simulator/src/main/kotlin/org/opendc/compute/simulator/scheduler/ComputeSchedulers.kt) +::: + +| Variable | Type | Required? | Default | Description | +|------------|--------|-----------|---------|----------------------------| +| policyType | string | yes | N/A | Type of allocation policy. | + +### FailureModel +The failure model that should be used during the simulation +See [FailureModels](FailureModel) for detailed instructions. + +### CheckpointModel +The checkpoint model that should be used to create snapshots. + +| Variable | Type | Required? | Default | Description | +|---------------------------|--------|-----------|---------|---------------------------------------------------------------------------------------------------------------------| +| checkpointInterval | Int64 | no | 3600000 | The time between checkpoints in ms | +| checkpointDuration | Int64 | no | 300000 | The time to create a snapshot in ms | +| checkpointIntervalScaling | Double | no | 1.0 | The scaling of the checkpointInterval after each succesful checkpoint. The default of 1.0 means no scaling happens. | + + +## Examples +In the following section, we discuss several examples of Scenario files. Any scenario file can be verified using the +JSON schema defined in [schema](TopologySchema). + +### Simple + +The simplest scneario that can be provided to OpenDC is shown below: +```json +{ + "topologies": [ + { + "pathToFile": "topologies/topology1.json" + } + ], + "workloads": [ + { + "type": "ComputeWorkload", + "pathToFile": "traces/bitbrains-small" + } + ], + "allocationPolicies": [ + { + "policyType": "Mem" + } + ] +} +``` + +This scenario creates a simulation from file topology1, located in the topologies folder, with a workload trace from the +bitbrains-small file, and an allocation policy of type Mem. The simulation is run once (by default), and the default +name is "". + +### Complex +Following is an example of a more complex topology: +```json +{ + "topologies": [ + { + "pathToFile": "topologies/topology1.json" + }, + { + "pathToFile": "topologies/topology2.json" + }, + { + "pathToFile": "topologies/topology3.json" + } + ], + "workloads": [ + { + "pathToFile": "traces/bitbrains-small", + "type": "ComputeWorkload" + }, + { + "pathToFile": "traces/bitbrains-large", + "type": "ComputeWorkload" + } + ], + "allocationPolicies": [ + { + "policyType": "Mem" + }, + { + "policyType": "Mem-Inv" + } + ] +} +``` + +This scenario runs a total of 12 experiments. We have 3 topologies (3 datacenter configurations), each simulated with +2 distinct workloads, each using a different allocation policy (either Mem or Mem-Inv). diff --git a/site/docs/documentation/Input/ExperimentSchema.md b/site/docs/documentation/Input/ExperimentSchema.md new file mode 100644 index 00000000..78ec55f7 --- /dev/null +++ b/site/docs/documentation/Input/ExperimentSchema.md @@ -0,0 +1,81 @@ +Below is the schema for the Scenario JSON file. This schema can be used to validate a scenario file. +A scenario file can be validated using a JSON schema validator, such as https://www.jsonschemavalidator.net/. + +```json +{ + "$schema": "OpenDC/Scenario", + "$defs": { + "topology": { + "type": "object", + "properties": { + "pathToFile": { + "type": "string" + } + }, + "required": [ + "pathToFile" + ] + }, + "workload": { + "type": "object", + "properties": { + "pathToFile": { + "type": "string" + }, + "type": { + "type": "string" + } + }, + "required": [ + "pathToFile", + "type" + ] + }, + "allocationPolicy": { + "type": "object", + "properties": { + "policyType": { + "type": "string" + } + }, + "required": [ + "policyType" + ] + } + }, + "properties": { + "name": { + "type": "string" + }, + "topologies": { + "type": "array", + "items": { + "$ref": "#/$defs/topology" + }, + "minItems": 1 + }, + "workloads": { + "type": "array", + "items": { + "$ref": "#/$defs/workload" + }, + "minItems": 1 + }, + "allocationPolicies": { + "type": "array", + "items": { + "$ref": "#/$defs/allocationPolicy" + }, + "minItems": 1 + }, + "runs": { + "type": "integer" + } + }, + "required": [ + "topologies", + "workloads", + "allocationPolicies" + ] +} +``` diff --git a/site/docs/documentation/Input/FailureModel.md b/site/docs/documentation/Input/FailureModel.md new file mode 100644 index 00000000..ecaf7c03 --- /dev/null +++ b/site/docs/documentation/Input/FailureModel.md @@ -0,0 +1,218 @@ +OpenDC provides three types of failure models: [Trace-based](#trace-based-failure-models), [Sample-based](#sample-based-failure-models), +and [Prefab](#prefab-failure-models). + +All failure models have a similar structure containing three simple steps. + +1. The _interval_ time determines the time between two failures. +2. The _duration_ time determines how long a single failure takes. +3. The _intensity_ determines how many hosts are effected by a failure. + +:::info Code +The code that defines the Failure Models can found [here](https://github.com/atlarge-research/opendc/blob/master/opendc-experiments/opendc-experiments-base/src/main/kotlin/org/opendc/experiments/base/experiment/specs/FailureModelSpec.kt). +::: + +## Trace based failure models +Trace-based failure models are defined by a parquet file. This file defines the interval, duration, and intensity of +several failures. The failures defined in the file are looped. A valid failure model file follows the format defined below: + +| Metric | Datatype | Unit | Summary | +|-------------------|------------|---------------|--------------------------------------------| +| failure_interval | int64 | milli seconds | The duration since the last failure | +| failure_duration | int64 | milli seconds | The duration of the failure | +| failure_intensity | float64 | ratio | The ratio of hosts effected by the failure | + +:::info Code +The code implementation of Trace Based Failure Models can be found [here](https://github.com/atlarge-research/opendc/blob/master/opendc-compute/opendc-compute-failure/src/main/kotlin/org/opendc/compute/failure/models/TraceBasedFailureModel.kt) +::: + +### Example +A trace-based failure model is specified by setting "type" to "trace-based". +After, the user can define the path to the failure trace using "pathToFile": +```json +{ + "type": "trace-based", + "pathToFile": "path/to/your/failure_trace.parquet" +} +``` + +The "repeat" value can be set to false if the user does not want the failures to loop: +```json +{ + "type": "trace-based", + "pathToFile": "path/to/your/failure_trace.parquet", + "repeat": "false" +} +``` + +## Sample based failure models +Sample based failure models sample from three distributions to get the _interval_, _duration_, and _intensity_ of +each failure. Sample-based failure models are effected by randomness and will thus create different results based +on the provided seed. + +:::info Code +The code implementation for the Sample based failure models can be found [here](https://github.com/atlarge-research/opendc/blob/master/opendc-compute/opendc-compute-failure/src/main/kotlin/org/opendc/compute/failure/models/SampleBasedFailureModel.kt) +::: + +### Distributions +OpenDC supports eight different distributions based on java's [RealDistributions](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/RealDistribution.html). +Because the different distributions require different variables, they have to be specified with a specific "type". +Next, we show an example of a correct specification of all available distributions in OpenDC. + +#### [ConstantRealDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/ConstantRealDistribution.html) + +```json +{ + "type": "constant", + "value": 10.0 +} +``` + +#### [ExponentialDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/ExponentialDistribution.html) +```json +{ + "type": "exponential", + "mean": 1.5 +} +``` + +#### [GammaDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/GammaDistribution.html) +```json +{ + "type": "gamma", + "shape": 1.0, + "scale": 0.5 +} +``` + +#### [LogNormalDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/LogNormalDistribution.html) +```json +{ + "type": "log-normal", + "scale": 1.0, + "shape": 0.5 +} +``` + +#### [NormalDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/NormalDistribution.html) +```json +{ + "type": "normal", + "mean": 1.0, + "std": 0.5 +} +``` + +#### [ParetoDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/ParetoDistribution.html) +```json +{ + "type": "pareto", + "scale": 1.0, + "shape": 0.6 +} +``` + +#### [UniformRealDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/UniformRealDistribution.html) +```json +{ + "type": "constant", + "lower": 5.0, + "upper": 10.0 +} +``` + +#### [WeibullDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/WeibullDistribution.html) +```json +{ + "type": "constant", + "alpha": 0.5, + "beta": 1.2 +} +``` + +### Example +A sample-based failure model is defined using three distributions for _intensity_, _duration_, and _intensity_. +Distributions can be mixed however the user wants. Note, values for _intensity_ and _duration_ are clamped to be positive. +The _intensity_ is clamped to the range [0.0, 1.0). +To specify a sample-based failure model, the type needs to be set to "custom". + +Example: +```json +{ + "type": "custom", + "iatSampler": { + "type": "exponential", + "mean": 1.5 + }, + "durationSampler": { + "type": "constant", + "alpha": 0.5, + "beta": 1.2 + }, + "nohSampler": { + "type": "constant", + "value": 0.5 + } +} +``` + +## Prefab failure models +The final type of failure models is the prefab models. These are models that are predefined in OpenDC and are based on +research. Currently, OpenDC has 9 prefab models based on [The Failure Trace Archive: Enabling the comparison of failure measurements and models of distributed systems](https://www-sciencedirect-com.vu-nl.idm.oclc.org/science/article/pii/S0743731513000634) +The figure below shows the values used to define the failure models. +![img.png](img.png) + +Each failure model is defined four times, on for each of the four distribution. +The final list of available prefabs is thus: + + G5k06Exp + G5k06Wbl + G5k06LogN + G5k06Gam + Lanl05Exp + Lanl05Wbl + Lanl05LogN + Lanl05Gam + Ldns04Exp + Ldns04Wbl + Ldns04LogN + Ldns04Gam + Microsoft99Exp + Microsoft99Wbl + Microsoft99LogN + Microsoft99Gam + Nd07cpuExp + Nd07cpuWbl + Nd07cpuLogN + Nd07cpuGam + Overnet03Exp + Overnet03Wbl + Overnet03LogN + Overnet03Gam + Pl05Exp + Pl05Wbl + Pl05LogN + Pl05Gam + Skype06Exp + Skype06Wbl + Skype06LogN + Skype06Gam + Websites02Exp + Websites02Wbl + Websites02LogN + Websites02Gam + +:::info Code +The different Prefab models can be found [here](https://github.com/atlarge-research/opendc/tree/master/opendc-compute/opendc-compute-failure/src/main/kotlin/org/opendc/compute/failure/prefab) +::: + +### Example +To specify a prefab model, the "type" needs to be set to "prefab". +After, the prefab can be defined with "prefabName": + +```json +{ + "type": "prefab", + "prefabName": "G5k06Exp" +} +``` + diff --git a/site/docs/documentation/Input/FailureModels.md b/site/docs/documentation/Input/FailureModels.md deleted file mode 100644 index d62767f6..00000000 --- a/site/docs/documentation/Input/FailureModels.md +++ /dev/null @@ -1,202 +0,0 @@ -OpenDC provides three types of failure models: [Trace-based](#trace-based-failure-models), [Sample-based](#sample-based-failure-models), -and [Prefab](#prefab-failure-models). - -All failure models have a similar structure containing three simple steps. - -1. The _interval_ time determines the time between two failures. -2. The _duration_ time determines how long a single failure takes. -3. The _intensity_ determines how many hosts are effected by a failure. - -# Trace based failure models -Trace-based failure models are defined by a parquet file. This file defines the interval, duration, and intensity of -several failures. The failures defined in the file are looped. A valid failure model file follows the format defined below: - -| Metric | Datatype | Unit | Summary | -|-------------------|------------|---------------|--------------------------------------------| -| failure_interval | int64 | milli seconds | The duration since the last failure | -| failure_duration | int64 | milli seconds | The duration of the failure | -| failure_intensity | float64 | ratio | The ratio of hosts effected by the failure | - -## Schema -A trace-based failure model is specified by setting "type" to "trace-based". -After, the user can define the path to the failure trace using "pathToFile": -```json -{ - "type": "trace-based", - "pathToFile": "path/to/your/failure_trace.parquet" -} -``` - -The "repeat" value can be set to false if the user does not want the failures to loop: -```json -{ - "type": "trace-based", - "pathToFile": "path/to/your/failure_trace.parquet", - "repeat": "false" -} -``` - -# Sample based failure models -Sample based failure models sample from three distributions to get the _interval_, _duration_, and _intensity_ of -each failure. Sample-based failure models are effected by randomness and will thus create different results based -on the provided seed. - -## Distributions -OpenDC supports eight different distributions based on java's [RealDistributions](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/RealDistribution.html). -Because the different distributions require different variables, they have to be specified with a specific "type". - -#### [ConstantRealDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/ConstantRealDistribution.html) -A distribution that always returns the same value. - -```json -{ - "type": "constant", - "value": 10.0 -} -``` - -#### [ExponentialDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/ExponentialDistribution.html) -```json -{ - "type": "exponential", - "mean": 1.5 -} -``` - -#### [GammaDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/GammaDistribution.html) -```json -{ - "type": "gamma", - "shape": 1.0, - "scale": 0.5 -} -``` - -#### [LogNormalDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/LogNormalDistribution.html) -```json -{ - "type": "log-normal", - "scale": 1.0, - "shape": 0.5 -} -``` - -#### [NormalDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/NormalDistribution.html) -```json -{ - "type": "normal", - "mean": 1.0, - "std": 0.5 -} -``` - -#### [ParetoDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/ParetoDistribution.html) -```json -{ - "type": "constant", - "scale": 1.0, - "shape": 0.6 -} -``` - -#### [UniformRealDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/UniformRealDistribution.html) -```json -{ - "type": "constant", - "lower": 5.0, - "upper": 10.0 -} -``` - -#### [WeibullDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/WeibullDistribution.html) -```json -{ - "type": "constant", - "alpha": 0.5, - "beta": 1.2 -} -``` - -## Schema -A sample-based failure model is defined using three distributions for _intensity_, _duration_, and _intensity_. -Distributions can be mixed however the user wants. Note, values for _intensity_ and _duration_ are clamped to be positive. -The _intensity_ is clamped to the range [0.0, 1.0). -To specify a sample-based failure model, the type needs to be set to "custom". - -Example: -```json -{ - "type": "custom", - "iatSampler": { - "type": "exponential", - "mean": 1.5 - }, - "durationSampler": { - "type": "constant", - "alpha": 0.5, - "beta": 1.2 - }, - "nohSampler": { - "type": "constant", - "value": 0.5 - } -} -``` - -# Prefab failure models -The final type of failure models is the prefab models. These are models that are predefined in OpenDC and are based on -research. Currently, OpenDC has 9 prefab models based on [The Failure Trace Archive: Enabling the comparison of failure measurements and models of distributed systems](https://www-sciencedirect-com.vu-nl.idm.oclc.org/science/article/pii/S0743731513000634) -The figure below shows the values used to define the failure models. -![img.png](img.png) - -Each failure model is defined four times, on for each of the four distribution. -The final list of available prefabs is thus: - - G5k06Exp - G5k06Wbl - G5k06LogN - G5k06Gam - Lanl05Exp - Lanl05Wbl - Lanl05LogN - Lanl05Gam - Ldns04Exp - Ldns04Wbl - Ldns04LogN - Ldns04Gam - Microsoft99Exp - Microsoft99Wbl - Microsoft99LogN - Microsoft99Gam - Nd07cpuExp - Nd07cpuWbl - Nd07cpuLogN - Nd07cpuGam - Overnet03Exp - Overnet03Wbl - Overnet03LogN - Overnet03Gam - Pl05Exp - Pl05Wbl - Pl05LogN - Pl05Gam - Skype06Exp - Skype06Wbl - Skype06LogN - Skype06Gam - Websites02Exp - Websites02Wbl - Websites02LogN - Websites02Gam - -## Schema -To specify a prefab model, the "type" needs to be set to "prefab". -After, the prefab can be defined with "prefabName": - -```json -{ - "type": "prefab", - "prefabName": "G5k06Exp" -} -``` - diff --git a/site/docs/documentation/Input/Scenario.md b/site/docs/documentation/Input/Scenario.md deleted file mode 100644 index ff7b9ffb..00000000 --- a/site/docs/documentation/Input/Scenario.md +++ /dev/null @@ -1,125 +0,0 @@ -The scenario of a simulation is defined using a JSON file. A scenario consists of one or more topologies, one or more -workloads, one or more allocation policies, a name and a number of times the simulation is being run. - -## Schema - -The schema for the scenario file is provided in [schema](ScenarioSchema) -In the following section, we describe the different components of the schema. - -### General Structure - -| Variable | Type | Required? | Default | Description | -|----------------------|----------------------------------------------|-----------|-------|--------------------------------------------------------------------------| -| name | string | no | "" | Name of the scenario, used for identification and referencing. | -| topologies | List[[Topology](#topology)] | yes | N/A | List of topologies used in the scenario. | -| workloads | List[[Workload](#workload)] | yes | N/A | List of workloads to be executed within the scenario. | -| allocationPolicies | List[[AllocationPolicy](#allocation-policy)] | yes | N/A | Allocation policies used for resource management in the scenario. | -| failureModels | List[[FailureModel](#failuremodel)] | no | empty | List of failure models to simulate various types of failures. | -| exportModels | List[[ExportModel](#exportmodel)] | no | empty | Specifications for exporting data from the simulation. | -| carbonTracePaths | List[string] | no | null | Paths to carbon footprint trace files. | -| outputFolder | string | no | "output" | Directory where the simulation outputs will be stored. | -| initialSeed | integer | no | 0 | Seed used for random number generation to ensure reproducibility. | -| runs | integer | no | 1 | Number of times the scenario should be run. | - -### Topology - -| Variable | Type | Required? | Default | Description | -|-------------|--------|-----------|---------|---------------------------------------------------------------------| -| pathToFile | string | yes | N/A | Path to the JSON file defining the topology. | - -### Workload - -| Variable | Type | Required? | Default | Description | -|-------------|--------|-----------|---------|---------------------------------------------------------------------| -| pathToFile | string | yes | N/A | Path to the file containing the workload trace. | -| type | string | yes | N/A | Type of the workload (e.g., "ComputeWorkload"). | - -### Allocation Policy - -| Variable | Type | Required? | Default | Description | -|-------------|--------|-----------|---------|---------------------------------------------------------------------| -| policyType | string | yes | N/A | Type of allocation policy (e.g., "BestFit", "FirstFit"). | - -### FailureModel - -| Variable | Type | Required? | Default | Description | -|-------------|--------|-----------|---------|---------------------------------------------------------------------| -| modelType | string | yes | N/A | Type of failure model to simulate specific operational failures. | - -### ExportModel - -| Variable | Type | Required? | Default | Description | -|-------------|--------|-----------|---------|---------------------------------------------------------------------| -| exportType | string | yes | N/A | Specifies the type of data export model for simulation results. | - - -## Examples -In the following section, we discuss several examples of Scenario files. Any scenario file can be verified using the -JSON schema defined in [schema](TopologySchema). - -### Simple - -The simplest scneario that can be provided to OpenDC is shown below: -```json -{ - "topologies": [ - { - "pathToFile": "topologies/topology1.json" - } - ], - "workloads": [ - { - "pathToFile": "traces/bitbrains-small", - "type": "ComputeWorkload" - } - ], - "allocationPolicies": [ - { - "policyType": "Mem" - } - ] -} -``` - -This scenario creates a simulation from file topology1, located in the topologies folder, with a workload trace from the -bitbrains-small file, and an allocation policy of type Mem. The simulation is run once (by default), and the default -name is "". - -### Complex -Following is an example of a more complex topology: -```json -{ - "topologies": [ - { - "pathToFile": "topologies/topology1.json" - }, - { - "pathToFile": "topologies/topology2.json" - }, - { - "pathToFile": "topologies/topology3.json" - } - ], - "workloads": [ - { - "pathToFile": "traces/bitbrains-small", - "type": "ComputeWorkload" - }, - { - "pathToFile": "traces/bitbrains-large", - "type": "ComputeWorkload" - } - ], - "allocationPolicies": [ - { - "policyType": "Mem" - }, - { - "policyType": "Mem-Inv" - } - ] -} -``` - -This scenario runs a total of 12 experiments. We have 3 topologies (3 datacenter configurations), each simulated with -2 distinct workloads, each using a different allocation policy (either Mem or Mem-Inv). diff --git a/site/docs/documentation/Input/ScenarioSchema.md b/site/docs/documentation/Input/ScenarioSchema.md deleted file mode 100644 index 78ec55f7..00000000 --- a/site/docs/documentation/Input/ScenarioSchema.md +++ /dev/null @@ -1,81 +0,0 @@ -Below is the schema for the Scenario JSON file. This schema can be used to validate a scenario file. -A scenario file can be validated using a JSON schema validator, such as https://www.jsonschemavalidator.net/. - -```json -{ - "$schema": "OpenDC/Scenario", - "$defs": { - "topology": { - "type": "object", - "properties": { - "pathToFile": { - "type": "string" - } - }, - "required": [ - "pathToFile" - ] - }, - "workload": { - "type": "object", - "properties": { - "pathToFile": { - "type": "string" - }, - "type": { - "type": "string" - } - }, - "required": [ - "pathToFile", - "type" - ] - }, - "allocationPolicy": { - "type": "object", - "properties": { - "policyType": { - "type": "string" - } - }, - "required": [ - "policyType" - ] - } - }, - "properties": { - "name": { - "type": "string" - }, - "topologies": { - "type": "array", - "items": { - "$ref": "#/$defs/topology" - }, - "minItems": 1 - }, - "workloads": { - "type": "array", - "items": { - "$ref": "#/$defs/workload" - }, - "minItems": 1 - }, - "allocationPolicies": { - "type": "array", - "items": { - "$ref": "#/$defs/allocationPolicy" - }, - "minItems": 1 - }, - "runs": { - "type": "integer" - } - }, - "required": [ - "topologies", - "workloads", - "allocationPolicies" - ] -} -``` diff --git a/site/docs/documentation/Input/Topology.md b/site/docs/documentation/Input/Topology.md index cf726616..0d2479bd 100644 --- a/site/docs/documentation/Input/Topology.md +++ b/site/docs/documentation/Input/Topology.md @@ -2,6 +2,11 @@ The topology of a datacenter is defined using a JSON file. A topology consist of Each cluster consist of at least one host on which jobs can be executed. Each host consist of one or more CPUs, a memory unit and a power model. +:::info Code +The code related to reading and processing topology files can be found [here](https://github.com/atlarge-research/opendc/tree/master/opendc-compute/opendc-compute-topology/src/main/kotlin/org/opendc/compute/topology) +::: + + ## Schema The schema for the topology file is provided in [schema](TopologySchema). @@ -17,12 +22,12 @@ In the following section, we describe the different components of the schema. ### Host -| variable | type | required? | default | description | -|------------|-----------------------|-----------|---------|--------------------------------------------------------------------------------| -| name | string | no | Host | The name of the host. This is only important for debugging and post-processing | -| count | integer | no | 1 | The amount of hosts of this type are in the cluster | -| cpuModel | [CPU](#cpuModel) | yes | N/A | The CPUs in the host | -| memory | [Memory](#memory) | yes | N/A | The memory used by the host | +| variable | type | required? | default | description | +|-------------|-----------------------------|-----------|---------|--------------------------------------------------------------------------------| +| name | string | no | Host | The name of the host. This is only important for debugging and post-processing | +| count | integer | no | 1 | The amount of hosts of this type are in the cluster | +| cpuModel | [CPU](#cpu) | yes | N/A | The CPUs in the host | +| memory | [Memory](#memory) | yes | N/A | The memory used by the host | | power model | [Power Model](#power-model) | yes | N/A | The power model used to determine the power draw of the host | ### CPU @@ -49,12 +54,13 @@ In the following section, we describe the different components of the schema. ### Power Model -| variable | type | Unit | required? | default | description | -|-----------|---------|------|-----------|---------|----------------------------------------------------------------------------| -| modelType | string | N/A | yes | N/A | The type of model used to determine power draw | -| power | string | Watt | no | 400 | The constant power draw when using the 'constant' power model type in Watt | -| maxPower | string | Watt | yes | N/A | The power draw of a host when using max capacity in Watt | -| idlePower | integer | Watt | yes | N/A | The power draw of a host when idle in Watt | +| variable | type | Unit | required? | default | description | +|-----------------|--------|------|-----------|----------|-------------------------------------------------------------------------------| +| vendor | string | N/A | yes | N/A | The type of model used to determine power draw | +| modelName | string | N/A | yes | N/A | The type of model used to determine power draw | +| arch | string | N/A | yes | N/A | The type of model used to determine power draw | +| totalPower | Int64 | Watt | no | max long | The power draw of a host when using max capacity in Watt | +| carbonTracePath | string | N/A | no | null | Path to a carbon intensity trace. If not given, carbon intensity is always 0. | ## Examples @@ -71,12 +77,11 @@ The simplest data center that can be provided to OpenDC is shown below: { "hosts": [ { - "cpus": [ - { - "coreCount": 16, - "coreSpeed": 1000 - } - ], + "cpu": + { + "coreCount": 16, + "coreSpeed": 1000 + }, "memory": { "memorySize": 100000 } @@ -87,7 +92,7 @@ The simplest data center that can be provided to OpenDC is shown below: } ``` -This is creates a data center with a single cluster containing a single host. This host consist of a single 16 core CPU +This creates a data center with a single cluster containing a single host. This host consist of a single 16 core CPU with a speed of 1 Ghz, and 100 MiB RAM memory. ### Count @@ -102,14 +107,14 @@ Duplicating clusters, hosts, or CPUs is easy using the "count" keyword: "hosts": [ { "count": 5, - "cpus": [ - { - "coreCount": 16, - "coreSpeed": 1000, - "count": 10 - } - ], - "memory": { + "cpu": + { + "coreCount": 16, + "coreSpeed": 1000, + "count": 10 + }, + "memory": + { "memorySize": 100000 } } @@ -205,7 +210,7 @@ Aside from using number to indicate values it is also possible to define values "modelType": "linear", "power": "400 Watts", "maxPower": "1 KW", - "idlePower": "0.4W" + "idlePower": "0.4 W" } } ] diff --git a/site/docs/documentation/Input/Traces.md b/site/docs/documentation/Input/Traces.md deleted file mode 100644 index ec5782cb..00000000 --- a/site/docs/documentation/Input/Traces.md +++ /dev/null @@ -1,26 +0,0 @@ -### Traces -OpenDC works with two types of traces that describe the servers that need to be run. Both traces have to be provided as -parquet files. - -#### Meta -The meta trace provides an overview of the servers: - -| Metric | Datatype | Unit | Summary | -|--------------|------------|----------|--------------------------------------------------| -| id | string | | The id of the server | -| start_time | datetime64 | datetime | The submission time of the server | -| stop_time | datetime64 | datetime | The finish time of the submission | -| cpu_count | int32 | count | The number of CPUs required to run this server | -| cpu_capacity | float64 | MHz | The amount of CPU required to run this server | -| mem_capacity | int64 | MB | The amount of memory required to run this server | - -#### Trace -The Trace file provides information about the computational demand of each server over time: - -| Metric | Datatype | Unit | Summary | -|-----------|------------|---------------|---------------------------------------------| -| id | string | | The id of the server | -| timestamp | datetime64 | datetime | The timestamp of the sample | -| duration | int64 | milli seconds | The duration since the last sample | -| cpu_count | int32 | count | The number of cpus required | -| cpu_usage | float64 | MHz | The amount of computational power required. | diff --git a/site/docs/documentation/Input/Workload.md b/site/docs/documentation/Input/Workload.md new file mode 100644 index 00000000..5f2e61ae --- /dev/null +++ b/site/docs/documentation/Input/Workload.md @@ -0,0 +1,24 @@ +OpenDC works with two types of traces that describe the servers that need to be run. Both traces have to be provided as +parquet files. + +#### Task +The meta trace provides an overview of the servers: + +| Metric | Datatype | Unit | Summary | +|-----------------|----------|----------|--------------------------------------------------| +| id | string | | The id of the server | +| submission_time | int64 | datetime | The submission time of the server | +| duration | int64 | datetime | The finish time of the submission | +| cpu_count | int32 | count | The number of CPUs required to run this server | +| cpu_capacity | float64 | MHz | The amount of CPU required to run this server | +| mem_capacity | int64 | MB | The amount of memory required to run this server | + +#### Fragment +The Fragment file provides information about the computational demand of each server over time: + +| Metric | Datatype | Unit | Summary | +|-----------|------------|---------------|---------------------------------------------| +| id | string | | The id of the task | +| duration | int64 | milli seconds | The duration since the last sample | +| cpu_count | int32 | count | The number of cpus required | +| cpu_usage | float64 | MHz | The amount of computational power required. | -- cgit v1.2.3