From 960b3d8a13c67ac4b7f479d5764b0b618fc9ea09 Mon Sep 17 00:00:00 2001
From: Dante Niewenhuis <d.niewenhuis@hotmail.com>
Date: Tue, 5 Mar 2024 16:50:35 +0100
Subject: Cpu fix (#208)

* Updated the topology format to JSON. Updated TopologyReader.kt to handle JSON filed. Added documentation for the new format.

* applied spotless kotlin

* small update

* Updated for spotless apply

* Updated for spotless apply
---
 site/docs/documentation/Input.md                |  42 ------
 site/docs/documentation/Input/Topology.md       | 184 ++++++++++++++++++++++++
 site/docs/documentation/Input/TopologySchema.md | 164 +++++++++++++++++++++
 site/docs/documentation/Input/Traces.md         |  26 ++++
 site/docs/documentation/Input/_category_.json   |   7 +
 5 files changed, 381 insertions(+), 42 deletions(-)
 delete mode 100644 site/docs/documentation/Input.md
 create mode 100644 site/docs/documentation/Input/Topology.md
 create mode 100644 site/docs/documentation/Input/TopologySchema.md
 create mode 100644 site/docs/documentation/Input/Traces.md
 create mode 100644 site/docs/documentation/Input/_category_.json

(limited to 'site')

diff --git a/site/docs/documentation/Input.md b/site/docs/documentation/Input.md
deleted file mode 100644
index 8ea89936..00000000
--- a/site/docs/documentation/Input.md
+++ /dev/null
@@ -1,42 +0,0 @@
-
-OpenDC requires three files to run an experiment. First is the topology of the data center that will be simulated. 
-Second, is a meta trace providing an overview of the servers that need to be executed. Third is the trace describing the 
-computational demand of each job over time. 
-
-### Topology
-The topology of a datacenter is described by a csv file. Each row in the csv is a cluster 
-of in the data center. Below is an example of a topology file consisting of three clusters:
-
-| ClusterID | ClusterName | Cores | Speed | Memory | numberOfHosts | memoryCapacityPerHost | coreCountPerHost |
-|-----------|-------------|-------|-------|--------|---------------|-----------------------|------------------|
-| A01       | A01         | 32    | 3.2   | 2048   | 1             | 256                   | 32               |
-| B01       | B01         | 48    | 2.93  | 1256   | 6             | 64                    | 8                |
-| C01       | C01         | 32    | 3.2   | 2048   | 2             | 128                   | 16               |
-
-
-### Traces
-OpenDC works with two types of traces that describe the servers that need to be run. Both traces have to be provided as 
-parquet files.
-
-#### Meta
-The meta trace provides an overview of the servers:
-
-| Metric       | Datatype   | Unit     | Summary                                          |
-|--------------|------------|----------|--------------------------------------------------|
-| id           | string     |          | The id of the server                             |
-| start_time   | datetime64 | datetime | The submission time of the server                |
-| stop_time    | datetime64 | datetime | The finish time of the submission                |
-| cpu_count    | int32      | count    | The number of CPUs required to run this server   |
-| cpu_capacity | float64    | MHz      | The amount of CPU required to run this server    |
-| mem_capacity | int64      | MB       | The amount of memory required to run this server |
-
-#### Trace
-The Trace file provides information about the computational demand of each server over time:
-
-| Metric    | Datatype   | Unit          | Summary                                     |
-|-----------|------------|---------------|---------------------------------------------|
-| id        | string     |               | The id of the server                        |
-| timestamp | datetime64 | datetime      | The timestamp of the sample                 |
-| duration  | int64      | milli seconds | The duration since the last sample          |
-| cpu_count | int32      | count         | The number of cpus required                 |
-| cpu_usage | float64    | MHz           | The amount of computational power required. |
diff --git a/site/docs/documentation/Input/Topology.md b/site/docs/documentation/Input/Topology.md
new file mode 100644
index 00000000..e5419078
--- /dev/null
+++ b/site/docs/documentation/Input/Topology.md
@@ -0,0 +1,184 @@
+The topology of a datacenter is defined using a JSON file. A topology consist of one or more clusters.
+Each cluster consist of at least one host on which jobs can be executed. Each host consist of one or more CPUs, a memory unit and a power model.
+
+## Schema
+The schema for the topology file is provided in [schema](TopologySchema). 
+In the following section, we describe the different components of the schema.
+
+### Cluster
+
+| variable | type                | required? | default | description                                                                       |
+|----------|---------------------|-----------|---------|-----------------------------------------------------------------------------------|
+| name     | string              | no        | Cluster | The name of the cluster. This is only important for debugging and post-processing |
+| count    | integer             | no        | 1       | The amount of clusters of this type are in the data center                        |
+| hosts    | List[[Host](#host)] | yes       | N/A     | A list of the hosts in a cluster.                                                 |
+
+### Host
+
+| variable    | type                        | required? | default | description                                                                    |
+|-------------|-----------------------------|-----------|---------|--------------------------------------------------------------------------------|
+| name        | string                      | no        | Host    | The name of the host. This is only important for debugging and post-processing |
+| count       | integer                     | no        | 1       | The amount of hosts of this type are in the cluster                            |
+| cpus        | List[[CPU](#cpu)]           | yes       | N/A     | A list of the hosts in a cluster.                                              |
+| memory      | [Memory](#memory)           | yes       | N/A     | The memory used by the host                                                    |
+| power model | [Power Model](#power-model) | yes       | N/A     | The power model used to determine the power draw of the host                   |
+
+### CPU
+
+| variable  | type    | Unit  | required? | default | description                                      |
+|-----------|---------|-------|-----------|---------|--------------------------------------------------|
+| name      | string  | N/A   | no        | unknown | The name of the CPU.                             |
+| vendor    | string  | N/A   | no        | unknown | The vendor of the CPU                            |
+| arch      | string  | N/A   | no        | unknown | the micro-architecture of the CPU                |
+| count     | integer | N/A   | no        | 1       | The amount of cpus of this type used by the host |
+| coreCount | integer | count | yes       | N/A     | The number of cores in the CPU                   |
+| coreSpeed | Double  | Mhz   | yes       | N/A     | The speed of each core in Mhz                    |
+
+### Memory
+
+| variable    | type    | Unit | required? | default | description                                                              |
+|-------------|---------|------|-----------|---------|--------------------------------------------------------------------------|
+| name        | string  | N/A  | no        | unknown | The name of the CPU.                                                     |
+| vendor      | string  | N/A  | no        | unknown | The vendor of the CPU                                                    |
+| arch        | string  | N/A  | no        | unknown | the micro-architecture of the CPU                                        |
+| count       | integer | N/A  | no        | 1       | The amount of cpus of this type used by the host                         |
+| memorySize  | integer | Byte | yes       | N/A     | The number of cores in the CPU                                           |
+| memorySpeed | Double  | ?    | no        | -1      | The speed of each core in Mhz. PLACEHOLDER: this currently does nothing. |
+
+### Power Model
+
+| variable  | type    | Unit | required? | default | description                                                                |
+|-----------|---------|------|-----------|---------|----------------------------------------------------------------------------|
+| modelType | string  | N/A  | yes       | N/A     | The type of model used to determine power draw                             |
+| power     | string  | Watt | no        | 400     | The constant power draw when using the 'constant' power model type in Watt |
+| maxPower  | string  | Watt | yes       | N/A     | The power draw of a host when using max capacity in Watt                   |
+| idlePower | integer | Watt | yes       | N/A     | The power draw of a host when idle in Watt                                 |
+
+
+## Examples
+In the following section, we discuss several examples of topology files. Any topology file can be verified using the 
+JSON schema defined in [schema](TopologySchema).
+
+### Simple
+
+The simplest data center that can be provided to OpenDC is shown below:
+```json
+{
+    "clusters":
+    [
+        {
+            "hosts" :
+            [
+                {
+                    "cpus":
+                    [
+                        {
+                            "coreCount": 16,
+                            "coreSpeed": 1000
+                        }
+                    ],
+                    "memory": {
+                        "memorySize": 100000
+                    }
+                }
+            ]
+        }
+    ]
+}
+```
+
+This is creates a data center with a single cluster containing a single host. This host consist of a single 16 core CPU
+with a speed of 1 Ghz, and 100 MiB RAM memory.
+
+### Count
+Duplicating clusters, hosts, or CPUs is easy using the "count" keyword:
+```json
+{
+    "clusters":
+    [
+        {
+            "count": 2,
+            "hosts" : 
+            [
+                {
+                    "count": 5,
+                    "cpus":
+                    [
+                        {
+                            "coreCount": 16,
+                            "coreSpeed": 1000,
+                            "count": 10
+                        }
+                    ],
+                    "memory": {
+                        "memorySize": 100000
+                    }
+                }
+            ]
+        }
+    ]
+}
+```
+This topology creates a datacenter consisting of 2 clusters, both containing 5 hosts. Each host contains 10 16 core CPUs. 
+Using "count" saves a lot of copying.
+
+### Complex
+Following is an example of a more complex topology:
+
+```json
+{
+    "clusters":
+    [
+        {
+            "name": "C01",
+            "count": 2,
+            "hosts" :
+            [
+                {
+                    "name": "H01",
+                    "count": 2,
+                    "cpus":
+                    [
+                        {
+                            "coreCount": 16,
+                            "coreSpeed": 1000
+                        }
+                    ],
+                    "memory": {
+                        "memorySize": 1000000
+                    },
+                    "powerModel":
+                    {
+                        "modelType": "linear",
+                        "idlePower": 200.0,
+                        "maxPower": 400.0
+                    }
+                },
+                {
+                    "name": "H02",
+                    "count": 2,
+                    "cpus":
+                    [
+                        {
+                            "coreCount": 8,
+                            "coreSpeed": 3000
+                        }
+                    ],
+                    "memory": {
+                        "memorySize": 100000
+                    },
+                    "powerModel":
+                    {
+                        "modelType": "square",
+                        "idlePower": 300.0,
+                        "maxPower": 500.0
+                    }
+                }
+            ]
+        }
+    ]
+}
+```
+
+This topology defines two types of hosts with different coreCount, and coreSpeed. 
+Both types of hosts are created twice. 
diff --git a/site/docs/documentation/Input/TopologySchema.md b/site/docs/documentation/Input/TopologySchema.md
new file mode 100644
index 00000000..9f8f7575
--- /dev/null
+++ b/site/docs/documentation/Input/TopologySchema.md
@@ -0,0 +1,164 @@
+Below is the schema for the Topology JSON file. This schema can be used to validate a topology file. 
+A topology file can be validated using using a JSON schema validator, such as https://www.jsonschemavalidator.net/.
+
+```json
+{
+  "$schema": "OpenDC/Topology",
+  "$defs": {
+    "cpu": {
+      "description": "definition of a cpu",
+      "type": "object",
+      "properties": {
+        "vendor": {
+          "type": "string",
+          "default": "unknown"
+        },
+        "modelName": {
+          "type": "string",
+          "default": "unknown"
+        },
+        "arch": {
+          "type": "string",
+          "default": "unknown"
+        },
+        "coreCount": {
+          "type": "integer"
+        },
+        "coreSpeed": {
+          "description": "The core speed of the cpu in Mhz",
+          "type": "number"
+        },
+        "count": {
+          "description": "The amount CPUs of this type present in the cluster",
+          "type": "integer"
+        }
+      },
+      "required": [
+        "coreCount",
+        "coreSpeed"
+      ]
+    },
+    "memory": {
+      "type": "object",
+      "properties": {
+        "vendor": {
+          "type": "string",
+          "default": "unknown"
+        },
+        "modelName": {
+          "type": "string",
+          "default": "unknown"
+        },
+        "arch": {
+          "type": "string",
+          "default": "unknown"
+        },
+        "memorySize": {
+          "description": "The amount of the memory in B",
+          "type": "integer"
+        },
+        "memorySpeed": {
+          "description": "The speed of the memory in Mhz. Note: currently, this does nothing",
+          "type": "number",
+          "default": -1
+        }
+      },
+      "required": [
+        "memorySize"
+      ]
+    },
+    "powerModel": {
+      "type": "object",
+      "properties": {
+        "modelType": {
+          "description": "The type of model used to determine power draw",
+          "type": "string"
+        },
+        "power": {
+          "description": "The constant power draw when using the 'constant' power model type in Watt",
+          "type": "number",
+          "default": 400
+        },
+        "maxPower": {
+          "description": "The power draw of a host when idle in Watt",
+          "type": "number"
+        },
+        "idlePower": {
+          "description": "The power draw of a host when using max capacity in Watt",
+          "type": "number"
+        }
+      },
+      "required": [
+        "modelType",
+        "maxPower",
+        "idlePower"
+      ]
+    },
+    "host": {
+      "type": "object",
+      "properties": {
+        "name": {
+          "type": "string",
+          "default": "Host"
+        },
+        "count": {
+          "description": "The amount hosts of this type present in the cluster",
+          "type": "integer",
+          "default": 1
+        },
+        "cpus": {
+          "type": "array",
+          "items": {
+            "$ref": "#/$defs/cpu"
+          },
+          "minItems": 1
+        },
+        "memory": {
+          "$ref": "#/$defs/memory"
+        }
+      },
+      "required": [
+        "cpus",
+        "memory"
+      ]
+    },
+    "cluster": {
+      "type": "object",
+      "properties": {
+        "name": {
+          "type": "string",
+          "default": "Cluster"
+        },
+        "count": {
+          "description": "The amount clusters of this type present in the Data center",
+          "type": "integer",
+          "default": 1
+        },
+        "hosts": {
+          "type": "array",
+          "items": {
+            "$ref": "#/$defs/host"
+          },
+          "minItems": 1
+        }
+      },
+      "required": [
+        "hosts"
+      ]
+    }
+  },
+  "properties": {
+    "clusters": {
+      "description": "Clusters present in the data center",
+      "type": "array",
+      "items": {
+        "$ref": "#/$defs/cluster"
+      },
+      "minItems": 1
+    }
+  },
+  "required": [
+    "clusters"
+  ]
+}
+```
diff --git a/site/docs/documentation/Input/Traces.md b/site/docs/documentation/Input/Traces.md
new file mode 100644
index 00000000..ec5782cb
--- /dev/null
+++ b/site/docs/documentation/Input/Traces.md
@@ -0,0 +1,26 @@
+### Traces
+OpenDC works with two types of traces that describe the servers that need to be run. Both traces have to be provided as
+parquet files.
+
+#### Meta
+The meta trace provides an overview of the servers:
+
+| Metric       | Datatype   | Unit     | Summary                                          |
+|--------------|------------|----------|--------------------------------------------------|
+| id           | string     |          | The id of the server                             |
+| start_time   | datetime64 | datetime | The submission time of the server                |
+| stop_time    | datetime64 | datetime | The finish time of the submission                |
+| cpu_count    | int32      | count    | The number of CPUs required to run this server   |
+| cpu_capacity | float64    | MHz      | The amount of CPU required to run this server    |
+| mem_capacity | int64      | MB       | The amount of memory required to run this server |
+
+#### Trace
+The Trace file provides information about the computational demand of each server over time:
+
+| Metric    | Datatype   | Unit          | Summary                                     |
+|-----------|------------|---------------|---------------------------------------------|
+| id        | string     |               | The id of the server                        |
+| timestamp | datetime64 | datetime      | The timestamp of the sample                 |
+| duration  | int64      | milli seconds | The duration since the last sample          |
+| cpu_count | int32      | count         | The number of cpus required                 |
+| cpu_usage | float64    | MHz           | The amount of computational power required. |
diff --git a/site/docs/documentation/Input/_category_.json b/site/docs/documentation/Input/_category_.json
new file mode 100644
index 00000000..e433770c
--- /dev/null
+++ b/site/docs/documentation/Input/_category_.json
@@ -0,0 +1,7 @@
+{
+    "label": "Input",
+    "position": 1,
+    "link": {
+        "type": "generated-index"
+    }
+}
-- 
cgit v1.2.3