summaryrefslogtreecommitdiff
path: root/site/docs
diff options
context:
space:
mode:
authorFabian Mastenbroek <mail.fabianm@gmail.com>2022-09-08 12:10:06 +0200
committerFabian Mastenbroek <mail.fabianm@gmail.com>2022-09-13 16:15:59 +0200
commit7bec1a7ab414f45fda516dd5b5e48bbe623a50f4 (patch)
tree927eee97038a4bdda26bb9a89f673a51eae17b22 /site/docs
parente3abcf3ebc677a86fbd2c1d2da7241ac6a52595e (diff)
feat(site): Add tutorials to OpenDC website
This change adds a new section to the OpenDC website containing tutorials to experiment with the OpenDC web application.
Diffstat (limited to 'site/docs')
-rw-r--r--site/docs/tutorials/_category_.json9
-rw-r--r--site/docs/tutorials/cloud-capacity-planning.mdx276
-rw-r--r--site/docs/tutorials/img/cpu-usage.pngbin0 -> 126367 bytes
-rw-r--r--site/docs/tutorials/img/resource-distribution.pngbin0 -> 13884 bytes
4 files changed, 285 insertions, 0 deletions
diff --git a/site/docs/tutorials/_category_.json b/site/docs/tutorials/_category_.json
new file mode 100644
index 00000000..5d3c1ca0
--- /dev/null
+++ b/site/docs/tutorials/_category_.json
@@ -0,0 +1,9 @@
+{
+ "label": "Tutorials",
+ "position": 3,
+ "link": {
+ "type": "generated-index",
+ "description": "Tutorials demonstrating how to conduct experiments with OpenDC."
+
+ }
+}
diff --git a/site/docs/tutorials/cloud-capacity-planning.mdx b/site/docs/tutorials/cloud-capacity-planning.mdx
new file mode 100644
index 00000000..a55c6a20
--- /dev/null
+++ b/site/docs/tutorials/cloud-capacity-planning.mdx
@@ -0,0 +1,276 @@
+---
+sidebar_position: 1
+title: Cloud Capacity Planning
+hide_title: true
+sidebar_label: Cloud Capacity Planning
+---
+
+# Cloud Capacity Planning Tutorial
+
+Using OpenDC to plan and design cloud datacenters.
+
+:::info Learning goal
+
+By doing this assignment, you will learn more about the basic concepts of datacenters and cloud computing systems, as
+well as how such systems are planned and designed.
+:::
+
+## Preamble
+
+Datacenter infrastructure is important in today’s digital society. Stakeholders across industry, government, and
+academia employ a vast and diverse array of cloud services hosted by datacenter infrastructure, and expect services to
+be reliable, high speed, and low cost. In turn, datacenter operators must maintain efficient operation at unprecedented
+scale.
+
+To keep up with growing demand and increasing complexity, architects of datacenters must address complex challenges in
+distributed systems, software engineering and performance engineering. One of these challenges is efficient utilization
+of resources in datacenters, which is only 6-12% industry-wide despite the fact that it is inconvenient for datacenter
+operators to keep much of their infrastructure idle, due to resulting high energy consumption and thus unnecessary
+costs.
+
+It is often quite difficult to implement optimizations or other changes in datacenters. Datacenter operators tend to be
+conservative in adopting such changes in fear of failure or misbehaving systems. Furthermore, testing changes at the
+scale of modern datacenter infrastructure in a real-world setting is prohibitively expensive and hard to reproduce,
+notwithstanding environmental concerns.
+
+A more viable alternative is the use of datacenter simulators such as OpenDC or CloudSim. These tools model datacenter
+infrastructure at a good accuracy and allow us to test changes in a controllable and repeatable environment.
+
+In this tutorial, we will use the OpenDC datacenter simulator to experiment with datacenters and demonstrate the
+process of designing and optimizing datacenters using simulation.
+
+## What is OpenDC
+
+OpenDC is an open source platform for datacenter simulation developed by AtLarge Research. The purpose of OpenDC is
+twofold: we aim to both enable cloud computing education and support research into datacenters.
+An example of the former is this tutorial, and examples of the latter include the numerous BSc and MSc research projects
+that are using OpenDC to run experiments and perform research.
+
+:::caution
+
+OpenDC is still an experimental tool. Your data may get lost, overwritten, or otherwise become unavailable. Sorry for
+the inconvenience.
+
+:::
+
+If you are not familiar with the OpenDC web interface, please follow the [Getting Started](/docs/category/getting-started)
+guide to get an understanding of the main concepts of OpenDC and how to design a datacenter.
+
+:::note Action
+
+Set up a project on the OpenDC website.
+
+:::
+
+
+## Assignment
+
+Acme Inc. is a small datacenter operator in the Netherlands. They are currently in the process of acquiring a new client
+and closing a deal where Acme will migrate all of the client’s business-critical workloads from internal machines to
+Acme’s datacenters. With this deal, the client aims to outsource the maintenance of their digital infrastructure, but in
+turn expects reliable and efficient operation from Acme.
+
+To demonstrate that Acme is capable of this task, it has started a pilot project with the client where Acme will migrate
+already a small subset of the client’s workloads. You are an engineer at Acme. and have been tasked with the design and
+procurement of the datacenter infrastructure required for this pilot project.
+
+To guide your design, the client has provided a workload trace of their business-critical workloads, which consist of
+the historical runtime behavior of 50 virtual machines over time. These virtual machines differ in resource
+requirements (e.g. number of vCPUs or memory) and in resource consumption over time. We can use OpenDC to simulate this
+workload trace and validate your datacenter design.
+
+The assignment is divided into four parts:
+1. Analyzing the requirements to estimate what resources are needed.
+2. Building your design in OpenDC
+3. Validating your design in OpenDC
+4. Optimizing your design in OpenDC
+
+Make notes of your thoughts on the following assignments & questions and discuss with your partner(s).
+
+## Analyze the Requirements
+
+The first step of the assignment is to analyze the requirements of the client in order to come up with a reasonable
+estimation of the datacenter infrastructure needed. This estimation will become our initial design which we will build
+and validate in OpenDC.
+
+Since the client has provided a workload trace representative of the workload that will eventually be running in the
+datacenter, we can use it to guide our design. In [Figure 1](#resource-distribution), the requested memory and vCPUs are
+depicted for the virtual machines in the workload trace.
+
+:::note Action
+
+Determine the total amount of vCPUs and memory required in the trace.
+
+:::
+
+<figure className="figure" id="resource-distribution">
+ <img src={require("./img/resource-distribution.png").default} alt="Resource requirements for the workload" />
+ <figcaption>
+ Requested number of vCPUs and memory (in GB) by the
+ virtual machines in the workload. The left figure shows the number of virtual machines that have requested 1, 2, 4 or 8
+ vCPUs. The right figure shows the amount of memory requested compared to the number of vCPUs in the virtual machine.
+ </figcaption>
+</figure>
+
+Based on this information, we could choose to purchase a new machine for every virtual machine in the workload trace.
+Such a design will most certainly be able to handle the workload. At the same time, it is much more expensive and
+probably unnecessary.
+
+In [Figure 2](#cpu-usage), the CPU Usage (in MHz) of the virtual machines in the workload is depicted over time. Observe that the
+median CPU usage of the virtual machines over the whole trace is approximately 100 MHz. This means that a 2-core
+processor with a base clock 3500 MHz would have utilization of only 1.4% (`100 MHz / (3500 MHz x 2)`) for such a median
+workload.
+
+<figure className="figure" id="cpu-usage">
+ <img src={require("./img/cpu-usage.png").default} alt="CPU usage over time for the workload" />
+ <figcaption>CPU Usage of the virtual machines in the workload over time.</figcaption>
+</figure>
+
+Instead, we could try to fit multiple virtual machines onto a single machine. For instance, the 2-core processor
+mentioned before is able to handle 70 virtual machines, each running at 100 MHz (`(3500 MHz x 2) / 100 MHz`), ignoring
+virtualization overhead and memory requirements.
+
+:::note Action
+
+Make a rough estimate of the number of physical cores required to host the vCPUs in the workload trace.
+
+:::
+
+Now that we have an indication of the number of physical cores we need to have, we can start to compose the servers in
+our datacenter. See **Table 1 and 2** for the equipment list you can choose from. Don’t forget to put enough memory in your
+servers, or otherwise you risk that not all virtual machines will fit on the servers in your datacenter.
+
+| Processor | Intel® Xeon® E-2224G | Intel® Xeon® E-2244G | Intel® Xeon® E-2246G |
+|----------------------------------|----------------------|----------------------|----------------------|
+| Base clock (in MHz) | 3500 | 3800 | 3600 |
+| Core count | 4 | 8 | 12 |
+| Average power consumption (in W) | 71 | 71 | 80 |
+
+**Table 1:** Processor options for your datacenter
+
+
+| Memory module | Crucial MTA9ASF2G72PZ-3G2E | Crucial MTA18ASF4G72PDZ-3G2E1 |
+|----------------------------------|----------------------------|-------------------------------|
+| Size (in GB) | 16 | 32 |
+| Speed (in MHZ) | 3200 | 3200 |
+**Table 2:** Memory options for your datacenter
+
+
+:::note Action
+
+Create a plan detailing the servers you want to have in your datacenter and what resources (e.g. processor or memory)
+they should contain. For instance, such a plan could look like:
+
+1. 8x Server (2x Intel® Xeon® E-2244G, 4x Crucial MTA18ASF4G72PDZ-3G2E1)
+
+:::
+
+:::tip Hint
+
+Budget more capacity than your initial estimates to prevent your datacenter from running at a very high
+utilization. Think about how your datacenter would handle a machine failure, will you still have enough capacity left?
+
+:::
+
+## Build the datacenter
+
+Based on the plan we devised in the previous section, we will now construct a (virtual) datacenter in OpenDC. If you
+have not yet used the OpenDC web interface to design a datacenter, please read [Getting Started](/docs/category/getting-started)
+guide to get an understanding of the main concepts of OpenDC and how to design a datacenter.
+
+:::note Action
+
+Implement your plan in the OpenDC web interface.
+
+:::
+
+## Validate your design
+
+We are now at a stage where we can validate whether the datacenter we have just designed and built in OpenDC is suitable
+for the workload of the client. We will use OpenDC to simulate the workload in our datacenter and keep track of several
+metrics to ensure efficient and reliable operation.
+
+One of our concerns is that our datacenter does not have enough computing power to deal with the client’s
+business-critical workload, leading to degraded performance and consequently an unhappy client.
+
+A metric that gives us an insight in performance degradation is the Overcommitted CPU Cycles, which represents the
+number of CPU cycles that a virtual machine wanted to run, but could not due to the host machine not having enough
+computing capacity at that moment. To keep track of this metric during simulation, we create a new portfolio by clicking
+the ‘+’ next to “Portfolio” in the left sidebar and select the metrics of interest.
+
+:::note Action
+
+Add a new portfolio and select at least the following targets:
+1. Overcommitted CPU Cycles
+2. Granted CPU Cycles
+3. Requested CPU Cycles
+4. Maximum Number VMs Finished
+
+:::
+
+We will now try to simulate the client’s workload trace (called _Bitbrains (Sample)_ in OpenDC). By clicking on ‘New
+Scenario’ below your created portfolio, we can create a base scenario which will represent our baseline datacenter
+design which we will compare against future improvements.
+
+:::note Action
+
+Add a base scenario to your new portfolio and select as trace _Bitbrains (Sample)_.
+
+:::
+
+By creating a new scenario, you will schedule a simulation of your datacenter design that will run on one of the OpenDC
+simulation servers. Press the Play button next to your portfolio to see the results of the simulations. If you have
+chosen the _Bitbrains (Sample)_ trace, the results should usually appear within one minute or less depending on the queue
+size. In case they do not appear within a reasonable timeframe, please contact the instructors.
+
+You can now see how your design has performed. Check whether all virtual machines have finished and whether the
+_Overcommitted CPU Cycles_ metric is not too high. Try to aim for anything below 1 bn cycles. In the next section, we’ll
+try to further optimize our design. For now, think of an explanation for the performance of your design.
+
+## Optimize your design
+
+Finally, let’s try to optimize your design so that it meets the requirements of the client and is beneficial for your
+employer as well. In particular, your company is interested in the follow goals:
+
+1. Reducing _Overcommitted CPU Cycles_ to a minimum for reliability.
+2. Reducing _Total Power Consumption_ to a minimum to save energy costs.
+
+:::note Action
+
+Add a new portfolio and select at least the following targets:
+1. Overcommitted CPU Cycles
+2. Granted CPU Cycles
+3. Requested CPU Cycles
+4. Total Power Consumption
+
+Then, add a base scenario to your new portfolio and select as trace _Bitbrains (Sample)_.
+
+:::
+
+Try to think of ways in which you can reduce both _Overcommitted CPU Cycles_ and _Total Power Consumption_. Create a new
+topology based on your initial topology and apply the changes you have come up with. In this way, you can easily compare
+the performance of different topologies in different scenarios. Note that the choice of scheduler might also influence
+your results.
+
+:::tip Hint
+
+The choice of scheduler (and thus the placement of VMs) might also influence your results.
+
+:::
+
+
+:::note Action
+
+1. Create a new topology based on your existing topology.
+2. Add a new scenario to your created portfolio and select your newly created topology.
+3. Compare the results against the base scenario.
+
+:::
+
+Repeat this approach until you are satisfied with your design.
+
+## Epilogue
+
+In this tutorial, you should have learned briefly about what datacenters are, and the process of designing and
+optimizing a datacenter yourself. If you have any feedback (positive or negative) about your experience using OpenDC
+during this tutorial, please let us know!
diff --git a/site/docs/tutorials/img/cpu-usage.png b/site/docs/tutorials/img/cpu-usage.png
new file mode 100644
index 00000000..86955b6a
--- /dev/null
+++ b/site/docs/tutorials/img/cpu-usage.png
Binary files differ
diff --git a/site/docs/tutorials/img/resource-distribution.png b/site/docs/tutorials/img/resource-distribution.png
new file mode 100644
index 00000000..b371a07a
--- /dev/null
+++ b/site/docs/tutorials/img/resource-distribution.png
Binary files differ