feat: added some notes

author: mjkwiatkowski <mati.rewa@gmail.com> 2026-06-01 14:15:37 +0200
committer: mjkwiatkowski <mati.rewa@gmail.com> 2026-06-01 14:15:37 +0200
commit: f771af4e69db4b8937f64fbf4024eb518a7cc230 (patch)
tree: 320be41b46d66def4f27ad7611e1ebfd51aa3386 /notes
parent: 91ca6b9411a1675f5735a86ac658833dc78cc382 (diff)
10 files changed, 205 insertions, 0 deletions
diff --git a/notes/1.txt b/notes/1.txt
new file mode 100644
index 0000000..4a11803
--- /dev/null
+++ b/notes/1.txt
@@ -0,0 +1,69 @@
+==13-05-2026==
+Introduction:
+Make a compelling story about what the adventure is upon you and what problem you set out to solve.
+You can think of it as a classical novel.
+Make a protagonist; there is a protagonist, he encounters and issue and overcomes it.
+Except it's just more technical.
+
+Status update template:
+Since last meeting, I changed ___ in the thesis text.
+I changed ___ in the artifact / experiments.
+My next concrete deliverable is ___.
+My main blocker is ___.
+
+Try to find good references in the CompSys manifesto.
+This should contain ``good'' references.
+Paralysis analysis.
+You should write your thesis at the same time as your coding.
+DO NOT leave out the thesis as the last part, after all coding is done.
+Do both in parallel.
+
+The `gap` in the presentation slides is `...and this has not yet been done before.` or `we are the first to do...` etc.
+`Gap` => nobody did this before (or a knowledge gap).
+E.g., a system does not exist. The key is that it is missing worldwide from the scientific community.
+Nobody has done yet => part of a `scientific` project.
+Why is my project scientific? Because nobody did it yet.
+
+How to backup that something does not exist?
+You cannot cite work that does not exist.
+You cannot cite non existing thing, because it's not there.
+Typically, you show instead how existing falls short.
+1/ State the problem, why it's important.
+2/ Refer to recent and impactful work on datacenter simulation or digital twinning.
+2a/ Write in a few sentences that this other work did `xyz` and say why `xyz` is NOT ENOUGH to advance the problem you introduce earlier.
+You can just say that `this is missing from their work, or I think that this is missing`.
+Now YOU need to make a claim, `this is true`, or `nobody did this`.
+Making such a claim is bold and YOU could be wrong, but STILL make that claim, but then cite the related work, make the claim `they don't do x`, and then go to your supervisor and tell him `in my mind this is the key issue with my thesis, do you agree with this? Is there any work I Should have cited? Am I misinterpreting anything those other projects did?`Bring this up with a conversation with your supervisor.
+
+Answer to each research questions is one of your main contributions.
+They are the main way the reader can understand what you have done.
+System design => contribution to question on ``How to design a ...?''
+Each contribution => a section in your thesis. Core content of the thesis. 3-4 sections.
+Each section corresponds directly to a research question.
+
+Make a skeleton of the thesis first. Very important!
+This way you can plan your own work much better => do this.
+Map the thesis before writing text.
+Put the skeleton in the shared folder.
+
+Each RQ should be enumerated, you want every question to be not just a nice isolated question,but add a bit of context below: 1) describe why it's important 2) say why it's challenging 3) say what makes it scientific.
+
+You can take whatever structure you want from any report, no plagiarism nor declarations needed.
+
+==20-05-2026==
+You should explain in your background section background on datacenters and datacenter simulations.
+The background is NOT an extensive discussion of extensive and related work. It is NOT that.
+It gives the necessary context for the rest of the thesis.
+
+You can include a figure in the introduction from a different paper.
+You can adapt it from a different paper.
+Do not copy figures directly.
+
+Background:
+a) Concept A
+b) Concept B
+c) Merge A + B, why we need both
+
+2.1 Datacenter
+Part of 2.1, or separate as 2.2 if large: Failures
+Part 2.2/2.3 Digital Twins
diff --git a/notes/20260513_133444.png b/notes/20260513_133444.png
new file mode 100644
index 0000000..ed2124d
--- /dev/null
+++ b/notes/20260513_133444.png
diff --git a/notes/20260513_135756.png b/notes/20260513_135756.png
new file mode 100644
index 0000000..a0729d8
--- /dev/null
+++ b/notes/20260513_135756.png
diff --git a/notes/20260513_140254.png b/notes/20260513_140254.png
new file mode 100644
index 0000000..3ebc53a
--- /dev/null
+++ b/notes/20260513_140254.png
diff --git a/notes/20260513_140457.png b/notes/20260513_140457.png
new file mode 100644
index 0000000..15f29c1
--- /dev/null
+++ b/notes/20260513_140457.png
diff --git a/notes/auto-scaling.txt b/notes/auto-scaling.txt
new file mode 100644
index 0000000..a6b009f
--- /dev/null
+++ b/notes/auto-scaling.txt
@@ -0,0 +1,43 @@
+==AutoScaling== 
+To get it to run --? look at how failures are implemented. 
+The moment your pool of usable hosts becomes too tight then you activate new ones. 
+1) I want to implement which allows me to put hosts inactive and activate it again. 
+2) During the workload the operator should be able to close and open the workloads.
+i) then you build a policy as to when you activate new hosts and when you want to stop hosts. 
+ii) have many policies to do this 
+iii) add booting time --> cold start. A host becomes available but it takes still more time to make it workable because you need to boot everything. 
+iv) cold starts use zero power or very minimal --> idle power of a node that is turned off. Inactive have minimal power draw, but the downsides is that nodes may not be available immediately.
+Inactive takes more time to boot e.g., 5 minutes.
+The more you can predict how many nodes you will need to more you can save power without impacting the performance. 
+
+The bachelor focus -- the primary focus should be on the design of the digital twin.
+Be careful that you do not make implementing the main topic of the bachelor thesis. 
+The main topic should still be digital twinning. 
+
+The digital twin is activated in 2 ways: either the operators prompt the digital twin when they run a workload or the digital twin sends a notification to the datacenter. 
+
+IBM Dublin := you have metrics being managed and policies can decide to do actions. 
+And these actions can be many different things. 
+Currently does not matter what it would -- this could be auto-scaling, scheduling, routing.
+What is the difference between routing and scheduling? 
+
+Great success == we do not include AI inference in the bachelor thesis! 
+Policy decisions can be made using different heuristics --> AI inference. 
+Do not work too much on the policies. 
+
+The focus of the thesis is the digital twinning part. 
+
+Autoscaling vs. failures vs. scheduling. 
+MVP soon.
+
+Why can't we port a digital twin from other domains?
+Show that a digital twin can react. 
+
+Show the readers what would the perfect experiment look like in the perfect. 
+
+What a digital twin is --> give the answer in the background somehow. 
+What do you think a digital twin is? Make it clear.
+Why is a specific datacenter digital twin different from what there already is. 
+
+Versen Thesis Awards they are promoted at ICTO today and tomorrow. 
+
diff --git a/notes/dante.txt b/notes/dante.txt
new file mode 100644
index 0000000..6c55eb3
--- /dev/null
+++ b/notes/dante.txt
@@ -0,0 +1,20 @@
+Create a model in draw.io
+Look at OpenTelemetry (read up - is this a lot of work?)
+https://github.com/atlarge-research/opendc/tree/radice-paper
+Make sure you the fields you specify in the schema itself are automatically exported.
+Measure Kafka latency of exporting. 
+Also ensure whether the user wants to export to database or not.
+Add multiple export functions.
+Make sure specify the config files in the command line.
+The prediction should be about auto-scaling.
+BUT -> there is no auto-scaling.
+Do auto-scaling.
+Idle power takes a lot of energy. 
+Predicting when to turn nodes on and off would be nice.
+Datacenters are heavily underutilized.
+Predict when to turn the hosts on and when to turn them off.
+Look at the failure models and how they work in OpenDC - this is how I stop a host, and this is how I start back a host. 
+With auto-scaling you can do it a bit smarter or not.
+Do auto-scaling in OpenDC.
+Rescheduling.
+
diff --git a/notes/meeting.txt b/notes/meeting.txt
new file mode 100644
index 0000000..9a544db
--- /dev/null
+++ b/notes/meeting.txt
@@ -0,0 +1,41 @@
+Find experiments or standard operation that might utilize simulation a bit is not enough.
+Look at the idea of cascading failures.
+A single failure can propagate.
+It makes it difficult simulate to completely.
+Why is it difficult with failures to use naive simulation.
+And then your thesis proposal is that a digital twin would help out these failures.
+The use case that you are specifically looking at is failures.
+Then of course you need to introduce failures.
+You are not focusing enough on digital twinning.
+You should focus more on this than predictive analytics.
+Digital twinning is the key -- argumentation and whatnot, not yet faults or predictive analysis.
+
+
+Opendc-web-server.
+All the interesting endpoints are defined in `rests/resources`
+Do not use Javalin, use Quarkus, because we use Quarkus in the web module.
+To find the documentation and find the web module, find `localhost:8080/q/dev-ui/extensions` or `localhost:8080/q/dev`.
+
+The API models the API everything can do.
+We need to bridge the experiment runner to the API.
+Some small fixes to the API have to be done.
+The API has some schemas defined in the schemas.
+
+Read the documentation of quarkus to add your own endpoints.
+Here add your endpoints `opendc-web/opendc-web-server/src/main/java/org/opendc/web/server/rest`
+What gets stored in the databse:
+- aggregate results are stored there.
+- detailed results are in the development tree (the thing that you did with PostgreSQL).
+Have a look at the website graphs of OpenDC.
+
+Failure traces are needed to demonstrate failures.
+
+Have a look at Jure's experiments and the cost impact of failures as this might be nice to show in your own work.
+Failures tap well into what he is doing.
+The degradation model of CPU.
+Shows how much money is being lost due to failures.
+
+Daniel cannot model some experiments in the web module.
+He would not get results.
+There are two different runners for the web module that are different from the generic `ExperimentRunner`.
+
diff --git a/notes/notes.txt b/notes/notes.txt
new file mode 100644
index 0000000..d5394cf
--- /dev/null
+++ b/notes/notes.txt
@@ -0,0 +1,32 @@
+The presentation is in practice 30% of grade. 
+The actual grading goes on during the presentation. 
+Alex will only ask questions during the presentation.
+You are getting a proof from the university that you are going to graduate. 
+The moment you get the information that your thesis went right, you get the document from the university saying that you are going to graduate.
+
+Alex assumes that you are going to do an 18EC honors project. 
+Text Alex again often.
+You do a continuation of the bachelor thesis as an honors project. 
+Ana did spatial shifting as an honours thesis, and then on top of that she combined spatial and temporal shifting on top of her bachelor thesis.
+
+Dante's vision on the structure:  
+1. Introduction, background -- do a literature survey. 
+    * you should add besides the model of digital twinning. 
+2. Design, how do you design 
+3. Experiments,
+    * add the methodology of running experiments. 
+    * using the Datacenter simulator as a "real" Datacenter is in itself is a really interesting. 
+4. For a normal bachelor thesis you show the effect -- the failures.
+5. Digital twin scheduling vs. Scheduling. 
+--BACHELOR thesis ends here--
+--HP Project here--
+Here we do the same thing with failures AND auto-scaling. 
+We add auto-scaling as the extra component. 
+We are looking at how this component interacts with all the other components. 
+Not just only an experiment but a bit more of an extra layer. 
+-- HP project adding -- 
+Adding an extra layer of complexity and check how it acts together with the rest of the systems.
+
+Write down: these are the base things we want to do for sure, and then make a list of things we'd like the extra's. 
+A use case can be specific metrics you want to measure or actions you want to take (akin to auto-scaling). 
+
diff --git a/notes/vu_thesis_template_advice.pdf b/notes/vu_thesis_template_advice.pdf
new file mode 100644
index 0000000..8eb63fe
--- /dev/null
+++ b/notes/vu_thesis_template_advice.pdf
author	mjkwiatkowski <mati.rewa@gmail.com>	2026-06-01 14:15:37 +0200
committer	mjkwiatkowski <mati.rewa@gmail.com>	2026-06-01 14:15:37 +0200
commit	f771af4e69db4b8937f64fbf4024eb518a7cc230 (patch)
tree	320be41b46d66def4f27ad7611e1ebfd51aa3386 /notes
parent	91ca6b9411a1675f5735a86ac658833dc78cc382 (diff)