From f771af4e69db4b8937f64fbf4024eb518a7cc230 Mon Sep 17 00:00:00 2001 From: mjkwiatkowski Date: Mon, 1 Jun 2026 14:15:37 +0200 Subject: feat: added some notes --- notes/1.txt | 69 ++++++++++++++++++++++++++++++++++++ notes/20260513_133444.png | Bin 0 -> 365862 bytes notes/20260513_135756.png | Bin 0 -> 371318 bytes notes/20260513_140254.png | Bin 0 -> 328776 bytes notes/20260513_140457.png | Bin 0 -> 142702 bytes notes/auto-scaling.txt | 43 ++++++++++++++++++++++ notes/dante.txt | 20 +++++++++++ notes/meeting.txt | 41 +++++++++++++++++++++ notes/notes.txt | 32 +++++++++++++++++ notes/vu_thesis_template_advice.pdf | Bin 0 -> 384126 bytes 10 files changed, 205 insertions(+) create mode 100644 notes/1.txt create mode 100644 notes/20260513_133444.png create mode 100644 notes/20260513_135756.png create mode 100644 notes/20260513_140254.png create mode 100644 notes/20260513_140457.png create mode 100644 notes/auto-scaling.txt create mode 100644 notes/dante.txt create mode 100644 notes/meeting.txt create mode 100644 notes/notes.txt create mode 100644 notes/vu_thesis_template_advice.pdf (limited to 'notes') diff --git a/notes/1.txt b/notes/1.txt new file mode 100644 index 0000000..4a11803 --- /dev/null +++ b/notes/1.txt @@ -0,0 +1,69 @@ +==13-05-2026== +Introduction: +Make a compelling story about what the adventure is upon you and what problem you set out to solve. +You can think of it as a classical novel. +Make a protagonist; there is a protagonist, he encounters and issue and overcomes it. +Except it's just more technical. + +Status update template: +Since last meeting, I changed ___ in the thesis text. +I changed ___ in the artifact / experiments. +My next concrete deliverable is ___. +My main blocker is ___. + +Try to find good references in the CompSys manifesto. +This should contain ``good'' references. +Paralysis analysis. +You should write your thesis at the same time as your coding. +DO NOT leave out the thesis as the last part, after all coding is done. +Do both in parallel. + +The `gap` in the presentation slides is `...and this has not yet been done before.` or `we are the first to do...` etc. +`Gap` => nobody did this before (or a knowledge gap). +E.g., a system does not exist. The key is that it is missing worldwide from the scientific community. +Nobody has done yet => part of a `scientific` project. +Why is my project scientific? Because nobody did it yet. + +How to backup that something does not exist? +You cannot cite work that does not exist. +You cannot cite non existing thing, because it's not there. +Typically, you show instead how existing falls short. +1/ State the problem, why it's important. +2/ Refer to recent and impactful work on datacenter simulation or digital twinning. +2a/ Write in a few sentences that this other work did `xyz` and say why `xyz` is NOT ENOUGH to advance the problem you introduce earlier. +You can just say that `this is missing from their work, or I think that this is missing`. +Now YOU need to make a claim, `this is true`, or `nobody did this`. +Making such a claim is bold and YOU could be wrong, but STILL make that claim, but then cite the related work, make the claim `they don't do x`, and then go to your supervisor and tell him `in my mind this is the key issue with my thesis, do you agree with this? Is there any work I Should have cited? Am I misinterpreting anything those other projects did?`Bring this up with a conversation with your supervisor. + +Answer to each research questions is one of your main contributions. +They are the main way the reader can understand what you have done. +System design => contribution to question on ``How to design a ...?'' +Each contribution => a section in your thesis. Core content of the thesis. 3-4 sections. +Each section corresponds directly to a research question. + +Make a skeleton of the thesis first. Very important! +This way you can plan your own work much better => do this. +Map the thesis before writing text. +Put the skeleton in the shared folder. + +Each RQ should be enumerated, you want every question to be not just a nice isolated question,but add a bit of context below: 1) describe why it's important 2) say why it's challenging 3) say what makes it scientific. + +You can take whatever structure you want from any report, no plagiarism nor declarations needed. + +==20-05-2026== +You should explain in your background section background on datacenters and datacenter simulations. +The background is NOT an extensive discussion of extensive and related work. It is NOT that. +It gives the necessary context for the rest of the thesis. + +You can include a figure in the introduction from a different paper. +You can adapt it from a different paper. +Do not copy figures directly. + +Background: +a) Concept A +b) Concept B +c) Merge A + B, why we need both + +2.1 Datacenter +Part of 2.1, or separate as 2.2 if large: Failures +Part 2.2/2.3 Digital Twins diff --git a/notes/20260513_133444.png b/notes/20260513_133444.png new file mode 100644 index 0000000..ed2124d Binary files /dev/null and b/notes/20260513_133444.png differ diff --git a/notes/20260513_135756.png b/notes/20260513_135756.png new file mode 100644 index 0000000..a0729d8 Binary files /dev/null and b/notes/20260513_135756.png differ diff --git a/notes/20260513_140254.png b/notes/20260513_140254.png new file mode 100644 index 0000000..3ebc53a Binary files /dev/null and b/notes/20260513_140254.png differ diff --git a/notes/20260513_140457.png b/notes/20260513_140457.png new file mode 100644 index 0000000..15f29c1 Binary files /dev/null and b/notes/20260513_140457.png differ diff --git a/notes/auto-scaling.txt b/notes/auto-scaling.txt new file mode 100644 index 0000000..a6b009f --- /dev/null +++ b/notes/auto-scaling.txt @@ -0,0 +1,43 @@ +==AutoScaling== +To get it to run --? look at how failures are implemented. +The moment your pool of usable hosts becomes too tight then you activate new ones. +1) I want to implement which allows me to put hosts inactive and activate it again. +2) During the workload the operator should be able to close and open the workloads. +i) then you build a policy as to when you activate new hosts and when you want to stop hosts. +ii) have many policies to do this +iii) add booting time --> cold start. A host becomes available but it takes still more time to make it workable because you need to boot everything. +iv) cold starts use zero power or very minimal --> idle power of a node that is turned off. Inactive have minimal power draw, but the downsides is that nodes may not be available immediately. +Inactive takes more time to boot e.g., 5 minutes. +The more you can predict how many nodes you will need to more you can save power without impacting the performance. + +The bachelor focus -- the primary focus should be on the design of the digital twin. +Be careful that you do not make implementing the main topic of the bachelor thesis. +The main topic should still be digital twinning. + +The digital twin is activated in 2 ways: either the operators prompt the digital twin when they run a workload or the digital twin sends a notification to the datacenter. + +IBM Dublin := you have metrics being managed and policies can decide to do actions. +And these actions can be many different things. +Currently does not matter what it would -- this could be auto-scaling, scheduling, routing. +What is the difference between routing and scheduling? + +Great success == we do not include AI inference in the bachelor thesis! +Policy decisions can be made using different heuristics --> AI inference. +Do not work too much on the policies. + +The focus of the thesis is the digital twinning part. + +Autoscaling vs. failures vs. scheduling. +MVP soon. + +Why can't we port a digital twin from other domains? +Show that a digital twin can react. + +Show the readers what would the perfect experiment look like in the perfect. + +What a digital twin is --> give the answer in the background somehow. +What do you think a digital twin is? Make it clear. +Why is a specific datacenter digital twin different from what there already is. + +Versen Thesis Awards they are promoted at ICTO today and tomorrow. + diff --git a/notes/dante.txt b/notes/dante.txt new file mode 100644 index 0000000..6c55eb3 --- /dev/null +++ b/notes/dante.txt @@ -0,0 +1,20 @@ +Create a model in draw.io +Look at OpenTelemetry (read up - is this a lot of work?) +https://github.com/atlarge-research/opendc/tree/radice-paper +Make sure you the fields you specify in the schema itself are automatically exported. +Measure Kafka latency of exporting. +Also ensure whether the user wants to export to database or not. +Add multiple export functions. +Make sure specify the config files in the command line. +The prediction should be about auto-scaling. +BUT -> there is no auto-scaling. +Do auto-scaling. +Idle power takes a lot of energy. +Predicting when to turn nodes on and off would be nice. +Datacenters are heavily underutilized. +Predict when to turn the hosts on and when to turn them off. +Look at the failure models and how they work in OpenDC - this is how I stop a host, and this is how I start back a host. +With auto-scaling you can do it a bit smarter or not. +Do auto-scaling in OpenDC. +Rescheduling. + diff --git a/notes/meeting.txt b/notes/meeting.txt new file mode 100644 index 0000000..9a544db --- /dev/null +++ b/notes/meeting.txt @@ -0,0 +1,41 @@ +Find experiments or standard operation that might utilize simulation a bit is not enough. +Look at the idea of cascading failures. +A single failure can propagate. +It makes it difficult simulate to completely. +Why is it difficult with failures to use naive simulation. +And then your thesis proposal is that a digital twin would help out these failures. +The use case that you are specifically looking at is failures. +Then of course you need to introduce failures. +You are not focusing enough on digital twinning. +You should focus more on this than predictive analytics. +Digital twinning is the key -- argumentation and whatnot, not yet faults or predictive analysis. + + +Opendc-web-server. +All the interesting endpoints are defined in `rests/resources` +Do not use Javalin, use Quarkus, because we use Quarkus in the web module. +To find the documentation and find the web module, find `localhost:8080/q/dev-ui/extensions` or `localhost:8080/q/dev`. + +The API models the API everything can do. +We need to bridge the experiment runner to the API. +Some small fixes to the API have to be done. +The API has some schemas defined in the schemas. + +Read the documentation of quarkus to add your own endpoints. +Here add your endpoints `opendc-web/opendc-web-server/src/main/java/org/opendc/web/server/rest` +What gets stored in the databse: +- aggregate results are stored there. +- detailed results are in the development tree (the thing that you did with PostgreSQL). +Have a look at the website graphs of OpenDC. + +Failure traces are needed to demonstrate failures. + +Have a look at Jure's experiments and the cost impact of failures as this might be nice to show in your own work. +Failures tap well into what he is doing. +The degradation model of CPU. +Shows how much money is being lost due to failures. + +Daniel cannot model some experiments in the web module. +He would not get results. +There are two different runners for the web module that are different from the generic `ExperimentRunner`. + diff --git a/notes/notes.txt b/notes/notes.txt new file mode 100644 index 0000000..d5394cf --- /dev/null +++ b/notes/notes.txt @@ -0,0 +1,32 @@ +The presentation is in practice 30% of grade. +The actual grading goes on during the presentation. +Alex will only ask questions during the presentation. +You are getting a proof from the university that you are going to graduate. +The moment you get the information that your thesis went right, you get the document from the university saying that you are going to graduate. + +Alex assumes that you are going to do an 18EC honors project. +Text Alex again often. +You do a continuation of the bachelor thesis as an honors project. +Ana did spatial shifting as an honours thesis, and then on top of that she combined spatial and temporal shifting on top of her bachelor thesis. + +Dante's vision on the structure: +1. Introduction, background -- do a literature survey. + * you should add besides the model of digital twinning. +2. Design, how do you design +3. Experiments, + * add the methodology of running experiments. + * using the Datacenter simulator as a "real" Datacenter is in itself is a really interesting. +4. For a normal bachelor thesis you show the effect -- the failures. +5. Digital twin scheduling vs. Scheduling. +--BACHELOR thesis ends here-- +--HP Project here-- +Here we do the same thing with failures AND auto-scaling. +We add auto-scaling as the extra component. +We are looking at how this component interacts with all the other components. +Not just only an experiment but a bit more of an extra layer. +-- HP project adding -- +Adding an extra layer of complexity and check how it acts together with the rest of the systems. + +Write down: these are the base things we want to do for sure, and then make a list of things we'd like the extra's. +A use case can be specific metrics you want to measure or actions you want to take (akin to auto-scaling). + diff --git a/notes/vu_thesis_template_advice.pdf b/notes/vu_thesis_template_advice.pdf new file mode 100644 index 0000000..8eb63fe Binary files /dev/null and b/notes/vu_thesis_template_advice.pdf differ -- cgit v1.2.3