opendc.git - The OpenDC repository.

Age	Commit message (Collapse)	Author
2025-06-22	Implemented Single GPU Support & outline of host-level allocation policies ↵	Niels Thiele
	(#342) * renamed performance counter to distinguish different resource types * added GPU, modelled similar to CPU * added GPUs to machine model * list of GPUs instead of single instance * renamed memory speed to bandwidth * enabled parsing of GPU resources * split powermodel into cpu and GPU powermodel * added gpu parsing tests * added idea of host level scheduling * added tests for multi gpu parsing * renamed powermodel to cpupowermodel * clarified naming of cpu and gpu components * added resource type to flow suplier and edge * added resourcetype * added GPU components and resource type to fragments * added GPU to workload and updated resource usage retrieval * implemented first version of multi resource * added name to workload * renamed perfomance counters * removed commented out code * removed deprecated comments * included demand and supply into calculations * resolving rebase mismatches * moved resource type from flowedge class to common package * added available resources to machinees * cleaner separation if workload is started of simmachine or vm * Replaced exception with dedicated enum * Only looping over resources that are actually used * using hashmaps to handle resourcetype instead of arrays for readability * fixed condition * tracking finished workloads per resource type * removed resource type from flowedge * made supply and demand distribution resource specific * added power model for GPU * removed unused test setup * removed depracated comments * removed unused parameter * added ID for GPU * added GPUs and GPU performance counters (naively) * implemented capturing of GPU statistics * added reminders for future implementations * renamed properties for better identification * added capturing GPU statistics * implemented first tests for GPUs * unified access to performance counters * added interface for general compute resource handling * implemented multi resource support in simmachine * added individual edge to VM per resource * extended compute resource interface * implemented multi-resource support in PSU * implemented generic retrieval of computeresources * implemented mult-resource suppport in vm * made method use more resource specific * implemented simple GPU tests * rolled back frquency and demand use * made naming independent of used resource * using workloads resources instead of VMs to determine available resource * implemented determination of used resources in workload * removed logging statements * implemented reading from workload * fixed naming for host-level allocation * fixed next deadline calculation * fixed forwarding supply * reduced memory footprint * made GPU powermodel nullable * maded Gpu powermodel configurable in topology * implemented tests for basic gpu scheduler * added gpu properties * implemented weights, filter and simple cpu-gpu scheduler * spotless apply * spotless apply pt. 2 * fixed capitalization * spotless kotlin run * implemented coloumn export * todo update * removed code comments * Merged PerformanceCounter classes into one & removed interface * removed GPU specific powermodel * Rebase master: kept both versions of TopologyFactories * renamed CpuPowermodel to resource independent Powermodel Moved it from Cpu package to power package * implementated default of getResourceType & removed overrides if possible * split getResourceType into Consumer and Supplier * added power as resource type * reduced supply demand from arrayList to single value * combining GPUs into one large GPU, until full multi-gpu support * merged distribution policy enum with corresponding factory * added comment * post-rebase fixes * aligned naming * Added GPU metrics to task output * Updates power resource type to uppercase. Standardizes the `ResourceType.Power` enum to `ResourceType.POWER` for consistency with other resource types and improved readability. * Removes deprecated test assertions Removes commented-out assertions in GPU tests. These assertions are no longer needed and clutter the test code. * Renames MaxMinFairnessStrategy to Policy Renames MaxMinFairnessStrategy to MaxMinFairnessPolicy for clarity and consistency with naming conventions. This change affects the factory and distributor to use the updated name. * applies spotless * nulls GPUs as it is not used
2025-03-27	Support carbon forecasting in timeshift (#327)	Sacheendra Talluri
	* Remove task from scheduler bookkeeping after failure * Support carbon forecasting in timeshift * Register scheduler and carbonmodel in context * Preliminary working task stopping; carbon intensity bug * Working carbon based stop. Two timeshift thresholds * Add a pause state task and guest * Move task stopper to allocation spec * Start tracking num pauses
2025-03-24	Updated AllocationPolicy input (#324)	Dante Niewenhuis

2025-03-20	Adds load shifting over time (#319)	Sacheendra Talluri
	* Start time shifting * Existing experiments work with new columns * Remove unused traces dir * Update java to 21 LTS and jacoco to be compatible * Minimal working timeshifting * Timeshift scheduler linked as carbon receiver * Add basic tests for timeshift scheduler * Run spotless apply * Modify tarce format tests to support new fields * Change all mentions of java 19 to 21 * Add a deferAll option to workload to make all tasks deferrable * Run spotless apply * Copy traces from resources in web dockerfile
2025-03-18	Made some changes to improve RAM of OpenDC (#318)	Dante Niewenhuis

2025-02-06	updated logging and added logging for batteries (#301)	Dante Niewenhuis
	* Updated logging * removed DoubleThresholdBatteryPolicy.java
2025-01-23	Added sampleFraction and submissionTime to the workloadSpec (#295)	Dante Niewenhuis
	* Added sampleFraction and submissionTime to the workloadSpec * Removed commented code
2025-01-22	Simplified the WorkloadLoader into a single class that can be extended when ↵	Dante Niewenhuis
	new workload types are added (#294)
2024-11-03	Rewritten the Carbon model (#260)	Dante Niewenhuis

2024-10-30	Added power sources to OpenDC (#258)	Dante Niewenhuis
	* Added power sources to OpenDC. In the current form each Cluster has a single power source that is connected to all hosts in that cluster * Added power sources to OpenDC. In the current form each Cluster has a single power source that is connected to all hosts in that cluster * Ran spotless Kotlin and Java
2024-10-29	Updated all floats to Doubles (#257)	Dante Niewenhuis
	* Updated tests Changed all floats into doubles to have consistency over the whole framework Made a small update to the multiplexer to better push through supply and demand Fixed small typo Updated M3SA paths. fixed merge conflicts Removed unused components. Updated tests. Improved checkpointing model Improved model, started with SimPowerSource implemented FailureModels and Checkpointing First working version midway commit first update All simulation are now run with a single CPU and single MemoryUnit. multi CPUs are combined into one. This is for performance and explainability. * Updated test memory
2024-10-25	Rewrote the FlowEngine (#256)	Dante Niewenhuis
	* Removed unused components. Updated tests. Improved checkpointing model Improved model, started with SimPowerSource implemented FailureModels and Checkpointing First working version midway commit first update All simulation are now run with a single CPU and single MemoryUnit. multi CPUs are combined into one. This is for performance and explainability. * fixed merge conflicts * Updated M3SA paths. * Fixed small typo
2024-09-16	All simulation are now run with a single CPU and single MemoryUnit. multi ↵	Dante Niewenhuis
	CPUs are combined into one. This is for performance and explainability. (#255)
2024-08-27	Renamed input files and internally server is changed to task (#246)	Dante Niewenhuis
	* Updated SimTrace to use a single ArrayDeque instead of three separate lists for deadline, cpuUsage, and coreCount * Renamed input files to tasks.parquet and fragments.parquet. Renamed server to task. OpenDC nows exports tasks.parquet instead of server.parquet
2024-05-07	Revamped failure models (#228)	Dante Niewenhuis

2024-04-29	Reworked Scenario.kt to consist of only specifications. The Specs are turned ↵	Dante Niewenhuis
	into objects when the scenario is being executed by ScenarioRunner.kt (#227)
2024-04-17	Added support for carbon traces (#218)	Dante Niewenhuis
	* Started with the carbon trace implementation * Moved the carbon trace system to the proper folders
2024-04-16	Revamped the trace system. All TraceFormat files are now in the api m… (#216)	Dante Niewenhuis
	* Revamped the trace system. All TraceFormat files are now in the api module. This fixes some problems with not being able to use types of traces * applied spotless
2024-03-19	Scenario and Portfolio update (#209)	Dante Niewenhuis
	* Initial commit * Implemented a new systems of defining and running scenarios / portfolios. Scenarios and Portfolios can now be defined using JSON files similar to topologies. This allows user to define experiments without changing any KotLin code. * Ran spotlessApply
2024-03-05	Cpu fix (#208)	Dante Niewenhuis
	* Updated the topology format to JSON. Updated TopologyReader.kt to handle JSON filed. Added documentation for the new format. * applied spotless kotlin * small update * Updated for spotless apply * Updated for spotless apply
2024-03-05	Updated package versions, updated web server tests. (#207)	Dante Niewenhuis
	* Updated all package versions including kotlin. Updated all web-server tests to run. * Changed the java version of the tests. OpenDC now only supports java 19. * small update * test update * new update * updated docker version to 19 * updated docker version to 19
2024-02-14	Updated metrics and parquet output (#195)	Dante Niewenhuis
	* Updated metrics and parquet output * fixed typos
2024-01-08	refactored opendc-experiment-compute (#190)	Dante Niewenhuis
	* removed experiment-compute and integrated all components into opendc-compute * updated workflow gradle file * removed unneeded code
2023-03-25	ci: Test OpenDC on Java 20	Fabian Mastenbroek
	This change updates the CI pipeline so that Java 20 is being tested with the latest Gradle RC, since Gradle 8.0 does not support it yet.
2023-03-25	ci: Migrate to GitHub Container Registry (#143)	Fabian Mastenbroek
	Docker Inc is sunsetting free team organizations for the Docker registry, which our organization is one of. Instead, a paid subscription is now required to maintain the organization. Given our relatively small usage of the account, it makes more sense to start publishing the container images on the GitHub Container Registry, since it is free for open source projects and integrates well with GitHub Actions. Fixes #141
2022-11-13	refactor: Replace use of CoroutineContext by Dispatcher	Fabian Mastenbroek
	This change replaces the use of `CoroutineContext` for passing the `SimulationDispatcher` across the different modules of OpenDC by the lightweight `Dispatcher` interface of the OpenDC common module.
2022-11-13	refactor(sim/core): Re-implement SimulationScheduler as Dispatcher	Fabian Mastenbroek
	This change updates the `SimulationScheduler` class to implement the `Dispatcher` interface from the OpenDC Common module, so that OpenDC modules only need to depend on the common module for dispatching future task (possibly in simulation).
2022-10-21	fix: Add log4j-core dependency	Fabian Mastenbroek
	This change adds the log4j-core dependency to various modules of OpenDC using log4j2, to ensure logging keeps working. The upgrade to SLF4J 2.0 broke the Log4j2 functionality, since the log4j-core artifact is not automatically shipped with the SLF4J implementation.
2022-10-21	refactor(sim/compute): Re-implement using flow2	Fabian Mastenbroek
	This change re-implements the OpenDC compute simulator framework using the new flow2 framework for modelling multi-edge flow networks. The re-implementation is written in Java and focusses on performance and clean API surface.
2022-10-10	fix(web/runner): Increase default job timeout	Fabian Mastenbroek
	This change fixes an issue with the OpenDC web runner where the default job timeout was set to 10 ms instead of 10 minutes. For longer simulations, this would cause the job to be terminated.
2022-10-10	fix(web/runner): Fix service metric reporting	Fabian Mastenbroek
	This change resolves an issue in the web runner where the finished VMs would always be reported as zero.
2022-10-10	feat(web/server): Add support for accounting simulation time	Fabian Mastenbroek
	This change updates the Quarkus-based web server to add support for tracking and limiting the simulation minutes used by the user in order to prevent misuse of shared resources.
2022-10-06	build: Switch to Spotless for formatting	Fabian Mastenbroek
	This change updates the build configuration to use Spotless for code formating of both Kotlin and Java.
2022-10-06	style: Eliminate use of wildcard imports	Fabian Mastenbroek
	This change updates the repository to remove the use of wildcard imports everywhere. Wildcard imports are not allowed by default by Ktlint as well as Google's Java style guide.
2022-10-05	refactor(sim/core): Rename runBlockingSimulation to runSimulation	Fabian Mastenbroek
	This change renames the method `runBlockingSimulation` to `runSimulation` to put more emphasis on the simulation part of the method. The blocking part is not that important, but this behavior is still described in the method documentation.
2022-10-05	refactor(sim/core): Use SimulationScheduler in coroutine dispatcher	Fabian Mastenbroek
	This change updates the implementation of `SimulationDispatcher` to use a (possibly user-provided) `SimulationScheduler` for managing the execution of the simulation and future tasks.
2022-10-03	refactor(exp/compute): Remove Topology interface	Fabian Mastenbroek
	This change removes the Topology interface from the `opendc-experiments-compute` module, which was meant for provisioning the experimental topology. Howerver, with the stateless `HostSpec` class, it is not needed to resolve the topology everytime.
2022-10-03	refactor(exp/compute): Integrate compute workload classes	Fabian Mastenbroek
	This change integrates the classes from the old `opendc-compute-workload` module into the `opendc-experiments-compute` module. This new module contains helper classes for setting up experiments with the OpenDC compute service.
2022-10-03	refactor(web/runner): Use experiment base for web runner	Fabian Mastenbroek
	This change updates the OpenDC web runner to use the new `opendc-experiments-base` module for setting up the experimental environment and simulate the workload.
2022-09-23	refactor(compute): Provide access to instances in compute service	Fabian Mastenbroek
	This change updates the interface of `ComputeService` to provide access to the instances (servers) that have been registered with the compute service. This allows metric collectors to query the metrics of the servers that are currently running.
2022-09-22	refactor(compute): Pass failure model during workload evaluation	Fabian Mastenbroek
	This change updates the `ComputeServiceHelper` class to provide the failure model via a parameter to the `run` method instead of constructor parameter. This separates the construction of the topology from the simulation of the workload.
2022-09-22	refactor(sim/compute): Make interference domain independent of profile	Fabian Mastenbroek
	This change updates the virtual machine performance interference model so that the interference domain can be constructed independently of the interference profile. As a consequence, the construction of the topology now does not depend anymore on the interference profile.
2022-09-22	refactor(sim/compute): Extract Random dependency from interference model	Fabian Mastenbroek
	This change moves the Random dependency outside the interference model, to allow the interference model to be completely immutable and passable between different simulations.
2022-08-03	refactor(web/runner): Support pluggable job manager	Fabian Mastenbroek
	This change introduces a new interface `JobManager` that is responsible for communicating with the backend about the available jobs and updating their status when the runner is simulating a job. This manager can be injected into the `OpenDCRunner` class and allows users to provide different sources for the jobs, not only the current REST API.
2022-08-03	fix(web/runner): Prevent reporting NaN values	Fabian Mastenbroek
	This change fixes an issue with the OpenDC web runner where it would report NaN values for some of the metrics due to the topology being empty. This in turn causes issues in the frontend.
2022-08-03	fix(web/runner): Gracefully exit on interrupt	Fabian Mastenbroek
	This change updates the web runner implementation to gracefully exit the current thread when interrupted.
2022-08-03	fix(web/runner): Use correct context ClassLoader for ForkJoinPool	Fabian Mastenbroek
	This change updates the OpenDC web runner implementation to use the correct context ClassLoader for simulation jobs running inside a ForkJoinPool. By default, the ForkJoinPool will use the system class loader which does not have access to the services needed by the web runner.
2022-05-18	refactor(web/runner): Move runner CLI into separate configuration	Fabian Mastenbroek
	This change splits the command line interface from the OpenDC web runner into a separate configuration. We plan to re-use the runner code for a Quarkus extension that integrates the runner in development mode.
2022-05-15	build(web/runner): Reduce build steps for Docker image	Fabian Mastenbroek
	This change updates the Dockerfile for the web runner to reduce the number of build steps necessary to build the web runner. Previously, the build would also include/build the web API which is not used in the image.
2022-05-06	refactor(compute/service): Remove OpenTelemetry from "compute" modules	Fabian Mastenbroek
	This change removes the OpenTelemetry integration from the OpenDC Compute modules. Previously, we chose to integrate OpenTelemetry to provide a unified way to report metrics to the users. Although this worked as expected, the overhead of the OpenTelemetry when collecting metrics during simulation was considerable and lacked more optimization opportunities (other than providing a separate API implementation). Furthermore, since we were tied to OpenTelemetry's SDK implementation, we experienced issues with throttling and registering multiple instruments. We will instead use another approach, where we expose the core metrics in OpenDC via specialized interfaces (see the commits before) such that access is fast and can be done without having to interface with OpenTelemetry. In addition, we will provide an adapter to that is able to forward these metrics to OpenTelemetry implementations, so we can still integrate with the wider ecosystem.