summaryrefslogtreecommitdiff
path: root/opendc-compute/opendc-compute-service/src
AgeCommit message (Collapse)Author
2024-10-25Rewrote the FlowEngine (#256)Dante Niewenhuis
* Removed unused components. Updated tests. Improved checkpointing model Improved model, started with SimPowerSource implemented FailureModels and Checkpointing First working version midway commit first update All simulation are now run with a single CPU and single MemoryUnit. multi CPUs are combined into one. This is for performance and explainability. * fixed merge conflicts * Updated M3SA paths. * Fixed small typo
2024-09-16All simulation are now run with a single CPU and single MemoryUnit. multi ↵Dante Niewenhuis
CPUs are combined into one. This is for performance and explainability. (#255)
2024-09-12Added max number of failures (#254)Dante Niewenhuis
* Added a max failure for tasks. If tasks fail more times, they get cancelled * Added maxNumFailures to the frontend * Updated tests
2024-08-27Renamed input files and internally server is changed to task (#246)Dante Niewenhuis
* Updated SimTrace to use a single ArrayDeque instead of three separate lists for deadline, cpuUsage, and coreCount * Renamed input files to tasks.parquet and fragments.parquet. Renamed server to task. OpenDC nows exports tasks.parquet instead of server.parquet
2024-05-07Revamped failure models (#228)Dante Niewenhuis
2024-04-29Fixed several cpu related bugs, changed input topology (#226)Dante Niewenhuis
2024-03-19Scenario and Portfolio update (#209)Dante Niewenhuis
* Initial commit * Implemented a new systems of defining and running scenarios / portfolios. Scenarios and Portfolios can now be defined using JSON files similar to topologies. This allows user to define experiments without changing any KotLin code. * Ran spotlessApply
2024-03-05Cpu fix (#208)Dante Niewenhuis
* Updated the topology format to JSON. Updated TopologyReader.kt to handle JSON filed. Added documentation for the new format. * applied spotless kotlin * small update * Updated for spotless apply * Updated for spotless apply
2024-03-05Updated package versions, updated web server tests. (#207)Dante Niewenhuis
* Updated all package versions including kotlin. Updated all web-server tests to run. * Changed the java version of the tests. OpenDC now only supports java 19. * small update * test update * new update * updated docker version to 19 * updated docker version to 19
2024-02-14Updated metrics and parquet output (#195)Dante Niewenhuis
* Updated metrics and parquet output * fixed typos
2024-01-08refactored opendc-experiment-compute (#190)Dante Niewenhuis
* removed experiment-compute and integrated all components into opendc-compute * updated workflow gradle file * removed unneeded code
2022-11-27refactor(compute/service): Do not split interface and implementationFabian Mastenbroek
This change inlines the implementation of the compute service into the `ComputeService` interface. We do not intend to provide multiple implementations of the service. In addition, this approach makes more sense for a Java implementation.
2022-11-27refactor(compute/service): Expose state directly to clientsFabian Mastenbroek
This change updates the implementation of the compute service to expose state to clients created by the compute service.
2022-11-27refactor(compute/api): Do not suspend in compute APIFabian Mastenbroek
This change updates the API interface of the OpenDC Compute service to not suspend execution using Kotlin Coroutines. The suspending modifiers were introduced in case the ComputeClient would communicate with the service over a network connection. However, the main use-case has been together with the ComputeService, where the suspending modifiers only frustrate the user experience when writing experiments. Furthermore, with the advent of Project Loom, it is not necessarily a problem to block the (virtual) thread during network communications.
2022-11-13refactor: Replace use of CoroutineContext by DispatcherFabian Mastenbroek
This change replaces the use of `CoroutineContext` for passing the `SimulationDispatcher` across the different modules of OpenDC by the lightweight `Dispatcher` interface of the OpenDC common module.
2022-11-13refactor(sim/core): Re-implement SimulationScheduler as DispatcherFabian Mastenbroek
This change updates the `SimulationScheduler` class to implement the `Dispatcher` interface from the OpenDC Common module, so that OpenDC modules only need to depend on the common module for dispatching future task (possibly in simulation).
2022-11-13refactor: Use InstantSource as time sourceFabian Mastenbroek
This change updates the modules of OpenDC to always accept the `InstantSource` interface as source of time. Previously we used `java.time.Clock`, but this class is bound to a time zone which does not make sense for our use-cases. Since `java.time.Clock` implements `java.time.InstantSource`, it can be used in places that require an `InstantSource` as parameter. Conversion from `InstantSource` to `Clock` is also possible by invoking `InstantSource#withZone`.
2022-11-04refactor: Use RandomGenerator as randomness sourceFabian Mastenbroek
This change updates the modules of OpenDC to always accept the `RandomGenerator` interface as source of randomness. This interface is implemented by the slower `java.util.Random` class, but also by the faster `java.util.SplittableRandom` class
2022-10-28refactor(compute/service): Do not suspend on guest startFabian Mastenbroek
This change updates the `Host` interface to remove the suspend modifiers to the start, stop, spawn, and delete methods of this interface. We now assume that the host immediately launches the guest on invocation of this method.
2022-10-10fix(compute/service): Expose number of registered serversFabian Mastenbroek
This change updates the compute service telemetry to also expose the number of servers that are registered with the service.
2022-10-06build: Switch to Spotless for formattingFabian Mastenbroek
This change updates the build configuration to use Spotless for code formating of both Kotlin and Java.
2022-10-06style: Eliminate use of wildcard importsFabian Mastenbroek
This change updates the repository to remove the use of wildcard imports everywhere. Wildcard imports are not allowed by default by Ktlint as well as Google's Java style guide.
2022-10-05refactor(sim/core): Rename runBlockingSimulation to runSimulationFabian Mastenbroek
This change renames the method `runBlockingSimulation` to `runSimulation` to put more emphasis on the simulation part of the method. The blocking part is not that important, but this behavior is still described in the method documentation.
2022-10-05refactor(sim/core): Use SimulationScheduler in coroutine dispatcherFabian Mastenbroek
This change updates the implementation of `SimulationDispatcher` to use a (possibly user-provided) `SimulationScheduler` for managing the execution of the simulation and future tasks.
2022-09-23refactor(compute): Provide access to instances in compute serviceFabian Mastenbroek
This change updates the interface of `ComputeService` to provide access to the instances (servers) that have been registered with the compute service. This allows metric collectors to query the metrics of the servers that are currently running.
2022-09-22refactor(compute): Add separate error host stateFabian Mastenbroek
This change adds a new HostState to indicate that the host is in an error state as opposed to being purposefully unavailable.
2022-09-21feat(compute): Add support for affinity scheduling (#101)Fabian Mastenbroek
This change adds support for (anti-)affinity scheduling of servers onto hosts, which happens at the compute service level. In the future, we might add support for server groups, which also enables soft (anti-)affinity scheduling. Implements #26 ## Implementation Notes :hammer_and_pick: * Add `DifferentHostFilter` to schedule instances on different hosts from a set of instances. * Add `SameHostFilter` to schedule instances on the same hosts as a set of instances.
2022-05-06refactor(compute/service): Remove OpenTelemetry from "compute" modulesFabian Mastenbroek
This change removes the OpenTelemetry integration from the OpenDC Compute modules. Previously, we chose to integrate OpenTelemetry to provide a unified way to report metrics to the users. Although this worked as expected, the overhead of the OpenTelemetry when collecting metrics during simulation was considerable and lacked more optimization opportunities (other than providing a separate API implementation). Furthermore, since we were tied to OpenTelemetry's SDK implementation, we experienced issues with throttling and registering multiple instruments. We will instead use another approach, where we expose the core metrics in OpenDC via specialized interfaces (see the commits before) such that access is fast and can be done without having to interface with OpenTelemetry. In addition, we will provide an adapter to that is able to forward these metrics to OpenTelemetry implementations, so we can still integrate with the wider ecosystem.
2022-05-04refactor(compute): Directly expose scheduler stats to userFabian Mastenbroek
This change updates the `ComputeService` interface to directly expose statistics about the scheduler to the user, such that they do not necessarily have to interact with OpenTelemetry to obtain these values.
2022-05-04feat(compute): Add support for looking up hostsFabian Mastenbroek
This change adds the ability for users to lookup the `Host` on which a `Server` is hosted (if any). This allows the user to potentially interact with the `Host` directly, e.g., in order to obtain advanced metrics.
2022-05-03refactor(compute): Expose CPU and system stats via Host interfaceFabian Mastenbroek
This change updates the `Host` interface to directly expose CPU and system stats to be used by components that interface with the `Host` interface. Previously, this would require the user to interact with the OpenTelemetry SDK. Although that is still possible for more advanced usage cases, users can use the following methods to easily access common host and guest statistics.
2022-02-18refactor(utils): Rename utils module to common moduleFabian Mastenbroek
This change adds a new module, opendc-common, that contains functionality that is shared across OpenDC's modules. We move the existing utils module into this new module.
2022-02-18feat(utils): Add Pacer to pace scheduling cyclesFabian Mastenbroek
This change adds a new Pacer class that can pace the incoming scheduling requests into scheduling cycles by allowing the user to specify a scheduling quantum.
2022-02-15refactor: Update OpenTelemetry to version 1.11Fabian Mastenbroek
This change updates the OpenDC codebase to use OpenTelemetry v1.11, which stabilizes the metrics API. This stabilization brings quite a few breaking changes, so significant changes are necessary inside the OpenDC codebase.
2021-10-25feat(telemetry): Report provisioning time of virtual machinesFabian Mastenbroek
This change adds support for collecting the provisioning time of virtual machines in addition to their boot time.
2021-10-25feat(compute): Support filtering hosts based on CPU capacityFabian Mastenbroek
This change allows users to create servers with a smaller CPU capacity than the host, by specifying the CPU capacity via metadata. This also allows filtering hosts based on their available CPU capacity.
2021-09-17refactor(telemetry): Standardize compute scheduler metricsFabian Mastenbroek
This change updates the OpenDC compute service implementation with multiple meters that follow the OpenTelemetry conventions.
2021-09-17refactor(telemetry): Create separate MeterProvider per service/hostFabian Mastenbroek
This change refactors the telemetry implementation by creating a separate MeterProvider per service or host. This means we have to keep track of multiple metric producers, but that we can attach resource information to each of the MeterProviders like we would in a real world scenario.
2021-09-10docs(compute): Clarify terminology in compute serviceFabian Mastenbroek
2021-09-07fix(compute): Mark unschedulable server as terminatedFabian Mastenbroek
This change updates the compute service to mark servers that cannot be scheduled as terminated instead of error. Error is instead reserved for cases where the server is in an error state while running.
2021-09-07feat(compute): Track provisioning response timeFabian Mastenbroek
This change adds a metric for the provisioning time of virtual machines by the compute service.
2021-08-25build: Upgrade to OpenTelemetry 1.5Fabian Mastenbroek
This change upgrades the OpenTelemetry dependency to version 1.5, which contains various breaking changes in the metrics API.
2021-08-22refactor(compute): Update FilterScheduler to follow OpenStack's NovaFabian Mastenbroek
This change updates the FilterScheduler implementation to follow more closely the scheduler implementation in OpenStack's Nova. We now normalize the weights, support many of the filters and weights in OpenStack and support overcommitting resources.
2021-08-22fix(compute): Track failed servers with counters correctlyFabian Mastenbroek
2021-08-13build: Update Kotlin dependenciesFabian Mastenbroek
This change updates the Kotlin dependencies used by OpenDC to their latest version.
2021-05-18chore: Address deprecations due to Kotlin 1.5Fabian Mastenbroek
This change addresses the deprecations that were caused by the migration to Kotlin 1.5.
2021-04-25build: Migrate to flat project structureFabian Mastenbroek
This change updates the project structure to become flattened. Previously, the simulator, frontend and API each lived into their own directory. With this change, all modules of the project live in the top-level directory of the repository. This should improve discoverability of modules of the project.