summaryrefslogtreecommitdiff
path: root/opendc-compute/opendc-compute-simulator
AgeCommit message (Collapse)Author
2022-11-27refactor(compute/service): Do not split interface and implementationFabian Mastenbroek
This change inlines the implementation of the compute service into the `ComputeService` interface. We do not intend to provide multiple implementations of the service. In addition, this approach makes more sense for a Java implementation.
2022-11-27refactor(compute/api): Do not suspend in compute APIFabian Mastenbroek
This change updates the API interface of the OpenDC Compute service to not suspend execution using Kotlin Coroutines. The suspending modifiers were introduced in case the ComputeClient would communicate with the service over a network connection. However, the main use-case has been together with the ComputeService, where the suspending modifiers only frustrate the user experience when writing experiments. Furthermore, with the advent of Project Loom, it is not necessarily a problem to block the (virtual) thread during network communications.
2022-11-13refactor: Replace use of CoroutineContext by DispatcherFabian Mastenbroek
This change replaces the use of `CoroutineContext` for passing the `SimulationDispatcher` across the different modules of OpenDC by the lightweight `Dispatcher` interface of the OpenDC common module.
2022-11-13refactor(sim/core): Re-implement SimulationScheduler as DispatcherFabian Mastenbroek
This change updates the `SimulationScheduler` class to implement the `Dispatcher` interface from the OpenDC Common module, so that OpenDC modules only need to depend on the common module for dispatching future task (possibly in simulation).
2022-11-13refactor: Use InstantSource as time sourceFabian Mastenbroek
This change updates the modules of OpenDC to always accept the `InstantSource` interface as source of time. Previously we used `java.time.Clock`, but this class is bound to a time zone which does not make sense for our use-cases. Since `java.time.Clock` implements `java.time.InstantSource`, it can be used in places that require an `InstantSource` as parameter. Conversion from `InstantSource` to `Clock` is also possible by invoking `InstantSource#withZone`.
2022-11-04refactor: Use RandomGenerator as randomness sourceFabian Mastenbroek
This change updates the modules of OpenDC to always accept the `RandomGenerator` interface as source of randomness. This interface is implemented by the slower `java.util.Random` class, but also by the faster `java.util.SplittableRandom` class
2022-10-31feat(sim/compute): Add support for snapshotting workloadsFabian Mastenbroek
This change updates the interface of `SimWorkload` to support snapshotting workloads. We introduce a new method `snapshot()` to this interface which returns a new `SimWorkload` that can be started at a later point in time and on another `SimMachine`, which continues progress from the moment the workload was snapshotted.
2022-10-28perf(compute/sim): Use static logger fieldFabian Mastenbroek
This change updates the `Guest` class implementation to use a static logger field instead of allocation a new logger for every guest.
2022-10-28refactor(compute/service): Do not suspend on guest startFabian Mastenbroek
This change updates the `Host` interface to remove the suspend modifiers to the start, stop, spawn, and delete methods of this interface. We now assume that the host immediately launches the guest on invocation of this method.
2022-10-28feat(compute/sim): Model host boot timeFabian Mastenbroek
This change updates `SimHost` to support modeling the time and resource consumption it takes to boot the host. The boot procedure is modeled as a `SimWorkload`.
2022-10-28refactor(compute/sim): Use workload chaining for boot delayFabian Mastenbroek
This change updates the implementation of `SimHost` to use workload chaining for modelling boot delays. Previously, this was implemented by sleeping 1 millisecond using Kotlin coroutines. With this change, we remove the need for coroutines and instead use the `SimDurationWorkload` to model the boot delay. In the future, we envision a user-supplied stochastic boot model to model the boot delay for VM instances.
2022-10-28feat(sim/compute): Add completion parameter to startWorkloadFabian Mastenbroek
This change updates the interface of `SimMachine#startWorkload` to introduce a parameter `completion` that is invoked when the workload completes either succesfully or due to failure. This functionality has often been implemented by wrapping a `SimWorkload` and catching its exceptions. However, since this functionality is used in all usages of `SimMachine#startWorkload` we instead embed it into `SimMachine` itself.
2022-10-21refactor(sim/compute): Re-implement using flow2Fabian Mastenbroek
This change re-implements the OpenDC compute simulator framework using the new flow2 framework for modelling multi-edge flow networks. The re-implementation is written in Java and focusses on performance and clean API surface.
2022-10-06build: Switch to Spotless for formattingFabian Mastenbroek
This change updates the build configuration to use Spotless for code formating of both Kotlin and Java.
2022-10-06style: Eliminate use of wildcard importsFabian Mastenbroek
This change updates the repository to remove the use of wildcard imports everywhere. Wildcard imports are not allowed by default by Ktlint as well as Google's Java style guide.
2022-10-05refactor(sim/core): Rename runBlockingSimulation to runSimulationFabian Mastenbroek
This change renames the method `runBlockingSimulation` to `runSimulation` to put more emphasis on the simulation part of the method. The blocking part is not that important, but this behavior is still described in the method documentation.
2022-10-05refactor(sim/core): Use SimulationScheduler in coroutine dispatcherFabian Mastenbroek
This change updates the implementation of `SimulationDispatcher` to use a (possibly user-provided) `SimulationScheduler` for managing the execution of the simulation and future tasks.
2022-09-22refactor(sim/compute): Simplify SimHypervisor classFabian Mastenbroek
This change simplifies the SimHypervisor class into a single implementation. Previously, it was implemented as an abstract class with multiple implementations for each multiplexer type. We now pass the multiplexer type as parameter to the SimHypervisor constructor.
2022-09-22refactor(sim/compute): Make interference domain independent of profileFabian Mastenbroek
This change updates the virtual machine performance interference model so that the interference domain can be constructed independently of the interference profile. As a consequence, the construction of the topology now does not depend anymore on the interference profile.
2022-09-22refactor(compute): Simplify constructor of SimHostFabian Mastenbroek
This change updates the constructor of SimHost to receive a `SimBareMetalMachine` and `SimHypervisor` directly instead of construction these objects itself. This ensures better testability and also simplifies the constructor of this class, especially when future changes to `SimBareMetalMachine` or `SimHypervisor` change their constructors.
2022-09-22refactor(compute): Add separate error host stateFabian Mastenbroek
This change adds a new HostState to indicate that the host is in an error state as opposed to being purposefully unavailable.
2022-09-22refactor(sim/compute): Extract Random dependency from interference modelFabian Mastenbroek
This change moves the Random dependency outside the interference model, to allow the interference model to be completely immutable and passable between different simulations.
2022-09-21refactor(sim/compute): Move interference logic into VmInterferenceMemberFabian Mastenbroek
This change updates the design of the VM interference model, where we move more of the logic into the `VmInterferenceMember` interface. This removes the dependency on the `VmInterferenceModel` for the hypervisor interface.
2022-09-21refactor(sim/compute): Pass interference key via parameterFabian Mastenbroek
This change updates the signature of the `SimHypervisor` interface to accept a `VmInterferenceKey` when creating a new virtual machine, instead of providing a string identifier. This is in preparation for removing the dependency on the `VmInterferenceModel` in the `SimAbstractHypervisor` class.
2022-05-06refactor(compute/service): Remove OpenTelemetry from "compute" modulesFabian Mastenbroek
This change removes the OpenTelemetry integration from the OpenDC Compute modules. Previously, we chose to integrate OpenTelemetry to provide a unified way to report metrics to the users. Although this worked as expected, the overhead of the OpenTelemetry when collecting metrics during simulation was considerable and lacked more optimization opportunities (other than providing a separate API implementation). Furthermore, since we were tied to OpenTelemetry's SDK implementation, we experienced issues with throttling and registering multiple instruments. We will instead use another approach, where we expose the core metrics in OpenDC via specialized interfaces (see the commits before) such that access is fast and can be done without having to interface with OpenTelemetry. In addition, we will provide an adapter to that is able to forward these metrics to OpenTelemetry implementations, so we can still integrate with the wider ecosystem.
2022-05-04refactor(compute): Directly expose scheduler stats to userFabian Mastenbroek
This change updates the `ComputeService` interface to directly expose statistics about the scheduler to the user, such that they do not necessarily have to interact with OpenTelemetry to obtain these values.
2022-05-03refactor(compute): Expose CPU and system stats via Host interfaceFabian Mastenbroek
This change updates the `Host` interface to directly expose CPU and system stats to be used by components that interface with the `Host` interface. Previously, this would require the user to interact with the OpenTelemetry SDK. Although that is still possible for more advanced usage cases, users can use the following methods to easily access common host and guest statistics.
2022-04-23build: Enable testing for all library modulesFabian Mastenbroek
This change updates the Gradle build configuration to ensure that all library modules (that will be published) use testing and are included in coverage reports. This should ensure the public modules remain well tested.
2022-02-18fix(simulator): Flush results before accessing countersFabian Mastenbroek
This change updates the simulator implementation to flush the active progress when accessing the hypervisor counters. Previously, if the counters were accessed, while the mux or consumer was in progress, its counter values were not accurate.
2022-02-18fix(compute): Disallow duplicate UIDs for SimHostFabian Mastenbroek
This change fixes an issue with the ComputeServiceHelper where it allowed users to register multiple SimHost objects with the same UID. See this issue for more information: https://github.com/atlarge-research/opendc/issues/51
2022-02-18build: Remove opendc-platform moduleFabian Mastenbroek
This change removes the opendc-platform module from the project. This module represented a Java platform which was previously used for sharing a set of dependency versions between subprojects. However, with the version catalogue that was added by Gradle, we currently do not use the platform anymore.
2022-02-18refactor(utils): Rename utils module to common moduleFabian Mastenbroek
This change adds a new module, opendc-common, that contains functionality that is shared across OpenDC's modules. We move the existing utils module into this new module.
2022-02-15refactor: Update OpenTelemetry to version 1.11Fabian Mastenbroek
This change updates the OpenDC codebase to use OpenTelemetry v1.11, which stabilizes the metrics API. This stabilization brings quite a few breaking changes, so significant changes are necessary inside the OpenDC codebase.
2021-10-25refactor(simulator): Support running workloads without coroutinesFabian Mastenbroek
This change updates the SimMachine interface to drop the coroutine requirement for running a workload on a machines. Users can now asynchronously start a workload and receive notifications via the workload callbacks. Users still have the possibility to suspend execution during workload execution by using the new `runWorkload` method, which is implemented on top of the new `startWorkload` primitive.
2021-10-25perf(telemetry): Prevent allocations during collection cycleFabian Mastenbroek
This change redesigns the ComputeMonitor interface to reduce the number of memory allocations necessary during a collection cycle.
2021-10-25feat(telemetry): Report provisioning time of virtual machinesFabian Mastenbroek
This change adds support for collecting the provisioning time of virtual machines in addition to their boot time.
2021-10-25feat(compute): Support filtering hosts based on CPU capacityFabian Mastenbroek
This change allows users to create servers with a smaller CPU capacity than the host, by specifying the CPU capacity via metadata. This also allows filtering hosts based on their available CPU capacity.
2021-10-08perf(simulator): Optimize SimTraceWorkloadFabian Mastenbroek
This change improves the performance of the SimTraceWorkload class by changing the way trace fragments are read and processed by the CPU consumers.
2021-10-03perf(compute): Optimize telemetry collectionFabian Mastenbroek
This change optimizes the telemetry collection in the SimHost class. Previously, there was significant overhead in collecting the metrics of this and associated classes due large `Attributes` object that did not cache accesses to `hashCode()`. We now wrap this object and manually cache the hash code.
2021-10-03feat(simulator): Expose CPU time counters directly on hypervisorFabian Mastenbroek
This change adds a new interface to the SimHypervisor interface that exposes the CPU time counters directly. These are derived from the flow counters and will be used by SimHost to expose them via telemetry.
2021-10-03refactor(simulator): Migrate to flow-based simulationFabian Mastenbroek
This change renames the `opendc-simulator-resources` module into the `opendc-simulator-flow` module to indicate that the core simulation model of OpenDC is based around modelling and simulating flows. Previously, the distinction between resource consumer and provider, and input and output caused some confusion. By switching to a flow-based model, this distinction is now clear (as in, the water flows from source to consumer/sink).
2021-10-03refactor(simulator): Merge distributor and aggregator into switchFabian Mastenbroek
This change removes the distributor and aggregator interfaces in favour of a single switch interface. Since the switch interface is as powerful as both the distributor and aggregator, we don't need the latter two.
2021-09-28fix(compute): Do not recover guests in non-error stateFabian Mastenbroek
2021-09-28refactor(telemetry): Do not require clock for ComputeMetricExporterFabian Mastenbroek
This change drops the requirement for a clock parameter when constructing a ComputeMetricExporter, since it will now derive the timestamp from the recorded metrics.
2021-09-19refactor(capelin): Make workload sampling model extensibleFabian Mastenbroek
This change updates the workload sampling implementation to be more flexible in the way the workload is constructed. Users can now sample multiple workloads at the same time using multiple samplers and use them as a single workload to simulate.
2021-09-19perf(compute): Add option for optimizing SimHost simulationFabian Mastenbroek
This change adds an option for optimizing SimHost simulation by combining all the CPUs of a machine into a single large CPU. For most workloads, this does not significantly affect the simulation results, but does improve the simulation time by a lot.
2021-09-17refactor(telemetry): Standardize SimHost metricsFabian Mastenbroek
This change standardizes the metrics emitted by SimHost instances and their guests based on the OpenTelemetry semantic conventions. We now also report CPU time as opposed to CPU work as this metric is more commonly used.
2021-09-17refactor(telemetry): Create separate MeterProvider per service/hostFabian Mastenbroek
This change refactors the telemetry implementation by creating a separate MeterProvider per service or host. This means we have to keep track of multiple metric producers, but that we can attach resource information to each of the MeterProviders like we would in a real world scenario.
2021-09-17refactor(telemetry): Simplify CoroutineMetricReaderFabian Mastenbroek
This change simplifies the CoroutineMetricReader implementation by removing the seperation of reader and exporter jobs.
2021-09-10test(compute): Add test suite for fault injectorFabian Mastenbroek