summaryrefslogtreecommitdiff
path: root/opendc-compute/opendc-compute-simulator
AgeCommit message (Collapse)Author
2022-10-05refactor(sim/core): Use SimulationScheduler in coroutine dispatcherFabian Mastenbroek
This change updates the implementation of `SimulationDispatcher` to use a (possibly user-provided) `SimulationScheduler` for managing the execution of the simulation and future tasks.
2022-09-22refactor(sim/compute): Simplify SimHypervisor classFabian Mastenbroek
This change simplifies the SimHypervisor class into a single implementation. Previously, it was implemented as an abstract class with multiple implementations for each multiplexer type. We now pass the multiplexer type as parameter to the SimHypervisor constructor.
2022-09-22refactor(sim/compute): Make interference domain independent of profileFabian Mastenbroek
This change updates the virtual machine performance interference model so that the interference domain can be constructed independently of the interference profile. As a consequence, the construction of the topology now does not depend anymore on the interference profile.
2022-09-22refactor(compute): Simplify constructor of SimHostFabian Mastenbroek
This change updates the constructor of SimHost to receive a `SimBareMetalMachine` and `SimHypervisor` directly instead of construction these objects itself. This ensures better testability and also simplifies the constructor of this class, especially when future changes to `SimBareMetalMachine` or `SimHypervisor` change their constructors.
2022-09-22refactor(compute): Add separate error host stateFabian Mastenbroek
This change adds a new HostState to indicate that the host is in an error state as opposed to being purposefully unavailable.
2022-09-22refactor(sim/compute): Extract Random dependency from interference modelFabian Mastenbroek
This change moves the Random dependency outside the interference model, to allow the interference model to be completely immutable and passable between different simulations.
2022-09-21refactor(sim/compute): Move interference logic into VmInterferenceMemberFabian Mastenbroek
This change updates the design of the VM interference model, where we move more of the logic into the `VmInterferenceMember` interface. This removes the dependency on the `VmInterferenceModel` for the hypervisor interface.
2022-09-21refactor(sim/compute): Pass interference key via parameterFabian Mastenbroek
This change updates the signature of the `SimHypervisor` interface to accept a `VmInterferenceKey` when creating a new virtual machine, instead of providing a string identifier. This is in preparation for removing the dependency on the `VmInterferenceModel` in the `SimAbstractHypervisor` class.
2022-05-06refactor(compute/service): Remove OpenTelemetry from "compute" modulesFabian Mastenbroek
This change removes the OpenTelemetry integration from the OpenDC Compute modules. Previously, we chose to integrate OpenTelemetry to provide a unified way to report metrics to the users. Although this worked as expected, the overhead of the OpenTelemetry when collecting metrics during simulation was considerable and lacked more optimization opportunities (other than providing a separate API implementation). Furthermore, since we were tied to OpenTelemetry's SDK implementation, we experienced issues with throttling and registering multiple instruments. We will instead use another approach, where we expose the core metrics in OpenDC via specialized interfaces (see the commits before) such that access is fast and can be done without having to interface with OpenTelemetry. In addition, we will provide an adapter to that is able to forward these metrics to OpenTelemetry implementations, so we can still integrate with the wider ecosystem.
2022-05-04refactor(compute): Directly expose scheduler stats to userFabian Mastenbroek
This change updates the `ComputeService` interface to directly expose statistics about the scheduler to the user, such that they do not necessarily have to interact with OpenTelemetry to obtain these values.
2022-05-03refactor(compute): Expose CPU and system stats via Host interfaceFabian Mastenbroek
This change updates the `Host` interface to directly expose CPU and system stats to be used by components that interface with the `Host` interface. Previously, this would require the user to interact with the OpenTelemetry SDK. Although that is still possible for more advanced usage cases, users can use the following methods to easily access common host and guest statistics.
2022-04-23build: Enable testing for all library modulesFabian Mastenbroek
This change updates the Gradle build configuration to ensure that all library modules (that will be published) use testing and are included in coverage reports. This should ensure the public modules remain well tested.
2022-02-18fix(simulator): Flush results before accessing countersFabian Mastenbroek
This change updates the simulator implementation to flush the active progress when accessing the hypervisor counters. Previously, if the counters were accessed, while the mux or consumer was in progress, its counter values were not accurate.
2022-02-18fix(compute): Disallow duplicate UIDs for SimHostFabian Mastenbroek
This change fixes an issue with the ComputeServiceHelper where it allowed users to register multiple SimHost objects with the same UID. See this issue for more information: https://github.com/atlarge-research/opendc/issues/51
2022-02-18build: Remove opendc-platform moduleFabian Mastenbroek
This change removes the opendc-platform module from the project. This module represented a Java platform which was previously used for sharing a set of dependency versions between subprojects. However, with the version catalogue that was added by Gradle, we currently do not use the platform anymore.
2022-02-18refactor(utils): Rename utils module to common moduleFabian Mastenbroek
This change adds a new module, opendc-common, that contains functionality that is shared across OpenDC's modules. We move the existing utils module into this new module.
2022-02-15refactor: Update OpenTelemetry to version 1.11Fabian Mastenbroek
This change updates the OpenDC codebase to use OpenTelemetry v1.11, which stabilizes the metrics API. This stabilization brings quite a few breaking changes, so significant changes are necessary inside the OpenDC codebase.
2021-10-25refactor(simulator): Support running workloads without coroutinesFabian Mastenbroek
This change updates the SimMachine interface to drop the coroutine requirement for running a workload on a machines. Users can now asynchronously start a workload and receive notifications via the workload callbacks. Users still have the possibility to suspend execution during workload execution by using the new `runWorkload` method, which is implemented on top of the new `startWorkload` primitive.
2021-10-25perf(telemetry): Prevent allocations during collection cycleFabian Mastenbroek
This change redesigns the ComputeMonitor interface to reduce the number of memory allocations necessary during a collection cycle.
2021-10-25feat(telemetry): Report provisioning time of virtual machinesFabian Mastenbroek
This change adds support for collecting the provisioning time of virtual machines in addition to their boot time.
2021-10-25feat(compute): Support filtering hosts based on CPU capacityFabian Mastenbroek
This change allows users to create servers with a smaller CPU capacity than the host, by specifying the CPU capacity via metadata. This also allows filtering hosts based on their available CPU capacity.
2021-10-08perf(simulator): Optimize SimTraceWorkloadFabian Mastenbroek
This change improves the performance of the SimTraceWorkload class by changing the way trace fragments are read and processed by the CPU consumers.
2021-10-03perf(compute): Optimize telemetry collectionFabian Mastenbroek
This change optimizes the telemetry collection in the SimHost class. Previously, there was significant overhead in collecting the metrics of this and associated classes due large `Attributes` object that did not cache accesses to `hashCode()`. We now wrap this object and manually cache the hash code.
2021-10-03feat(simulator): Expose CPU time counters directly on hypervisorFabian Mastenbroek
This change adds a new interface to the SimHypervisor interface that exposes the CPU time counters directly. These are derived from the flow counters and will be used by SimHost to expose them via telemetry.
2021-10-03refactor(simulator): Migrate to flow-based simulationFabian Mastenbroek
This change renames the `opendc-simulator-resources` module into the `opendc-simulator-flow` module to indicate that the core simulation model of OpenDC is based around modelling and simulating flows. Previously, the distinction between resource consumer and provider, and input and output caused some confusion. By switching to a flow-based model, this distinction is now clear (as in, the water flows from source to consumer/sink).
2021-10-03refactor(simulator): Merge distributor and aggregator into switchFabian Mastenbroek
This change removes the distributor and aggregator interfaces in favour of a single switch interface. Since the switch interface is as powerful as both the distributor and aggregator, we don't need the latter two.
2021-09-28fix(compute): Do not recover guests in non-error stateFabian Mastenbroek
2021-09-28refactor(telemetry): Do not require clock for ComputeMetricExporterFabian Mastenbroek
This change drops the requirement for a clock parameter when constructing a ComputeMetricExporter, since it will now derive the timestamp from the recorded metrics.
2021-09-19refactor(capelin): Make workload sampling model extensibleFabian Mastenbroek
This change updates the workload sampling implementation to be more flexible in the way the workload is constructed. Users can now sample multiple workloads at the same time using multiple samplers and use them as a single workload to simulate.
2021-09-19perf(compute): Add option for optimizing SimHost simulationFabian Mastenbroek
This change adds an option for optimizing SimHost simulation by combining all the CPUs of a machine into a single large CPU. For most workloads, this does not significantly affect the simulation results, but does improve the simulation time by a lot.
2021-09-17refactor(telemetry): Standardize SimHost metricsFabian Mastenbroek
This change standardizes the metrics emitted by SimHost instances and their guests based on the OpenTelemetry semantic conventions. We now also report CPU time as opposed to CPU work as this metric is more commonly used.
2021-09-17refactor(telemetry): Create separate MeterProvider per service/hostFabian Mastenbroek
This change refactors the telemetry implementation by creating a separate MeterProvider per service or host. This means we have to keep track of multiple metric producers, but that we can attach resource information to each of the MeterProviders like we would in a real world scenario.
2021-09-17refactor(telemetry): Simplify CoroutineMetricReaderFabian Mastenbroek
This change simplifies the CoroutineMetricReader implementation by removing the seperation of reader and exporter jobs.
2021-09-10test(compute): Add test suite for fault injectorFabian Mastenbroek
2021-09-10refactor(compute): Integrate fault injection into compute simulatorFabian Mastenbroek
This change moves the fault injection logic directly into the opendc-compute-simulator module, so that it can operate at a higher abstraction. In the future, we might again split the module if we can re-use some of its logic.
2021-09-07fix(compute): Do not allow failure of inactive guestsFabian Mastenbroek
This change fixes an issue in SimHost where guests that where inactive were also failed, causing an IllegalStateException.
2021-09-07fix(compute): Use correct memory size for host memoryFabian Mastenbroek
This change fixes an issue where all servers could not be scheduled due to the memory size of the host being computed incorrectly.
2021-09-07feat(compute): Track guest up/down timeFabian Mastenbroek
This change updates the SimHost implementation to track the up and downtime of hypervisor guests.
2021-09-07fix(compute): Start host even if it already exists on hostFabian Mastenbroek
2021-09-07fix(compute): Support overcommitted memory in SimHostFabian Mastenbroek
This change enables host to overcommit their memory when testing whether new servers can fit on the host.
2021-09-07feat(compute): Track host up/down timeFabian Mastenbroek
This change adds new metrics for tracking the up and downtime of hosts due to failures. In addition, this change adds a test to verify whether the metrics are collected correctly.
2021-08-25refactor(compute): Measure power draw without PSU overheadFabian Mastenbroek
This change updates the SimHost implementation to measure the power draw of the machine without PSU overhead to make the results more realistic.
2021-08-25fix(simulator): Eliminate unnecessary double to long conversionsFabian Mastenbroek
This change eliminates unnecessary double to long conversions in the simulator. Previously, we used longs to denote the amount of work. However, in the mean time we have switched to doubles in the lower stack.
2021-08-25build: Upgrade to OpenTelemetry 1.5Fabian Mastenbroek
This change upgrades the OpenTelemetry dependency to version 1.5, which contains various breaking changes in the metrics API.
2021-08-24feat(compute): Add support for SimHost failureFabian Mastenbroek
This change adds support for failures in the SimHost implementation. Failing a host will now cause the virtual machine to enter an error state.
2021-08-24fix(capelin): Update Bitbrains trace testsFabian Mastenbroek
This change updates the Bitbrains trace tests with the updated trace that does not hardcode the duration of the trace fragments.
2021-08-24refactor(simulator): Execute traces based on timestampsFabian Mastenbroek
This change refactors the trace workload in the OpenDC simulator to track execute a fragment based on the fragment's timestamp. This makes sure that the trace is replayed identically to the original execution.
2021-06-24simulator: Re-implement performance interference modelFabian Mastenbroek
This change updates reimplements the performance interference model to work on top of the universal resource model in `opendc-simulator-resources`. This enables us to model interference and performance variability of other resources such as disk or network in the future.
2021-06-21simulator: Re-organize compute simulator moduleFabian Mastenbroek
This change re-organizes the classes of the compute simulator module to make a clearer distinction between the hardware, firmware and software interfaces in this module.
2021-06-21simulator: Remove concept of resource lifecycleFabian Mastenbroek
This change removes the AutoCloseable interface from the SimResourceProvider and removes the concept of a resource lifecycle. Instead, resource providers are now either active (running a resource consumer) or in-active (being idle), which simplifies implementation.