summaryrefslogtreecommitdiff
path: root/opendc-experiments/opendc-experiments-capelin/src/test
AgeCommit message (Collapse)Author
2022-06-15fix(sim/compute): Always recompute power usageFabian Mastenbroek
This change fixes an issue in the `SimBareMetalMachine` implementation where the power usage was only updated after a non-zero duration. However, this would mean that OpenDC would possibly report incorrect power usage values when multiple convergence calls occured at the same timestamp.
2022-05-06refactor(compute/service): Remove OpenTelemetry from "compute" modulesFabian Mastenbroek
This change removes the OpenTelemetry integration from the OpenDC Compute modules. Previously, we chose to integrate OpenTelemetry to provide a unified way to report metrics to the users. Although this worked as expected, the overhead of the OpenTelemetry when collecting metrics during simulation was considerable and lacked more optimization opportunities (other than providing a separate API implementation). Furthermore, since we were tied to OpenTelemetry's SDK implementation, we experienced issues with throttling and registering multiple instruments. We will instead use another approach, where we expose the core metrics in OpenDC via specialized interfaces (see the commits before) such that access is fast and can be done without having to interface with OpenTelemetry. In addition, we will provide an adapter to that is able to forward these metrics to OpenTelemetry implementations, so we can still integrate with the wider ecosystem.
2022-05-06refactor(telemetry/compute): Support direct metric accessFabian Mastenbroek
This change introduces a `ComputeMetricReader` class that can be used as a replacement for the `CoroutineMetricReader` class when reading metrics from the Compute service. This implementation operates directly on a `ComputeService` instance, providing better performance.
2022-04-22refactor(compute): Load interference model via trace libraryFabian Mastenbroek
This change updates the compute support library to load the VM interference model via the OpenDC trace library, which provides a generic interface for reading interference models associated with workload traces.
2022-02-18fix(simulator): Flush results before accessing countersFabian Mastenbroek
This change updates the simulator implementation to flush the active progress when accessing the hypervisor counters. Previously, if the counters were accessed, while the mux or consumer was in progress, its counter values were not accurate.
2022-02-15refactor: Update OpenTelemetry to version 1.11Fabian Mastenbroek
This change updates the OpenDC codebase to use OpenTelemetry v1.11, which stabilizes the metrics API. This stabilization brings quite a few breaking changes, so significant changes are necessary inside the OpenDC codebase.
2021-11-16feat(workflow): Add helper tools for workflow simulationsFabian Mastenbroek
This change adds a new module, opendc-workflow-workload that contains helper code for constructing workflow simulations using OpenDC.
2021-11-02refactor(trace): Support gaps in trace dataFabian Mastenbroek
This change updates the implementation of the trace converter and SimTrace implementation to support cases where there is a gap between samples in the trace data. This change allows users to specify what to do in case samples are missing in the trace. The available options are specified in `SimTrace.FillMode`. Currently, we support either carrying the previous value forward or set the usage to zero.
2021-10-25perf(telemetry): Prevent allocations during collection cycleFabian Mastenbroek
This change redesigns the ComputeMonitor interface to reduce the number of memory allocations necessary during a collection cycle.
2021-10-25perf(compute): Redesign VM interference algorithmFabian Mastenbroek
This change redesigns the virtual machine interference algorithm to have a fixed memory usage per `VmInterferenceModel` instance. Previously, for every interference domain, a copy of the model would be created, leading to OutOfMemory errors when running multiple experiments at the same time.
2021-10-25fix(simulator): Compute energy usage in absence of convergenceFabian Mastenbroek
This change addresses an issue where energy usage was not computed correctly if the machine performed no work in between collection cycles.
2021-10-08perf(simulator): Skip fair-share algorithm if capacity remainingFabian Mastenbroek
This change updates the MaxMinFlowMultiplexer implementation to skip the fair-share algorithm in case the total demand is lower than the available capacity. In this case, no re-division of capacity is necessary.
2021-10-05perf(simulator): Manage deadlines centrally in max min muxFabian Mastenbroek
This change updates the MaxMinFlowMultiplexer implementation to centrally manage the deadlines of the `FlowSource`s as opposed to each source using its own timers. For large amounts of inputs, this is much faster as the multiplexer already needs to traverse each input on a timer expiration of an input.
2021-10-03feat(simulator): Expose CPU time counters directly on hypervisorFabian Mastenbroek
This change adds a new interface to the SimHypervisor interface that exposes the CPU time counters directly. These are derived from the flow counters and will be used by SimHost to expose them via telemetry.
2021-10-03refactor(simulator): Separate push and pull flagsFabian Mastenbroek
This change separates the push and pull flags in FlowConsumerContextImpl, meaning that sources can now push directly without pulling and vice versa.
2021-10-03refactor(simulator): Merge distributor and aggregator into switchFabian Mastenbroek
This change removes the distributor and aggregator interfaces in favour of a single switch interface. Since the switch interface is as powerful as both the distributor and aggregator, we don't need the latter two.
2021-10-03refactor(simulator): Remove onUpdate callbackFabian Mastenbroek
This change removes the `onUpdate` callback from the `SimResourceProviderLogic` interface. Instead, users should now update counters using either `onConsume` or `onConverge`.
2021-10-03refactor(simulator): Invoke consumer callback on every invalidationFabian Mastenbroek
This change updates the simulator implementation to always invoke the `SimResourceConsumer.onNext` callback when the resource context is invalidated. This allows users to update the resource counter or do some other work if the context has changed.
2021-10-03perf(simulator): Reduce memory allocations in SimResourceInterpreterFabian Mastenbroek
This change removes unnecessary allocations in the SimResourceInterpreter caused by the way timers were allocated for the resource context.
2021-10-03refactor(simulator): Combine work and deadline to durationFabian Mastenbroek
This change removes the work and deadline properties from the SimResourceCommand.Consume class and introduces a new property duration. This property is now used in conjunction with the limit to compute the amount of work processed by a resource provider. Previously, we used both work and deadline to compute the duration and the amount of remaining work at the end of a consumption. However, with this change, we ensure that a resource consumption always runs at the same speed once establishing, drastically simplifying the computation for the amount of work processed during the consumption.
2021-09-28refactor(telemetry): Do not require clock for ComputeMetricExporterFabian Mastenbroek
This change drops the requirement for a clock parameter when constructing a ComputeMetricExporter, since it will now derive the timestamp from the recorded metrics.
2021-09-19feat(trace): Update OpenDC VM trace formatFabian Mastenbroek
This change optimizes the OpenDC VM trace format by removing unnecessary columns as well as optimizing the writer settings. The new implementation still supports reading the old trace format in case users run OpenDC with older workload traces.
2021-09-19refactor(capelin): Make workload sampling model extensibleFabian Mastenbroek
This change updates the workload sampling implementation to be more flexible in the way the workload is constructed. Users can now sample multiple workloads at the same time using multiple samplers and use them as a single workload to simulate.
2021-09-19refactor(capelin): Support flexible topology creationFabian Mastenbroek
This change adds support for creating flexible topologies by creating a TopologyFactory interface that is responsible for configuring the hosts of a compute service.
2021-09-19refactor(capelin): Extract common code out of Capelin experimentsFabian Mastenbroek
This change creates a new module for doing simulations with virtual machine workloads. We have found that a lot of code in the Capelin experiments code is being re-used by non-experiment modules.
2021-09-17refactor(telemetry): Standardize SimHost metricsFabian Mastenbroek
This change standardizes the metrics emitted by SimHost instances and their guests based on the OpenTelemetry semantic conventions. We now also report CPU time as opposed to CPU work as this metric is more commonly used.
2021-09-17refactor(telemetry): Standardize compute scheduler metricsFabian Mastenbroek
This change updates the OpenDC compute service implementation with multiple meters that follow the OpenTelemetry conventions.
2021-09-17refactor(telemetry): Create separate MeterProvider per service/hostFabian Mastenbroek
This change refactors the telemetry implementation by creating a separate MeterProvider per service or host. This means we have to keep track of multiple metric producers, but that we can attach resource information to each of the MeterProviders like we would in a real world scenario.
2021-09-17build(telemetry): Update to OpenTelemetry 1.6.0Fabian Mastenbroek
This change updates the opentelemetry-java library to version 1.6.0.
2021-09-10refactor(compute): Integrate fault injection into compute simulatorFabian Mastenbroek
This change moves the fault injection logic directly into the opendc-compute-simulator module, so that it can operate at a higher abstraction. In the future, we might again split the module if we can re-use some of its logic.
2021-09-10refactor(capelin): Terminate servers after reaching deadlineFabian Mastenbroek
This change updates the Capelin experiment helpers to terminate a server when it has reached its end-date.
2021-09-07refactor(capelin): Restructure input reading classesFabian Mastenbroek
2021-09-07refactor(capelin): Move metric collection outside Capelin codeFabian Mastenbroek
This change moves the metric collection outside the Capelin codebase in a separate module so other modules can also benefit from the compute metric collection code.
2021-09-07feat(capelin): Report up/downtime metrics in experiment monitorFabian Mastenbroek
2021-09-02refactor(trace): Implement trace API for WTF readerFabian Mastenbroek
This change updates the WTF trace reader to support the new streaming trace API.
2021-09-02refactor(format): Remove environment reader from format libraryFabian Mastenbroek
This change removes the environment reader from the format library since they are highly specific for the particular experiment. In the future, we hope to have a single format to setup the entire datacenter (perhaps similar to the format used by the web runner).
2021-08-25refactor(compute): Measure power draw without PSU overheadFabian Mastenbroek
This change updates the SimHost implementation to measure the power draw of the machine without PSU overhead to make the results more realistic.
2021-08-25fix(capelin): Eliminate unnecessary double to long conversionsFabian Mastenbroek
This change eliminates the unnecessary conversions from double to long in the Capelin metric processing code.
2021-08-25build: Upgrade to OpenTelemetry 1.5Fabian Mastenbroek
This change upgrades the OpenTelemetry dependency to version 1.5, which contains various breaking changes in the metrics API.
2021-08-24test(capelin): Add tests for interference and failuresFabian Mastenbroek
This change adds tests to the Capelin integration suite for performance interference as well as failures. These test more accurately the experiment setup.
2021-08-24fix(capelin): Update Bitbrains trace testsFabian Mastenbroek
This change updates the Bitbrains trace tests with the updated trace that does not hardcode the duration of the trace fragments.
2021-08-24fix(simulator): Support unaligned trace fragmentsFabian Mastenbroek
2021-08-24fix(simulator): Record overcommit only after deadlineFabian Mastenbroek
This change fixes an issue with the simulator where it would record overcomitted work if the output was updated before the deadline was reached.
2021-08-24refactor(simulator): Execute traces based on timestampsFabian Mastenbroek
This change refactors the trace workload in the OpenDC simulator to track execute a fragment based on the fragment's timestamp. This makes sure that the trace is replayed identically to the original execution.
2021-08-22refactor(compute): Update FilterScheduler to follow OpenStack's NovaFabian Mastenbroek
This change updates the FilterScheduler implementation to follow more closely the scheduler implementation in OpenStack's Nova. We now normalize the weights, support many of the filters and weights in OpenStack and support overcommitting resources.
2021-06-24simulator: Re-implement performance interference modelFabian Mastenbroek
This change updates reimplements the performance interference model to work on top of the universal resource model in `opendc-simulator-resources`. This enables us to model interference and performance variability of other resources such as disk or network in the future.
2021-06-24format: Remove performance interference from trace readersFabian Mastenbroek
This change updates the trace reader implementation to remove their dependency on the performance interference model. In a future commit, we will instead pass the performance interference model via the host/hypervisor.
2021-06-02simulator: Add uniform interface for resource metricsFabian Mastenbroek
This change adds a new interface to the resources library for accessing metrics of resources such as work, demand and overcommitted work. With this change, we do not need an implementation specific listener interface in SimResourceSwitchMaxMin anymore. Another benefit of this approach is that updates will be scheduled more efficiently and progress will only be reported once the system has reached a steady-state for that timestamp.
2021-06-01simulator: Centralize resource logic in SimResourceInterpreterFabian Mastenbroek
This change introduces the SimResourceInterpreter which centralizes the logic for scheduling and interpreting the communication between resource consumer and provider. This approach offers better performance due to avoiding invalidating the state of the resource context when not necessary. Benchmarks show in the best case a 5x performance improvement and at worst a 2x improvement.
2021-05-04exp: Fix aggregation of power drawFabian Mastenbroek
This change fixes the aggregation of the power draw metric. Previously, if the power draw did not change between collection cycles, the power draw would be reported as zero. This change uses OpenTelemetry Views to collect the latest value of the power draw each cycle.