opendc.git - The OpenDC repository.

Age	Commit message (Collapse)	Author
2022-09-28	feat(exp/base): Add service registry for cloud services	Fabian Mastenbroek
	This change adds a new module called opendc-experiments-base which will provide a base for doing experiments with OpenDC. The initial feature we introduce is the service registry which acts as DNS for services to register during experimentation.
2022-09-23	refactor(compute): Provide access to instances in compute service	Fabian Mastenbroek
	This change updates the interface of `ComputeService` to provide access to the instances (servers) that have been registered with the compute service. This allows metric collectors to query the metrics of the servers that are currently running.
2022-09-22	refactor(compute): Pass failure model during workload evaluation	Fabian Mastenbroek
	This change updates the `ComputeServiceHelper` class to provide the failure model via a parameter to the `run` method instead of constructor parameter. This separates the construction of the topology from the simulation of the workload.
2022-09-22	refactor(sim/compute): Make interference domain independent of profile	Fabian Mastenbroek
	This change updates the virtual machine performance interference model so that the interference domain can be constructed independently of the interference profile. As a consequence, the construction of the topology now does not depend anymore on the interference profile.
2022-09-22	refactor(sim/compute): Extract Random dependency from interference model	Fabian Mastenbroek
	This change moves the Random dependency outside the interference model, to allow the interference model to be completely immutable and passable between different simulations.
2022-09-21	refactor(sim/compute): Move VM interference model into compute simulator	Fabian Mastenbroek
	This change moves the core of the VM interference model from the flow module into the compute simulator. This logic can be contained in the compute simulator and does not need to leak into the flow-level simulator.
2022-06-15	fix(sim/compute): Always recompute power usage	Fabian Mastenbroek
	This change fixes an issue in the `SimBareMetalMachine` implementation where the power usage was only updated after a non-zero duration. However, this would mean that OpenDC would possibly report incorrect power usage values when multiple convergence calls occured at the same timestamp.
2022-06-15	fix(exp/tf20): Derive device statistics directly from SimMachine	Fabian Mastenbroek
	This change updates the implementation of SimTFDevice to directly use the metrics provided by the `SimBareMetalMachine` class, instead of computing these metrics itself.
2022-05-06	refactor(exp/capelin): Add independent Capelin distribution	Fabian Mastenbroek
	This change updates the Capelin experiments so it can be distributed and executed independently of the main OpenDC distribution. We provide a new command line interface for users to directly run the experiments. Alternatively, the `CapelinRunner` class encapsulates the logic for running the experiments and can be used programmatically.
2022-05-06	refactor(exp/tf20): Convert experiment into integration test	Fabian Mastenbroek
	This change removes the `TensorFlowExperiment` in favour of an integration test that can be run during CI invocations. Given that the experiment was not very sophisticated (in terms of data collection), we believe it is better suited as an integration test.
2022-05-06	fix(exp/tf20): Fix infinite loop due to invalid rounding	Fabian Mastenbroek
	This change fixes an issue with the `SimTFDevice` implementation where very small amounts of FLOPs would cause the device to enter an infinite loop. We now round the value up to ensure that the device always consumes FLOPs.
2022-05-06	feat(faas): Add helper tools for FaaS simulations	Fabian Mastenbroek
	This change adds a new module, opendc-faas-workload that contains helper code for constructing simulations of FaaS-based workloads using OpenDC. In addition, we add an integration test that demonstrates the capabilities of the helper tool and the FaaS platform of OpenDC.
2022-05-06	refactor(exp/tf20): Remove OpenTelemetry from TF20 experiment	Fabian Mastenbroek
	This change removes the OpenTelemetry integration from the OpenDC Tensorflow 2020 experiments. Previously, we chose to integrate OpenTelemetry to provide a unified way to report metrics to the users. See the previous commit removing it from the "Compute" modules for the reasoning behind this change.
2022-05-06	refactor(workflow/service): Remove OpenTelemetry from "FaaS" modules	Fabian Mastenbroek
	This change removes the OpenTelemetry integration from the OpenDC FaaS modules. Previously, we chose to integrate OpenTelemetry to provide a unified way to report metrics to the users. See the previous commit removing it from the "Compute" modules for the reasoning behind this change.
2022-05-06	refactor(compute/service): Remove OpenTelemetry from "compute" modules	Fabian Mastenbroek
	This change removes the OpenTelemetry integration from the OpenDC Compute modules. Previously, we chose to integrate OpenTelemetry to provide a unified way to report metrics to the users. Although this worked as expected, the overhead of the OpenTelemetry when collecting metrics during simulation was considerable and lacked more optimization opportunities (other than providing a separate API implementation). Furthermore, since we were tied to OpenTelemetry's SDK implementation, we experienced issues with throttling and registering multiple instruments. We will instead use another approach, where we expose the core metrics in OpenDC via specialized interfaces (see the commits before) such that access is fast and can be done without having to interface with OpenTelemetry. In addition, we will provide an adapter to that is able to forward these metrics to OpenTelemetry implementations, so we can still integrate with the wider ecosystem.
2022-05-06	refactor(exp/tf20): Directly expose device stats stats to user	Fabian Mastenbroek
	This change updates the `TFDevice` interface to directly expose statistics about the accelerator device to the user. Previously, the user had to access these values through OpenTelemetry, which required substantial extra work.
2022-05-06	refactor(telemetry/compute): Support direct metric access	Fabian Mastenbroek
	This change introduces a `ComputeMetricReader` class that can be used as a replacement for the `CoroutineMetricReader` class when reading metrics from the Compute service. This implementation operates directly on a `ComputeService` instance, providing better performance.
2022-04-24	build: Move modules into subgroups	Fabian Mastenbroek
	This change updates the Gradle build configuration of the project to publish the different type of modules (e.g., opendc-compute, opendc-simulator) into their own groups.
2022-04-23	build: Enable testing for all library modules	Fabian Mastenbroek
	This change updates the Gradle build configuration to ensure that all library modules (that will be published) use testing and are included in coverage reports. This should ensure the public modules remain well tested.
2022-04-22	refactor(compute): Load interference model via trace library	Fabian Mastenbroek
	This change updates the compute support library to load the VM interference model via the OpenDC trace library, which provides a generic interface for reading interference models associated with workload traces.
2022-02-18	refactor(simulator): Remove delta parameter from flow callbacks	Fabian Mastenbroek
	This change removes the delta parameter from the callbacks of the flow framework. This parameter was used to indicate the duration in time between the last call and the current call. However, its usefulness was limited since the actual delta values needed by implementors of this method had to be bridged across different flow callbacks.
2022-02-18	fix(simulator): Flush results before accessing counters	Fabian Mastenbroek
	This change updates the simulator implementation to flush the active progress when accessing the hypervisor counters. Previously, if the counters were accessed, while the mux or consumer was in progress, its counter values were not accurate.
2022-02-18	build: Remove opendc-platform module	Fabian Mastenbroek
	This change removes the opendc-platform module from the project. This module represented a Java platform which was previously used for sharing a set of dependency versions between subprojects. However, with the version catalogue that was added by Gradle, we currently do not use the platform anymore.
2022-02-18	perf(common): Optimize TimerScheduler	Fabian Mastenbroek
	This change updates the TimerScheduler implementation to directly use the Delay object instead of running the timers inside a coroutine. Constructing the coroutine is more expensive, so we prefer running in a Runnable.
2022-02-18	refactor(utils): Rename utils module to common module	Fabian Mastenbroek
	This change adds a new module, opendc-common, that contains functionality that is shared across OpenDC's modules. We move the existing utils module into this new module.
2022-02-15	refactor: Update OpenTelemetry to version 1.11	Fabian Mastenbroek
	This change updates the OpenDC codebase to use OpenTelemetry v1.11, which stabilizes the metrics API. This stabilization brings quite a few breaking changes, so significant changes are necessary inside the OpenDC codebase.
2021-11-16	feat(workflow): Add helper tools for workflow simulations	Fabian Mastenbroek
	This change adds a new module, opendc-workflow-workload that contains helper code for constructing workflow simulations using OpenDC.
2021-11-02	refactor(trace): Support gaps in trace data	Fabian Mastenbroek
	This change updates the implementation of the trace converter and SimTrace implementation to support cases where there is a gap between samples in the trace data. This change allows users to specify what to do in case samples are missing in the trace. The available options are specified in `SimTrace.FillMode`. Currently, we support either carrying the previous value forward or set the usage to zero.
2021-10-25	refactor(simulator): Support running workloads without coroutines	Fabian Mastenbroek
	This change updates the SimMachine interface to drop the coroutine requirement for running a workload on a machines. Users can now asynchronously start a workload and receive notifications via the workload callbacks. Users still have the possibility to suspend execution during workload execution by using the new `runWorkload` method, which is implemented on top of the new `startWorkload` primitive.
2021-10-25	perf(telemetry): Prevent allocations during collection cycle	Fabian Mastenbroek
	This change redesigns the ComputeMonitor interface to reduce the number of memory allocations necessary during a collection cycle.
2021-10-25	perf(compute): Redesign VM interference algorithm	Fabian Mastenbroek
	This change redesigns the virtual machine interference algorithm to have a fixed memory usage per `VmInterferenceModel` instance. Previously, for every interference domain, a copy of the model would be created, leading to OutOfMemory errors when running multiple experiments at the same time.
2021-10-25	fix(simulator): Compute energy usage in absence of convergence	Fabian Mastenbroek
	This change addresses an issue where energy usage was not computed correctly if the machine performed no work in between collection cycles.
2021-10-08	perf(simulator): Skip fair-share algorithm if capacity remaining	Fabian Mastenbroek
	This change updates the MaxMinFlowMultiplexer implementation to skip the fair-share algorithm in case the total demand is lower than the available capacity. In this case, no re-division of capacity is necessary.
2021-10-08	perf(simulator): Optimize SimTraceWorkload	Fabian Mastenbroek
	This change improves the performance of the SimTraceWorkload class by changing the way trace fragments are read and processed by the CPU consumers.
2021-10-05	perf(simulator): Manage deadlines centrally in max min mux	Fabian Mastenbroek
	This change updates the MaxMinFlowMultiplexer implementation to centrally manage the deadlines of the `FlowSource`s as opposed to each source using its own timers. For large amounts of inputs, this is much faster as the multiplexer already needs to traverse each input on a timer expiration of an input.
2021-10-05	perf(experiments): Add benchmark for Capelin experiment	Fabian Mastenbroek

2021-10-03	feat(simulator): Expose CPU time counters directly on hypervisor	Fabian Mastenbroek
	This change adds a new interface to the SimHypervisor interface that exposes the CPU time counters directly. These are derived from the flow counters and will be used by SimHost to expose them via telemetry.
2021-10-03	perf(simulator): Make convergence callback optional	Fabian Mastenbroek
	This change adds two new properties for controlling whether the convergence callbacks of the source and consumer respectively should be invoked. This saves a lot of unnecessary calls for stages that do not have any implementation of the `onConvergence` method.
2021-10-03	refactor(simulator): Create separate callbacks for remaining events	Fabian Mastenbroek
	This change creates separate callbacks for the remaining events: onStart, onStop and onConverge.
2021-10-03	refactor(simulator): Remove capacity event	Fabian Mastenbroek
	This change removes the Capacity entry from FlowEvent. Since the source is always pulled on a capacity change, we do not need a separate event for this.
2021-10-03	refactor(simulator): Separate push and pull flags	Fabian Mastenbroek
	This change separates the push and pull flags in FlowConsumerContextImpl, meaning that sources can now push directly without pulling and vice versa.
2021-10-03	refactor(simulator): Migrate to flow-based simulation	Fabian Mastenbroek
	This change renames the `opendc-simulator-resources` module into the `opendc-simulator-flow` module to indicate that the core simulation model of OpenDC is based around modelling and simulating flows. Previously, the distinction between resource consumer and provider, and input and output caused some confusion. By switching to a flow-based model, this distinction is now clear (as in, the water flows from source to consumer/sink).
2021-10-03	refactor(simulator): Merge distributor and aggregator into switch	Fabian Mastenbroek
	This change removes the distributor and aggregator interfaces in favour of a single switch interface. Since the switch interface is as powerful as both the distributor and aggregator, we don't need the latter two.
2021-10-03	refactor(simulator): Remove onUpdate callback	Fabian Mastenbroek
	This change removes the `onUpdate` callback from the `SimResourceProviderLogic` interface. Instead, users should now update counters using either `onConsume` or `onConverge`.
2021-10-03	refactor(simulator): Invoke consumer callback on every invalidation	Fabian Mastenbroek
	This change updates the simulator implementation to always invoke the `SimResourceConsumer.onNext` callback when the resource context is invalidated. This allows users to update the resource counter or do some other work if the context has changed.
2021-10-03	perf(simulator): Reduce memory allocations in SimResourceInterpreter	Fabian Mastenbroek
	This change removes unnecessary allocations in the SimResourceInterpreter caused by the way timers were allocated for the resource context.
2021-10-03	refactor(simulator): Add support for pushing flow from context	Fabian Mastenbroek
	This change adds a new method to `SimResourceContext` called `push` which allows users to change the requested flow rate directly without having to interrupt the consumer.
2021-10-03	refactor(simulator): Combine work and deadline to duration	Fabian Mastenbroek
	This change removes the work and deadline properties from the SimResourceCommand.Consume class and introduces a new property duration. This property is now used in conjunction with the limit to compute the amount of work processed by a resource provider. Previously, we used both work and deadline to compute the duration and the amount of remaining work at the end of a consumption. However, with this change, we ensure that a resource consumption always runs at the same speed once establishing, drastically simplifying the computation for the amount of work processed during the consumption.
2021-09-28	refactor(telemetry): Do not require clock for ComputeMetricExporter	Fabian Mastenbroek
	This change drops the requirement for a clock parameter when constructing a ComputeMetricExporter, since it will now derive the timestamp from the recorded metrics.
2021-09-21	feat(trace): Add support for writing traces	Fabian Mastenbroek
	This change adds a new API for writing traces in a trace format. Currently, writing is only supported by the OpenDC VM format, but over time the other formats will also have support for writing added.