| Age | Commit message (Collapse) | Author |
|
This change adds a new module `opendc-experiments-compute` that provides
provisioner implementations for experiments to use for setting up the
compute service of OpenDC and provisioning (simulated) hosts.
|
|
This change updates the interface of `ComputeService` to provide access
to the instances (servers) that have been registered with the compute
service. This allows metric collectors to query the metrics of the
servers that are currently running.
|
|
This change updates the `ComputeServiceHelper` class to provide the
failure model via a parameter to the `run` method instead of constructor
parameter. This separates the construction of the topology from the
simulation of the workload.
|
|
This change simplifies the SimHypervisor class into a single
implementation. Previously, it was implemented as an abstract class with
multiple implementations for each multiplexer type. We now pass the
multiplexer type as parameter to the SimHypervisor constructor.
|
|
This change updates the virtual machine performance interference model
so that the interference domain can be constructed independently of the
interference profile. As a consequence, the construction of the topology
now does not depend anymore on the interference profile.
|
|
This change updates the constructor of SimHost to receive a
`SimBareMetalMachine` and `SimHypervisor` directly instead of
construction these objects itself. This ensures better testability and
also simplifies the constructor of this class, especially when future
changes to `SimBareMetalMachine` or `SimHypervisor` change their
constructors.
|
|
This change adds a new HostState to indicate that the host is in an
error state as opposed to being purposefully unavailable.
|
|
This change moves the Random dependency outside the interference model,
to allow the interference model to be completely immutable and passable
between different simulations.
|
|
This change updates the design of the VM interference model, where we
move more of the logic into the `VmInterferenceMember` interface. This
removes the dependency on the `VmInterferenceModel` for the hypervisor
interface.
|
|
This change updates the signature of the `SimHypervisor` interface to
accept a `VmInterferenceKey` when creating a new virtual machine,
instead of providing a string identifier. This is in preparation for
removing the dependency on the `VmInterferenceModel` in the
`SimAbstractHypervisor` class.
|
|
This change removes the timestamp parameter from `SimTrace`. Instead, it
is now assumed that the trace is continuous and the end of a fragment
starts a new fragment, in order to simplify replaying of the trace.
|
|
This change adds support for (anti-)affinity scheduling of servers onto hosts,
which happens at the compute service level.
In the future, we might add support for server groups, which also enables
soft (anti-)affinity scheduling.
Implements #26
## Implementation Notes :hammer_and_pick:
* Add `DifferentHostFilter` to schedule instances on different hosts from a set of instances.
* Add `SameHostFilter` to schedule instances on the same hosts as a set of instances.
|
|
This change fixes an issue with the metric exporting code in OpenDC
where a UUID is not converted correctly into a `Binary` object that is
consumed by the Apache Parquet library.
|
|
This change updates the trace API by introducing a limited type system
for the table columns. Previously, the table columns could have any
possible type representable by the JVM. With this change, we limit the
available types to a small type system.
|
|
This change removes the dependency on the OpenTelemetry SDK. Instead,
we'll only expose metrics via the OpenTelemetry API in the future via
adapter classes.
|
|
This change removes the OpenTelemetry integration from the OpenDC
Compute modules. Previously, we chose to integrate OpenTelemetry to
provide a unified way to report metrics to the users.
Although this worked as expected, the overhead of the OpenTelemetry when
collecting metrics during simulation was considerable and lacked more
optimization opportunities (other than providing a separate API
implementation). Furthermore, since we were tied to OpenTelemetry's SDK
implementation, we experienced issues with throttling and registering
multiple instruments.
We will instead use another approach, where we expose the core metrics
in OpenDC via specialized interfaces (see the commits before) such that
access is fast and can be done without having to interface with
OpenTelemetry. In addition, we will provide an adapter to that is able
to forward these metrics to OpenTelemetry implementations, so we can
still integrate with the wider ecosystem.
|
|
This change introduces a `ComputeMetricReader` class that can be used as
a replacement for the `CoroutineMetricReader` class when reading metrics
from the Compute service. This implementation operates directly on a
`ComputeService` instance, providing better performance.
|
|
This change updates the `ComputeService` interface to directly expose
statistics about the scheduler to the user, such that they do not
necessarily have to interact with OpenTelemetry to obtain these values.
|
|
This change adds the ability for users to lookup the `Host` on which a
`Server` is hosted (if any). This allows the user to potentially
interact with the `Host` directly, e.g., in order to obtain advanced
metrics.
|
|
This change updates the `Host` interface to directly expose CPU and
system stats to be used by components that interface with the `Host`
interface.
Previously, this would require the user to interact with the OpenTelemetry SDK.
Although that is still possible for more advanced usage cases, users can
use the following methods to easily access common host and guest
statistics.
|
|
This change updates the `ParquetDataWriter` class to not use the
`parquet-avro` library for exporting experiment data, but instead to use
the low-level APIs to directly write the data in Parquet format.
|
|
This change updates the `LocalParquetReader` implementation to support
custom `ReadSupport` implementations, so we do not have to rely on the
Avro implementation necessarily.
|
|
This change updates the Gradle build configuration of the project to
publish the different type of modules (e.g., opendc-compute,
opendc-simulator) into their own groups.
|
|
This change updates the Gradle build configuration to ensure that all
library modules (that will be published) use testing and are included in
coverage reports. This should ensure the public modules remain well
tested.
|
|
This change moves the trace conventions (such as table and column names)
in a separate conv package, so that it is separated from the main API.
This also allows for a potential move into a separate module in the
future.
|
|
This change updates the compute support library to load the VM
interference model via the OpenDC trace library, which provides a
generic interface for reading interference models associated with
workload traces.
|
|
This change updates the simulator implementation to flush the active
progress when accessing the hypervisor counters. Previously, if the
counters were accessed, while the mux or consumer was in progress, its
counter values were not accurate.
|
|
This change fixes an issue with the ComputeServiceHelper where it
allowed users to register multiple SimHost objects with the same UID.
See this issue for more information:
https://github.com/atlarge-research/opendc/issues/51
|
|
This change removes the opendc-platform module from the project. This
module represented a Java platform which was previously used for sharing
a set of dependency versions between subprojects. However, with the
version catalogue that was added by Gradle, we currently do not use the
platform anymore.
|
|
This change adds a new module, opendc-common, that contains
functionality that is shared across OpenDC's modules.
We move the existing utils module into this new module.
|
|
This change adds a new Pacer class that can pace the incoming scheduling
requests into scheduling cycles by allowing the user to specify a
scheduling quantum.
|
|
This change updates the OpenDC codebase to use OpenTelemetry v1.11,
which stabilizes the metrics API. This stabilization brings quite a few
breaking changes, so significant changes are necessary inside the OpenDC
codebase.
|
|
This change adds a new module, opendc-workflow-workload that contains
helper code for constructing workflow simulations using OpenDC.
|
|
This change updates the implementation of the trace converter and
SimTrace implementation to support cases where there is a gap between
samples in the trace data.
This change allows users to specify what to do in case samples are
missing in the trace. The available options are specified in
`SimTrace.FillMode`. Currently, we support either carrying the previous
value forward or set the usage to zero.
|
|
This change updates the SimMachine interface to drop the coroutine
requirement for running a workload on a machines. Users can now
asynchronously start a workload and receive notifications via the
workload callbacks.
Users still have the possibility to suspend execution during workload
execution by using the new `runWorkload` method, which is implemented on
top of the new `startWorkload` primitive.
|
|
This change redesigns the ComputeMonitor interface to reduce the number
of memory allocations necessary during a collection cycle.
|
|
This change adds support for collecting the provisioning time of virtual
machines in addition to their boot time.
|
|
This change redesigns the virtual machine interference algorithm to have
a fixed memory usage per `VmInterferenceModel` instance. Previously, for
every interference domain, a copy of the model would be created, leading
to OutOfMemory errors when running multiple experiments at the same
time.
|
|
This change allows users to create servers with a smaller CPU capacity
than the host, by specifying the CPU capacity via metadata. This also
allows filtering hosts based on their available CPU capacity.
|
|
This change improves the performance of the SimTraceWorkload class by
changing the way trace fragments are read and processed by the CPU
consumers.
|
|
This change optimizes the telemetry collection in the SimHost class.
Previously, there was significant overhead in collecting the metrics of
this and associated classes due large `Attributes` object that did not
cache accesses to `hashCode()`. We now wrap this object and manually
cache the hash code.
|
|
This change adds a new interface to the SimHypervisor interface that
exposes the CPU time counters directly. These are derived from the flow
counters and will be used by SimHost to expose them via telemetry.
|
|
This change renames the `opendc-simulator-resources` module into the
`opendc-simulator-flow` module to indicate that the core simulation
model of OpenDC is based around modelling and simulating flows.
Previously, the distinction between resource consumer and provider, and
input and output caused some confusion. By switching to a flow-based
model, this distinction is now clear (as in, the water flows from source
to consumer/sink).
|
|
This change removes the distributor and aggregator interfaces in favour
of a single switch interface. Since the switch interface is as powerful
as both the distributor and aggregator, we don't need the latter two.
|
|
|
|
|
|
This change drops the requirement for a clock parameter when
constructing a ComputeMetricExporter, since it will now derive the
timestamp from the recorded metrics.
|
|
This change adds a new API for writing traces in a trace format.
Currently, writing is only supported by the OpenDC VM format, but over
time the other formats will also have support for writing added.
|
|
This change simplifies the TraceFormat SPI interface by reducing the
number of interfaces that implementors need to implement to only
TraceFormat.
|
|
This change updates the ComputeWorkloadLoader to use index column
lookups in order to prevent having to lookup the index for every row.
|