| Age | Commit message (Collapse) | Author |
|
This change updates the Dockerfile for the web runner to reduce the
number of build steps necessary to build the web runner. Previously, the
build would also include/build the web API which is not used in the
image.
|
|
This change removes the OpenTelemetry integration from the OpenDC
Compute modules. Previously, we chose to integrate OpenTelemetry to
provide a unified way to report metrics to the users.
Although this worked as expected, the overhead of the OpenTelemetry when
collecting metrics during simulation was considerable and lacked more
optimization opportunities (other than providing a separate API
implementation). Furthermore, since we were tied to OpenTelemetry's SDK
implementation, we experienced issues with throttling and registering
multiple instruments.
We will instead use another approach, where we expose the core metrics
in OpenDC via specialized interfaces (see the commits before) such that
access is fast and can be done without having to interface with
OpenTelemetry. In addition, we will provide an adapter to that is able
to forward these metrics to OpenTelemetry implementations, so we can
still integrate with the wider ecosystem.
|
|
This change introduces a `ComputeMetricReader` class that can be used as
a replacement for the `CoroutineMetricReader` class when reading metrics
from the Compute service. This implementation operates directly on a
`ComputeService` instance, providing better performance.
|
|
This change updates the compute support library to load the VM
interference model via the OpenDC trace library, which provides a
generic interface for reading interference models associated with
workload traces.
|
|
This change contains a rewrite of the OpenDC web runner implementation,
which now supports terminating simulations when exceeding a deadline, as
well as executing multiple simulation jobs at the same time.
Furthermore, we have extracted the runner from the command line
interface, so that we can offer this functionality as a library in the
future.
|
|
This change updates the Docker deployment configuration for the new web
API implemented in Kotlin. The new API migrates to Postgres.
Furthermore, with this change, we move the Dockerfiles to their
corresponding module.
|
|
This change updates the web runner implementation to use the new API
client introduced in the previous commit.
|
|
This change removes the opendc-platform module from the project. This
module represented a Java platform which was previously used for sharing
a set of dependency versions between subprojects. However, with the
version catalogue that was added by Gradle, we currently do not use the
platform anymore.
|
|
This change adds support for custom audience values in the web runner.
If the audience used by the user is different from the default value
(https://api.opendc.org/v2/), then the runner fails to obtain a valid
access token for the API.
|
|
This change updates the OpenDC codebase to use OpenTelemetry v1.11,
which stabilizes the metrics API. This stabilization brings quite a few
breaking changes, so significant changes are necessary inside the OpenDC
codebase.
|
|
This change adds a new module, opendc-workflow-workload that contains
helper code for constructing workflow simulations using OpenDC.
|
|
This change redesigns the ComputeMonitor interface to reduce the number
of memory allocations necessary during a collection cycle.
|
|
This change redesigns the virtual machine interference algorithm to have
a fixed memory usage per `VmInterferenceModel` instance. Previously, for
every interference domain, a copy of the model would be created, leading
to OutOfMemory errors when running multiple experiments at the same
time.
|
|
This change renames the `opendc-simulator-resources` module into the
`opendc-simulator-flow` module to indicate that the core simulation
model of OpenDC is based around modelling and simulating flows.
Previously, the distinction between resource consumer and provider, and
input and output caused some confusion. By switching to a flow-based
model, this distinction is now clear (as in, the water flows from source
to consumer/sink).
|
|
This change drops the requirement for a clock parameter when
constructing a ComputeMetricExporter, since it will now derive the
timestamp from the recorded metrics.
|
|
This change adds a new API for writing traces in a trace format.
Currently, writing is only supported by the OpenDC VM format, but over
time the other formats will also have support for writing added.
|
|
This change updates the workload sampling implementation to be more
flexible in the way the workload is constructed. Users can now sample
multiple workloads at the same time using multiple samplers and use them
as a single workload to simulate.
|
|
This change adds support for creating flexible topologies by creating a
TopologyFactory interface that is responsible for configuring the hosts
of a compute service.
|
|
This change creates a new module for doing simulations with virtual
machine workloads. We have found that a lot of code in the Capelin
experiments code is being re-used by non-experiment modules.
|
|
This change standardizes the metrics emitted by SimHost instances and
their guests based on the OpenTelemetry semantic conventions. We now
also report CPU time as opposed to CPU work as this metric is more
commonly used.
|
|
This change updates the OpenDC compute service implementation with
multiple meters that follow the OpenTelemetry conventions.
|
|
This change refactors the telemetry implementation by creating a
separate MeterProvider per service or host. This means we have to keep
track of multiple metric producers, but that we can attach resource
information to each of the MeterProviders like we would in a real world
scenario.
|
|
This change moves the fault injection logic directly into the
opendc-compute-simulator module, so that it can operate at a higher
abstraction. In the future, we might again split the module if we can
re-use some of its logic.
|
|
|
|
This change addresses an incompatibility issue with the kotlin-reflect
transitive dependency in the opendc-web-runner module.
|
|
This change moves the metric collection outside the Capelin codebase in
a separate module so other modules can also benefit from the compute
metric collection code.
|
|
|
|
This change updates the WTF trace reader to support the new streaming
trace API.
|
|
This change removes the environment reader from the format library since
they are highly specific for the particular experiment. In the future,
we hope to have a single format to setup the entire datacenter (perhaps
similar to the format used by the web runner).
|
|
This change eliminates the unnecessary conversions from double to long
in the Capelin metric processing code.
|
|
This change upgrades the OpenTelemetry dependency to version 1.5, which
contains various breaking changes in the metrics API.
|
|
This change updates the FilterScheduler implementation to follow more
closely the scheduler implementation in OpenStack's Nova. We now
normalize the weights, support many of the filters and weights in
OpenStack and support overcommitting resources.
|
|
This change fixes an issue where the topology generated by the frontend
was not accepted by the API server.
|
|
This change updates the web runner to not require direct database access
for scheduling simulation jobs. Instead, the runner polls the public
REST API for available jobs and reports its results through there.
|
|
This change updates reimplements the performance interference model to
work on top of the universal resource model in
`opendc-simulator-resources`. This enables us to model interference and
performance variability of other resources such as disk or network in
the future.
|
|
This change updates the trace reader implementation to remove their
dependency on the performance interference model. In a future commit, we
will instead pass the performance interference model via the
host/hypervisor.
|
|
This change re-organizes the classes of the compute simulator module to
make a clearer distinction between the hardware, firmware and software
interfaces in this module.
|
|
This change addresses the deprecations that were caused by the migration
to Kotlin 1.5.
|
|
|
|
This change adds support for the Gradle version catalog feature in our
build configuration. This allows us to have a single file,
gradle/libs.versions.toml, which contains all the dependency versions
used in this project.
|
|
This change updates the Gradle build configuration so that the
application modules (as opposed the libraries) are not published onto
Maven Central.
|
|
This change updates the build scripts to use type-safe project accessors
when specifying build dependencies between modules.
|
|
This change updates the project structure to become flattened.
Previously, the simulator, frontend and API each lived into their own
directory.
With this change, all modules of the project live in the top-level
directory of the repository. This should improve discoverability of
modules of the project.
|