| Age | Commit message (Collapse) | Author |
|
This change adds a new API for writing traces in a trace format.
Currently, writing is only supported by the OpenDC VM format, but over
time the other formats will also have support for writing added.
|
|
This change simplifies the TraceFormat SPI interface by reducing the
number of interfaces that implementors need to implement to only
TraceFormat.
|
|
|
|
This change updates the ComputeWorkloadLoader to use index column
lookups in order to prevent having to lookup the index for every row.
|
|
This change adds support for looking up the column value through the
column index. This enables faster lookup when processing very large
traces.
|
|
This change unifies columns of different tables used by trace formats.
This concretely means that instead of having columns specific per table
(e.g., RESOURCE_ID and RESOURCE_STATE_ID), with this changes these
columns are shared between the tables with a single definition
(RESOURCE_ID).
|
|
This pull request enables re-use of virtual machine workload helpers by extracting the helpers into a
separate module which may be used by other experiments.
- Support workload/machine CPU count mismatch
- Extract common code out of Capelin experiments
- Support flexible topology creation
- Add option for optimizing SimHost simulation
- Support creating CPU-optimized topology
- Make workload sampling model extensible
- Add support for extended Bitbrains trace format
- Add support for Azure VM trace format
- Add support for internal OpenDC VM trace format
- Optimize OpenDC VM trace format
- Add tool for converting workload traces
- Remove dependency on SnakeYaml
**Breaking API Changes**
- `RESOURCE_NCPU` and `RESOURCE_STATE_NCPU` are renamed to `RESOURCE_CPU_COUNT` and `RESOURCE_STATE_CPU_COUNT` respectively.
|
|
This change removes the dependency on SnakeYaml for the simulator. It
was only required for a very small component of the simulator and
therefore does not justify bringing in such a dependency.
|
|
This change adds an initial implementation to the trace library for
converting between workload trace formats. Currently the tool supports
only converting to the OpenDC VM trace format. However, in the future,
we will add support for converting between other formats as well.
|
|
This change optimizes the OpenDC VM trace format by removing
unnecessary columns as well as optimizing the writer settings.
The new implementation still supports reading the old trace format in
case users run OpenDC with older workload traces.
|
|
This change adds official support to the trace library for the internal
VM trace format used by OpenDC for its experiments. This is a compact
format that uses Parquet to store the virtual machine trace data in two
Parquet files.
|
|
This change adds support in the trace library for the Azure VM trace
format.
|
|
This change adds support in the trace library for the extended Bitbrains
format. This format is slightly different than the CSV format used by
the original Bitbrains traces and contains more fields.
|
|
This change updates the workload sampling implementation to be more
flexible in the way the workload is constructed. Users can now sample
multiple workloads at the same time using multiple samplers and use them
as a single workload to simulate.
|
|
This change adds support for creating a topology that is CPU-optimized
for simulation. This means that all the CPU resources of a machine are
merged into a single large CPU in order to reduce simulation time.
|
|
This change adds an option for optimizing SimHost simulation by
combining all the CPUs of a machine into a single large CPU. For most
workloads, this does not significantly affect the simulation results,
but does improve the simulation time by a lot.
|
|
This change adds support for creating flexible topologies by creating a
TopologyFactory interface that is responsible for configuring the hosts
of a compute service.
|
|
This change creates a new module for doing simulations with virtual
machine workloads. We have found that a lot of code in the Capelin
experiments code is being re-used by non-experiment modules.
|
|
This change allows workloads that require more CPUs than available on
the machine to still function properly.
|
|
This pull request standardizes the metrics emitted by the simulator based on OpenTelemetry conventions.
From now on, all metrics exposed by the simulator are exported through OpenTelemetry
following the recommended practices for naming, collection, etc.
**Implementation Notes**
- Improve ParquetDataWriter implementation
- Simplify CoroutineMetricReader
- Create separate MeterProvider per service/host
- Standardize compute scheduler metrics
- Standardize SimHost metrics
- Use logical types for Parquet output columns
**External Dependencies**
- Update to OpenTelemetry 1.6.0
**Breaking API Changes**
- Instead of supplying a `Meter` instances, key classes are now responsible for constructing
a `Meter` instance from the supplied `MeterProvider`.
- Export format has been changed to suit the outputted metrics
- Energy experiments shell has been removed
|
|
This change updates the output schema for the experiment data to use
logical types where possible. This adds additional context for the
writer and the reader on how to process the column (efficiently).
|
|
This change standardizes the metrics emitted by SimHost instances and
their guests based on the OpenTelemetry semantic conventions. We now
also report CPU time as opposed to CPU work as this metric is more
commonly used.
|
|
This change updates the OpenDC compute service implementation with
multiple meters that follow the OpenTelemetry conventions.
|
|
This change refactors the telemetry implementation by creating a
separate MeterProvider per service or host. This means we have to keep
track of multiple metric producers, but that we can attach resource
information to each of the MeterProviders like we would in a real world
scenario.
|
|
This change simplifies the CoroutineMetricReader implementation by
removing the seperation of reader and exporter jobs.
|
|
This change improves the ParquetDataWriter class to support more complex
use-cases. It now allows subclasses to modify the writer options.
In addition to this, a subclass for writing server data is added.
|
|
This change removes the energy experiments. The experiments only
provided a setup for the original experiments and is not able to
reproduce the results without further worker.
|
|
This change updates the opentelemetry-java library to version 1.6.0.
|
|
This pull request updates the trace API with the addition of several new trace formats.
- Add support for Materna traces from GWA
- Keep reader state in own class
- Parse last column in Solvinity trace format
- Add support Azure VM traces
- Add support for WfCommons (WorkflowHub) traces
- Add API for accessing available table columns
- Add synthetic resource table for Bitbrains format
- Support dynamic resolving of trace formats
**Breaking API Changes**
- Replace `isSupported` by a list of `TableColumns`
|
|
This change enables users to open traces of various trace formats by
dynamically specifying the format name. The trace API will use the
service loader to resolve the available trace formats on the classpath.
|
|
This change adds a synthetic resource table for the Bitbrains format,
which can be used to list the available partitions in the trace.
|
|
This change adds a new API to the Table interface for accessing the
table columns that the table supports. This does not necessarily mean
that the column will have a value for every row, but that the table
format has defined this particular column.
|
|
This change adds support for reading WfCommons workflow traces in
OpenDC. This functionality is available in the new
`opendc-trace-wfformat` module.
|
|
This change adds a trace API implementation for the Azure VM traces.
|
|
This change fixes an issue where the last column in the Solvinity traces
is not parsed correctly, due to the last column having no whitespace at
the end to seek to.
|
|
This change removes the external class that holds the state of the
reader and instead puts the state in the reader implementation.
Maintaining a separate class for the state increases the complexity and
has worse performance characteristics due to the bytecode produced by
Kotlin for property accesses.
|
|
This change adds support for the Materna traces from the Grid Workload
Trace Archive (GWA). These traces are very similar to the Bitbrains
traces, so they share the same base implementation.
|
|
This pull request integrates the fault injector into the `opendc-compute-simulator` module,
where it is now specialized to inject faults into `SimHost` instances.
The previous fault injector implementation supported generic targets,
but this caused the implementation to be more complex.
Since the fault injector was only used for `SimHost` instances,
we have decided to specialize it to `SimHost` for now.
- Support generic distribution in fault injector
- Terminate servers after reaching deadline
- Integrate fault injection into compute simulator
- Clarify terminology in compute service
** External Dependencies**
- Apache commons-math3
**Breaking API Changes**
- Removal of `opendc-simulator-failures` and its corresponding interfaces/classes.
|
|
|
|
|
|
This change moves the fault injection logic directly into the
opendc-compute-simulator module, so that it can operate at a higher
abstraction. In the future, we might again split the module if we can
re-use some of its logic.
|
|
This change updates the Capelin experiment helpers to terminate a server
when it has reached its end-date.
|
|
This change adds support for specifying the distribution of the
failures, group size and duration for the fault injector.
|
|
This pull request addresses a couple of deployment issues that have been reported.
|
|
This change updates the Docker Compose configuration to default to the
available public images for OpenDC, in order to remove the requirement
for building OpenDC locally.
|
|
This change updates the Docker Compose configuration to not warn the
user when they have not specified any Sentry configuration. Since Sentry
is optional, the user should not be presented warnings.
|
|
This change fixes an issue where local traces are not correctly detected
due to Docker mounting the local traces in the incorrect directory.
|
|
This change updates the dependencies of the OpenDC frontend module.
|
|
This pull request adds the necessary code in preparation for the risk analysis experiments:
- Track provisioning time
- Track host up/down time
- Track guest up/down time
- Support overcommitted memory
- Do not fail inactive guests
- Mark unschedulable server as terminated
- Make ExperimentMonitor optional for trace processing
- Report up/downtime metrics in experiment monitor
- Move metric collection outside Capelin code
- Resolve kotlin-reflect incompatibility
- Restructure input reading classes
**Breaking API Changes**
- `ExperimentMonitor` replaced in favour of `ComputeMonitor`
|
|
|