summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-09-19refactor(capelin): Extract common code out of Capelin experimentsFabian Mastenbroek
This change creates a new module for doing simulations with virtual machine workloads. We have found that a lot of code in the Capelin experiments code is being re-used by non-experiment modules.
2021-09-17fix(simulator): Support workload/machine CPU count mismatchFabian Mastenbroek
This change allows workloads that require more CPUs than available on the machine to still function properly.
2021-09-17merge: Standardize simulator metricsFabian Mastenbroek
This pull request standardizes the metrics emitted by the simulator based on OpenTelemetry conventions. From now on, all metrics exposed by the simulator are exported through OpenTelemetry following the recommended practices for naming, collection, etc. **Implementation Notes** - Improve ParquetDataWriter implementation - Simplify CoroutineMetricReader - Create separate MeterProvider per service/host - Standardize compute scheduler metrics - Standardize SimHost metrics - Use logical types for Parquet output columns **External Dependencies** - Update to OpenTelemetry 1.6.0 **Breaking API Changes** - Instead of supplying a `Meter` instances, key classes are now responsible for constructing a `Meter` instance from the supplied `MeterProvider`. - Export format has been changed to suit the outputted metrics - Energy experiments shell has been removed
2021-09-17feat(capelin): Use logical types for Parquet output columnsFabian Mastenbroek
This change updates the output schema for the experiment data to use logical types where possible. This adds additional context for the writer and the reader on how to process the column (efficiently).
2021-09-17refactor(telemetry): Standardize SimHost metricsFabian Mastenbroek
This change standardizes the metrics emitted by SimHost instances and their guests based on the OpenTelemetry semantic conventions. We now also report CPU time as opposed to CPU work as this metric is more commonly used.
2021-09-17refactor(telemetry): Standardize compute scheduler metricsFabian Mastenbroek
This change updates the OpenDC compute service implementation with multiple meters that follow the OpenTelemetry conventions.
2021-09-17refactor(telemetry): Create separate MeterProvider per service/hostFabian Mastenbroek
This change refactors the telemetry implementation by creating a separate MeterProvider per service or host. This means we have to keep track of multiple metric producers, but that we can attach resource information to each of the MeterProviders like we would in a real world scenario.
2021-09-17refactor(telemetry): Simplify CoroutineMetricReaderFabian Mastenbroek
This change simplifies the CoroutineMetricReader implementation by removing the seperation of reader and exporter jobs.
2021-09-17refactor(capelin): Improve ParquetDataWriter implementationFabian Mastenbroek
This change improves the ParquetDataWriter class to support more complex use-cases. It now allows subclasses to modify the writer options. In addition to this, a subclass for writing server data is added.
2021-09-17refactor(experiments): Remove energy experiments shellFabian Mastenbroek
This change removes the energy experiments. The experiments only provided a setup for the original experiments and is not able to reproduce the results without further worker.
2021-09-17build(telemetry): Update to OpenTelemetry 1.6.0Fabian Mastenbroek
This change updates the opentelemetry-java library to version 1.6.0.
2021-09-12merge: Add support for new trace formatsFabian Mastenbroek
This pull request updates the trace API with the addition of several new trace formats. - Add support for Materna traces from GWA - Keep reader state in own class - Parse last column in Solvinity trace format - Add support Azure VM traces - Add support for WfCommons (WorkflowHub) traces - Add API for accessing available table columns - Add synthetic resource table for Bitbrains format - Support dynamic resolving of trace formats **Breaking API Changes** - Replace `isSupported` by a list of `TableColumns`
2021-09-12feat(trace): Support dynamic resolving of trace formatsFabian Mastenbroek
This change enables users to open traces of various trace formats by dynamically specifying the format name. The trace API will use the service loader to resolve the available trace formats on the classpath.
2021-09-12feat(trace): Add synthetic resource table for Bitbrains formatFabian Mastenbroek
This change adds a synthetic resource table for the Bitbrains format, which can be used to list the available partitions in the trace.
2021-09-12refactor(trace): Add API for accessing available table columnsFabian Mastenbroek
This change adds a new API to the Table interface for accessing the table columns that the table supports. This does not necessarily mean that the column will have a value for every row, but that the table format has defined this particular column.
2021-09-11feat(trace): Add support for WfCommons (WorkflowHub) tracesFabian Mastenbroek
This change adds support for reading WfCommons workflow traces in OpenDC. This functionality is available in the new `opendc-trace-wfformat` module.
2021-09-11feat(capelin): Implement trace API for Azure VM trace formatFabian Mastenbroek
This change adds a trace API implementation for the Azure VM traces.
2021-09-11fix(capelin): Parse last column in Solvinity trace formatFabian Mastenbroek
This change fixes an issue where the last column in the Solvinity traces is not parsed correctly, due to the last column having no whitespace at the end to seek to.
2021-09-11perf(trace): Keep reader state in own classFabian Mastenbroek
This change removes the external class that holds the state of the reader and instead puts the state in the reader implementation. Maintaining a separate class for the state increases the complexity and has worse performance characteristics due to the bytecode produced by Kotlin for property accesses.
2021-09-10feat(trace): Support Materna traces from GWAFabian Mastenbroek
This change adds support for the Materna traces from the Grid Workload Trace Archive (GWA). These traces are very similar to the Bitbrains traces, so they share the same base implementation.
2021-09-10merge: Integrate fault injector into compute simulatorFabian Mastenbroek
This pull request integrates the fault injector into the `opendc-compute-simulator` module, where it is now specialized to inject faults into `SimHost` instances. The previous fault injector implementation supported generic targets, but this caused the implementation to be more complex. Since the fault injector was only used for `SimHost` instances, we have decided to specialize it to `SimHost` for now. - Support generic distribution in fault injector - Terminate servers after reaching deadline - Integrate fault injection into compute simulator - Clarify terminology in compute service ** External Dependencies** - Apache commons-math3 **Breaking API Changes** - Removal of `opendc-simulator-failures` and its corresponding interfaces/classes.
2021-09-10test(compute): Add test suite for fault injectorFabian Mastenbroek
2021-09-10docs(compute): Clarify terminology in compute serviceFabian Mastenbroek
2021-09-10refactor(compute): Integrate fault injection into compute simulatorFabian Mastenbroek
This change moves the fault injection logic directly into the opendc-compute-simulator module, so that it can operate at a higher abstraction. In the future, we might again split the module if we can re-use some of its logic.
2021-09-10refactor(capelin): Terminate servers after reaching deadlineFabian Mastenbroek
This change updates the Capelin experiment helpers to terminate a server when it has reached its end-date.
2021-09-10feat(simulator): Support generic distribution in fault injectorFabian Mastenbroek
This change adds support for specifying the distribution of the failures, group size and duration for the fault injector.
2021-09-10merge: Address deployment issuesFabian Mastenbroek
This pull request addresses a couple of deployment issues that have been reported.
2021-09-10fix(docker): Default to public images for deploymentFabian Mastenbroek
This change updates the Docker Compose configuration to default to the available public images for OpenDC, in order to remove the requirement for building OpenDC locally.
2021-09-10fix(docker): Do not warn when Sentry is not configuredFabian Mastenbroek
This change updates the Docker Compose configuration to not warn the user when they have not specified any Sentry configuration. Since Sentry is optional, the user should not be presented warnings.
2021-09-10fix(docker): Mount local traces to correct container directoryFabian Mastenbroek
This change fixes an issue where local traces are not correctly detected due to Docker mounting the local traces in the incorrect directory.
2021-09-10build(ui): Update dependenciesFabian Mastenbroek
This change updates the dependencies of the OpenDC frontend module.
2021-09-07merge: Prepare for risk analysis experimentsFabian Mastenbroek
This pull request adds the necessary code in preparation for the risk analysis experiments: - Track provisioning time - Track host up/down time - Track guest up/down time - Support overcommitted memory - Do not fail inactive guests - Mark unschedulable server as terminated - Make ExperimentMonitor optional for trace processing - Report up/downtime metrics in experiment monitor - Move metric collection outside Capelin code - Resolve kotlin-reflect incompatibility - Restructure input reading classes **Breaking API Changes** - `ExperimentMonitor` replaced in favour of `ComputeMonitor`
2021-09-07refactor(capelin): Restructure input reading classesFabian Mastenbroek
2021-09-07build(web): Resolve kotlin-reflect incompatibilityFabian Mastenbroek
This change addresses an incompatibility issue with the kotlin-reflect transitive dependency in the opendc-web-runner module.
2021-09-07refactor(capelin): Move metric collection outside Capelin codeFabian Mastenbroek
This change moves the metric collection outside the Capelin codebase in a separate module so other modules can also benefit from the compute metric collection code.
2021-09-07feat(capelin): Report up/downtime metrics in experiment monitorFabian Mastenbroek
2021-09-07refactor(capelin): Make ExperimentMonitor optional for trace processingFabian Mastenbroek
2021-09-07fix(compute): Mark unschedulable server as terminatedFabian Mastenbroek
This change updates the compute service to mark servers that cannot be scheduled as terminated instead of error. Error is instead reserved for cases where the server is in an error state while running.
2021-09-07fix(compute): Do not allow failure of inactive guestsFabian Mastenbroek
This change fixes an issue in SimHost where guests that where inactive were also failed, causing an IllegalStateException.
2021-09-07fix(compute): Use correct memory size for host memoryFabian Mastenbroek
This change fixes an issue where all servers could not be scheduled due to the memory size of the host being computed incorrectly.
2021-09-07feat(compute): Track guest up/down timeFabian Mastenbroek
This change updates the SimHost implementation to track the up and downtime of hypervisor guests.
2021-09-07fix(compute): Start host even if it already exists on hostFabian Mastenbroek
2021-09-07fix(compute): Support overcommitted memory in SimHostFabian Mastenbroek
This change enables host to overcommit their memory when testing whether new servers can fit on the host.
2021-09-07feat(compute): Track host up/down timeFabian Mastenbroek
This change adds new metrics for tracking the up and downtime of hosts due to failures. In addition, this change adds a test to verify whether the metrics are collected correctly.
2021-09-07feat(compute): Track provisioning response timeFabian Mastenbroek
This change adds a metric for the provisioning time of virtual machines by the compute service.
2021-09-05build(ui): Bump immer from 9.0.5 to 9.0.6dependabot[bot]
Bumps [immer](https://github.com/immerjs/immer) from 9.0.5 to 9.0.6. - [Release notes](https://github.com/immerjs/immer/releases) - [Commits](https://github.com/immerjs/immer/compare/v9.0.5...v9.0.6) --- updated-dependencies: - dependency-name: immer dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-02build: Update to Gradle 7.2Fabian Mastenbroek
This change updates the Gradle version of the supplied Gradle wrapper to version 7.2. * Update Gradle to version 7.2 * Address incompatibilities with version catalog * Remove opendc-format module.
2021-09-02merge: Add generic trace reading libraryFabian Mastenbroek
This pull request adds a generic trace reading library to OpenDC. The library has been designed to support a wide range of trace formats and uses a streaming approach to improve performance of reading large traces. * Add trace reading API * Implement API for GWF format * Implement API for SWF format * Implement API for WTF format * Implement API for Bitbrains format * Implement API for Bitbrains Parquet format **Breaking API Changes** * `opendc-format` has been removed in favour of `opendc-trace-*`
2021-09-02perf(trace): Improve performance of column lookupFabian Mastenbroek
2021-09-02refactor(capelin): Migrate trace reader to new trace APIFabian Mastenbroek
This change updates the trace reading classes in the Capelin experiment to use the new trace API in order to re-use many of the trace reading parts.