opendc.git - The OpenDC repository.

Age	Commit message (Collapse)	Author
2022-05-06	build(trace/parquet): Remove unnecessary dependencies	Fabian Mastenbroek
	This change removes several dependencies from the `opendc-trace-parquet` helper module, which are part of Hadoop Common, but are not actually used by the Parquet project.
2022-05-06	refactor(harness): Remove OpenDC Harness modules	Fabian Mastenbroek
	This change removes the OpenDC Harness modules from the main repository. We have made the decision to take a different direction regarding the specification and execution of experiments. The design of the current harness does not integrate well with the specification of experiments in the web interface. The new version focuses on proper integration with the web interface, as well as via the command line interface.
2022-05-06	refactor(exp/capelin): Add independent Capelin distribution	Fabian Mastenbroek
	This change updates the Capelin experiments so it can be distributed and executed independently of the main OpenDC distribution. We provide a new command line interface for users to directly run the experiments. Alternatively, the `CapelinRunner` class encapsulates the logic for running the experiments and can be used programmatically.
2022-05-06	refactor(exp/tf20): Convert experiment into integration test	Fabian Mastenbroek
	This change removes the `TensorFlowExperiment` in favour of an integration test that can be run during CI invocations. Given that the experiment was not very sophisticated (in terms of data collection), we believe it is better suited as an integration test.
2022-05-06	fix(exp/tf20): Fix infinite loop due to invalid rounding	Fabian Mastenbroek
	This change fixes an issue with the `SimTFDevice` implementation where very small amounts of FLOPs would cause the device to enter an infinite loop. We now round the value up to ensure that the device always consumes FLOPs.
2022-05-06	feat(faas): Add helper tools for FaaS simulations	Fabian Mastenbroek
	This change adds a new module, opendc-faas-workload that contains helper code for constructing simulations of FaaS-based workloads using OpenDC. In addition, we add an integration test that demonstrates the capabilities of the helper tool and the FaaS platform of OpenDC.
2022-05-06	merge: Move OpenTelemetry integration outside core modules (#81)	Fabian Mastenbroek
	This change removes the OpenTelemetry integration from the OpenDC modules. Previously, we chose to integrate OpenTelemetry to provide a unified way to report metrics to the users. Although this worked as expected, the overhead of the OpenTelemetry when collecting metrics during simulation was considerable and lacked more optimization opportunities (other than providing a separate API implementation). Furthermore, since we were tied to OpenTelemetry's SDK implementation, we experienced issues with throttling and registering multiple instruments. We will instead use another approach, where we expose the core metrics in OpenDC via specialized interfaces (see #80) such that access is fast and can be done without having to interface with OpenTelemetry. In addition, we will provide an adapter to that is able to forward these metrics to OpenTelemetry implementations, so we can still integrate with the wider ecosystem. ## Implementation Notes :hammer_and_pick: * Remove OpenTelemetry from "compute" modules * Remove OpenTelemetry from "workflow" modules * Remove OpenTelemetry from "FaaS" modules * Remove OpenTelemetry from TF20 experiment * Remove dependency on OpenTelemetry SDK ## External Dependencies :four_leaf_clover: * N/A ## Breaking API Changes :warning: * Metrics are not anymore directly exposed via OpenTelemetry. Instead, an adapter needs to be used to access the data via OpenTelemetry.
2022-05-06	refactor(telemetry): Remove dependency on OpenTelemetry SDK	Fabian Mastenbroek
	This change removes the dependency on the OpenTelemetry SDK. Instead, we'll only expose metrics via the OpenTelemetry API in the future via adapter classes.
2022-05-06	refactor(exp/tf20): Remove OpenTelemetry from TF20 experiment	Fabian Mastenbroek
	This change removes the OpenTelemetry integration from the OpenDC Tensorflow 2020 experiments. Previously, we chose to integrate OpenTelemetry to provide a unified way to report metrics to the users. See the previous commit removing it from the "Compute" modules for the reasoning behind this change.
2022-05-06	refactor(workflow/service): Remove OpenTelemetry from "FaaS" modules	Fabian Mastenbroek
	This change removes the OpenTelemetry integration from the OpenDC FaaS modules. Previously, we chose to integrate OpenTelemetry to provide a unified way to report metrics to the users. See the previous commit removing it from the "Compute" modules for the reasoning behind this change.
2022-05-06	refactor(workflow/service): Remove OpenTelemetry from "workflow" modules	Fabian Mastenbroek
	This change removes the OpenTelemetry integration from the OpenDC Workflow modules. Previously, we chose to integrate OpenTelemetry to provide a unified way to report metrics to the users. See the previous commit removing it from the "Compute" modules for the reasoning behind this change.
2022-05-06	refactor(compute/service): Remove OpenTelemetry from "compute" modules	Fabian Mastenbroek
	This change removes the OpenTelemetry integration from the OpenDC Compute modules. Previously, we chose to integrate OpenTelemetry to provide a unified way to report metrics to the users. Although this worked as expected, the overhead of the OpenTelemetry when collecting metrics during simulation was considerable and lacked more optimization opportunities (other than providing a separate API implementation). Furthermore, since we were tied to OpenTelemetry's SDK implementation, we experienced issues with throttling and registering multiple instruments. We will instead use another approach, where we expose the core metrics in OpenDC via specialized interfaces (see the commits before) such that access is fast and can be done without having to interface with OpenTelemetry. In addition, we will provide an adapter to that is able to forward these metrics to OpenTelemetry implementations, so we can still integrate with the wider ecosystem.
2022-05-06	merge: Expose metrics directly to user (#80)	Fabian Mastenbroek
	This pull request adds the ability to access the metrics of resources modeled by the OpenDC Compute, Workflow, FaaS, and TensorFlow services directly from their corresponding interfaces. Previously, users would have to interact with OpenTelemetry to obtain these values, which is complex and provides significant overhead. With this pull request, users can access the metrics of all cloud resources modeled by OpenDC via methods such as `getSchedulerStats()`, etc. Breaking Changes - `ComputeService.hostCount` removed in favour of `ComputeService.hosts.size`
2022-05-06	refactor(exp/tf20): Directly expose device stats stats to user	Fabian Mastenbroek
	This change updates the `TFDevice` interface to directly expose statistics about the accelerator device to the user. Previously, the user had to access these values through OpenTelemetry, which required substantial extra work.
2022-05-06	refactor(faas/service): Directly expose scheduler/function stats to user	Fabian Mastenbroek
	This change updates the `FaaSService` interface to directly expose statistics about the scheduler and individual functions to the user, such that they do not necessarily have to interact with OpenTelemetry to obtain these values.
2022-05-06	refactor(workflow/service): Directly expose scheduler stats to user	Fabian Mastenbroek
	This change updates the `WorkflowService` interface to directly expose statistics about the scheduler to the user, such that they do not necessarily have to interact with OpenTelemetry to obtain these values
2022-05-06	refactor(telemetry/compute): Support direct metric access	Fabian Mastenbroek
	This change introduces a `ComputeMetricReader` class that can be used as a replacement for the `CoroutineMetricReader` class when reading metrics from the Compute service. This implementation operates directly on a `ComputeService` instance, providing better performance.
2022-05-04	refactor(compute): Directly expose scheduler stats to user	Fabian Mastenbroek
	This change updates the `ComputeService` interface to directly expose statistics about the scheduler to the user, such that they do not necessarily have to interact with OpenTelemetry to obtain these values.
2022-05-04	feat(compute): Add support for looking up hosts	Fabian Mastenbroek
	This change adds the ability for users to lookup the `Host` on which a `Server` is hosted (if any). This allows the user to potentially interact with the `Host` directly, e.g., in order to obtain advanced metrics.
2022-05-04	ci: Bump mikepenz/action-junit-report from 3.0.2 to 3.0.3 (#79)	dependabot[bot]
	Bumps [mikepenz/action-junit-report](https://github.com/mikepenz/action-junit-report) from 3.0.2 to 3.0.3. - [Release notes](https://github.com/mikepenz/action-junit-report/releases) - [Commits](https://github.com/mikepenz/action-junit-report/compare/v3.0.2...v3.0.3) --- updated-dependencies: - dependency-name: mikepenz/action-junit-report dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-03	refactor(compute): Expose CPU and system stats via Host interface	Fabian Mastenbroek
	This change updates the `Host` interface to directly expose CPU and system stats to be used by components that interface with the `Host` interface. Previously, this would require the user to interact with the OpenTelemetry SDK. Although that is still possible for more advanced usage cases, users can use the following methods to easily access common host and guest statistics.
2022-05-02	merge: Add support for SQL via Apache Calcite (#78)	Fabian Mastenbroek
	This pull request integrates initial support for SQL queries via Apache Calcite into the OpenDC codebase. Our vision is that users of OpenDC should be able to use SQL queries to access and process most of the experiment data generated by simulations. This pull request moves towards this goal by adding the ability to query workload traces supported by OpenDC using SQL. We also provide a CLI for querying the data in workload traces via `opendc-trace-tools`: ```bash opendc-trace-tools query -i data/bitbrains-small -f opendc-vm "SELECT MAX(cpu_count) FROM resource_states" ``` ## Implementation Notes :hammer_and_pick: * Add Calcite (SQL) integration * Add support for writing via SQL * Add support for writing via SQL * Support custom Parquet ReadSupport implementations * Read records using low-level Parquet API * Do not use Avro when exporting experiment data * Do not use Avro when reading WTF trace * Drop dependency on Avro * Add support for projections ## External Dependencies :four_leaf_clover: * Apache Calcite ## Breaking API Changes :warning: * The existing code for reading Parquet traces using Apache Avro has been removed. * `TraceFormat.newReader` now accepts a nullable `projection` parameter
2022-05-02	perf(trace/calcite): Add support for projections	Fabian Mastenbroek
	This change adds support for projections in the Apache Calcite integration with OpenDC. This enables faster queries when only a subset of the table columns is selected.
2022-05-02	feat(trace/api): Add support for projecting tables	Fabian Mastenbroek
	This change adds support for projecting certain columns of a table. This enables faster reading for tables with high number of columns. Currently, we support projection in the Parquet-based workload formats. Other formats are text-based and will probably not benefit much from projection.
2022-05-02	refactor(trace/parquet): Drop dependency on Avro	Fabian Mastenbroek
	This change updates the Parquet support library in OpenDC to not rely on Avro, but instead interface directly with Parquet's reading and writing functionality, providing less overhead.
2022-05-02	refactor(trace/wtf): Do not use Avro when reading WTF trace	Fabian Mastenbroek
	This change updates the Workflow Trace format implementation in OpenDC to not use the `parquet-avro` library for exporting experiment data, but instead to use the low-level APIs to directly read the data from Parquet. This reduces the amount of conversions necessary before reaching the OpenDC trace API.
2022-05-02	refactor(compute): Do not use Avro when exporting experiment data	Fabian Mastenbroek
	This change updates the `ParquetDataWriter` class to not use the `parquet-avro` library for exporting experiment data, but instead to use the low-level APIs to directly write the data in Parquet format.
2022-05-02	perf(trace/opendc): Read records using low-level API	Fabian Mastenbroek
	This change updates the OpenDC VM format reader implementation to use the low-level record reading APIs provided by the `parquet-mr` library for improved performance. Previously, we used the `parquet-avro` library to read/write Avro records in Parquet format, but that library carries considerable overhead.
2022-05-01	refactor(trace/parquet): Support custom ReadSupport implementations	Fabian Mastenbroek
	This change updates the `LocalParquetReader` implementation to support custom `ReadSupport` implementations, so we do not have to rely on the Avro implementation necessarily.
2022-04-30	feat(trace/tools): Add support for querying traces using SQL	Fabian Mastenbroek
	This change adds a command line interface for querying workload traces using SQL. We provide a new command for the trace tools that can query a workload trace.
2022-04-30	feat(trace/calcite): Add support for writing via SQL	Fabian Mastenbroek
	This change updates the Apache Calcite integration to support writing workload traces via SQL. This enables custom conversion scripts between different workload traces.
2022-04-30	feat(trace/calcite): Add Calcite (SQL) integration	Fabian Mastenbroek
	This change adds support for querying workload trace formats implemented using the OpenDC API through Apache Calcite. This allows users to write SQL queries to explore the workload traces.
2022-04-24	merge: Move modules into different groups (#77)	Fabian Mastenbroek
	This pull request moves the different modules of OpenDC into different groups. For instance, all submodules of `opendc-compute` are moved into `org.opendc.compute`. This provides a better separation of the artifacts. ## Implementation Notes :hammer_and_pick: * Enable testing for all library modules * Move modules into subgroups * Update to Jandex Gradle 0.12.0 ## External Dependencies :four_leaf_clover: * N/A ## Breaking API Changes :warning: * Each module has now been assigned its own group (e.g., `org.opendc.compute` or `org.opendc.simulator`)
2022-04-24	build: Update to Jandex Gradle 0.12.0	Fabian Mastenbroek
	This change updates the Jandex Gradle plugin to version 0.12.0. This version addresses some deprecation warnings reported by Gradle for the future 8.0 release.
2022-04-24	build: Move modules into subgroups	Fabian Mastenbroek
	This change updates the Gradle build configuration of the project to publish the different type of modules (e.g., opendc-compute, opendc-simulator) into their own groups.
2022-04-23	build: Enable testing for all library modules	Fabian Mastenbroek
	This change updates the Gradle build configuration to ensure that all library modules (that will be published) use testing and are included in coverage reports. This should ensure the public modules remain well tested.
2022-04-22	merge: Improve discovery of interference models (#76)	Fabian Mastenbroek
	This pull request intends to improve discovery of interference models. Previously, interference models were not tied to the workload trace, meaning they had to be resolved separately from the workload trace. In reality, the interference model is always tied to the particular workload trace. With this pull request, we integrate the interference model into the `odcvm` trace format and make it available through the `opendc-trace` library. This has as additional benefit that we can support different interference formats in the future using the same API. Furthermore, this change allows us to ship the interference model with the workload traces and resolve them automatically in the future using some form of package manager. ## Implementation Notes :hammer_and_pick: * Incorporate interference model in trace format * Load interference model via trace library * Move conventions into separate package ## External Dependencies :four_leaf_clover: * N/A ## Breaking API Changes :warning: * `VmInterferenceModelReader` has been removed from `opendc-compute-workload` * Table and column conventions have been moved in `org.opendc.trace.conv` package
2022-04-22	refactor(trace/api): Move conventions into separate package	Fabian Mastenbroek
	This change moves the trace conventions (such as table and column names) in a separate conv package, so that it is separated from the main API. This also allows for a potential move into a separate module in the future.
2022-04-22	refactor(compute): Load interference model via trace library	Fabian Mastenbroek
	This change updates the compute support library to load the VM interference model via the OpenDC trace library, which provides a generic interface for reading interference models associated with workload traces.
2022-04-22	feat(trace/opendc): Incorporate interference model in trace format	Fabian Mastenbroek
	This change updates the OpenDC VM trace format to incorporate the VM interference model in the trace format itself. This makes sense since the model is tightly coupled to the actual trace that is being simulated. This approach has as benefit that we can directly load the interference model from the workload trace, without having to resolve the model seperately (as we did before).
2022-04-22	refactor(web/runner): Improve OpenDC web runner implementation	Fabian Mastenbroek
	This change contains a rewrite of the OpenDC web runner implementation, which now supports terminating simulations when exceeding a deadline, as well as executing multiple simulation jobs at the same time. Furthermore, we have extracted the runner from the command line interface, so that we can offer this functionality as a library in the future.
2022-04-22	merge: Update build process (#74)	Fabian Mastenbroek
	This pull request brings several updates to the build process as well as new dependency versions. This should resolve several issues that occur during the build process (such as Quarkus or JaCoCo complaining). ## Implementation Notes :hammer_and_pick: * Remove use of lint-staged * Migrate from Yarn to NPM * Update to Kotlin 1.6.21 * Update to Quarkus 2.8.1.Final * Use JaCoCo 0.8.8 * Test Java 18 ## External Dependencies :four_leaf_clover: * Kotlin, Quarkus, JaCoCo, NPM ## Breaking API Changes :warning: * N/A
2022-04-22	build(web/api): Move Quarkus build configuration into buildSrc	Fabian Mastenbroek
	This change moves most of the Quarkus build configuration into buildSrc so it can possibly be re-used for other modules.
2022-04-22	build: Include Quarkus tests in aggregated JaCoCo test report	Fabian Mastenbroek
	This change fixes an issue where the results of the Quarkus tests where not included in the aggregated JaCoCo test report, due to it not using the official Gradle JaCoCo plugin. This change defines a new configuration that exposes the execution data generated by Quarkus to the aggregation plugin.
2022-04-21	ci: Test Java 18	Fabian Mastenbroek
	This change updates the CI pipelines to test with the latest version of Java (version 18).
2022-04-21	build: Use JaCoCo 0.8.8	Fabian Mastenbroek
	This change updates the build script to use JaCoCo 0.8.8 for code coverage instrumentation. This version adds support for Java 18 classes.
2022-04-21	build(web/api): Update to Quarkus 2.8.1.Final	Fabian Mastenbroek
	This change updates the web API to use Quarkus 2.8.1.Final. This release fixes an issue we had with local extensions failing to build due to some build directories missing.
2022-04-21	build: Update to Kotlin 1.6.21	Fabian Mastenbroek
	This change updates the Kotlin version used by the build process of our project to version 1.6.21.
2022-04-21	build(web/ui): Migrate from Yarn to NPM	Fabian Mastenbroek
	This change updates the Node package manager used by the OpenDC web UI build from Yarn to NPM, which is included by default in the Node distributions that are used by node-gradle.
2022-04-21	build(web/ui): Remove use of lint-staged	Fabian Mastenbroek
	This change removes the use of lint-staged from the project. These steps should be managed by the Gradle build logic, while we keep the Next.js build logic as minimal as possible.