summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-05-15build(web/api): Ensure Node.js is downloadedFabian Mastenbroek
This change updates the build configuration in order to ensure that Node.js is downloaded onto the build system. This drops an explicit dependency on a system installation of Node.js and allows us to ensure that the project is built against the correct Node.js version.
2022-05-15build(web/runner): Reduce build steps for Docker imageFabian Mastenbroek
This change updates the Dockerfile for the web runner to reduce the number of build steps necessary to build the web runner. Previously, the build would also include/build the web API which is not used in the image.
2022-05-15ci: Build Docker images for build pipelineFabian Mastenbroek
This change updates the CI build pipeline to also build the Docker images in order to catch any regressions in the deployment process via Docker.
2022-05-15ci: Bump docker/login-action from 1 to 2 (#84)dependabot[bot]
Bumps [docker/login-action](https://github.com/docker/login-action) from 1 to 2. - [Release notes](https://github.com/docker/login-action/releases) - [Commits](https://github.com/docker/login-action/compare/v1...v2) --- updated-dependencies: - dependency-name: docker/login-action dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-15ci: Bump docker/build-push-action from 2 to 3 (#83)dependabot[bot]
Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 2 to 3. - [Release notes](https://github.com/docker/build-push-action/releases) - [Commits](https://github.com/docker/build-push-action/compare/v2...v3) --- updated-dependencies: - dependency-name: docker/build-push-action dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-06merge: Restructure experiments and remove legacy harness (#82)Fabian Mastenbroek
This pull request restructures the experiments present in the `opendc-experiments` directory and removes the legacy OpenDC Harness. Previously, the experiments were written against the OpenDC Harness, which facilitates generation and execution of scenarios. However, the OpenDC Harness does not integrate well into the web-based workflow of OpenDC, where users should be able to submit scenarios in the web interface and automatically simulate it in the cloud, since the harness relied on a special Kotlin DSL to specify experiments. In future pull request, we'll attempt to introduce a similar approach for specifying and running experiments as we have done for the Radice experiments, where the entire experiment is described in a serializable (JSON/YAML) format. ## Implementation Notes :hammer_and_pick: * Add helper tools for FaaS simulations * Fix infinite loop due to invalid rounding * Convert experiment into integration test * Add independent Capelin distribution * Remove OpenDC Harness modules * Remove unnecessary dependencies ## Breaking API Changes :warning: * Removal of the OpenDC Harness modules. Instead, we now package each experiment individually. We'll focus in the future on extracting common code from the Capelin and Radice experiments so they can be re-used by other experiments as well.
2022-05-06build(trace/parquet): Remove unnecessary dependenciesFabian Mastenbroek
This change removes several dependencies from the `opendc-trace-parquet` helper module, which are part of Hadoop Common, but are not actually used by the Parquet project.
2022-05-06refactor(harness): Remove OpenDC Harness modulesFabian Mastenbroek
This change removes the OpenDC Harness modules from the main repository. We have made the decision to take a different direction regarding the specification and execution of experiments. The design of the current harness does not integrate well with the specification of experiments in the web interface. The new version focuses on proper integration with the web interface, as well as via the command line interface.
2022-05-06refactor(exp/capelin): Add independent Capelin distributionFabian Mastenbroek
This change updates the Capelin experiments so it can be distributed and executed independently of the main OpenDC distribution. We provide a new command line interface for users to directly run the experiments. Alternatively, the `CapelinRunner` class encapsulates the logic for running the experiments and can be used programmatically.
2022-05-06refactor(exp/tf20): Convert experiment into integration testFabian Mastenbroek
This change removes the `TensorFlowExperiment` in favour of an integration test that can be run during CI invocations. Given that the experiment was not very sophisticated (in terms of data collection), we believe it is better suited as an integration test.
2022-05-06fix(exp/tf20): Fix infinite loop due to invalid roundingFabian Mastenbroek
This change fixes an issue with the `SimTFDevice` implementation where very small amounts of FLOPs would cause the device to enter an infinite loop. We now round the value up to ensure that the device always consumes FLOPs.
2022-05-06feat(faas): Add helper tools for FaaS simulationsFabian Mastenbroek
This change adds a new module, opendc-faas-workload that contains helper code for constructing simulations of FaaS-based workloads using OpenDC. In addition, we add an integration test that demonstrates the capabilities of the helper tool and the FaaS platform of OpenDC.
2022-05-06merge: Move OpenTelemetry integration outside core modules (#81)Fabian Mastenbroek
This change removes the OpenTelemetry integration from the OpenDC modules. Previously, we chose to integrate OpenTelemetry to provide a unified way to report metrics to the users. Although this worked as expected, the overhead of the OpenTelemetry when collecting metrics during simulation was considerable and lacked more optimization opportunities (other than providing a separate API implementation). Furthermore, since we were tied to OpenTelemetry's SDK implementation, we experienced issues with throttling and registering multiple instruments. We will instead use another approach, where we expose the core metrics in OpenDC via specialized interfaces (see #80) such that access is fast and can be done without having to interface with OpenTelemetry. In addition, we will provide an adapter to that is able to forward these metrics to OpenTelemetry implementations, so we can still integrate with the wider ecosystem. ## Implementation Notes :hammer_and_pick: * Remove OpenTelemetry from "compute" modules * Remove OpenTelemetry from "workflow" modules * Remove OpenTelemetry from "FaaS" modules * Remove OpenTelemetry from TF20 experiment * Remove dependency on OpenTelemetry SDK ## External Dependencies :four_leaf_clover: * N/A ## Breaking API Changes :warning: * Metrics are not anymore directly exposed via OpenTelemetry. Instead, an adapter needs to be used to access the data via OpenTelemetry.
2022-05-06refactor(telemetry): Remove dependency on OpenTelemetry SDKFabian Mastenbroek
This change removes the dependency on the OpenTelemetry SDK. Instead, we'll only expose metrics via the OpenTelemetry API in the future via adapter classes.
2022-05-06refactor(exp/tf20): Remove OpenTelemetry from TF20 experimentFabian Mastenbroek
This change removes the OpenTelemetry integration from the OpenDC Tensorflow 2020 experiments. Previously, we chose to integrate OpenTelemetry to provide a unified way to report metrics to the users. See the previous commit removing it from the "Compute" modules for the reasoning behind this change.
2022-05-06refactor(workflow/service): Remove OpenTelemetry from "FaaS" modulesFabian Mastenbroek
This change removes the OpenTelemetry integration from the OpenDC FaaS modules. Previously, we chose to integrate OpenTelemetry to provide a unified way to report metrics to the users. See the previous commit removing it from the "Compute" modules for the reasoning behind this change.
2022-05-06refactor(workflow/service): Remove OpenTelemetry from "workflow" modulesFabian Mastenbroek
This change removes the OpenTelemetry integration from the OpenDC Workflow modules. Previously, we chose to integrate OpenTelemetry to provide a unified way to report metrics to the users. See the previous commit removing it from the "Compute" modules for the reasoning behind this change.
2022-05-06refactor(compute/service): Remove OpenTelemetry from "compute" modulesFabian Mastenbroek
This change removes the OpenTelemetry integration from the OpenDC Compute modules. Previously, we chose to integrate OpenTelemetry to provide a unified way to report metrics to the users. Although this worked as expected, the overhead of the OpenTelemetry when collecting metrics during simulation was considerable and lacked more optimization opportunities (other than providing a separate API implementation). Furthermore, since we were tied to OpenTelemetry's SDK implementation, we experienced issues with throttling and registering multiple instruments. We will instead use another approach, where we expose the core metrics in OpenDC via specialized interfaces (see the commits before) such that access is fast and can be done without having to interface with OpenTelemetry. In addition, we will provide an adapter to that is able to forward these metrics to OpenTelemetry implementations, so we can still integrate with the wider ecosystem.
2022-05-06merge: Expose metrics directly to user (#80)Fabian Mastenbroek
This pull request adds the ability to access the metrics of resources modeled by the OpenDC Compute, Workflow, FaaS, and TensorFlow services directly from their corresponding interfaces. Previously, users would have to interact with OpenTelemetry to obtain these values, which is complex and provides significant overhead. With this pull request, users can access the metrics of all cloud resources modeled by OpenDC via methods such as `getSchedulerStats()`, etc. ** Breaking Changes ** - `ComputeService.hostCount` removed in favour of `ComputeService.hosts.size`
2022-05-06refactor(exp/tf20): Directly expose device stats stats to userFabian Mastenbroek
This change updates the `TFDevice` interface to directly expose statistics about the accelerator device to the user. Previously, the user had to access these values through OpenTelemetry, which required substantial extra work.
2022-05-06refactor(faas/service): Directly expose scheduler/function stats to userFabian Mastenbroek
This change updates the `FaaSService` interface to directly expose statistics about the scheduler and individual functions to the user, such that they do not necessarily have to interact with OpenTelemetry to obtain these values.
2022-05-06refactor(workflow/service): Directly expose scheduler stats to userFabian Mastenbroek
This change updates the `WorkflowService` interface to directly expose statistics about the scheduler to the user, such that they do not necessarily have to interact with OpenTelemetry to obtain these values
2022-05-06refactor(telemetry/compute): Support direct metric accessFabian Mastenbroek
This change introduces a `ComputeMetricReader` class that can be used as a replacement for the `CoroutineMetricReader` class when reading metrics from the Compute service. This implementation operates directly on a `ComputeService` instance, providing better performance.
2022-05-04refactor(compute): Directly expose scheduler stats to userFabian Mastenbroek
This change updates the `ComputeService` interface to directly expose statistics about the scheduler to the user, such that they do not necessarily have to interact with OpenTelemetry to obtain these values.
2022-05-04feat(compute): Add support for looking up hostsFabian Mastenbroek
This change adds the ability for users to lookup the `Host` on which a `Server` is hosted (if any). This allows the user to potentially interact with the `Host` directly, e.g., in order to obtain advanced metrics.
2022-05-04ci: Bump mikepenz/action-junit-report from 3.0.2 to 3.0.3 (#79)dependabot[bot]
Bumps [mikepenz/action-junit-report](https://github.com/mikepenz/action-junit-report) from 3.0.2 to 3.0.3. - [Release notes](https://github.com/mikepenz/action-junit-report/releases) - [Commits](https://github.com/mikepenz/action-junit-report/compare/v3.0.2...v3.0.3) --- updated-dependencies: - dependency-name: mikepenz/action-junit-report dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-03refactor(compute): Expose CPU and system stats via Host interfaceFabian Mastenbroek
This change updates the `Host` interface to directly expose CPU and system stats to be used by components that interface with the `Host` interface. Previously, this would require the user to interact with the OpenTelemetry SDK. Although that is still possible for more advanced usage cases, users can use the following methods to easily access common host and guest statistics.
2022-05-02merge: Add support for SQL via Apache Calcite (#78)Fabian Mastenbroek
This pull request integrates initial support for SQL queries via Apache Calcite into the OpenDC codebase. Our vision is that users of OpenDC should be able to use SQL queries to access and process most of the experiment data generated by simulations. This pull request moves towards this goal by adding the ability to query workload traces supported by OpenDC using SQL. We also provide a CLI for querying the data in workload traces via `opendc-trace-tools`: ```bash opendc-trace-tools query -i data/bitbrains-small -f opendc-vm "SELECT MAX(cpu_count) FROM resource_states" ``` ## Implementation Notes :hammer_and_pick: * Add Calcite (SQL) integration * Add support for writing via SQL * Add support for writing via SQL * Support custom Parquet ReadSupport implementations * Read records using low-level Parquet API * Do not use Avro when exporting experiment data * Do not use Avro when reading WTF trace * Drop dependency on Avro * Add support for projections ## External Dependencies :four_leaf_clover: * Apache Calcite ## Breaking API Changes :warning: * The existing code for reading Parquet traces using Apache Avro has been removed. * `TraceFormat.newReader` now accepts a nullable `projection` parameter
2022-05-02perf(trace/calcite): Add support for projectionsFabian Mastenbroek
This change adds support for projections in the Apache Calcite integration with OpenDC. This enables faster queries when only a subset of the table columns is selected.
2022-05-02feat(trace/api): Add support for projecting tablesFabian Mastenbroek
This change adds support for projecting certain columns of a table. This enables faster reading for tables with high number of columns. Currently, we support projection in the Parquet-based workload formats. Other formats are text-based and will probably not benefit much from projection.
2022-05-02refactor(trace/parquet): Drop dependency on AvroFabian Mastenbroek
This change updates the Parquet support library in OpenDC to not rely on Avro, but instead interface directly with Parquet's reading and writing functionality, providing less overhead.
2022-05-02refactor(trace/wtf): Do not use Avro when reading WTF traceFabian Mastenbroek
This change updates the Workflow Trace format implementation in OpenDC to not use the `parquet-avro` library for exporting experiment data, but instead to use the low-level APIs to directly read the data from Parquet. This reduces the amount of conversions necessary before reaching the OpenDC trace API.
2022-05-02refactor(compute): Do not use Avro when exporting experiment dataFabian Mastenbroek
This change updates the `ParquetDataWriter` class to not use the `parquet-avro` library for exporting experiment data, but instead to use the low-level APIs to directly write the data in Parquet format.
2022-05-02perf(trace/opendc): Read records using low-level APIFabian Mastenbroek
This change updates the OpenDC VM format reader implementation to use the low-level record reading APIs provided by the `parquet-mr` library for improved performance. Previously, we used the `parquet-avro` library to read/write Avro records in Parquet format, but that library carries considerable overhead.
2022-05-01refactor(trace/parquet): Support custom ReadSupport implementationsFabian Mastenbroek
This change updates the `LocalParquetReader` implementation to support custom `ReadSupport` implementations, so we do not have to rely on the Avro implementation necessarily.
2022-04-30feat(trace/tools): Add support for querying traces using SQLFabian Mastenbroek
This change adds a command line interface for querying workload traces using SQL. We provide a new command for the trace tools that can query a workload trace.
2022-04-30feat(trace/calcite): Add support for writing via SQLFabian Mastenbroek
This change updates the Apache Calcite integration to support writing workload traces via SQL. This enables custom conversion scripts between different workload traces.
2022-04-30feat(trace/calcite): Add Calcite (SQL) integrationFabian Mastenbroek
This change adds support for querying workload trace formats implemented using the OpenDC API through Apache Calcite. This allows users to write SQL queries to explore the workload traces.
2022-04-24merge: Move modules into different groups (#77)Fabian Mastenbroek
This pull request moves the different modules of OpenDC into different groups. For instance, all submodules of `opendc-compute` are moved into `org.opendc.compute`. This provides a better separation of the artifacts. ## Implementation Notes :hammer_and_pick: * Enable testing for all library modules * Move modules into subgroups * Update to Jandex Gradle 0.12.0 ## External Dependencies :four_leaf_clover: * N/A ## Breaking API Changes :warning: * Each module has now been assigned its own group (e.g., `org.opendc.compute` or `org.opendc.simulator`)
2022-04-24build: Update to Jandex Gradle 0.12.0Fabian Mastenbroek
This change updates the Jandex Gradle plugin to version 0.12.0. This version addresses some deprecation warnings reported by Gradle for the future 8.0 release.
2022-04-24build: Move modules into subgroupsFabian Mastenbroek
This change updates the Gradle build configuration of the project to publish the different type of modules (e.g., opendc-compute, opendc-simulator) into their own groups.
2022-04-23build: Enable testing for all library modulesFabian Mastenbroek
This change updates the Gradle build configuration to ensure that all library modules (that will be published) use testing and are included in coverage reports. This should ensure the public modules remain well tested.
2022-04-22merge: Improve discovery of interference models (#76)Fabian Mastenbroek
This pull request intends to improve discovery of interference models. Previously, interference models were not tied to the workload trace, meaning they had to be resolved separately from the workload trace. In reality, the interference model is always tied to the particular workload trace. With this pull request, we integrate the interference model into the `odcvm` trace format and make it available through the `opendc-trace` library. This has as additional benefit that we can support different interference formats in the future using the same API. Furthermore, this change allows us to ship the interference model with the workload traces and resolve them automatically in the future using some form of package manager. ## Implementation Notes :hammer_and_pick: * Incorporate interference model in trace format * Load interference model via trace library * Move conventions into separate package ## External Dependencies :four_leaf_clover: * N/A ## Breaking API Changes :warning: * `VmInterferenceModelReader` has been removed from `opendc-compute-workload` * Table and column conventions have been moved in `org.opendc.trace.conv` package
2022-04-22refactor(trace/api): Move conventions into separate packageFabian Mastenbroek
This change moves the trace conventions (such as table and column names) in a separate conv package, so that it is separated from the main API. This also allows for a potential move into a separate module in the future.
2022-04-22refactor(compute): Load interference model via trace libraryFabian Mastenbroek
This change updates the compute support library to load the VM interference model via the OpenDC trace library, which provides a generic interface for reading interference models associated with workload traces.
2022-04-22feat(trace/opendc): Incorporate interference model in trace formatFabian Mastenbroek
This change updates the OpenDC VM trace format to incorporate the VM interference model in the trace format itself. This makes sense since the model is tightly coupled to the actual trace that is being simulated. This approach has as benefit that we can directly load the interference model from the workload trace, without having to resolve the model seperately (as we did before).
2022-04-22refactor(web/runner): Improve OpenDC web runner implementationFabian Mastenbroek
This change contains a rewrite of the OpenDC web runner implementation, which now supports terminating simulations when exceeding a deadline, as well as executing multiple simulation jobs at the same time. Furthermore, we have extracted the runner from the command line interface, so that we can offer this functionality as a library in the future.
2022-04-22merge: Update build process (#74)Fabian Mastenbroek
This pull request brings several updates to the build process as well as new dependency versions. This should resolve several issues that occur during the build process (such as Quarkus or JaCoCo complaining). ## Implementation Notes :hammer_and_pick: * Remove use of lint-staged * Migrate from Yarn to NPM * Update to Kotlin 1.6.21 * Update to Quarkus 2.8.1.Final * Use JaCoCo 0.8.8 * Test Java 18 ## External Dependencies :four_leaf_clover: * Kotlin, Quarkus, JaCoCo, NPM ## Breaking API Changes :warning: * N/A
2022-04-22build(web/api): Move Quarkus build configuration into buildSrcFabian Mastenbroek
This change moves most of the Quarkus build configuration into buildSrc so it can possibly be re-used for other modules.
2022-04-22build: Include Quarkus tests in aggregated JaCoCo test reportFabian Mastenbroek
This change fixes an issue where the results of the Quarkus tests where not included in the aggregated JaCoCo test report, due to it not using the official Gradle JaCoCo plugin. This change defines a new configuration that exposes the execution data generated by Quarkus to the aggregation plugin.