summaryrefslogtreecommitdiff
path: root/opendc-trace/opendc-trace-opendc
AgeCommit message (Collapse)Author
2022-05-02feat(trace/api): Add support for projecting tablesFabian Mastenbroek
This change adds support for projecting certain columns of a table. This enables faster reading for tables with high number of columns. Currently, we support projection in the Parquet-based workload formats. Other formats are text-based and will probably not benefit much from projection.
2022-05-02refactor(trace/parquet): Drop dependency on AvroFabian Mastenbroek
This change updates the Parquet support library in OpenDC to not rely on Avro, but instead interface directly with Parquet's reading and writing functionality, providing less overhead.
2022-05-02perf(trace/opendc): Read records using low-level APIFabian Mastenbroek
This change updates the OpenDC VM format reader implementation to use the low-level record reading APIs provided by the `parquet-mr` library for improved performance. Previously, we used the `parquet-avro` library to read/write Avro records in Parquet format, but that library carries considerable overhead.
2022-05-01refactor(trace/parquet): Support custom ReadSupport implementationsFabian Mastenbroek
This change updates the `LocalParquetReader` implementation to support custom `ReadSupport` implementations, so we do not have to rely on the Avro implementation necessarily.
2022-04-23build: Enable testing for all library modulesFabian Mastenbroek
This change updates the Gradle build configuration to ensure that all library modules (that will be published) use testing and are included in coverage reports. This should ensure the public modules remain well tested.
2022-04-22refactor(trace/api): Move conventions into separate packageFabian Mastenbroek
This change moves the trace conventions (such as table and column names) in a separate conv package, so that it is separated from the main API. This also allows for a potential move into a separate module in the future.
2022-04-22feat(trace/opendc): Incorporate interference model in trace formatFabian Mastenbroek
This change updates the OpenDC VM trace format to incorporate the VM interference model in the trace format itself. This makes sense since the model is tightly coupled to the actual trace that is being simulated. This approach has as benefit that we can directly load the interference model from the workload trace, without having to resolve the model seperately (as we did before).
2022-02-18build: Remove opendc-platform moduleFabian Mastenbroek
This change removes the opendc-platform module from the project. This module represented a Java platform which was previously used for sharing a set of dependency versions between subprojects. However, with the version catalogue that was added by Gradle, we currently do not use the platform anymore.
2021-10-25feat(trace): Add column for CPU capacity in OpenDC formatFabian Mastenbroek
This change adds a new column to resource table of the OpenDC trace format for the CPU capacity provisioned for a virtual machine, so that this capacity can be assigned to the virtual machine during simulation.
2021-09-21feat(trace): Add support for writing tracesFabian Mastenbroek
This change adds a new API for writing traces in a trace format. Currently, writing is only supported by the OpenDC VM format, but over time the other formats will also have support for writing added.
2021-09-20refactor(trace): Simplify TraceFormat SPI interfaceFabian Mastenbroek
This change simplifies the TraceFormat SPI interface by reducing the number of interfaces that implementors need to implement to only TraceFormat.
2021-09-20feat(trace): Add property for describing partition keysFabian Mastenbroek
2021-09-20feat(trace): Support column lookup via indexFabian Mastenbroek
This change adds support for looking up the column value through the column index. This enables faster lookup when processing very large traces.
2021-09-20refactor(trace): Unify columns of different tablesFabian Mastenbroek
This change unifies columns of different tables used by trace formats. This concretely means that instead of having columns specific per table (e.g., RESOURCE_ID and RESOURCE_STATE_ID), with this changes these columns are shared between the tables with a single definition (RESOURCE_ID).
2021-09-19feat(trace): Update OpenDC VM trace formatFabian Mastenbroek
This change optimizes the OpenDC VM trace format by removing unnecessary columns as well as optimizing the writer settings. The new implementation still supports reading the old trace format in case users run OpenDC with older workload traces.
2021-09-19feat(trace): Add support for internal OpenDC VM trace formatFabian Mastenbroek
This change adds official support to the trace library for the internal VM trace format used by OpenDC for its experiments. This is a compact format that uses Parquet to store the virtual machine trace data in two Parquet files.