summaryrefslogtreecommitdiff
path: root/opendc-trace/opendc-trace-parquet
AgeCommit message (Collapse)Author
2025-03-18Unit System v2 (#243)Alessio Leonardo Tomei
* Separated `Time` unit into `TimeDelta` and `TimeStamp` + small fixes Addition and subtruction between `Timestamp`s is not allowed, but any other `Unit` operation/comparison is. `TimeDelta`s can be added/subtructed to/form `Timestamp`s. Deserialization of `Timestamp`: - `Number` -> interpreted as millis from Epoch - `Instant` (string representation) -> converted to Timestamp - `Duration` (string representation) -> interpreted as duration since Epoch (warn msg is logged) Deserialization of `TimeDelta` is the same as `Time` was before, with the diference that when an `Instant` is converted to an timedelta since Epoch a warning message is logged. * Unit System v2 - Merged `BoundedPercentage` and `UnboundedPercentage` - Overrided all operation defined in `Unit` in all subclasses to avoid as much as possible value classes being boxed in bytecode. If units are used as generics (hence also functions defined in Unit<T>) they are boxed (as double would if used as generic). - All units companions now subclass `UnitId`, and can be used as keys (e.g `Map<UnitId, idk>`), while offering `max` `min` and `zero` methods. - Division between the same unit now returns `Percentage` - Added `Iterable<T>.averageOfUnitOrNull(selector (T) -> <specific unit>)` - `ifNeg0ThenPos0()` now is optional and not invoked on every constructor - Now methods in `Unit<T>` are all abstract, forcing override and avoid boxing in some cases - Added `@UnintendedOperation` and `UnitOperationException` for methods that must be defined but are not intended for use (e.g. `Timestamp` + `Timestamp`)
2024-08-22Refactored exporters. Allows output column selection in scenario (#241) (#241)Alessio Leonardo Tomei
2024-03-05Updated package versions, updated web server tests. (#207)Dante Niewenhuis
* Updated all package versions including kotlin. Updated all web-server tests to run. * Changed the java version of the tests. OpenDC now only supports java 19. * small update * test update * new update * updated docker version to 19 * updated docker version to 19
2022-12-14fix(trace/wtf): Disable Parquet strict typingFabian Mastenbroek
This change fixes an issue where some of the traces from the Workflow Trace Archive would fail to load with the trace format in OpenDC. This was caused by one of the fields being stored as a double, while the formats expects it to be a long. Parquet does not support unioning primitive types. Therefore, we have to disable strict type checking when reading the file. Furthermore, we need to support double entries for storing the workflow ids.
2022-10-06build: Switch to Spotless for formattingFabian Mastenbroek
This change updates the build configuration to use Spotless for code formating of both Kotlin and Java.
2022-10-06style: Eliminate use of wildcard importsFabian Mastenbroek
This change updates the repository to remove the use of wildcard imports everywhere. Wildcard imports are not allowed by default by Ktlint as well as Google's Java style guide.
2022-07-07build(trace/parquet): Ignore reload4j dependencyFabian Mastenbroek
This change updates the build configuration to ignore the reload4j dependency that was recently added to the hadoop-common module. Reload4j replaces the old unmaintained log4j1 module. However, since we expose this module as a library, we do not want to include a logging implementation in the dependencies. Currently, there are already instances where this new dependency leads to duplicate logging implementations on the classpath.
2022-06-08test(trace): Add conformance suite for OpenDC trace APIFabian Mastenbroek
This change adds a re-usable test suite for the interface of the OpenDC trace API, so implementors can verify whether they match the specification of the interfaces.
2022-05-18refactor(web/runner): Move runner CLI into separate configurationFabian Mastenbroek
This change splits the command line interface from the OpenDC web runner into a separate configuration. We plan to re-use the runner code for a Quarkus extension that integrates the runner in development mode.
2022-05-06build(trace/parquet): Remove unnecessary dependenciesFabian Mastenbroek
This change removes several dependencies from the `opendc-trace-parquet` helper module, which are part of Hadoop Common, but are not actually used by the Parquet project.
2022-05-02refactor(trace/parquet): Drop dependency on AvroFabian Mastenbroek
This change updates the Parquet support library in OpenDC to not rely on Avro, but instead interface directly with Parquet's reading and writing functionality, providing less overhead.
2022-05-02perf(trace/opendc): Read records using low-level APIFabian Mastenbroek
This change updates the OpenDC VM format reader implementation to use the low-level record reading APIs provided by the `parquet-mr` library for improved performance. Previously, we used the `parquet-avro` library to read/write Avro records in Parquet format, but that library carries considerable overhead.
2022-05-01refactor(trace/parquet): Support custom ReadSupport implementationsFabian Mastenbroek
This change updates the `LocalParquetReader` implementation to support custom `ReadSupport` implementations, so we do not have to rely on the Avro implementation necessarily.
2022-04-23build: Enable testing for all library modulesFabian Mastenbroek
This change updates the Gradle build configuration to ensure that all library modules (that will be published) use testing and are included in coverage reports. This should ensure the public modules remain well tested.
2022-02-18build: Remove opendc-platform moduleFabian Mastenbroek
This change removes the opendc-platform module from the project. This module represented a Java platform which was previously used for sharing a set of dependency versions between subprojects. However, with the version catalogue that was added by Gradle, we currently do not use the platform anymore.
2021-09-19refactor(capelin): Extract common code out of Capelin experimentsFabian Mastenbroek
This change creates a new module for doing simulations with virtual machine workloads. We have found that a lot of code in the Capelin experiments code is being re-used by non-experiment modules.
2021-09-02refactor(trace): Extract Parquet helpers into separate moduleFabian Mastenbroek
This change extracts the Parquet helpers outside format module into a new module, in order to improve re-usability of these helpers.