| Age | Commit message (Collapse) | Author |
|
* Implemented Workflows for OpenDC
|
|
|
|
(#342)
* renamed performance counter to distinguish different resource types
* added GPU, modelled similar to CPU
* added GPUs to machine model
* list of GPUs instead of single instance
* renamed memory speed to bandwidth
* enabled parsing of GPU resources
* split powermodel into cpu and GPU powermodel
* added gpu parsing tests
* added idea of host level scheduling
* added tests for multi gpu parsing
* renamed powermodel to cpupowermodel
* clarified naming of cpu and gpu components
* added resource type to flow suplier and edge
* added resourcetype
* added GPU components and resource type to fragments
* added GPU to workload and updated resource usage retrieval
* implemented first version of multi resource
* added name to workload
* renamed perfomance counters
* removed commented out code
* removed deprecated comments
* included demand and supply into calculations
* resolving rebase mismatches
* moved resource type from flowedge class to common package
* added available resources to machinees
* cleaner separation if workload is started of simmachine or vm
* Replaced exception with dedicated enum
* Only looping over resources that are actually used
* using hashmaps to handle resourcetype instead of arrays for readability
* fixed condition
* tracking finished workloads per resource type
* removed resource type from flowedge
* made supply and demand distribution resource specific
* added power model for GPU
* removed unused test setup
* removed depracated comments
* removed unused parameter
* added ID for GPU
* added GPUs and GPU performance counters (naively)
* implemented capturing of GPU statistics
* added reminders for future implementations
* renamed properties for better identification
* added capturing GPU statistics
* implemented first tests for GPUs
* unified access to performance counters
* added interface for general compute resource handling
* implemented multi resource support in simmachine
* added individual edge to VM per resource
* extended compute resource interface
* implemented multi-resource support in PSU
* implemented generic retrieval of computeresources
* implemented mult-resource suppport in vm
* made method use more resource specific
* implemented simple GPU tests
* rolled back frquency and demand use
* made naming independent of used resource
* using workloads resources instead of VMs to determine available resource
* implemented determination of used resources in workload
* removed logging statements
* implemented reading from workload
* fixed naming for host-level allocation
* fixed next deadline calculation
* fixed forwarding supply
* reduced memory footprint
* made GPU powermodel nullable
* maded Gpu powermodel configurable in topology
* implemented tests for basic gpu scheduler
* added gpu properties
* implemented weights, filter and simple cpu-gpu scheduler
* spotless apply
* spotless apply pt. 2
* fixed capitalization
* spotless kotlin run
* implemented coloumn export
* todo update
* removed code comments
* Merged PerformanceCounter classes into one & removed interface
* removed GPU specific powermodel
* Rebase master: kept both versions of TopologyFactories
* renamed CpuPowermodel to resource independent Powermodel
Moved it from Cpu package to power package
* implementated default of getResourceType & removed overrides if possible
* split getResourceType into Consumer and Supplier
* added power as resource type
* reduced supply demand from arrayList to single value
* combining GPUs into one large GPU, until full multi-gpu support
* merged distribution policy enum with corresponding factory
* added comment
* post-rebase fixes
* aligned naming
* Added GPU metrics to task output
* Updates power resource type to uppercase.
Standardizes the `ResourceType.Power` enum to `ResourceType.POWER`
for consistency with other resource types and improved readability.
* Removes deprecated test assertions
Removes commented-out assertions in GPU tests.
These assertions are no longer needed and clutter the test code.
* Renames MaxMinFairnessStrategy to Policy
Renames MaxMinFairnessStrategy to MaxMinFairnessPolicy for
clarity and consistency with naming conventions. This change
affects the factory and distributor to use the updated name.
* applies spotless
* nulls GPUs as it is not used
|
|
* Start time shifting
* Existing experiments work with new columns
* Remove unused traces dir
* Update java to 21 LTS and jacoco to be compatible
* Minimal working timeshifting
* Timeshift scheduler linked as carbon receiver
* Add basic tests for timeshift scheduler
* Run spotless apply
* Modify tarce format tests to support new fields
* Change all mentions of java 19 to 21
* Add a deferAll option to workload to make all tasks deferrable
* Run spotless apply
* Copy traces from resources in web dockerfile
|
|
* Separated `Time` unit into `TimeDelta` and `TimeStamp` + small fixes
Addition and subtruction between `Timestamp`s is not allowed, but any
other `Unit` operation/comparison is. `TimeDelta`s can be
added/subtructed to/form `Timestamp`s.
Deserialization of `Timestamp`:
- `Number` -> interpreted as millis from Epoch
- `Instant` (string representation) -> converted to Timestamp
- `Duration` (string representation) -> interpreted as duration since
Epoch (warn msg is logged)
Deserialization of `TimeDelta` is the same as `Time` was before, with the
diference that when an `Instant` is converted to an timedelta since Epoch
a warning message is logged.
* Unit System v2
- Merged `BoundedPercentage` and `UnboundedPercentage`
- Overrided all operation defined in `Unit` in all subclasses to avoid
as much as possible value classes being boxed in bytecode. If units are used as generics
(hence also functions defined in Unit<T>) they are boxed (as double would if used as generic).
- All units companions now subclass `UnitId`, and can be used as keys
(e.g `Map<UnitId, idk>`), while offering `max` `min` and `zero`
methods.
- Division between the same unit now returns `Percentage`
- Added `Iterable<T>.averageOfUnitOrNull(selector (T) -> <specific unit>)`
- `ifNeg0ThenPos0()` now is optional and not invoked on every
constructor
- Now methods in `Unit<T>` are all abstract, forcing override and avoid
boxing in some cases
- Added `@UnintendedOperation` and `UnitOperationException` for methods
that must be defined but are not intended for use (e.g. `Timestamp` +
`Timestamp`)
|
|
* Started on reimplementing the SimTrace implementation
* updated trace format. Fragments now do not have a deadline, but a duration. The Fragments are executed in order.
|
|
* Updated SimTrace to use a single ArrayDeque instead of three separate lists for deadline, cpuUsage, and coreCount
* Renamed input files to tasks.parquet and fragments.parquet. Renamed server to task. OpenDC nows exports tasks.parquet instead of server.parquet
|
|
|
|
|
|
* sync with the master branch
* rebase
* multimodel - simulation is currently run as many times as you can see a model
* factory method - handles models without given params
* removed redundant flags
* modelType
* flags removed
* implemented output into a folder
* multimodel ipynb setup - to be implemented and also ran as a python script, when the simulation occurs
* towards a mutimodel python implementation - issue observed - the saved files have same data?
* json parsing handles now lists for topology, workloads, allocaitonPolicies, powerModels
* scenarioFile inputs lists, and creates multiple combinations of scenarios
* multi-model prediction repaired, now we predict using multiple models
* commit before removing powerModel from scenario
* commit after removing powerModel from scenario
* commit after removing powerModel from scenario (and actually running)
* powermodels now can output their name and full name (with min and max)
* now we can select where to output (seed or output folder)
* input files - clear naming + output naming improved
* minimal changes
* all tests passing + json files from tests updated to the new json format
* json files from topology now accept only one power model (instead of list)
* json files from topology now accept only one power model (instead of list)
* multi and single input from tests updated to match the format
* tests passed locally
* spotless applies
* demo folder removed
|
|
* Started with the carbon trace implementation
* Moved the carbon trace system to the proper folders
|
|
* Revamped the trace system. All TraceFormat files are now in the api module. This fixes some problems with not being able to use types of traces
* applied spotless
|
|
* Updated all package versions including kotlin. Updated all web-server tests to run.
* Changed the java version of the tests. OpenDC now only supports java 19.
* small update
* test update
* new update
* updated docker version to 19
* updated docker version to 19
|
|
This change fixes an issue where some of the traces from the Workflow
Trace Archive would fail to load with the trace format in OpenDC. This
was caused by one of the fields being stored as a double, while the
formats expects it to be a long.
Parquet does not support unioning primitive types. Therefore, we have to
disable strict type checking when reading the file. Furthermore, we need
to support double entries for storing the workflow ids.
|
|
This change adds the log4j-core dependency to various modules of OpenDC
using log4j2, to ensure logging keeps working. The upgrade to SLF4J 2.0 broke
the Log4j2 functionality, since the log4j-core artifact is not
automatically shipped with the SLF4J implementation.
|
|
This change updates the build configuration to use Spotless for code
formating of both Kotlin and Java.
|
|
This change updates the repository to remove the use of wildcard imports
everywhere. Wildcard imports are not allowed by default by Ktlint as
well as Google's Java style guide.
|
|
This change updates the TraceFormat lookup algorithm to prevent caching
the available trace format on first access. Since the result of
ServiceLoader depends on the Thread's context ClassLoader, they may
differ between different threads.
Furthermore, ServiceLoader maintains its own thread-local cache, so we
can instead utilize that cache and always use the results returned by
it.
|
|
This change updates the build configuration to ignore the reload4j
dependency that was recently added to the hadoop-common module. Reload4j
replaces the old unmaintained log4j1 module.
However, since we expose this module as a library, we do not want to
include a logging implementation in the dependencies. Currently, there
are already instances where this new dependency leads to duplicate
logging implementations on the classpath.
|
|
This change updates the simulator dependencies to the latest available
version where possible.
|
|
This change adds a re-usable test suite for the interface of the OpenDC
trace API, so implementors can verify whether they match the
specification of the interfaces.
|
|
This change adds JMH benchmarks for the parsing logic of the Azure VM
trace format in order to catch performance regressions.
|
|
This change adds JMH benchmarks for the parsing logic of the OpenDC VM
trace format in order to catch performance regressions.
|
|
This change updates the trace API by introducing a limited type system
for the table columns. Previously, the table columns could have any
possible type representable by the JVM. With this change, we limit the
available types to a small type system.
|
|
This change splits the command line interface from the OpenDC web runner
into a separate configuration. We plan to re-use the runner code for a Quarkus
extension that integrates the runner in development mode.
|
|
This change removes several dependencies from the `opendc-trace-parquet`
helper module, which are part of Hadoop Common, but are not actually
used by the Parquet project.
|
|
This change adds support for projections in the Apache Calcite
integration with OpenDC. This enables faster queries when only a subset
of the table columns is selected.
|
|
This change adds support for projecting certain columns of a table. This
enables faster reading for tables with high number of columns.
Currently, we support projection in the Parquet-based workload formats.
Other formats are text-based and will probably not benefit much from
projection.
|
|
This change updates the Parquet support library in OpenDC to not rely on
Avro, but instead interface directly with Parquet's reading and writing
functionality, providing less overhead.
|
|
This change updates the Workflow Trace format implementation in OpenDC to
not use the `parquet-avro` library for exporting experiment data, but
instead to use the low-level APIs to directly read the data from Parquet.
This reduces the amount of conversions necessary before reaching the
OpenDC trace API.
|
|
This change updates the OpenDC VM format reader implementation to use
the low-level record reading APIs provided by the `parquet-mr` library
for improved performance. Previously, we used the `parquet-avro` library
to read/write Avro records in Parquet format, but that library carries
considerable overhead.
|
|
This change updates the `LocalParquetReader` implementation to support
custom `ReadSupport` implementations, so we do not have to rely on the
Avro implementation necessarily.
|
|
This change adds a command line interface for querying workload traces
using SQL. We provide a new command for the trace tools that can query a
workload trace.
|
|
This change updates the Apache Calcite integration to support writing
workload traces via SQL. This enables custom conversion scripts between
different workload traces.
|
|
This change adds support for querying workload trace formats implemented
using the OpenDC API through Apache Calcite. This allows users to write
SQL queries to explore the workload traces.
|
|
This change updates the Gradle build configuration of the project to
publish the different type of modules (e.g., opendc-compute,
opendc-simulator) into their own groups.
|
|
This change updates the Gradle build configuration to ensure that all
library modules (that will be published) use testing and are included in
coverage reports. This should ensure the public modules remain well
tested.
|
|
This change moves the trace conventions (such as table and column names)
in a separate conv package, so that it is separated from the main API.
This also allows for a potential move into a separate module in the
future.
|
|
This change updates the OpenDC VM trace format to incorporate the VM
interference model in the trace format itself. This makes sense since
the model is tightly coupled to the actual trace that is being
simulated.
This approach has as benefit that we can directly load the
interference model from the workload trace, without having to resolve
the model seperately (as we did before).
|
|
This change fixes an issue where the number of vCPUs was not taken into
account when converting from CPU Usage percentage to MHz.
|
|
This change removes the opendc-platform module from the project. This
module represented a Java platform which was previously used for sharing
a set of dependency versions between subprojects. However, with the
version catalogue that was added by Gradle, we currently do not use the
platform anymore.
|
|
Tasks from a .gwf trace file did not have dependencies because this
property was not assigned after being read in the GwfTaskTableReader.
I removed the conversion from String to Long in parseParents because it
seems like other readers (the Parquet reader in particular) return
Strings as well, which is why they are converted to Long in line 75 of
TraceHelpers.kt.
Co-authored-by: Fabian Mastenbroek <mail.fabianm@gmail.com>
|
|
This change updates the implementation of the trace converter and
SimTrace implementation to support cases where there is a gap between
samples in the trace data.
This change allows users to specify what to do in case samples are
missing in the trace. The available options are specified in
`SimTrace.FillMode`. Currently, we support either carrying the previous
value forward or set the usage to zero.
|
|
This change adds support for converting the Azure VM traces into the
OpenDC trace format.
|
|
This change adds a new column to resource table of the OpenDC trace format for
the CPU capacity provisioned for a virtual machine, so that this
capacity can be assigned to the virtual machine during simulation.
|
|
This change addresses an issue where the timestamps in the Azure trace
where not retrieved correctly from the files.
|
|
This change updates the Azure VM trace format implementation to directly
support loading a trace in GZIP format in order to prevent users having
to decompress the trace files so they can be opened by OpenDC.
|
|
This change adds a new API for writing traces in a trace format.
Currently, writing is only supported by the OpenDC VM format, but over
time the other formats will also have support for writing added.
|
|
This change simplifies the TraceFormat SPI interface by reducing the
number of interfaces that implementors need to implement to only
TraceFormat.
|
|
|