| Age | Commit message (Collapse) | Author |
|
(#342)
* renamed performance counter to distinguish different resource types
* added GPU, modelled similar to CPU
* added GPUs to machine model
* list of GPUs instead of single instance
* renamed memory speed to bandwidth
* enabled parsing of GPU resources
* split powermodel into cpu and GPU powermodel
* added gpu parsing tests
* added idea of host level scheduling
* added tests for multi gpu parsing
* renamed powermodel to cpupowermodel
* clarified naming of cpu and gpu components
* added resource type to flow suplier and edge
* added resourcetype
* added GPU components and resource type to fragments
* added GPU to workload and updated resource usage retrieval
* implemented first version of multi resource
* added name to workload
* renamed perfomance counters
* removed commented out code
* removed deprecated comments
* included demand and supply into calculations
* resolving rebase mismatches
* moved resource type from flowedge class to common package
* added available resources to machinees
* cleaner separation if workload is started of simmachine or vm
* Replaced exception with dedicated enum
* Only looping over resources that are actually used
* using hashmaps to handle resourcetype instead of arrays for readability
* fixed condition
* tracking finished workloads per resource type
* removed resource type from flowedge
* made supply and demand distribution resource specific
* added power model for GPU
* removed unused test setup
* removed depracated comments
* removed unused parameter
* added ID for GPU
* added GPUs and GPU performance counters (naively)
* implemented capturing of GPU statistics
* added reminders for future implementations
* renamed properties for better identification
* added capturing GPU statistics
* implemented first tests for GPUs
* unified access to performance counters
* added interface for general compute resource handling
* implemented multi resource support in simmachine
* added individual edge to VM per resource
* extended compute resource interface
* implemented multi-resource support in PSU
* implemented generic retrieval of computeresources
* implemented mult-resource suppport in vm
* made method use more resource specific
* implemented simple GPU tests
* rolled back frquency and demand use
* made naming independent of used resource
* using workloads resources instead of VMs to determine available resource
* implemented determination of used resources in workload
* removed logging statements
* implemented reading from workload
* fixed naming for host-level allocation
* fixed next deadline calculation
* fixed forwarding supply
* reduced memory footprint
* made GPU powermodel nullable
* maded Gpu powermodel configurable in topology
* implemented tests for basic gpu scheduler
* added gpu properties
* implemented weights, filter and simple cpu-gpu scheduler
* spotless apply
* spotless apply pt. 2
* fixed capitalization
* spotless kotlin run
* implemented coloumn export
* todo update
* removed code comments
* Merged PerformanceCounter classes into one & removed interface
* removed GPU specific powermodel
* Rebase master: kept both versions of TopologyFactories
* renamed CpuPowermodel to resource independent Powermodel
Moved it from Cpu package to power package
* implementated default of getResourceType & removed overrides if possible
* split getResourceType into Consumer and Supplier
* added power as resource type
* reduced supply demand from arrayList to single value
* combining GPUs into one large GPU, until full multi-gpu support
* merged distribution policy enum with corresponding factory
* added comment
* post-rebase fixes
* aligned naming
* Added GPU metrics to task output
* Updates power resource type to uppercase.
Standardizes the `ResourceType.Power` enum to `ResourceType.POWER`
for consistency with other resource types and improved readability.
* Removes deprecated test assertions
Removes commented-out assertions in GPU tests.
These assertions are no longer needed and clutter the test code.
* Renames MaxMinFairnessStrategy to Policy
Renames MaxMinFairnessStrategy to MaxMinFairnessPolicy for
clarity and consistency with naming conventions. This change
affects the factory and distributor to use the updated name.
* applies spotless
* nulls GPUs as it is not used
|
|
|
|
|
|
* Start time shifting
* Existing experiments work with new columns
* Remove unused traces dir
* Update java to 21 LTS and jacoco to be compatible
* Minimal working timeshifting
* Timeshift scheduler linked as carbon receiver
* Add basic tests for timeshift scheduler
* Run spotless apply
* Modify tarce format tests to support new fields
* Change all mentions of java 19 to 21
* Add a deferAll option to workload to make all tasks deferrable
* Run spotless apply
* Copy traces from resources in web dockerfile
|
|
|
|
* some updates
* Updates
* Added comments and renamed variables
* Ran Spotless
|
|
* Added maxCpuDemand to TraceWorkload, don't know if this will be needed so might remove later.
Updated SimTraceWorkload to properly handle creating checkpoints
Fixed a bug with the updatedConsumers in the FlowDistributor
Implemented a first version of scaling the runtime of fragments.
* small update
* updated tests to reflect the changes in the checkpointing model
* Updated the checkpointing tests to reflect the changes made
* updated wrapper-validation-action
* Applied spotless
|
|
* Added sampleFraction and submissionTime to the workloadSpec
* Removed commented code
|
|
new workload types are added (#294)
|
|
* Updated tests
Changed all floats into doubles to have consistency over the whole framework
Made a small update to the multiplexer to better push through supply and demand
Fixed small typo
Updated M3SA paths.
fixed merge conflicts
Removed unused components. Updated tests.
Improved checkpointing model
Improved model, started with SimPowerSource
implemented FailureModels and Checkpointing
First working version
midway commit
first update
All simulation are now run with a single CPU and single MemoryUnit. multi CPUs are combined into one. This is for performance and explainability.
* Updated test memory
|
|
* Removed unused components. Updated tests.
Improved checkpointing model
Improved model, started with SimPowerSource
implemented FailureModels and Checkpointing
First working version
midway commit
first update
All simulation are now run with a single CPU and single MemoryUnit. multi CPUs are combined into one. This is for performance and explainability.
* fixed merge conflicts
* Updated M3SA paths.
* Fixed small typo
|
|
* Started on reimplementing the SimTrace implementation
* updated trace format. Fragments now do not have a deadline, but a duration. The Fragments are executed in order.
|
|
* Updated SimTrace to use a single ArrayDeque instead of three separate lists for deadline, cpuUsage, and coreCount
* Renamed input files to tasks.parquet and fragments.parquet. Renamed server to task. OpenDC nows exports tasks.parquet instead of server.parquet
|
|
|
|
* Initial commit
* Implemented a new systems of defining and running scenarios / portfolios. Scenarios and Portfolios can now be defined using JSON files similar to topologies. This allows user to define experiments without changing any KotLin code.
* Ran spotlessApply
|
|
* Updated all package versions including kotlin. Updated all web-server tests to run.
* Changed the java version of the tests. OpenDC now only supports java 19.
* small update
* test update
* new update
* updated docker version to 19
* updated docker version to 19
|
|
* removed experiment-compute and integrated all components into opendc-compute
* updated workflow gradle file
* removed unneeded code
|
|
This change integrates the classes from the old
`opendc-compute-workload` module into the `opendc-experiments-compute`
module. This new module contains helper classes for setting up
experiments with the OpenDC compute service.
|
|
This change adds a new module `opendc-experiments-compute` that provides
provisioner implementations for experiments to use for setting up the
compute service of OpenDC and provisioning (simulated) hosts.
|
|
This change updates the interface of `ComputeService` to provide access
to the instances (servers) that have been registered with the compute
service. This allows metric collectors to query the metrics of the
servers that are currently running.
|
|
This change updates the `ComputeServiceHelper` class to provide the
failure model via a parameter to the `run` method instead of constructor
parameter. This separates the construction of the topology from the
simulation of the workload.
|
|
This change simplifies the SimHypervisor class into a single
implementation. Previously, it was implemented as an abstract class with
multiple implementations for each multiplexer type. We now pass the
multiplexer type as parameter to the SimHypervisor constructor.
|
|
This change updates the virtual machine performance interference model
so that the interference domain can be constructed independently of the
interference profile. As a consequence, the construction of the topology
now does not depend anymore on the interference profile.
|
|
This change updates the constructor of SimHost to receive a
`SimBareMetalMachine` and `SimHypervisor` directly instead of
construction these objects itself. This ensures better testability and
also simplifies the constructor of this class, especially when future
changes to `SimBareMetalMachine` or `SimHypervisor` change their
constructors.
|
|
This change moves the Random dependency outside the interference model,
to allow the interference model to be completely immutable and passable
between different simulations.
|
|
This change removes the timestamp parameter from `SimTrace`. Instead, it
is now assumed that the trace is continuous and the end of a fragment
starts a new fragment, in order to simplify replaying of the trace.
|
|
This change fixes an issue with the metric exporting code in OpenDC
where a UUID is not converted correctly into a `Binary` object that is
consumed by the Apache Parquet library.
|
|
This change updates the trace API by introducing a limited type system
for the table columns. Previously, the table columns could have any
possible type representable by the JVM. With this change, we limit the
available types to a small type system.
|
|
This change removes the OpenTelemetry integration from the OpenDC
Compute modules. Previously, we chose to integrate OpenTelemetry to
provide a unified way to report metrics to the users.
Although this worked as expected, the overhead of the OpenTelemetry when
collecting metrics during simulation was considerable and lacked more
optimization opportunities (other than providing a separate API
implementation). Furthermore, since we were tied to OpenTelemetry's SDK
implementation, we experienced issues with throttling and registering
multiple instruments.
We will instead use another approach, where we expose the core metrics
in OpenDC via specialized interfaces (see the commits before) such that
access is fast and can be done without having to interface with
OpenTelemetry. In addition, we will provide an adapter to that is able
to forward these metrics to OpenTelemetry implementations, so we can
still integrate with the wider ecosystem.
|
|
This change introduces a `ComputeMetricReader` class that can be used as
a replacement for the `CoroutineMetricReader` class when reading metrics
from the Compute service. This implementation operates directly on a
`ComputeService` instance, providing better performance.
|
|
This change updates the `ParquetDataWriter` class to not use the
`parquet-avro` library for exporting experiment data, but instead to use
the low-level APIs to directly write the data in Parquet format.
|
|
This change updates the `LocalParquetReader` implementation to support
custom `ReadSupport` implementations, so we do not have to rely on the
Avro implementation necessarily.
|
|
This change moves the trace conventions (such as table and column names)
in a separate conv package, so that it is separated from the main API.
This also allows for a potential move into a separate module in the
future.
|
|
This change updates the compute support library to load the VM
interference model via the OpenDC trace library, which provides a
generic interface for reading interference models associated with
workload traces.
|
|
This change fixes an issue with the ComputeServiceHelper where it
allowed users to register multiple SimHost objects with the same UID.
See this issue for more information:
https://github.com/atlarge-research/opendc/issues/51
|
|
This change updates the OpenDC codebase to use OpenTelemetry v1.11,
which stabilizes the metrics API. This stabilization brings quite a few
breaking changes, so significant changes are necessary inside the OpenDC
codebase.
|
|
This change adds a new module, opendc-workflow-workload that contains
helper code for constructing workflow simulations using OpenDC.
|
|
This change updates the implementation of the trace converter and
SimTrace implementation to support cases where there is a gap between
samples in the trace data.
This change allows users to specify what to do in case samples are
missing in the trace. The available options are specified in
`SimTrace.FillMode`. Currently, we support either carrying the previous
value forward or set the usage to zero.
|
|
This change redesigns the ComputeMonitor interface to reduce the number
of memory allocations necessary during a collection cycle.
|
|
This change adds support for collecting the provisioning time of virtual
machines in addition to their boot time.
|
|
This change redesigns the virtual machine interference algorithm to have
a fixed memory usage per `VmInterferenceModel` instance. Previously, for
every interference domain, a copy of the model would be created, leading
to OutOfMemory errors when running multiple experiments at the same
time.
|
|
This change allows users to create servers with a smaller CPU capacity
than the host, by specifying the CPU capacity via metadata. This also
allows filtering hosts based on their available CPU capacity.
|
|
This change improves the performance of the SimTraceWorkload class by
changing the way trace fragments are read and processed by the CPU
consumers.
|
|
This change renames the `opendc-simulator-resources` module into the
`opendc-simulator-flow` module to indicate that the core simulation
model of OpenDC is based around modelling and simulating flows.
Previously, the distinction between resource consumer and provider, and
input and output caused some confusion. By switching to a flow-based
model, this distinction is now clear (as in, the water flows from source
to consumer/sink).
|
|
|
|
This change drops the requirement for a clock parameter when
constructing a ComputeMetricExporter, since it will now derive the
timestamp from the recorded metrics.
|
|
This change adds a new API for writing traces in a trace format.
Currently, writing is only supported by the OpenDC VM format, but over
time the other formats will also have support for writing added.
|
|
This change simplifies the TraceFormat SPI interface by reducing the
number of interfaces that implementors need to implement to only
TraceFormat.
|
|
This change updates the ComputeWorkloadLoader to use index column
lookups in order to prevent having to lookup the index for every row.
|
|
This change unifies columns of different tables used by trace formats.
This concretely means that instead of having columns specific per table
(e.g., RESOURCE_ID and RESOURCE_STATE_ID), with this changes these
columns are shared between the tables with a single definition
(RESOURCE_ID).
|