summaryrefslogtreecommitdiff
path: root/opendc-compute
AgeCommit message (Collapse)Author
2021-10-08perf(simulator): Optimize SimTraceWorkloadFabian Mastenbroek
This change improves the performance of the SimTraceWorkload class by changing the way trace fragments are read and processed by the CPU consumers.
2021-10-03perf(compute): Optimize telemetry collectionFabian Mastenbroek
This change optimizes the telemetry collection in the SimHost class. Previously, there was significant overhead in collecting the metrics of this and associated classes due large `Attributes` object that did not cache accesses to `hashCode()`. We now wrap this object and manually cache the hash code.
2021-10-03feat(simulator): Expose CPU time counters directly on hypervisorFabian Mastenbroek
This change adds a new interface to the SimHypervisor interface that exposes the CPU time counters directly. These are derived from the flow counters and will be used by SimHost to expose them via telemetry.
2021-10-03refactor(simulator): Migrate to flow-based simulationFabian Mastenbroek
This change renames the `opendc-simulator-resources` module into the `opendc-simulator-flow` module to indicate that the core simulation model of OpenDC is based around modelling and simulating flows. Previously, the distinction between resource consumer and provider, and input and output caused some confusion. By switching to a flow-based model, this distinction is now clear (as in, the water flows from source to consumer/sink).
2021-10-03refactor(simulator): Merge distributor and aggregator into switchFabian Mastenbroek
This change removes the distributor and aggregator interfaces in favour of a single switch interface. Since the switch interface is as powerful as both the distributor and aggregator, we don't need the latter two.
2021-09-28fix(compute): Write null values explicitly in Parquet exporterFabian Mastenbroek
2021-09-28fix(compute): Do not recover guests in non-error stateFabian Mastenbroek
2021-09-28refactor(telemetry): Do not require clock for ComputeMetricExporterFabian Mastenbroek
This change drops the requirement for a clock parameter when constructing a ComputeMetricExporter, since it will now derive the timestamp from the recorded metrics.
2021-09-21feat(trace): Add support for writing tracesFabian Mastenbroek
This change adds a new API for writing traces in a trace format. Currently, writing is only supported by the OpenDC VM format, but over time the other formats will also have support for writing added.
2021-09-20refactor(trace): Simplify TraceFormat SPI interfaceFabian Mastenbroek
This change simplifies the TraceFormat SPI interface by reducing the number of interfaces that implementors need to implement to only TraceFormat.
2021-09-20perf(compute): Use index lookup in trace loaderFabian Mastenbroek
This change updates the ComputeWorkloadLoader to use index column lookups in order to prevent having to lookup the index for every row.
2021-09-20refactor(trace): Unify columns of different tablesFabian Mastenbroek
This change unifies columns of different tables used by trace formats. This concretely means that instead of having columns specific per table (e.g., RESOURCE_ID and RESOURCE_STATE_ID), with this changes these columns are shared between the tables with a single definition (RESOURCE_ID).
2021-09-19feat(trace): Add tool for converting workload tracesFabian Mastenbroek
This change adds an initial implementation to the trace library for converting between workload trace formats. Currently the tool supports only converting to the OpenDC VM trace format. However, in the future, we will add support for converting between other formats as well.
2021-09-19feat(trace): Update OpenDC VM trace formatFabian Mastenbroek
This change optimizes the OpenDC VM trace format by removing unnecessary columns as well as optimizing the writer settings. The new implementation still supports reading the old trace format in case users run OpenDC with older workload traces.
2021-09-19feat(trace): Add support for internal OpenDC VM trace formatFabian Mastenbroek
This change adds official support to the trace library for the internal VM trace format used by OpenDC for its experiments. This is a compact format that uses Parquet to store the virtual machine trace data in two Parquet files.
2021-09-19feat(trace): Add support for Azure VM trace formatFabian Mastenbroek
This change adds support in the trace library for the Azure VM trace format.
2021-09-19feat(trace): Add support for extended Bitbrains trace formatFabian Mastenbroek
This change adds support in the trace library for the extended Bitbrains format. This format is slightly different than the CSV format used by the original Bitbrains traces and contains more fields.
2021-09-19refactor(capelin): Make workload sampling model extensibleFabian Mastenbroek
This change updates the workload sampling implementation to be more flexible in the way the workload is constructed. Users can now sample multiple workloads at the same time using multiple samplers and use them as a single workload to simulate.
2021-09-19feat(capelin): Support creating CPU-optimized topologyFabian Mastenbroek
This change adds support for creating a topology that is CPU-optimized for simulation. This means that all the CPU resources of a machine are merged into a single large CPU in order to reduce simulation time.
2021-09-19perf(compute): Add option for optimizing SimHost simulationFabian Mastenbroek
This change adds an option for optimizing SimHost simulation by combining all the CPUs of a machine into a single large CPU. For most workloads, this does not significantly affect the simulation results, but does improve the simulation time by a lot.
2021-09-19refactor(capelin): Support flexible topology creationFabian Mastenbroek
This change adds support for creating flexible topologies by creating a TopologyFactory interface that is responsible for configuring the hosts of a compute service.
2021-09-19refactor(capelin): Extract common code out of Capelin experimentsFabian Mastenbroek
This change creates a new module for doing simulations with virtual machine workloads. We have found that a lot of code in the Capelin experiments code is being re-used by non-experiment modules.
2021-09-17refactor(telemetry): Standardize SimHost metricsFabian Mastenbroek
This change standardizes the metrics emitted by SimHost instances and their guests based on the OpenTelemetry semantic conventions. We now also report CPU time as opposed to CPU work as this metric is more commonly used.
2021-09-17refactor(telemetry): Standardize compute scheduler metricsFabian Mastenbroek
This change updates the OpenDC compute service implementation with multiple meters that follow the OpenTelemetry conventions.
2021-09-17refactor(telemetry): Create separate MeterProvider per service/hostFabian Mastenbroek
This change refactors the telemetry implementation by creating a separate MeterProvider per service or host. This means we have to keep track of multiple metric producers, but that we can attach resource information to each of the MeterProviders like we would in a real world scenario.
2021-09-17refactor(telemetry): Simplify CoroutineMetricReaderFabian Mastenbroek
This change simplifies the CoroutineMetricReader implementation by removing the seperation of reader and exporter jobs.
2021-09-10test(compute): Add test suite for fault injectorFabian Mastenbroek
2021-09-10docs(compute): Clarify terminology in compute serviceFabian Mastenbroek
2021-09-10refactor(compute): Integrate fault injection into compute simulatorFabian Mastenbroek
This change moves the fault injection logic directly into the opendc-compute-simulator module, so that it can operate at a higher abstraction. In the future, we might again split the module if we can re-use some of its logic.
2021-09-07fix(compute): Mark unschedulable server as terminatedFabian Mastenbroek
This change updates the compute service to mark servers that cannot be scheduled as terminated instead of error. Error is instead reserved for cases where the server is in an error state while running.
2021-09-07fix(compute): Do not allow failure of inactive guestsFabian Mastenbroek
This change fixes an issue in SimHost where guests that where inactive were also failed, causing an IllegalStateException.
2021-09-07fix(compute): Use correct memory size for host memoryFabian Mastenbroek
This change fixes an issue where all servers could not be scheduled due to the memory size of the host being computed incorrectly.
2021-09-07feat(compute): Track guest up/down timeFabian Mastenbroek
This change updates the SimHost implementation to track the up and downtime of hypervisor guests.
2021-09-07fix(compute): Start host even if it already exists on hostFabian Mastenbroek
2021-09-07fix(compute): Support overcommitted memory in SimHostFabian Mastenbroek
This change enables host to overcommit their memory when testing whether new servers can fit on the host.
2021-09-07feat(compute): Track host up/down timeFabian Mastenbroek
This change adds new metrics for tracking the up and downtime of hosts due to failures. In addition, this change adds a test to verify whether the metrics are collected correctly.
2021-09-07feat(compute): Track provisioning response timeFabian Mastenbroek
This change adds a metric for the provisioning time of virtual machines by the compute service.
2021-08-25refactor(compute): Measure power draw without PSU overheadFabian Mastenbroek
This change updates the SimHost implementation to measure the power draw of the machine without PSU overhead to make the results more realistic.
2021-08-25fix(simulator): Eliminate unnecessary double to long conversionsFabian Mastenbroek
This change eliminates unnecessary double to long conversions in the simulator. Previously, we used longs to denote the amount of work. However, in the mean time we have switched to doubles in the lower stack.
2021-08-25build: Upgrade to OpenTelemetry 1.5Fabian Mastenbroek
This change upgrades the OpenTelemetry dependency to version 1.5, which contains various breaking changes in the metrics API.
2021-08-24feat(compute): Add support for SimHost failureFabian Mastenbroek
This change adds support for failures in the SimHost implementation. Failing a host will now cause the virtual machine to enter an error state.
2021-08-24fix(capelin): Update Bitbrains trace testsFabian Mastenbroek
This change updates the Bitbrains trace tests with the updated trace that does not hardcode the duration of the trace fragments.
2021-08-24refactor(simulator): Execute traces based on timestampsFabian Mastenbroek
This change refactors the trace workload in the OpenDC simulator to track execute a fragment based on the fragment's timestamp. This makes sure that the trace is replayed identically to the original execution.
2021-08-22refactor(compute): Update FilterScheduler to follow OpenStack's NovaFabian Mastenbroek
This change updates the FilterScheduler implementation to follow more closely the scheduler implementation in OpenStack's Nova. We now normalize the weights, support many of the filters and weights in OpenStack and support overcommitting resources.
2021-08-22fix(compute): Track failed servers with counters correctlyFabian Mastenbroek
2021-08-13build: Update Kotlin dependenciesFabian Mastenbroek
This change updates the Kotlin dependencies used by OpenDC to their latest version.
2021-06-24simulator: Re-implement performance interference modelFabian Mastenbroek
This change updates reimplements the performance interference model to work on top of the universal resource model in `opendc-simulator-resources`. This enables us to model interference and performance variability of other resources such as disk or network in the future.
2021-06-21simulator: Re-organize compute simulator moduleFabian Mastenbroek
This change re-organizes the classes of the compute simulator module to make a clearer distinction between the hardware, firmware and software interfaces in this module.
2021-06-21simulator: Remove concept of resource lifecycleFabian Mastenbroek
This change removes the AutoCloseable interface from the SimResourceProvider and removes the concept of a resource lifecycle. Instead, resource providers are now either active (running a resource consumer) or in-active (being idle), which simplifies implementation.
2021-06-21exp: Enable interpreter sharing across hostsFabian Mastenbroek
This change enables the experiments to share the SimResourceInterpreter across multiple hosts, which allows updates to be scheduled efficiently for all machines at the same time. This is especially beneficial if the machines operate on the same time slices.