Skip to content

Releases: ml-energy/zeus

v0.4.0: `ZeusMonitor`

21 Jun 01:35
Compare
Choose a tag to compare

What's New

v0.3.0: `ZeusMonitorContext` for in-training-loop profiling

05 Dec 20:54
Compare
Choose a tag to compare

What's New

  • ZeusMonitorContext allows users to profile their per-iteration energy and time consumption.
    • It's aimed for those who would like to get a feel for the energy consumption of their DNN training job with a couple additional lines (as opposed to modified lines).
    • Documentation and integration example: here

v0.2.2

04 Dec 18:38
Compare
Choose a tag to compare

Bug Fix

  • Fixed a bug that made all Zeus monitors monitor the same GPU (index 0) in DP mode (#10)

v0.2.1

15 Oct 00:39
Compare
Choose a tag to compare

Bug fix

  • Fixed a bug where power limit profiling did not carry over to the next epoch when the dataset has less numbers of batches (#7).

v0.2.0: Single-Node Data Parallel Support

08 Oct 19:53
Compare
Choose a tag to compare

New Features

  • Single-node multi-GPU data parallel training support added (#2)
  • zeus_monitor is built at Docker image build time and baked into the image (#6)

Breaking Changes

  • ZeusDataLoader's profile window for each power limit is now based on the number of iterations, not time. (#2)
    • This was done to ease synchronization between GPUs while profiling power limits.
    • The ZEUS_PROFILE_PARAMS environment variable is now parsed as a comma separated string of the number of warmup and measure iterations.
    • ZeusMaster's constructor now takes arguments profile_warmup_iters and profile_measure_iters.

v0.1.0

27 Aug 16:52
Compare
Choose a tag to compare

First official release of Zeus!

  • Support for single-GPU training is stable.