Zeus v0.10.0: Broader support
What's New
CPU and DRAM energy measurement
We implemented support for Intel RAPL, which allows CPU and DRAM energy measurement on supported CPUs.
Generally speaking, most Intel CPUs support would support both and some AMD CPUs will support RAPL, albeit only CPU measurement.
JAX support
We added preliminary JAX support. Check out our full example here.
API usage is mostly identical:
monitor = ZeusMonitor(sync_execution_with="jax") # JAX!
monitor.begin_window("computations")
# Run computation
measurement = monitor.end_window("computations")
Zeus Daemon
Our energy optimizers require changing setting on the GPU, including power limit and frequency. This requires admin privileges. More details in our docs.
Zeus Daemon lets you circumvent this by running as a standalone daemon process on the node that implements privileged operations on your behalf, so that you don't have to give the entire Zeus-integrated application admin privileges.
We wrote the Zeus Daemon in Rust: Check out the source code and crates.io for details.
Breaking Changes
ZeusMonitor.begin_window
and ZeusMonitor.end_window
's second parameter sync_cuda
was renamed to sync_execution
.
This is because JAX asynchronously runs CPU code as well, and we would like to synchronize both CUDA and CPU computations. This created the need to generalize sync_cuda
to sync_execution
.
Changelog
- Docs: Add warnings about instantiating
ZeusMonitor
as a global variable. by @jaywonchung in #68 - Docs: Fix typo by @Sunt-ing in #69
- Docs: Improve the GPU energy monitoring demo by @Sunt-ing in #70
- Feat: Detect and reject unofficial
pynvml
bindings by @jaywonchung in #71 - Fix: Pandas warnings from
PowerMonitor
by @jaywonchung in #75 - Feat: Zeus daemon by @jaywonchung in #81
- Test: Allow
zeusd
dev and testing on MacOS by @jaywonchung in #82 - Refactor: Reorg
zeus.device.gpu
by @jaywonchung in #83 - Feat: Integrate
zeusd
intozeus.device.gpu
by @jaywonchung in #85 - Chore: Fix typo in GitHub Actions by @jaywonchung in #86
- Chore:
zeusd
debug outputs and doc comments by @jaywonchung in #87 - Feat: Add CPU measurement (via Intel RAPL) to ZeusMonitor by @wbjin in #90
- Fix: RAPL DRAM measurements not to be included in package measurements by @wbjin in #92
- Chore: Run checks in PRs from forks by @jaywonchung in #95
- Docs: Fix attribute name in
ZeusMonitor
example by @HGangloff in #96 - Feat: Add zero energy warning in
ZeusMonitor
by @sharonsyh in #93 - Feat: Add jax support in CUDA sync by @HGangloff in #97
- Docs: Refine JAX integration and example by @jaywonchung in #99
- Feat: Multi arch docker build by @sharonsyh in #104
- News: Add Perseus news and write Perseus blog by @jaywonchung in #107
- Feat: Multi-Arch Docker Build - Pushing to symbioticlab/zeus and mlenergy/zeus by @sharonsyh in #106
- Feat: RAPL Monitor for monitoring wraparounds for a rapl file by @wbjin in #105
- Test: Tests for CPU monitoring onn ZeusMonitor by @wbjin in #100
- Chore: Fix lint warnings from ruff by @wbjin in #108
New Contributors 🎉
- @Sunt-ing made their first contribution in #69
- @wbjin made their first contribution in #90
- @HGangloff made their first contribution in #96
- @sharonsyh made their first contribution in #93
Full Changelog: v0.9.1...zeus-v0.10.0