[V1.0] Development Work #60

luigifcruz · 2023-07-10T02:57:10Z

The scope of this pull request is to remove unnecessary complexity from BLADE. This aims to simplify maintainability and facilitate integration with third-party frameworks. The end result should retain the performance while reducing the boilerplate code necessary to create pipelines and schedule work. The Python interface will also receive an overhaul to match C++ API in both performance and features. Higher-level applications such as the CLI will also be rewritten using the Python bindings instead of the C++ API.

It isn't in the scope of this PR to modify the code of existing modules. This will be addressed in future versions of BLADE. Features such as support for multiple ArrayTensor data layouts like ATFP and AFTP, the addition of half-precision computation support, and in-place array operations.

Removal of CUDA Graphs. [Unnecessary Complexity]
Removal of CLI. [To-be Rewritten in Python]
Remove support for nested Pipelines. [In Favor Of Variable Rate Pipelines]
Replace pre-made Pipelines with Bundles. [Better Single Pipeline Module Support]
Refactor Runner with buffer management.
Modernization of the Python API. Possibly with nanobind. [Too Low-level Currently]
Add variable rate pipeline modules. [Gather, Copy, Permutation]
Add an open-source license.
Write updated documentation. [Readme]

Closes #65 and #61.

luigifcruz · 2023-07-13T21:13:04Z

It will be necessary to refactor the Blade::Runner class to add support for handling asynchronous memory copies from host to device locally instead of deferring it to the pipeline execution. This will result in less VRAM usage with no expected impact on performance.

luigifcruz · 2023-07-17T20:58:58Z

To ensure a level of stability for BLADE users, it is essential to keep track of performance regressions. In previous versions, benchmark results were manually saved to a file inside the repository tree. This PR adds the ability to automatically run benchmarks and tests using a standard server. This makes it possible to graph the benchmark results of components from the past and see how they have changed over time.

Automated benchmarks and tests will be run before any pull request is merged into the main branch to ensure quality. Approved work-in-progress pull requests will also have the ability to run these automated tasks to keep track of regressions as they occur.

For consistent results, a self-hosted machine dedicated to this task will serve as the standard server. Its hardware configuration will be representative of the hardware currently used in production. By using the results produced by this server, it will be possible to extrapolate the performance for other hardware configurations. This capability will help better understand how to deploy the pipeline in a heterogeneous server hardware configuration environment.

luigifcruz · 2023-12-20T22:37:10Z

All tests are passing with no actionable TODO for this PR. Merging!

luigifcruz added 22 commits July 9, 2023 23:36

Remove CUDA graph and CLI.

667fc61

Cleanup Result and MemoryTaint enums.

54479fe

Merge memory helper and utils.

fc8a2b7

Remove memory collection.

aec9a6c

Mode device headers to memory/device.

5df2fbb

Port ModeH from Pipeline to Bundle.

9587848

Flag areas of improvement with TODO.

a570281

Add bypass capabilities to Cast module.

790d5f7

Modernize Channelize module.

b1fabb7

Update Bundle casting behavior.

5bf370e

Further add modernization flags.

4b83f34

Remove unnecessary dependency (CLI11).

759f90e

Remove unnecessary dependency.

bfdb2b1

Organize test folder structure.

f22d1d0

Create examples structure.

fd44d2b

Organize src build structure.

17dbf07

Organize meson structure.

2ac08d8

Organize benchmark structure.

8fc0913

Temporary module tests fix.

db40204

Add Pipeline logic for unified mode.

5fc0699

Improve JIT build script.

645499b

Fix JIT build logic.

fc699c2

luigifcruz added 5 commits July 13, 2023 18:26

Fix Module::Taint type.

ba54f08

Add Gather module.

8db433c

Fix misplaced Gather kernel run.

2726701

Add Copy module.

09d5a49

Add Permutation module.

d02f6fe

Aggregate module benchmarks.

30ec618

luigifcruz removed a link to an issue Dec 18, 2023

[V1.1] Write up-to-date documentation. #63

Open

luigifcruz and others added 26 commits December 18, 2023 15:45

Add Gather module to the Mode BH test.

ccbdeec

Patch pipeline compute counter.

94e5e83

Patch build script for Python.

82c1cba

Add banner.

9b28188

Update README.md

acd5dbc

Update README.md

40b2ff7

Update README.md

89d889b

Update README.md

c1eab9b

Update Docker file.

b9f01a0

Make fmtlib static.

c7b0053

Add Python build test to Docker CI.

90b0716

Fix license heading in README.md.

eb0d3e1

Add cuda_args to benchmark and test files

e507cb5

Replace cuFFT callback with kernel.

9ad47b6

Add usage instructions and examples for the library.

84a3a34

Add to README usage.

42703be

Update Python version in Docker run command

5ccb8d9

Remove unnecessary comment in gather module meson.build.

b6bb0bf

Add F32 module to ext_duplicate.cc.

948c34d

Add Duplicate Module benchmarks.

edb9684

Add Gather Module benchmarks.

cc76dbc

Add Permutation Module benchmarks.

4ea3586

Add memory benchmarking to test-docker.yml workflow

bb30619

Fix memory benchmark command in test-docker.yml

82fa059

Bork on purpose to check CI.

b752047

Add CI for all benchmarks.

10e2ce1

luigifcruz merged commit 066482a into main Dec 20, 2023
1 check passed

luigifcruz deleted the dev branch December 20, 2023 22:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[V1.0] Development Work #60

[V1.0] Development Work #60

luigifcruz commented Jul 10, 2023 •

edited

Loading

luigifcruz commented Jul 13, 2023 •

edited

Loading

luigifcruz commented Jul 17, 2023

luigifcruz commented Dec 20, 2023

[V1.0] Development Work #60

[V1.0] Development Work #60

Conversation

luigifcruz commented Jul 10, 2023 • edited Loading

luigifcruz commented Jul 13, 2023 • edited Loading

luigifcruz commented Jul 17, 2023

luigifcruz commented Dec 20, 2023

luigifcruz commented Jul 10, 2023 •

edited

Loading

luigifcruz commented Jul 13, 2023 •

edited

Loading