Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update doc according npm workspace #636

Merged
merged 30 commits into from
Feb 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
ac08f2f
DEV.md: Fix html formatting
JulienVig Feb 22, 2024
b57755d
Delete empty docs/web_example directory
JulienVig Feb 22, 2024
0eae2f0
ONBOARDING.md: remove outdated VSCode tip
JulienVig Feb 22, 2024
787b6fd
Update ONBOARDING.md
JulienVig Feb 22, 2024
cf77583
CONTRIBUTING.md: update commands with workspaces and remove appendix
JulienVig Feb 22, 2024
df0315c
ONBOARDING.md: Update watch instructions
JulienVig Feb 22, 2024
90b5055
cli/README.md: Update instructions
JulienVig Feb 22, 2024
329864e
Clean the CLI module
JulienVig Feb 22, 2024
9643964
docs/node_example: Integrate server startup to example
JulienVig Feb 22, 2024
b5e4785
node_example: update dependencies
JulienVig Feb 22, 2024
7e921e8
node_example: update README.md instructions and explanations
JulienVig Feb 22, 2024
3412165
web-client: Update README.md
JulienVig Feb 22, 2024
afb6407
Rename node_example main script
JulienVig Feb 22, 2024
7719b1d
Merge onboarding.md into contributing.md
JulienVig Feb 27, 2024
d1bb792
node_examples: renamed to examples and add a new example for custom t…
JulienVig Feb 27, 2024
746c6d2
Merge each module start build test instructions in CONTRIBUTING.md
JulienVig Feb 27, 2024
3d887fe
Typo CONTRIBUTING.md
JulienVig Feb 27, 2024
18d5b85
Merge with develop
JulienVig Feb 27, 2024
8f6ba0f
Fix typos
JulienVig Feb 27, 2024
ecc63ce
examples/README: Fix markdown
JulienVig Feb 27, 2024
e371f0b
DEV.md: change links to examples folder
JulienVig Feb 27, 2024
706ecbf
CONTRIBUTING.md: fix broken link
JulienVig Feb 27, 2024
0cebb4a
Fix linting errors
JulienVig Feb 27, 2024
f793de3
Fix cli/package.json main
JulienVig Feb 28, 2024
06d51c4
Simply CLI start example
JulienVig Feb 28, 2024
0a63cb4
Fix confusion in docs/CONTRIBUTING.md
JulienVig Feb 28, 2024
0031ea2
Update docs/examples/README.md
JulienVig Feb 28, 2024
ef6c031
Update docs/examples/package.json
JulienVig Feb 28, 2024
c8c2fd2
Update docs/examples/README.md
JulienVig Feb 28, 2024
3a2a07a
Fix inaccuracy in DEV.md
JulienVig Feb 28, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 58 additions & 22 deletions DEV.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<div align="center">
<div align="center">
<h1>DISCO <code>developer guide</code></h1>
<p>
<a href="https://github.com/epfml/disco/actions/workflows/lint-test-build.yml"><img src="https://github.com/epfml/disco/actions/workflows/lint-test-build.yml/badge.svg" alt="build status" /></a>
Expand All @@ -16,10 +16,9 @@ Here you will have a first overview of the project, how to install and run an in
The DISCO project is composed of multiple parts. At the root level, there are four main folders: `discojs`, `server`, `web-client` and `cli`.

- `discojs`, or Disco.js, is the TypeScript library that contains federated and decentralized learning logic. The library allows to train and use machine learning models in a distributed fashion. The library itself is composed of the `disco-node` and `disco-web` modules, both of them extending the platform-agnostic code in `disco-core`. In other words, `disco-core` contains most of the implementation but can't be used by itself, while `disco-web` and `disco-node` allow using `disco-core` via different architectures. To some extents, you can think of `disco-core` as an abstract class extended by `disco-web`and `disco-node`.
- `disco-node` lets you use Disco.js with Node.js. For example, the `server` and the `cli` rely on `disco-node`. A user can also directly import the `disco-node` package in their Node.js programs.
- `disco-web` allows using Disco.js through a browser. The `web-client`, discussed below, relies on `disco-web` to implement a browser UI.

The main difference between the two is how they handle storage: a browser doesn't have access to the file system (for security reasons) while a Node.js application does.
- `disco-node` lets you use Disco.js with Node.js. For example, the `server` and the `cli` rely on `disco-node`. A user can also directly import the `disco-node` package in their Node.js programs.
- `disco-web` allows using Disco.js through a browser. The `web-client`, discussed below, relies on `disco-web` to implement a browser UI.
The main difference between the two is how they handle storage: a browser doesn't have access to the file system (for security reasons) while a Node.js application does.
- `server` contains the server implementation necessary to use Disco.js. Indeed, while the federated and decentralized learning logic is implemented by Disco.js, we still need a server to orchestrate users in both paradigms. In decentralized learning, the server exposes an API for users to query the necessary information to train models in a decentralized fashion, such as the list of other peers. Thus, the server never receives training data or model parameters. In federated learning, the server receives model updates but never training data. It keeps track of participants and updates the model weights. A `server` instance is **always** necessary to use DISCO, whether one is using a browser UI, the CLI or directly programming with `disco-node`.
- `web-client` implements a browser User Interface. In other words, it implements a website allowing users to use DISCO without coding. Via the browser, a user can create and participate in federated and decentralized training sessions, evaluate models, etc.
- `cli` contains the Command Line Interface for Disco.js. For example, the CLI allows a user to create and join training sessions from the command line, benchmark performance by emulating multiple clients, etc.
Expand Down Expand Up @@ -125,15 +124,58 @@ npm -w web-client start # from another terminal

The web client should be running on `http://localhost:8081`, if not first restart the server and then the web client.

> [!IMPORTANT]
> Make sure to first start the server to ensure that it is listening to port 8080.

**You can now access DISCO at http://localhost:8081/**

## How to use DISCO

There are multiple ways to use and interact with DISCO, depending on your objective:

- A non-technical user that wants to train models in a distributed manner without coding would want to use DISCO through the `web-client`. To do so, starting a local `server` instance is also needed as a backend to the `web-client`. Similarly, a contributor aiming to implement new UI features would certainly want to run the same setup.
- A technical user may find it more flexible to use DISCO from a Node.js script, which gives users a finer control over the process. The `discojs-node` module is tailored to be used in Node.js scripts and allows to load data, helps starting a server and run distributed machine learning training tasks.
- Finally, the `cli` (command line interface) can also be used to quickly start distributed model trainings. The `CLI` is more restricting than using `discojs-node` but allows to start training with multiple users in a single command. It is useful for benchmarking for example.

**Training on your own datasets:** DISCO provides pre-defined training tasks, such as CIFAR10, Titanic, etc. The [Tasks document guide](./docs/TASK.md) describes how to add custom tasks from the web-client UI, a `discojs-node` script or how to add support for a new pre-defined task.

### `web-client` and `server`

The last step of the installation instructions describe how to start a web interface along with a helper server. The server is used to provide some predefined machine learning tasks and orchestrate distributed training.

From the root level, launch a `server` instance:

```
npm -w server start
```

The server should be listening on `http://localhost:8080/`.

Start the web-client:

```
npm -w web-client start # from another terminal
```

The web client should be running on `http://localhost:8081`. Running the last command should also output a Network address at which devices on the same network can access the UI. You can find more information in the [Contributing to the `web-client`](./docs/CONTRIBUTING.md#contributing-to-web-client) Section as well as the [server README](./server/README.md).

### Importing `discojs-node` with Node.js

Using `discojs-node` is illustrated in [the `examples` folder](./docs/examples). Using `discojs-node` implies starting a server (or having access to one), loading local training data and configuring the model training.

### `cli`

Training a model with the `cli` on pre-defined tasks is straightforward:

```
# From the root folder
npm -w cli start -- --task cifar10 --numberOfUsers 4 --epochs 15 --roundDuration 5
npm -w cli start -- --help # for all options
```

Adding CLI support for another task is described in the [CLI README](./cli/README.md).

## Further documentation

- Next you may want to read our [onboarding guide](./docs/ONBOARDING.md) which lists the following steps to onboard DISCO.
- If you are only planning to use DISCO in your own scripts, you can find a stand-alone example relying on `discojs-node` [here](./docs/node_example). The example runs with Node.js outside any browser, using the `@epfml/discojs-node` NPM package and the `server` module. A DISCO server is launched by the script itself and the data is already available in the repo.
- To contribute or modify the codebase have a look at the [contributing guide](./docs/CONTRIBUTING.md) which lists the following steps to onboard DISCO.
- If you are only planning to use DISCO in your own scripts, you can find a standalone example relying on `discojs-node` in [the `examples` folder](./docs/examples). The example runs with Node.js outside any browser, with `discojs-node` and a `server` instance. A DISCO server is launched by the script which then loads data and emulates multiple users training a model in a federated manner.

#### Table of contents

Expand All @@ -142,19 +184,13 @@ As there are many guides in the project, here is a table of contents referencing
- [DISCO README](./README.md)
- [Developer guide](./DEV.md)
- The `docs` folder contains in-depth documentation on the project:
- [Onboarding guide](./docs/ONBOARDING.md)
- [Contributing guide](./docs/CONTRIBUTING.md)
- [TASK.md: training on your own dataset](./docs/TASK.md)
- [Disco.js under the hood](./docs/DISCOJS.md)
- [FAQ](./docs/FAQ.md)
- [Example: using `discojs-node` in a script](./docs/node_example/README.md)
- [`examples` folder: using `discojs-node`, adding a custom task](./docs/examples)
- [Privacy in DISCO](./docs/PRIVACY.md)
- [How to create a DISCO Task](./docs/TASK.md)
- [Vue.js architecture](./docs/VUEJS.md)
- Respective `README` files contain installation and packaging instructions relevant to the module
- [`discojs` README](./discojs/README.md)
- [`discojs-core` README](./discojs/discojs-core/README.md)
- [`discojs-node` README](./discojs/discojs-node/README.md)
- [`discojs-web` README](./discojs/discojs-web/README.md)
- [`server` README](./server/README.md)
- [`web-client` README](./web-client/README.md)
- [Vue.js in DISCO](./docs/VUEJS.md)
- [FAQ](./docs/FAQ.md)
- `README` files contain information relevant to their respective module:
- [`server` README](./server/README.md), with API and deployment information
- [`cli` README](./cli/README.md)
133 changes: 20 additions & 113 deletions cli/README.md
Original file line number Diff line number Diff line change
@@ -1,130 +1,37 @@
# CLI benchmark and node client
# DISCO Command Line Interface

Welcome to the Disco🔮 command line interface (CLI). This shows how to easily use Disco🔮 even without a browser, as a Node.js client, to join any federated or decentralized learning task. Also, the standalone scripts and CLI here allow to conveniently simulate multiple clients and log metrics such as training and validation accuracy of each client. Integration of Disco🔮 into other js apps can follow the same code principles (no browser needed).
The CLI lets one use DISCO in standalone manner (i.e. without running a server or browser backend manually). The CLI allows to conveniently simulate multiple clients and log metrics such as the training and validation accuracy of each client. Integration of DISCO into other apps can follow the same principles (no browser needed). Currently, the CLI only support running federated tasks. Since the CLI relies on Node.js, it uses DISCO through `discojs-node`.

The Disco CLI allows one to benchmark or simply play around with `discojs` in order to see the performance
of distributed learning. It is possible to pass key arguments such as the number of users, round duration (how
frequently the clients communicate with each other), ...
For example, the following command trains a model on CIFAR10, using 4 federated clients for 15 epochs with a round duration of 5 batches (see [DISCOJS.md](../docs/DISCOJS.md#rounds) for more information on rounds)

To train cifar10, using 4 federated clients for 15 epochs with a round duration of 5 batches, all you have to do is type
> [!NOTE]
> Make sure you first ran `./get_training_data.sh` (in the root folder) to download training data.

```
# From the root folder
npm -w cli start -- --task cifar10 --numberOfUsers 4 --epochs 15 --roundDuration 5
# Or from the cli folder directly
npm start -- --task cifar10 --numberOfUsers 4 --epochs 15 --roundDuration 5
```

or also using the shorter alias notation

or using the shorter alias notation:
```
npm start -- -t cifar10 -u 4 -e 15 -r 5
```

## Quick-install guide

- install node 16 and ensure it is activated on opening any new terminal (e.g. `nvm use 16`)
- `git clone git@github.com:epfml/disco.git`
- download the `example_training_data.tar.gz` file and extract it into the root of the repository
- simply execute [get_training_data.sh](../get_training_data.sh)
- `npm ci` within `discojs`, `server` and `cli`
- `cd discojs/discojs-node && npm run build`

## Running the CLI

- `cd cli`
- `npm start` to run the benchmark with the default setting, to see the available flags run
- `npm start -- --help`

## Custom Tasks

DiscoJS currently provides several pre-define popular tasks such as titanic, simple-face and cifar10. In order
to understand how to add your own custom task, we will go over how we added simple-face to disojs [here](../information/TASK.md).

## Dataset

The only thing missing is loading the data, we use our own data class that is a wrapper for the `tfjs` dataset class.
We do the data loading in [data.ts](./src/data.ts).

The key class (for images) is the [ImageLoader](../discojs/src/dataset/data_loader/image_loader.ts), this needs to be extended since files are loaded differently depending
on the environment (node vs browser). Note this is also where we can add pre-processing, here we simply normalise the
images, but more complex operations can also be added.

Once we have built this object we can load the dataset by giving as an input the `files: string[]` and `labels: number[]`.
We give an example of what this looks like down bellow.
```js
import fs from 'fs'
import { tf, dataset, Task } from '@epfml/discojs'

class NodeImageLoader extends dataset.ImageLoader<string> {
async readImageFrom(source: string): Promise<tf.Tensor3D> {
const imageBuffer = fs.readFileSync(source)
let tensor = tf.node.decodeImage(imageBuffer)
// <---- Add pre processing here!
// e.g: If resize needed uncomment the following
// tensor = tf.image.resizeBilinear(tensor, [
// 32, 32
// ])
tensor = tensor.div(tf.scalar(255))
return tensor as tf.Tensor3D
}
}

...

export async function simplefaceData(task: Task): Promise<dataset.DataTuple> {

...

return await new NodeImageLoader(task).loadAll(files, {labels: labels})
}


npm -w cli start -- -t cifar10 -u 4 -e 15 -r 5
```

Example of `files` and `labels` content.

```js
{ labels: [ 0, 0, 0, 1, 1, 1 ] }
{
files: [
'../example_training_data/simple_face/child/12.png',
'../example_training_data/simple_face/child/141.png',
'../example_training_data/simple_face/child/143.png',
'../example_training_data/simple_face/adult/9417.png',
'../example_training_data/simple_face/adult/9429.png',
'../example_training_data/simple_face/adult/9462.png'
]
}
You can find all the command arguments with:
```

Once you add a new data loader, add it to the `getTaskData` function
in the same file.

```js
export async function getTaskData(task: Task) {
if (task.taskID === 'simple_face') {
return simplefaceData(task)
}
if (task.taskID === 'titanic') {
return titanicData(task)
}
if (task.taskID === 'cifar10') {
return cifar10Data(task)
}
throw Error(`Data loader for ${task.taskID} not implemented.`)
}
npm -w cli start -- --help # or -h
```

## Adding new tasks

### CLI

The last thing to add is to add the task in the [args.ts](./src/args.ts) as follows

```js
let supportedTasks: Map<string, Task> = Map()
supportedTasks = supportedTasks.set(tasks.simple_face.task.taskID, tasks.simple_face.task) // <------
```
The CLI can be used on several pre-defined tasks: titanic, simple-face and CIFAR10. In order
to understand how to add a new task have a look at [TASK.md](../docs/TASK.md).

Now you are done and you should be able to run your task as follows
Once a new task has been defined in `discojs`, it can be loaded in [data.ts](./src/data.ts) as it is already implemented for current tasks. There are currently [multiple classes](../discojs/discojs-node/src/dataset/data_loader) you can use to load data using Node.js and preprocess data: ImageLoader, TabularLoader and TextLoader.
Once a function to load data has been added, make sure to extend `getTaskData` in `data.ts`, which matches each task with it respective with data loading function.

The last thing to add is to add the task as a CLI argument in [args.ts](./src/args.ts) to the `supportedTasks` Map.
You should now be able to run your task as follows:
```
npm run benchmark -- --task simple_face --numberOfUsers 4 --epochs 15 --roundDuration 5
npm -w cli start -- --task your_task --numberOfUsers 4 --epochs 15 --roundDuration 5
```
4 changes: 2 additions & 2 deletions cli/package.json
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
{
"name": "@epfml/disco-cli",
"private": true,
"main": "dist/benchmark.ts",
"main": "dist/cli.js",
"scripts": {
"watch": "nodemon --ext ts --ignore dist --watch ../discojs/discojs-node/dist --watch ../server/dist --watch . --exec npm run",
"start": "npm run build && node dist/benchmark.js",
"start": "npm run build && node dist/cli.js",
"build": "tsc",
"lint": "npx eslint --ext ts --max-warnings 0 .",
"test": ": nothing"
Expand Down
9 changes: 4 additions & 5 deletions cli/src/args.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import { parse } from 'ts-command-line-args'
import { Map } from 'immutable'

import { defaultTasks, type Task } from '@epfml/discojs-node'

interface BenchmarkArguments {
Expand All @@ -19,21 +18,21 @@ type BenchmarkUnsafeArguments = {
help?: boolean
}

const argExample = 'e.g. npm run benchmark -- -u 2 -e 3, runs 2 users for 3 epochs'
const argExample = 'e.g. npm start -- -u 2 -e 3 # runs 2 users for 3 epochs'

const unsafeArgs = parse<BenchmarkUnsafeArguments>(
{
task: { type: String, alias: 't', description: 'Task', defaultValue: 'simple_face' },
task: { type: String, alias: 't', description: 'Task: titanic, simple_face or cifar10', defaultValue: 'simple_face' },
numberOfUsers: { type: Number, alias: 'u', description: 'Number of users', defaultValue: 1 },
epochs: { type: Number, alias: 'e', description: 'Number of epochs', defaultValue: 10 },
roundDuration: { type: Number, alias: 'r', description: 'Round duration', defaultValue: 10 },
batchSize: { type: Number, alias: 'b', description: 'Round duration', defaultValue: 10 },
batchSize: { type: Number, alias: 'b', description: 'Training batch size', defaultValue: 10 },
save: { type: Boolean, alias: 's', description: 'Save logs of benchmark', defaultValue: false },
help: { type: Boolean, optional: true, alias: 'h', description: 'Prints this usage guide' }
},
{
helpArg: 'help',
headerContentSections: [{ header: 'Disco benchmark', content: 'npm run benchmark -- [Options]\n' + argExample }]
headerContentSections: [{ header: 'DISCO CLI', content: 'npm start -- [Options]\n' + argExample }]
}
)

Expand Down
6 changes: 2 additions & 4 deletions cli/src/benchmark.ts → cli/src/cli.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ import { args } from './args'
const NUMBER_OF_USERS = args.numberOfUsers
const TASK = args.task

const infoText = `\nRunning federated benchmark of ${TASK.taskID}`
const infoText = `\nStarted federated training of ${TASK.taskID}`
console.log(infoText)

console.log({ args })
Expand All @@ -19,9 +19,7 @@ async function runUser (task: Task, url: URL, data: data.DataSplit): Promise<Tra
const scheme = TrainingSchemes.FEDERATED
const disco = new Disco(task, { scheme, url })

console.log('runUser>>>>')
await disco.fit(data)
console.log('runUser<<<<')
await disco.close()
return await disco.logs()
}
Expand All @@ -39,7 +37,7 @@ async function main (): Promise<void> {
const fileName = `${TASK.taskID}_${NUMBER_OF_USERS}users.csv`
saveLog(logs, fileName)
}

console.log('Shutting down the server...')
await new Promise((resolve, reject) => {
server.once('close', resolve)
server.close(reject)
Expand Down
Loading
Loading