diff --git a/examples/hello-world/hello-cross-val/README.md b/examples/hello-world/hello-cross-val/README.md deleted file mode 100644 index a649abcec6..0000000000 --- a/examples/hello-world/hello-cross-val/README.md +++ /dev/null @@ -1,25 +0,0 @@ -# Hello Cross-Site Validation - -The cross-site model evaluation workflow uses the data from clients to run evaluation with the models of other clients. Data is not shared. Rather the collection of models is distributed to each client site to run local validation. The server collects the results of local validation to construct an all-to-all matrix of model performance vs. client dataset. It uses the [CrossSiteEval](https://nvflare.readthedocs.io/en/main/apidocs/nvflare.app_common.workflows.cross_site_eval.html) controller workflow. - -### 1. Install NVIDIA FLARE - -Follow the [Installation](../../getting_started/README.md) instructions. - -### 2. Run the experiment - -Use nvflare simulator to run the example: - -``` -nvflare simulator -w /tmp/nvflare/ -n 2 -t 2 hello-cross-val/jobs/hello-cross-val -``` - -### 3. Access the logs and results - -You can find the running logs and results inside the simulator's workspace/simulate_job - -```bash -$ ls /tmp/nvflare/simulate_job/ -app_server app_site-1 app_site-2 log.txt - -``` diff --git a/examples/hello-world/hello-cross-val/requirements.txt b/examples/hello-world/hello-cross-val/requirements.txt deleted file mode 100644 index ec750098b4..0000000000 --- a/examples/hello-world/hello-cross-val/requirements.txt +++ /dev/null @@ -1 +0,0 @@ -nvflare~=2.5.0rc diff --git a/examples/hello-world/hello-numpy-cross-val/README.md b/examples/hello-world/hello-numpy-cross-val/README.md index 352d1fdde8..bba69d07db 100644 --- a/examples/hello-world/hello-numpy-cross-val/README.md +++ b/examples/hello-world/hello-numpy-cross-val/README.md @@ -2,53 +2,87 @@ The cross-site model evaluation workflow uses the data from clients to run evaluation with the models of other clients. Data is not shared. Rather the collection of models is distributed to each client site to run local validation. The server collects the results of local validation to construct an all-to-all matrix of model performance vs. client dataset. It uses the [CrossSiteModelEval](https://nvflare.readthedocs.io/en/main/apidocs/nvflare.app_common.workflows.cross_site_model_eval.html) controller workflow. -> **_NOTE:_** This example uses a Numpy-based trainer and will generate its data within the code. -You can follow the [hello_world notebook](../hello_world.ipynb) or the following: - -### 1. Install NVIDIA FLARE +## Installation Follow the [Installation](../../getting_started/README.md) instructions. -### 2. Run the experiment +# Run training and cross site validation right after training -Use nvflare simulator to run the hello-examples: +This example uses a Numpy-based trainer to simulate the training +steps. -``` -nvflare simulator -w /tmp/nvflare/ -n 2 -t 2 hello-numpy-cross-val/jobs/hello-numpy-cross-val +We first perform FedAvg training and then conduct cross-site validation. + +So you will see two workflows (ScatterAndGather and CrossSiteModelEval) are configured. + +## 1. Prepare the job and run the experiment using simulator + +We use Job API to generate the job and run the job using simulator: + +```bash +python3 job_train_and_cse.py ``` -### 3. Access the logs and results +## 2. Access the logs and results -You can find the running logs and results inside the simulator's workspace/simulate_job +You can find the running logs and results inside the simulator's workspace: ```bash -$ ls /tmp/nvflare/simulate_job/ -app_server app_site-1 app_site-2 log.txt +$ ls /tmp/nvflare/jobs/workdir/ +server/ site-1/ site-2/ startup/ +``` + +The cross site validation results: +```bash +$ cat /tmp/nvflare/jobs/workdir/server/simulate_job/cross_site_val/cross_val_results.json ``` -# Run cross site validation using the previous trained results +# Run cross site evaluation using the previous trained results -## Introduction +We can also run cross-site evaluation without the training workflow, making use of the previous results or just want to evaluate on the pretrained models. -The "hello-numpy-cross-val-only" and "hello-numpy-cross-val-only-list-models" jobs show how to run the NVFlare cross-site validation without the training workflow, making use of the previous run results. The first one uses the default single server model. The second enables a list of server models. You can provide / use your own previous trained models for the cross-validation. +You can provide / use your own pretrained models for the cross-site evaluation. -### Generate the previous run best global model and local best model +## 1. Generate the pretrained model -Run the following command to generate the pre-trained models: +In reality, users would use any training workflows to obtain these pretrained models +To mimic that, run the following command to generate the pre-trained models: + +```bash +python3 generate_pretrain_models.py ``` -python pre_train_models.py -``` -### How to run the Job +## 2. Prepare the job and run the experiment using simulator + +Note that our pretrained models is generated under: + +```python +SERVER_MODEL_DIR = "/tmp/nvflare/server_pretrain_models" +CLIENT_MODEL_DIR = "/tmp/nvflare/client_pretrain_models" +``` -Define two OS system variable "SERVER_MODEL_DIR" and "CLIENT_MODEL_DIR" to point to the absolute path of the server best model and local best model location respectively. Then use the NVFlare admin command "submit_job" to submit and run the cross-validation job. +In our job_cse.py we also specify that. -For example, define the system variable "SERVER_MODEL_DIR" like this: +Then we can use Job API to generate the job and run it using simulator: +```bash +python3 job_cse.py ``` -export SERVER_MODEL_DIR="/path/to/model/location/at/server-side" + +## 3. Access the logs and results + +You can find the running logs and results inside the simulator's workspace: + +```bash +$ ls /tmp/nvflare/jobs/workdir/ +server/ site-1/ site-2/ startup/ ``` +The cross site validation results: + +```bash +$ cat /tmp/nvflare/jobs/workdir/server/simulate_job/cross_site_val/cross_val_results.json +``` \ No newline at end of file diff --git a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only/pre_train_models.py b/examples/hello-world/hello-numpy-cross-val/generate_pretrain_models.py similarity index 51% rename from examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only/pre_train_models.py rename to examples/hello-world/hello-numpy-cross-val/generate_pretrain_models.py index a65e962197..408f91ce90 100644 --- a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only/pre_train_models.py +++ b/examples/hello-world/hello-numpy-cross-val/generate_pretrain_models.py @@ -1,4 +1,4 @@ -# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved. +# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -16,11 +16,16 @@ import numpy as np -from nvflare.app_common.abstract.model import ModelLearnableKey, make_model_learnable -from nvflare.app_common.np.constants import NPConstants +SERVER_MODEL_DIR = "/tmp/nvflare/server_pretrain_models" +CLIENT_MODEL_DIR = "/tmp/nvflare/client_pretrain_models" + + +def _save_model(model_data, model_dir: str, model_file: str): + if not os.path.exists(model_dir): + os.makedirs(model_dir) + model_path = os.path.join(model_dir, model_file) + np.save(model_path, model_data) -SERVER_MODEL_DIR = "models/server" -CLIENT_MODEL_DIR = "models/client" if __name__ == "__main__": """ @@ -28,17 +33,7 @@ """ model_data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.float32) - model_learnable = make_model_learnable(weights={NPConstants.NUMPY_KEY: model_data}, meta_props={}) - working_dir = os.getcwd() - model_dir = os.path.join(working_dir, SERVER_MODEL_DIR) - if not os.path.exists(model_dir): - os.makedirs(model_dir) - model_path = os.path.join(model_dir, "server.npy") - np.save(model_path, model_learnable[ModelLearnableKey.WEIGHTS][NPConstants.NUMPY_KEY]) - - model_dir = os.path.join(working_dir, CLIENT_MODEL_DIR) - if not os.path.exists(model_dir): - os.makedirs(model_dir) - model_save_path = os.path.join(model_dir, "best_numpy.npy") - np.save(model_save_path, model_data) + _save_model(model_data=model_data, model_dir=SERVER_MODEL_DIR, model_file="server_1.npy") + _save_model(model_data=model_data, model_dir=SERVER_MODEL_DIR, model_file="server_2.npy") + _save_model(model_data=model_data, model_dir=CLIENT_MODEL_DIR, model_file="best_numpy.npy") diff --git a/examples/hello-world/hello-numpy-cross-val/job_cse.py b/examples/hello-world/hello-numpy-cross-val/job_cse.py new file mode 100644 index 0000000000..f9d2b7a27a --- /dev/null +++ b/examples/hello-world/hello-numpy-cross-val/job_cse.py @@ -0,0 +1,63 @@ +# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +from nvflare import FedJob +from nvflare.app_common.app_constant import AppConstants +from nvflare.app_common.np.np_formatter import NPFormatter +from nvflare.app_common.np.np_model_locator import NPModelLocator +from nvflare.app_common.np.np_trainer import NPTrainer +from nvflare.app_common.np.np_validator import NPValidator +from nvflare.app_common.widgets.validation_json_generator import ValidationJsonGenerator +from nvflare.app_common.workflows.cross_site_model_eval import CrossSiteModelEval + +SERVER_MODEL_DIR = "/tmp/nvflare/server_pretrain_models" +CLIENT_MODEL_DIR = "/tmp/nvflare/client_pretrain_models" + + +if __name__ == "__main__": + n_clients = 2 + + job = FedJob(name="hello-numpy-cse", min_clients=n_clients) + + model_locator_id = job.to_server( + NPModelLocator( + model_dir="/tmp/nvflare/server_pretrain_models", + model_name={"server_model_1": "server_1.npy", "server_model_2": "server_2.npy"}, + ) + ) + formatter_id = job.to_server(NPFormatter()) + job.to_server(ValidationJsonGenerator()) + + # Define the controller workflow and send to server + controller = CrossSiteModelEval( + model_locator_id=model_locator_id, + formatter_id=formatter_id, + ) + job.to_server(controller) + + # Add clients + trainer = NPTrainer( + train_task_name=AppConstants.TASK_TRAIN, + submit_model_task_name=AppConstants.TASK_SUBMIT_MODEL, + model_dir="/tmp/nvflare/client_pretrain_models", + ) + job.to_clients(trainer, tasks=[AppConstants.TASK_SUBMIT_MODEL]) + validator = NPValidator( + validate_task_name=AppConstants.TASK_VALIDATION, + ) + job.to_clients(validator, tasks=[AppConstants.TASK_VALIDATION]) + + job.export_job("/tmp/nvflare/jobs") + job.simulator_run("/tmp/nvflare/jobs/workdir", gpu="0", n_clients=n_clients) diff --git a/examples/hello-world/hello-numpy-cross-val/job_train_and_cse.py b/examples/hello-world/hello-numpy-cross-val/job_train_and_cse.py new file mode 100644 index 0000000000..40f9c0d467 --- /dev/null +++ b/examples/hello-world/hello-numpy-cross-val/job_train_and_cse.py @@ -0,0 +1,69 @@ +# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +from nvflare import FedJob +from nvflare.apis.dxo import DataKind +from nvflare.app_common.aggregators.intime_accumulate_model_aggregator import InTimeAccumulateWeightedAggregator +from nvflare.app_common.app_constant import AppConstants +from nvflare.app_common.np.np_formatter import NPFormatter +from nvflare.app_common.np.np_model_locator import NPModelLocator +from nvflare.app_common.np.np_model_persistor import NPModelPersistor +from nvflare.app_common.np.np_trainer import NPTrainer +from nvflare.app_common.np.np_validator import NPValidator +from nvflare.app_common.shareablegenerators.full_model_shareable_generator import FullModelShareableGenerator +from nvflare.app_common.widgets.validation_json_generator import ValidationJsonGenerator +from nvflare.app_common.workflows.cross_site_model_eval import CrossSiteModelEval +from nvflare.app_common.workflows.scatter_and_gather import ScatterAndGather + +if __name__ == "__main__": + n_clients = 2 + num_rounds = 1 + + job = FedJob(name="hello-numpy-cse", min_clients=n_clients) + + persistor_id = job.to_server(NPModelPersistor()) + aggregator_id = job.to_server(InTimeAccumulateWeightedAggregator(expected_data_kind=DataKind.WEIGHTS)) + shareable_generator_id = job.to_server(FullModelShareableGenerator()) + model_locator_id = job.to_server(NPModelLocator()) + formatter_id = job.to_server(NPFormatter()) + job.to_server(ValidationJsonGenerator()) + + # Define the controller workflow and send to server + controller = ScatterAndGather( + min_clients=n_clients, + num_rounds=num_rounds, + persistor_id=persistor_id, + aggregator_id=aggregator_id, + shareable_generator_id=shareable_generator_id, + ) + job.to_server(controller) + + # Define the controller workflow and send to server + controller = CrossSiteModelEval(model_locator_id=model_locator_id, formatter_id=formatter_id) + job.to_server(controller) + + # Add clients + trainer = NPTrainer( + train_task_name=AppConstants.TASK_TRAIN, + submit_model_task_name=AppConstants.TASK_SUBMIT_MODEL, + ) + job.to_clients(trainer, tasks=[AppConstants.TASK_TRAIN, AppConstants.TASK_SUBMIT_MODEL]) + validator = NPValidator( + validate_task_name=AppConstants.TASK_VALIDATION, + ) + job.to_clients(validator, tasks=[AppConstants.TASK_VALIDATION]) + + job.export_job("/tmp/nvflare/jobs") + job.simulator_run("/tmp/nvflare/jobs/workdir", gpu="0", n_clients=n_clients) diff --git a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only-list-models/app/config/config_fed_client.json b/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only-list-models/app/config/config_fed_client.json deleted file mode 100755 index 788939cba1..0000000000 --- a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only-list-models/app/config/config_fed_client.json +++ /dev/null @@ -1,29 +0,0 @@ -{ - "format_version": 2, - "model_dir": "{$CLIENT_MODEL_DIR}", - "executors": [ - { - "tasks": [ - "train", - "submit_model" - ], - "executor": { - "path": "nvflare.app_common.np.np_trainer.NPTrainer", - "args": { - "model_dir": "{model_dir}" - } - } - }, - { - "tasks": [ - "validate" - ], - "executor": { - "path": "nvflare.app_common.np.np_validator.NPValidator" - } - } - ], - "task_result_filters": [], - "task_data_filters": [], - "components": [] -} \ No newline at end of file diff --git a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only-list-models/app/config/config_fed_server.json b/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only-list-models/app/config/config_fed_server.json deleted file mode 100755 index 024373c1ff..0000000000 --- a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only-list-models/app/config/config_fed_server.json +++ /dev/null @@ -1,39 +0,0 @@ -{ - "format_version": 2, - "model_dir": "{$SERVER_MODEL_DIR}", - "server": { - "heart_beat_timeout": 600 - }, - "task_data_filters": [], - "task_result_filters": [], - "components": [ - { - "id": "model_locator", - "path": "nvflare.app_common.np.np_model_locator.NPModelLocator", - "args": { - "model_dir": "{model_dir}", - "model_names": { - "server_model_1": "server_1.npy", - "server_model_2": "server_2.npy" - } - } - }, - { - "id": "json_generator", - "path": "nvflare.app_common.widgets.validation_json_generator.ValidationJsonGenerator", - "args": {} - } - ], - "workflows": [ - { - "id": "cross_site_model_eval", - "path": "nvflare.app_common.workflows.cross_site_model_eval.CrossSiteModelEval", - "args": { - "model_locator_id": "model_locator", - "submit_model_timeout": 600, - "validation_timeout": 6000, - "cleanup_models": false - } - } - ] -} \ No newline at end of file diff --git a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only-list-models/meta.json b/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only-list-models/meta.json deleted file mode 100644 index 202f437746..0000000000 --- a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only-list-models/meta.json +++ /dev/null @@ -1,10 +0,0 @@ -{ - "name": "hello-numpy-cross-val", - "resource_spec": {}, - "min_clients" : 2, - "deploy_map": { - "app": [ - "@ALL" - ] - } -} diff --git a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only-list-models/pre_train_models.py b/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only-list-models/pre_train_models.py deleted file mode 100644 index 0191508fe7..0000000000 --- a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only-list-models/pre_train_models.py +++ /dev/null @@ -1,46 +0,0 @@ -# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import os - -import numpy as np - -from nvflare.app_common.abstract.model import ModelLearnableKey, make_model_learnable -from nvflare.app_common.np.constants import NPConstants - -SERVER_MODEL_DIR = "models/server" -CLIENT_MODEL_DIR = "models/client" - -if __name__ == "__main__": - """ - This is the tool to generate the pre-trained models for demonstrating the cross-validation without training. - """ - - model_data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.float32) - model_learnable = make_model_learnable(weights={NPConstants.NUMPY_KEY: model_data}, meta_props={}) - - working_dir = os.getcwd() - model_dir = os.path.join(working_dir, SERVER_MODEL_DIR) - if not os.path.exists(model_dir): - os.makedirs(model_dir) - model_path = os.path.join(model_dir, "server_1.npy") - np.save(model_path, model_learnable[ModelLearnableKey.WEIGHTS][NPConstants.NUMPY_KEY]) - model_path = os.path.join(model_dir, "server_2.npy") - np.save(model_path, model_learnable[ModelLearnableKey.WEIGHTS][NPConstants.NUMPY_KEY]) - - model_dir = os.path.join(working_dir, CLIENT_MODEL_DIR) - if not os.path.exists(model_dir): - os.makedirs(model_dir) - model_save_path = os.path.join(model_dir, "best_numpy.npy") - np.save(model_save_path, model_data) diff --git a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only/app/config/config_fed_client.json b/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only/app/config/config_fed_client.json deleted file mode 100755 index 788939cba1..0000000000 --- a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only/app/config/config_fed_client.json +++ /dev/null @@ -1,29 +0,0 @@ -{ - "format_version": 2, - "model_dir": "{$CLIENT_MODEL_DIR}", - "executors": [ - { - "tasks": [ - "train", - "submit_model" - ], - "executor": { - "path": "nvflare.app_common.np.np_trainer.NPTrainer", - "args": { - "model_dir": "{model_dir}" - } - } - }, - { - "tasks": [ - "validate" - ], - "executor": { - "path": "nvflare.app_common.np.np_validator.NPValidator" - } - } - ], - "task_result_filters": [], - "task_data_filters": [], - "components": [] -} \ No newline at end of file diff --git a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only/app/config/config_fed_server.json b/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only/app/config/config_fed_server.json deleted file mode 100755 index bf2880977e..0000000000 --- a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only/app/config/config_fed_server.json +++ /dev/null @@ -1,35 +0,0 @@ -{ - "format_version": 2, - "model_dir": "{$SERVER_MODEL_DIR}", - "server": { - "heart_beat_timeout": 600 - }, - "task_data_filters": [], - "task_result_filters": [], - "components": [ - { - "id": "model_locator", - "path": "nvflare.app_common.np.np_model_locator.NPModelLocator", - "args": { - "model_dir": "{model_dir}" - } - }, - { - "id": "json_generator", - "path": "nvflare.app_common.widgets.validation_json_generator.ValidationJsonGenerator", - "args": {} - } - ], - "workflows": [ - { - "id": "cross_site_model_eval", - "path": "nvflare.app_common.workflows.cross_site_model_eval.CrossSiteModelEval", - "args": { - "model_locator_id": "model_locator", - "submit_model_timeout": 600, - "validation_timeout": 6000, - "cleanup_models": false - } - } - ] -} \ No newline at end of file diff --git a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only/meta.json b/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only/meta.json deleted file mode 100644 index 202f437746..0000000000 --- a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val-only/meta.json +++ /dev/null @@ -1,10 +0,0 @@ -{ - "name": "hello-numpy-cross-val", - "resource_spec": {}, - "min_clients" : 2, - "deploy_map": { - "app": [ - "@ALL" - ] - } -} diff --git a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val/app/config/config_fed_client.json b/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val/app/config/config_fed_client.json deleted file mode 100755 index 21c98c3291..0000000000 --- a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val/app/config/config_fed_client.json +++ /dev/null @@ -1,26 +0,0 @@ -{ - "format_version": 2, - "executors": [ - { - "tasks": [ - "train", - "submit_model" - ], - "executor": { - "path": "nvflare.app_common.np.np_trainer.NPTrainer", - "args": {} - } - }, - { - "tasks": [ - "validate" - ], - "executor": { - "path": "nvflare.app_common.np.np_validator.NPValidator" - } - } - ], - "task_result_filters": [], - "task_data_filters": [], - "components": [] -} \ No newline at end of file diff --git a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val/app/config/config_fed_server.json b/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val/app/config/config_fed_server.json deleted file mode 100755 index 8e978fa8b2..0000000000 --- a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val/app/config/config_fed_server.json +++ /dev/null @@ -1,73 +0,0 @@ -{ - "format_version": 2, - "server": { - "heart_beat_timeout": 600 - }, - "task_data_filters": [], - "task_result_filters": [], - "components": [ - { - "id": "persistor", - "path": "nvflare.app_common.np.np_model_persistor.NPModelPersistor", - "args": {} - }, - { - "id": "shareable_generator", - "path": "nvflare.app_common.shareablegenerators.full_model_shareable_generator.FullModelShareableGenerator", - "args": {} - }, - { - "id": "aggregator", - "path": "nvflare.app_common.aggregators.intime_accumulate_model_aggregator.InTimeAccumulateWeightedAggregator", - "args": { - "expected_data_kind": "WEIGHTS", - "aggregation_weights": { - "site-1": 1.0, - "site-2": 1.0 - } - } - }, - { - "id": "model_locator", - "path": "nvflare.app_common.np.np_model_locator.NPModelLocator", - "args": {} - }, - { - "id": "formatter", - "path": "nvflare.app_common.np.np_formatter.NPFormatter", - "args": {} - }, - { - "id": "json_generator", - "path": "nvflare.app_common.widgets.validation_json_generator.ValidationJsonGenerator", - "args": {} - } - ], - "workflows": [ - { - "id": "scatter_and_gather", - "path": "nvflare.app_common.workflows.scatter_and_gather.ScatterAndGather", - "args": { - "min_clients": 2, - "num_rounds": 3, - "start_round": 0, - "wait_time_after_min_received": 10, - "aggregator_id": "aggregator", - "persistor_id": "persistor", - "shareable_generator_id": "shareable_generator", - "train_task_name": "train", - "train_timeout": 6000 - } - }, - { - "id": "cross_site_model_eval", - "path": "nvflare.app_common.workflows.cross_site_model_eval.CrossSiteModelEval", - "args": { - "model_locator_id": "model_locator", - "submit_model_timeout": 600, - "validation_timeout": 6000, - "cleanup_models": false - } - } - ] -} \ No newline at end of file diff --git a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val/meta.json b/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val/meta.json deleted file mode 100644 index 202f437746..0000000000 --- a/examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val/meta.json +++ /dev/null @@ -1,10 +0,0 @@ -{ - "name": "hello-numpy-cross-val", - "resource_spec": {}, - "min_clients" : 2, - "deploy_map": { - "app": [ - "@ALL" - ] - } -} diff --git a/nvflare/app_common/np/np_model_locator.py b/nvflare/app_common/np/np_model_locator.py index 1d5e71ecd1..7ed074c76d 100755 --- a/nvflare/app_common/np/np_model_locator.py +++ b/nvflare/app_common/np/np_model_locator.py @@ -42,11 +42,11 @@ def __init__(self, model_dir="models", model_name: Union[str, Dict[str, str]] = self.model_dir = model_dir if model_name is None: - self.model_file_name = {NPModelLocator.SERVER_MODEL_NAME: "server.npy"} + self.model_name = {NPModelLocator.SERVER_MODEL_NAME: "server.npy"} elif isinstance(model_name, str): - self.model_file_name = {NPModelLocator.SERVER_MODEL_NAME: model_name} + self.model_name = {NPModelLocator.SERVER_MODEL_NAME: model_name} elif isinstance(model_name, dict): - self.model_file_name = model_name + self.model_name = model_name else: raise ValueError(f"model_name must be a str, or a Dict[str, str]. But got: {type(model_name)}") @@ -59,19 +59,19 @@ def get_model_names(self, fl_ctx: FLContext) -> List[str]: Returns: List[str]: List of model names. """ - return list(self.model_file_name.keys()) + return list(self.model_name.keys()) def locate_model(self, model_name, fl_ctx: FLContext) -> DXO: dxo = None engine = fl_ctx.get_engine() - if model_name in list(self.model_file_name.keys()): + if model_name in list(self.model_name.keys()): try: job_id = fl_ctx.get_prop(FLContextKey.CURRENT_RUN) run_dir = engine.get_workspace().get_run_dir(job_id) model_path = os.path.join(run_dir, self.model_dir) - model_load_path = os.path.join(model_path, self.model_file_name[model_name]) + model_load_path = os.path.join(model_path, self.model_name[model_name]) np_data = None try: np_data = np.load(model_load_path, allow_pickle=False) diff --git a/nvflare/app_common/np/np_model_persistor.py b/nvflare/app_common/np/np_model_persistor.py index 9245c60077..c6db2060ee 100755 --- a/nvflare/app_common/np/np_model_persistor.py +++ b/nvflare/app_common/np/np_model_persistor.py @@ -38,6 +38,16 @@ def _get_run_dir(fl_ctx: FLContext): class NPModelPersistor(ModelPersistor): def __init__(self, model_dir="models", model_name="server.npy"): + """Model persistor for numpy arrays. + + Note: + If the specified model can't be found using "model_dir"/"model_name" + Then default array of [[1, 2, 3], [4, 5, 6], [7, 8, 9]] is used. + + Args: + model_dir (str, optional): model directory. Defaults to "models". + model_name (str, optional): model name. Defaults to "server.npy". + """ super().__init__() self.model_dir = model_dir diff --git a/nvflare/app_common/np/np_trainer.py b/nvflare/app_common/np/np_trainer.py index 4a33ed98c7..ae655eb325 100755 --- a/nvflare/app_common/np/np_trainer.py +++ b/nvflare/app_common/np/np_trainer.py @@ -93,7 +93,7 @@ def _train(self, shareable: Shareable, fl_ctx: FLContext, abort_signal: Signal): if abort_signal.triggered: return make_reply(ReturnCode.TASK_ABORTED) - # Doing some dummy training. + # Doing some mock training. if np_data: if NPConstants.NUMPY_KEY in np_data: np_data[NPConstants.NUMPY_KEY] += self._delta @@ -124,7 +124,11 @@ def _train(self, shareable: Shareable, fl_ctx: FLContext, abort_signal: Signal): return make_reply(ReturnCode.TASK_ABORTED) # Prepare a DXO for our updated model. Create shareable and return - outgoing_dxo = DXO(data_kind=incoming_dxo.data_kind, data=np_data, meta={MetaKey.NUM_STEPS_CURRENT_ROUND: 1}) + outgoing_dxo = DXO( + data_kind=incoming_dxo.data_kind, + data=np_data, + meta={MetaKey.NUM_STEPS_CURRENT_ROUND: 1}, + ) return outgoing_dxo.to_shareable() def _submit_model(self, fl_ctx: FLContext, abort_signal: Signal): diff --git a/nvflare/app_common/workflows/cross_site_model_eval.py b/nvflare/app_common/workflows/cross_site_model_eval.py index 43bcb998ae..bf17f65a2f 100644 --- a/nvflare/app_common/workflows/cross_site_model_eval.py +++ b/nvflare/app_common/workflows/cross_site_model_eval.py @@ -60,8 +60,8 @@ def __init__( Defaults to "cross_site_val". submit_model_timeout (int, optional): Timeout of submit_model_task. Defaults to 600 secs. validation_timeout (int, optional): Timeout for validate_model task. Defaults to 6000 secs. - model_locator_id (str, optional): ID for model_locator component. Defaults to "". - formatter_id (str, optional): ID for formatter component. Defaults to "". + model_locator_id (str, optional): ID for `ModelLocator` component. Defaults to "". + formatter_id (str, optional): ID for `Formatter` component. Defaults to "". submit_model_task_name (str, optional): Name of submit_model task. Defaults to "". validation_task_name (str, optional): Name of validate_model task. Defaults to "validate". cleanup_models (bool, optional): Whether or not models should be deleted after run. Defaults to False. diff --git a/examples/hello-world/hello-cross-val/jobs/hello-cross-val/app/config/config_fed_client.conf b/tests/integration_test/data/jobs/hello-pt-cse/app/config/config_fed_client.conf similarity index 100% rename from examples/hello-world/hello-cross-val/jobs/hello-cross-val/app/config/config_fed_client.conf rename to tests/integration_test/data/jobs/hello-pt-cse/app/config/config_fed_client.conf diff --git a/examples/hello-world/hello-cross-val/jobs/hello-cross-val/app/config/config_fed_server.conf b/tests/integration_test/data/jobs/hello-pt-cse/app/config/config_fed_server.conf similarity index 100% rename from examples/hello-world/hello-cross-val/jobs/hello-cross-val/app/config/config_fed_server.conf rename to tests/integration_test/data/jobs/hello-pt-cse/app/config/config_fed_server.conf diff --git a/examples/hello-world/hello-cross-val/jobs/hello-cross-val/app/custom/net.py b/tests/integration_test/data/jobs/hello-pt-cse/app/custom/net.py similarity index 100% rename from examples/hello-world/hello-cross-val/jobs/hello-cross-val/app/custom/net.py rename to tests/integration_test/data/jobs/hello-pt-cse/app/custom/net.py diff --git a/examples/hello-world/hello-cross-val/jobs/hello-cross-val/app/custom/train.py b/tests/integration_test/data/jobs/hello-pt-cse/app/custom/train.py similarity index 99% rename from examples/hello-world/hello-cross-val/jobs/hello-cross-val/app/custom/train.py rename to tests/integration_test/data/jobs/hello-pt-cse/app/custom/train.py index 3bc1ebc679..6f04eebf85 100644 --- a/examples/hello-world/hello-cross-val/jobs/hello-cross-val/app/custom/train.py +++ b/tests/integration_test/data/jobs/hello-pt-cse/app/custom/train.py @@ -26,7 +26,7 @@ from nvflare.app_common.app_constant import ModelName # (optional) set a fixed location so we don't need to download everytime -CIFAR10_ROOT = "/tmp/nvflare/data/cifar10" +CIFAR10_ROOT = "/tmp/nvflare/data" MODEL_SAVE_PATH_ROOT = "/tmp/nvflare/workdir/cifar10" diff --git a/examples/hello-world/hello-cross-val/jobs/hello-cross-val/meta.conf b/tests/integration_test/data/jobs/hello-pt-cse/meta.conf similarity index 77% rename from examples/hello-world/hello-cross-val/jobs/hello-cross-val/meta.conf rename to tests/integration_test/data/jobs/hello-pt-cse/meta.conf index df04ef4ed7..39f64f94d3 100644 --- a/examples/hello-world/hello-cross-val/jobs/hello-cross-val/meta.conf +++ b/tests/integration_test/data/jobs/hello-pt-cse/meta.conf @@ -1,5 +1,5 @@ { - name = "hello-cross-val" + name = "hello-pt-cse" resource_spec {} min_clients = 2 deploy_map { diff --git a/tests/integration_test/data/test_configs/standalone_job/client_api.yml b/tests/integration_test/data/test_configs/standalone_job/client_api.yml index 611135a07c..9eba11a1ac 100644 --- a/tests/integration_test/data/test_configs/standalone_job/client_api.yml +++ b/tests/integration_test/data/test_configs/standalone_job/client_api.yml @@ -67,8 +67,6 @@ tests: "data": { "run_finished": True } setup: - python -c "from torchvision.datasets import CIFAR10; CIFAR10(root='/tmp/nvflare/data', train=True, download=True)" - teardown: - - rm -rf /tmp/nvflare/data - test_name: "run pt-client-api-launch-once" event_sequence: - "trigger": @@ -86,8 +84,6 @@ tests: "data": { "run_finished": True } setup: - python -c "from torchvision.datasets import CIFAR10; CIFAR10(root='/tmp/nvflare/data', train=True, download=True)" - teardown: - - rm -rf /tmp/nvflare/data - test_name: "run pt-client-api-cyclic" event_sequence: - "trigger": @@ -105,8 +101,6 @@ tests: "data": { "run_finished": True } setup: - python -c "from torchvision.datasets import CIFAR10; CIFAR10(root='/tmp/nvflare/data', train=True, download=True)" - teardown: - - rm -rf /tmp/nvflare/data - test_name: "run decorator" event_sequence: - "trigger": @@ -124,8 +118,6 @@ tests: "data": { "run_finished": True } setup: - python -c "from torchvision.datasets import CIFAR10; CIFAR10(root='/tmp/nvflare/data', train=True, download=True)" - teardown: - - rm -rf /tmp/nvflare/data - test_name: "run lightning-client-api" event_sequence: - "trigger": @@ -144,8 +136,6 @@ tests: setup: - python -m pip install pytorch_lightning - python -c "from torchvision.datasets import CIFAR10; CIFAR10(root='/tmp/nvflare/data', train=True, download=True)" - teardown: - - rm -rf /tmp/nvflare/data - test_name: "run pt-client-api-in-process" event_sequence: - "trigger": @@ -163,8 +153,6 @@ tests: "data": { "run_finished": True } setup: - python -c "from torchvision.datasets import CIFAR10; CIFAR10(root='/tmp/nvflare/data', train=True, download=True)" - teardown: - - rm -rf /tmp/nvflare/data - test_name: "run decorator-in-process" event_sequence: - "trigger": @@ -182,8 +170,6 @@ tests: "data": { "run_finished": True } setup: - python -c "from torchvision.datasets import CIFAR10; CIFAR10(root='/tmp/nvflare/data', train=True, download=True)" - teardown: - - rm -rf /tmp/nvflare/data - test_name: "run lightning-client-api-in-process" event_sequence: - "trigger": diff --git a/tests/integration_test/data/test_configs/standalone_job/pt_job.yml b/tests/integration_test/data/test_configs/standalone_job/pt_job.yml index 33ecba9566..cb582e2946 100644 --- a/tests/integration_test/data/test_configs/standalone_job/pt_job.yml +++ b/tests/integration_test/data/test_configs/standalone_job/pt_job.yml @@ -28,3 +28,26 @@ tests: - python -c "from torchvision.datasets import CIFAR10; CIFAR10(root='~/data', download=True)" teardown: - rm -rf ~/data +- test_name: "run hello-pt-cross-val" + event_sequence: + - "trigger": + "type": "server_log" + "data": "Server started" + "actions": [ "submit_job hello-pt-cse" ] + "result": + "type": "job_submit_success" + - "trigger": + "type": "run_state" + "data": { "run_finished": True } + "actions": [ "ensure_current_job_done" ] + "result": + "type": "run_state" + "data": { "run_finished": True } + validators: + - path: tests.integration_test.src.validators.PTModelValidator + - path: tests.integration_test.src.validators.CrossValResultValidator + args: { server_model_names: [ "server" ] } + setup: + - python -c "from torchvision.datasets import CIFAR10; CIFAR10(root='/tmp/nvflare/data/', download=True)" + teardown: + - rm -rf ~/data