Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hierarchical statistics support #2608

Merged
merged 1 commit into from
Jun 6, 2024

Conversation

apatole
Copy link
Contributor

@apatole apatole commented Jun 3, 2024

This change adds hierarchical federated statistics support.

The existing StatisticsController outputs global statistics in a flat hierarchy. There are usecases where global statistics are required to be generated as per the given hierarchy configuration where NVFlare clients can be specified to belong to a particular hierarchy.

The new class HierarchicalStatisticsController is added to support hierarchical statistics. It is derived from StatisticsController and takes additional argument hierarchy_config for hierarchy specification file providing details about all the clients and their hierarchy.

Example hierarchy config file contents:

{
    "Manufacturers": [
        {
            "Name": "Manufacturer-1",
            "Orgs": [
                 {
                     "Name": "Org-1",
                     "Sites": ["Site-1"]
                 },
                 {
                     "Name": "Org-2",
                     "Sites": ["Site-2"]
                 }
            ],
        },
        {
            "Name": "Manufacturer-2",
            "Orgs": [
                {
                    "Name": "Org-3",
                    "Sites": ["Site-3", "Site-4"]
                }
            ]
        },
    ]
}

The above hierarchy config specifies three level hierarchy for four NVFlare clients named 'Site-1', 'Site-2', 'Site-3' and 'Site-4'. And the generate global statistics output will in hierarchical format like below. At each hierarchical level, global statistics are caulculated whereas local statistics are reported at the last hierarchical level.

{
    "Global": {
        <Global stats>
    },
    "Manufacturers": [
        {
            "Name": "Manufacturer-1",
            "Global": {
                <Manufacturer level stats>
            },
            "Orgs": [
                {
                    "Name": "Org-1",
                    "Global": {
                        <Org level stats>
                    },
                    "Sites": [
                        {
                            "Name": "Site-1",
                            "Local": {
                                <Local stats>
                            },
                        },
                        {
                            "Name": "Site-2",
                            "Local": {
                                <Local stats>
                            },
                        }
                    ],
                },
            ]
        },
    ],
}

The number of hierarchical levels and the hierarchical level names are automatically calculated from the given hierarchy config. Any number of hierarchical levels are supported.

This change also adds unit test to test hierarchical global statistics up to four hierarchical levels.

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • New tests added to cover the changes.
  • Quick tests passed locally by running ./runtest.sh.
  • In-line docstrings updated.

Copy link
Collaborator

@yanchengnv yanchengnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I added some comments and requested changes to be more consistent with FLARE code style.

@apatole apatole force-pushed the apatole/add_hierarchical_stats branch 3 times, most recently from 0be8b16 to a5aabae Compare June 5, 2024 15:00
@apatole
Copy link
Contributor Author

apatole commented Jun 5, 2024

Thanks @chesterxgchen and @yanchengnv for the review, I have incorporated all the review comments. Also removed few redundant source lines in the latest change, thanks

Copy link
Collaborator

@yanchengnv yanchengnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks better. There seems to be a logical gap about hierarchy_config_json handling (see my comments). Also added some minor style suggestions.

@apatole apatole force-pushed the apatole/add_hierarchical_stats branch from a5aabae to d105fb7 Compare June 5, 2024 16:17
@apatole apatole requested a review from yanchengnv June 5, 2024 16:20
@apatole
Copy link
Contributor Author

apatole commented Jun 5, 2024

Thanks @yanchengnv again for the review. I have incorporated review comments, hoping I addressed comments properly, let me know otherwise, thanks.

@apatole apatole force-pushed the apatole/add_hierarchical_stats branch from d105fb7 to 410aab3 Compare June 5, 2024 16:26
yanchengnv
yanchengnv previously approved these changes Jun 5, 2024
Copy link
Collaborator

@yanchengnv yanchengnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@apatole
Copy link
Contributor Author

apatole commented Jun 5, 2024

Hi @yanchengnv @chesterxgchen, Looks like someone needs to also approve workflow runs to be able to pass workflow checks and merge PR once all checks passed.

@chesterxgchen
Copy link
Collaborator

Hi @yanchengnv @chesterxgchen, Looks like someone needs to also approve workflow runs to be able to pass workflow checks and merge PR once all checks passed.

I am at conference, will take another look after that,

@yanchengnv
Copy link
Collaborator

Hi @yanchengnv @chesterxgchen, Looks like someone needs to also approve workflow runs to be able to pass workflow checks and merge PR once all checks passed.

I am at conference, will take another look after that,

I already did it.

There are some format issues.
@apatole, please run "python -m isort" and "python -m black" against your code to fix these issues.

@apatole
Copy link
Contributor Author

apatole commented Jun 5, 2024

@yanchengnv no worries, thanks. Fixed one typo in copyright section in latest patch, it was introduced in earlier patch, thanks.

@apatole apatole requested a review from yanchengnv June 5, 2024 18:35
Copy link
Collaborator

@yanchengnv yanchengnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@apatole
Copy link
Contributor Author

apatole commented Jun 5, 2024

Looks like blossom-ci is still waiting to be triggered.

@apatole apatole force-pushed the apatole/add_hierarchical_stats branch from bd575cd to fd74021 Compare June 5, 2024 20:54
@apatole
Copy link
Contributor Author

apatole commented Jun 6, 2024

Hi @yanchengnv, please help getting the workflows triggered once and in merging the PR. Had to rebase the PR and it seems blossom-ci also is not getting triggered. Thanks.

This change adds hierarchical federated statistics support.

The existing `StatisticsController` outputs global statistics in a flat
hierarchy. There are usecases where global statistics are required to be
generated as per the given hierarchy configuration where NVFlare clients
can be specified to belong to a particular hierarchy.

The new class `HierarchicalStatisticsController` is added to support
hierarchical statistics. It is derived from `StatisticsController` and
takes additional argument `hierarchy_config` for hierarchy specification
file providing details about all the clients and their hierarchy.

Example hierarchy config file contents:

    {
        "Manufacturers": [
            {
                "Name": "Manufacturer-1",
                "Orgs": [
                     {
                         "Name": "Org-1",
                         "Sites": ["Site-1"]
                     },
                     {
                         "Name": "Org-2",
                         "Sites": ["Site-2"]
                     }
                ],
            },
            {
                "Name": "Manufacturer-2",
                "Orgs": [
                    {
                        "Name": "Org-3",
                        "Sites": ["Site-3", "Site-4"]
                    }
                ]
            },
        ]
    }

The above hierarchy config specifies three level hierarchy for four NVFlare
clients named 'Site-1', 'Site-2', 'Site-3' and 'Site-4'. And the
generate global statistics output will in hierarchical format like below. At
each hierarchical level, global statistics are caulculated whereas local
statistics are reported at the last hierarchical level.

{
    "Global": {
        <Global stats>
    },
    "Manufacturers": [
        {
            "Name": "Manufacturer-1",
            "Global": {
                <Manufacturer level stats>
            },
            "Orgs": [
                {
                    "Name": "Org-1",
                    "Global": {
                        <Org level stats>
                    },
                    "Sites": [
                        {
                            "Name": "Site-1",
                            "Local": {
                                <Local stats>
                            },
                        },
                        {
                            "Name": "Site-2",
                            "Local": {
                                <Local stats>
                            },
                        }
                    ],
                },
            ]
        },
    ],
}

The number of hierarchical levels and the hierarchical level names are
automatically calculated from the given hierarchy config. Any number of
hierarchical levels are supported.

This change also adds unit test to test hierarchical global statistics
upto four hierarchical levels.
@apatole apatole force-pushed the apatole/add_hierarchical_stats branch from fd74021 to c022a6b Compare June 6, 2024 13:27
@yanchengnv yanchengnv merged commit cc741be into NVIDIA:main Jun 6, 2024
15 checks passed
@apatole apatole deleted the apatole/add_hierarchical_stats branch June 6, 2024 14:45
nvidianz pushed a commit to nvidianz/NVFlare that referenced this pull request Jul 8, 2024
This change adds hierarchical federated statistics support.

The existing `StatisticsController` outputs global statistics in a flat
hierarchy. There are usecases where global statistics are required to be
generated as per the given hierarchy configuration where NVFlare clients
can be specified to belong to a particular hierarchy.

The new class `HierarchicalStatisticsController` is added to support
hierarchical statistics. It is derived from `StatisticsController` and
takes additional argument `hierarchy_config` for hierarchy specification
file providing details about all the clients and their hierarchy.

Example hierarchy config file contents:

    {
        "Manufacturers": [
            {
                "Name": "Manufacturer-1",
                "Orgs": [
                     {
                         "Name": "Org-1",
                         "Sites": ["Site-1"]
                     },
                     {
                         "Name": "Org-2",
                         "Sites": ["Site-2"]
                     }
                ],
            },
            {
                "Name": "Manufacturer-2",
                "Orgs": [
                    {
                        "Name": "Org-3",
                        "Sites": ["Site-3", "Site-4"]
                    }
                ]
            },
        ]
    }

The above hierarchy config specifies three level hierarchy for four NVFlare
clients named 'Site-1', 'Site-2', 'Site-3' and 'Site-4'. And the
generate global statistics output will in hierarchical format like below. At
each hierarchical level, global statistics are caulculated whereas local
statistics are reported at the last hierarchical level.

{
    "Global": {
        <Global stats>
    },
    "Manufacturers": [
        {
            "Name": "Manufacturer-1",
            "Global": {
                <Manufacturer level stats>
            },
            "Orgs": [
                {
                    "Name": "Org-1",
                    "Global": {
                        <Org level stats>
                    },
                    "Sites": [
                        {
                            "Name": "Site-1",
                            "Local": {
                                <Local stats>
                            },
                        },
                        {
                            "Name": "Site-2",
                            "Local": {
                                <Local stats>
                            },
                        }
                    ],
                },
            ]
        },
    ],
}

The number of hierarchical levels and the hierarchical level names are
automatically calculated from the given hierarchy config. Any number of
hierarchical levels are supported.

This change also adds unit test to test hierarchical global statistics
upto four hierarchical levels.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants