Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SONiC PAC Support HLD #1315

Merged
merged 13 commits into from
Jan 25, 2024
Merged

SONiC PAC Support HLD #1315

merged 13 commits into from
Jan 25, 2024

Conversation

samitabh
Copy link
Contributor

@samitabh samitabh commented Apr 5, 2023

SONiC PAC Support HLD

The PAC functionality requires the changes to the below Repos. The associated PRs are listed below in merge sequence

Repo Title PR
sonic-swss-common Changes to Handle PAC Mgr updates sonic-net/sonic-swss-common#871
sonic-swss Vlanmgr and Fdborch changes for PAC sonic-net/sonic-swss#3143
sonic-buildimage PAC infra sonic interface files sonic-net/sonic-buildimage#18638
sonic-buildimage PAC infra header files sonic-net/sonic-buildimage#18639
sonic-buildimage PAC infra files sonic-net/sonic-buildimage#18640
sonic-buildimage PAC infra util changes for logging sonic-net/sonic-buildimage#18641
sonic-buildimage PAC infra utils changes for sim sonic-net/sonic-buildimage#18642
sonic-buildimage PAC infra utilities sonic-net/sonic-buildimage#18643
sonic-buildimage PAC libinfra tool sonic-net/sonic-buildimage#18644
sonic-buildimage PAC Infra OS abstraction files sonic-net/sonic-buildimage#18645
sonic-buildimage PAC Infra sysapi files sonic-net/sonic-buildimage#18646
sonic-buildimage PAC infra Makefile changes sonic-net/sonic-buildimage#18637
sonic-buildimage Changes to handle PAC operational info sonic-net/sonic-buildimage#18618
sonic-buildimage Changes to Handle PAC Mgr updates sonic-net/sonic-buildimage#18619
sonic-buildimage PAC changes to receive config updates sonic-net/sonic-buildimage#18620
sonic-buildimage JSON lib changes to support PAC sonic-net/sonic-buildimage#18622
sonic-buildimage MAB mgr changes for PAC sonic-net/sonic-buildimage#18623
sonic-buildimage Hostapd mgr changes for PAC sonic-net/sonic-buildimage#18621
sonic-wpa-supplicant sonic-wpasupplicant changes for PAC sonic-net/sonic-wpa-supplicant#88
sonic-wpa-supplicant Changes to support PAC and 802.1X interaction sonic-net/sonic-wpa-supplicant#89
sonic-wpa-supplicant Changes in HOSTAPD to Support PAC sonic-net/sonic-wpa-supplicant#90
sonic-wpa-supplicant HOSTPAD driver changes for PAC sonic-net/sonic-wpa-supplicant#91
sonic-utilities CLI support for PAC sonic-net/sonic-utilities#3265
sonic-buildimage MAB makefile and common header files sonic-net/sonic-buildimage#18625
sonic-buildimage MAB common header files sonic-net/sonic-buildimage#18626
sonic-buildimage MAB generic files sonic-net/sonic-buildimage#18627
sonic-buildimage MAB control function changes sonic-net/sonic-buildimage#18628
sonic-buildimage MAB protocol related header files sonic-net/sonic-buildimage#18629
sonic-buildimage MAB protocol related changes sonic-net/sonic-buildimage#18630
sonic-buildimage Auth mgr Makefile and common header files sonic-net/sonic-buildimage#18631
sonic-buildimage Auth mgr generic header files sonic-net/sonic-buildimage#18632
sonic-buildimage Authmgr event handling and other functionality sonic-net/sonic-buildimage#18633
sonic-buildimage Auth mgr API interface functions sonic-net/sonic-buildimage#18634
sonic-buildimage Authmgr include files for authentication functionality sonic-net/sonic-buildimage#18635
sonic-buildimage Auth mgr functionality changes sonic-net/sonic-buildimage#18636
sonic-buildimage Makefile changes for PAC sonic-net/sonic-buildimage#18624
sonic-buildimage Docker and Makefile changes for PAC sonic-net/sonic-buildimage#18616

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Apr 5, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

@samitabh samitabh changed the title Initial version of PNAC in SONiC Initial version of PAC in SONiC Apr 10, 2023
@samitabh samitabh changed the title Initial version of PAC in SONiC Initial version of Port Access Control in SONiC Apr 10, 2023
@samitabh samitabh changed the title Initial version of Port Access Control in SONiC SONiC PAC Support HLD Apr 11, 2023
doc/pac/Port Access Control.md Show resolved Hide resolved
doc/pac/Port Access Control.md Outdated Show resolved Hide resolved
doc/pac/Port Access Control.md Show resolved Hide resolved
doc/pac/Port Access Control.md Show resolved Hide resolved
doc/pac/Port Access Control.md Show resolved Hide resolved
@zhangyanzhao
Copy link
Collaborator

doc/pac/Port Access Control.md Show resolved Hide resolved
doc/pac/Port Access Control.md Show resolved Hide resolved
doc/pac/Port Access Control.md Show resolved Hide resolved
doc/pac/Port Access Control.md Outdated Show resolved Hide resolved
doc/pac/Port Access Control.md Show resolved Hide resolved
@rck-innovium
Copy link
Collaborator

  1. It would help to list the SAI APIs and attributes used to implement each mode. 
  2. I am assuming in multiple host mode, before any client gets authenticated, the port will have its learning mode as SAI_BRIDGE_PORT_FDB_LEARNING_MODE_DROP. The expectation is that EAPOL packets sent by the client will be trapped to CPU via the EAPOL host-if trap.  Given this expectation, other Switch Traps like STP, LACP etc. will also get trapped to CPU and get processed.  However, these packets should be dropped and not be sent to the daemons.

@thovikeerthi
Copy link
Contributor

As per the received update during HLD review meeting, whenever RADIUS server connectivity with Switch (i.e. Authenticator) is down (due to any issues), then no new supplicant's shall be allowed to get authenticated i.e. until connectivity with RADIUS server is restored, which results in a down-time w.r.t authentication functionality for new supplicants.
Please check, can we avoid this error case by supporting 'Local Authentication' mechanism as well along with Remote Authentication via RADIUS.
On supporting both mechanisms, we can avoid authentication down-time for new supplicants.
Note:

  • As known to all, Hostapd has support for both local & remote authentication.

@thovikeerthi
Copy link
Contributor

IEE802.1X defines the encapsulation of EAP over Wired networks and Wireless networks. Please confirm, what is the support provided/extended in your design w.r.t this note ?

@thovikeerthi
Copy link
Contributor

thovikeerthi commented May 24, 2023

As per the received update during HLD review meeting i.e. 'On session-timeout', the static FDB MAC entry aging happens and entry will be removed. Please confirm, who manages (i.e. start/stop/run etc) this timer ? what is the role of SWITCH here ?
Also, please explain these Timer's state-machine or transactions handling in the HLD and their impact on the switch (i.e. under use-cases like reboot etc).

@kishorgovind
Copy link
Collaborator

As per the received update during HLD review meeting, whenever RADIUS server connectivity with Switch (i.e. Authenticator) is down (due to any issues), then no new supplicant's shall be allowed to get authenticated i.e. until connectivity with RADIUS server is restored, which results in a down-time w.r.t authentication functionality for new supplicants. Please check, can we avoid this error case by supporting 'Local Authentication' mechanism as well along with Remote Authentication via RADIUS. On supporting both mechanisms, we can avoid authentication down-time for new supplicants. Note:

  • As known to all, Hostapd has support for both local & remote authentication.

Local authentication is not scalable and managable. All the Access Switches need to be in sync with the User Profile configurations in the RADIUS servers.

For port Authentication option, we need to provide local authentication as an option

@kishorgovind
Copy link
Collaborator

Since there is MAC limit of 128 entries in FDB, I see challenge with the design that the legitimate MAC client may not get access/authenticated as explained below. If there are burst of unauthenticated MAC addresses are received say due to DOS attack. At the same time Radius server is not reachable or for some reason Radius server is not providing quick response and current design does not support local authentication. In this scenario unauthenticated clients may occupy 128 FDB entries (even though traffic is not forwarded). Now since 128 entries are full and when next client (which is a legitimate client) tries to authenticate it will be denied the service since FDB entries reached the limit.
Pl. address this scenario in the design. According to me the solution to this problem is FDB entry should not be created for incoming client till this client is authenticated.

DoS actually happens as PAC does not attempt authentication beyond 128 clients. It is not an FDB limit that causes the DoS. Local Authentication is not scalable and managable. PAC needs to have the logic to not even attempt authentication for MACs classified as "malicious". This will be taken up in a future release.

Here when I say DOS attack, once client might be generating series of MAC address and fills FDB table. I am not saying MAC address as malicious or not. My point is that since you are creating FDB entry for MAC before client gets authenticated and hence FDB may get filled up. Instead logic should be you add MAC entry only after authentication.

@samitabh
Copy link
Contributor Author

The expectation is that EAPOL packets sent by the client will be trapped to CPU via the EAPOL host-if trap. Given this expectation, other Switch Traps like STP, LACP etc. will also get trapped to CPU and get processed. However, these packets should be dropped and not be sent to the daemons

[Amitabha] Generally these clients are laptops, printers etc. and will not be sending STP and LACP packets. We will however make a note and come back on this in a further release.

@samitabh
Copy link
Contributor Author

samitabh commented Aug 2, 2023

@thovikeerthi
Can you please review the updated HLD and my comments and approve the HLD.

@kishorgovind
Copy link
Collaborator

As per the received update during HLD review meeting, whenever RADIUS server connectivity with Switch (i.e. Authenticator) is down (due to any issues), then no new supplicant's shall be allowed to get authenticated i.e. until connectivity with RADIUS server is restored, which results in a down-time w.r.t authentication functionality for new supplicants. Please check, can we avoid this error case by supporting 'Local Authentication' mechanism as well along with Remote Authentication via RADIUS. On supporting both mechanisms, we can avoid authentication down-time for new supplicants. Note:

  • As known to all, Hostapd has support for both local & remote authentication.

Local authentication is not scalable and managable. All the Access Switches need to be in sync with the User Profile configurations in the RADIUS servers.

As discussed during community discussion we are going to have single HLD. Hence we need to update HLD with local authentication. Pl let me know how CG can do this?

@samitabh
Copy link
Contributor Author

samitabh commented Aug 3, 2023

As per the received update during HLD review meeting, whenever RADIUS server connectivity with Switch (i.e. Authenticator) is down (due to any issues), then no new supplicant's shall be allowed to get authenticated i.e. until connectivity with RADIUS server is restored, which results in a down-time w.r.t authentication functionality for new supplicants. Please check, can we avoid this error case by supporting 'Local Authentication' mechanism as well along with Remote Authentication via RADIUS. On supporting both mechanisms, we can avoid authentication down-time for new supplicants. Note:

  • As known to all, Hostapd has support for both local & remote authentication.

Local authentication is not scalable and managable. All the Access Switches need to be in sync with the User Profile configurations in the RADIUS servers.

As discussed during community discussion we are going to have single HLD. Hence we need to update HLD with local authentication. Pl let me know how CG can do this?

Let's have a separate HLD for Local Authentication with this HLD as a base reference HLD.

@msosyak
Copy link

msosyak commented Aug 7, 2023

Are theare any other code PRs for the feature opened? If not, is there any prediction when they will be opened?

Copy link
Collaborator

@kishorgovind kishorgovind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved

@adyeung
Copy link
Collaborator

adyeung commented Jan 25, 2024

@msosyak feature contribution is planned for 202405, expect code PR posting in the next few weeks

@adyeung adyeung self-requested a review January 25, 2024 22:06
@adyeung adyeung merged commit 0850550 into sonic-net:master Jan 25, 2024
@zhangyanzhao
Copy link
Collaborator

@adyeung please help to update the HLD with the code PRs, so that we can track the feature progress. Thanks.

@adyeung
Copy link
Collaborator

adyeung commented Apr 16, 2024

@adyeung please help to update the HLD with the code PRs, so that we can track the feature progress. Thanks.

Done

@adyeung adyeung self-assigned this Apr 17, 2024
Once a client is authenticated, authorization parameters from RADIUS can be sent for the client. The Authenticator switch processes these RADIUS attributes to apply to the client session. Following attributes are supported.

- *VLAN Id*: This is the VLAN ID sent by a RADIUS server for the authenticated client. This VLAN should be a pre-created VLAN on the switch.
- *Session Timeout*: This is the timeout attribute of the authenticated client session.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have default session for un-authenticated clients as well?

For instance lets say if a MAB auth-request is rejected by Radius server and if we dont have a session timeout and immediately remove the client. Then the next packet from the client would trigger another Radius request, incase if the client is generating too many packets, the Radius connection might be overloaded.

One way to maintain a session timer for auth-failed clients as well. Please let me know if this was considered during the design


### 2.2.5 VLAN
1. PAC associates authenticated clients to a VLAN on the port.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change involve writing to config_db?

What if the user does 'config save' while the PAC has authenticated the client and added a port to Radius specified VLAN, would this make the configuration persistent?

I'm assuming we only do 'Vlan member add' and not 'VLAN create' based on the radius response, what if the radius specified VLAN is not present in the system? Do we error out?

2. The pacd, mabd and hostapdmgrd gets notified about their respective configurations.
3. hostapd being a standard Linux application gets its configuration from a hostapd.conf file. hostapdmgrd generates the hostapd.conf file based on the relevant CONFIG_DB tables. hostapdmgrd informs hostapd about the list of ports it needs to run on. This port list is dynamic as it depends of port link/admin state, port configuration etc. hostapdmgrd keeps hostapd updated about these changes.
4. These modules communicate amongst themselves via socket messages.
5. hostapd listens to EAPoL PDUs on the provided interface list. When it receives a PDU, it consults pacd and proceeds to authenticate the client. pacd also listens to "unknown src MAC" and triggers MAB, if configured on the port, to authenticate the client.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is client sends a EAPOL pkt followed by regular ETH packet, im assuming there is a possibility that both dot1x (triggered by hostapd) & MAB authentication can happen concurrently.

How does pacd behave during this? Also how do we make we give higher priority to a specific auth-method lets say 'dot1x'?

"port_control_mode": "auto",
"host_control_mode": "multi_auth",
"reauth_period": 60,
"reauth_enable": "true",
Copy link

@praveenraja1 praveenraja1 May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a suggestion, maybe consider removing this field and indicting disabled by setting reauth_period as 0. This would be inline with how its done in hostapd.conf

EAP reauthentication period in seconds (default: 3600 seconds; 0 = disable
reauthentication).
eap_reauth_period=3600

Copy link

@praveenraja1 praveenraja1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, left some minor comments/questions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: DeferredForNextRelease
Development

Successfully merging this pull request may close these issues.