From e3ef0455a4387f5a7e931d14db1b3b1aeb9b362e Mon Sep 17 00:00:00 2001 From: Kebo Liu Date: Fri, 20 Aug 2021 15:27:21 +0800 Subject: [PATCH 1/5] Update dynamic buffer calculation HLD change to calculate peer_response at different operating speeds, according to the definition in IEEE 802.3 31B.3.7, instead of taking a fixed number. --- doc/qos/dynamically-headroom-calculation.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/qos/dynamically-headroom-calculation.md b/doc/qos/dynamically-headroom-calculation.md index 3bd4c33f330..3c484a6dcd1 100644 --- a/doc/qos/dynamically-headroom-calculation.md +++ b/doc/qos/dynamically-headroom-calculation.md @@ -746,7 +746,7 @@ Let's imagine what will happen after a XOFF frame has been sent for a priority. 1. MAC/PHY delay, which is the bytes held in the SWITCH CHIP's egress pipeline and PHY when XOFF has been generated. 2. Gearbox delay, which is the latency caused by the Gearbox, if there is one. 3. KB on cable, which is the bytes held in the cable, which is equals the time required for packet to travel from one end of the cable to the other multiplies the port's speed. Obviously, the time is equal to cable length divided by speed of the light in the media. -4. Peer response time, which is the bytes that are held in the peer switch's pipeline and will be send out when the XOFF packet is received. +4. Peer response time, which is the bytes that are held in the peer switch's pipeline and will be send out when the XOFF packet is received. IEEE 802.3 31B.3.7 defines how many pause_quanta shall wait when a switch receives a pause frame at different operating speed. For example, at 40 Gb/s it shall wait for 118 pause_quanta while 394 pause_quanta shall be taken at 100 Gb/s. A pause_quanta equal to 512 bit times(see IEEE 802.3 31B.2). Let's consider the flow of XOFF packet generating and handling: @@ -773,6 +773,7 @@ Therefore, headroom is calculated as the following: - `cell occupancy` = (100 - `small packet percentage` + `small packet percentage` * `worst case factor`) / 100 - `kb on cable` = `cable length` / `speed of light in media` * `port speed` - `kb on gearbox` = `port speed` * `gearbox delay` / 8 / 1024 +- `peer response` = (`number of pause_quanta` * 512) / 8 / 1024 - `propagation delay` = `port mtu` + 2 * (`kb on cable` + `kb on gearbox`) + `mac/phy delay` + `peer response` - `Xon` = `pipeline latency` - `Xoff` = `lossless mtu` + `propagation delay` * `cell occupancy` @@ -787,7 +788,6 @@ The values used in the above procedure are fetched from the following table: - `port mtu`: PORT|\|mtu, default value is `9100` - `gearbox delay`: PERIPHERIAL_TABLE|\|gearbox_delay - `mac/phy delay`: ASIC_TABLE|\|mac_phy_delay -- `peer response`: ASIC_TABLE|\|peer_response_time - `cell`: ASIC_TABLE|\|cell_size - `small packet percentage`: LOSSLESS_TRAFFIC_PATTERN|\|small_packet_percentage - `lossless mtu`: LOSSLESS_TRAFFIC_PATTERN|\|mtu From 085ff2367c65c5afc63ccb820bf1d138e88179d8 Mon Sep 17 00:00:00 2001 From: Kebo Liu Date: Tue, 31 Aug 2021 16:29:50 +0800 Subject: [PATCH 2/5] rephrase the description --- doc/qos/dynamically-headroom-calculation.md | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/doc/qos/dynamically-headroom-calculation.md b/doc/qos/dynamically-headroom-calculation.md index 3c484a6dcd1..abf8ade364c 100644 --- a/doc/qos/dynamically-headroom-calculation.md +++ b/doc/qos/dynamically-headroom-calculation.md @@ -746,7 +746,19 @@ Let's imagine what will happen after a XOFF frame has been sent for a priority. 1. MAC/PHY delay, which is the bytes held in the SWITCH CHIP's egress pipeline and PHY when XOFF has been generated. 2. Gearbox delay, which is the latency caused by the Gearbox, if there is one. 3. KB on cable, which is the bytes held in the cable, which is equals the time required for packet to travel from one end of the cable to the other multiplies the port's speed. Obviously, the time is equal to cable length divided by speed of the light in the media. -4. Peer response time, which is the bytes that are held in the peer switch's pipeline and will be send out when the XOFF packet is received. IEEE 802.3 31B.3.7 defines how many pause_quanta shall wait when a switch receives a pause frame at different operating speed. For example, at 40 Gb/s it shall wait for 118 pause_quanta while 394 pause_quanta shall be taken at 100 Gb/s. A pause_quanta equal to 512 bit times(see IEEE 802.3 31B.2). +4. Peer response time, when a switch receives a pause frame, it will not stop the packet transmission immediately, because it need to drain the frames which already submitted to MAC layer. So extra buffer shall be considered to handle the peer delay response. IEEE 802.3 31B.3.7 defines how many pause_quanta shall wait upon an XOFF. A pause_quanta equal to the time required to transmit 512 bits of a frame at the data rate of the MAC. At different operating speed the number of pause_quanta shall be taken are also different. Following table shows the number of pause_quanta shall be taken for each speed. + + | Operating speed | Number of pause_quanta | + |:--------:|:-----------------------------:| + | 100 Mb/s | 1 | + | 1 Gb/s | 2 | + | 10 Gb/s | 67 | + | 25 Gb/s | 80 | + | 40 Gb/s | 118 | + | 50 Gb/s | 147 | + | 100 Gb/s | 394 | + | 200 Gb/s | 453 | + | 400 Gb/s | 905 | Let's consider the flow of XOFF packet generating and handling: From 2bb22a2e98937574d850790fcdf981bba2c34483 Mon Sep 17 00:00:00 2001 From: Kebo Liu Date: Mon, 13 Sep 2021 14:38:56 +0800 Subject: [PATCH 3/5] Update doc/qos/dynamically-headroom-calculation.md Co-authored-by: Stephen Sun <5379172+stephenxs@users.noreply.github.com> --- doc/qos/dynamically-headroom-calculation.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/doc/qos/dynamically-headroom-calculation.md b/doc/qos/dynamically-headroom-calculation.md index abf8ade364c..5c188529624 100644 --- a/doc/qos/dynamically-headroom-calculation.md +++ b/doc/qos/dynamically-headroom-calculation.md @@ -746,7 +746,8 @@ Let's imagine what will happen after a XOFF frame has been sent for a priority. 1. MAC/PHY delay, which is the bytes held in the SWITCH CHIP's egress pipeline and PHY when XOFF has been generated. 2. Gearbox delay, which is the latency caused by the Gearbox, if there is one. 3. KB on cable, which is the bytes held in the cable, which is equals the time required for packet to travel from one end of the cable to the other multiplies the port's speed. Obviously, the time is equal to cable length divided by speed of the light in the media. -4. Peer response time, when a switch receives a pause frame, it will not stop the packet transmission immediately, because it need to drain the frames which already submitted to MAC layer. So extra buffer shall be considered to handle the peer delay response. IEEE 802.3 31B.3.7 defines how many pause_quanta shall wait upon an XOFF. A pause_quanta equal to the time required to transmit 512 bits of a frame at the data rate of the MAC. At different operating speed the number of pause_quanta shall be taken are also different. Following table shows the number of pause_quanta shall be taken for each speed. +4. Peer response time. When a switch receives a pause frame, it will not stop the packet transmission immediately, because it needs to drain the frames which already been submitted to the MAC layer. So extra buffer shall be considered to handle the peer delay response. IEEE 802.3 31B.3.7 defines how many pause_quanta shall wait upon an XOFF. A pause_quanta is equal to the time required to transmit 512 bits of a frame at the data rate of the MAC. At different operating speeds, the number of pause_quanta shall be taken are also different. Following table shows the number of pause_quanta that shall be taken for each speed. + | Operating speed | Number of pause_quanta | |:--------:|:-----------------------------:| From 11fed3d32226efae5923c627df1131164d96ec9b Mon Sep 17 00:00:00 2001 From: Kebo Liu Date: Mon, 13 Sep 2021 14:40:55 +0800 Subject: [PATCH 4/5] Update dynamically-headroom-calculation.md --- doc/qos/dynamically-headroom-calculation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/qos/dynamically-headroom-calculation.md b/doc/qos/dynamically-headroom-calculation.md index 5c188529624..1376e02ab96 100644 --- a/doc/qos/dynamically-headroom-calculation.md +++ b/doc/qos/dynamically-headroom-calculation.md @@ -786,7 +786,7 @@ Therefore, headroom is calculated as the following: - `cell occupancy` = (100 - `small packet percentage` + `small packet percentage` * `worst case factor`) / 100 - `kb on cable` = `cable length` / `speed of light in media` * `port speed` - `kb on gearbox` = `port speed` * `gearbox delay` / 8 / 1024 -- `peer response` = (`number of pause_quanta` * 512) / 8 / 1024 +- `peer response` = (`number of pause_quanta` * 512) / 8 - `propagation delay` = `port mtu` + 2 * (`kb on cable` + `kb on gearbox`) + `mac/phy delay` + `peer response` - `Xon` = `pipeline latency` - `Xoff` = `lossless mtu` + `propagation delay` * `cell occupancy` From 079df5d5454a48808df17b0256339921ca866424 Mon Sep 17 00:00:00 2001 From: Kebo Liu Date: Fri, 12 Nov 2021 09:50:57 +0800 Subject: [PATCH 5/5] fix review comments --- doc/qos/dynamically-headroom-calculation.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/doc/qos/dynamically-headroom-calculation.md b/doc/qos/dynamically-headroom-calculation.md index 1376e02ab96..7145c0d4a76 100644 --- a/doc/qos/dynamically-headroom-calculation.md +++ b/doc/qos/dynamically-headroom-calculation.md @@ -786,7 +786,9 @@ Therefore, headroom is calculated as the following: - `cell occupancy` = (100 - `small packet percentage` + `small packet percentage` * `worst case factor`) / 100 - `kb on cable` = `cable length` / `speed of light in media` * `port speed` - `kb on gearbox` = `port speed` * `gearbox delay` / 8 / 1024 -- `peer response` = (`number of pause_quanta` * 512) / 8 +- `peer response` = + - if can get a valid pause quanta, `peer response` = (`number of pause_quanta` * 512) / 8 + - otherwise, use the default value, `peer response`: ASIC_TABLE|\|peer_response_time - `propagation delay` = `port mtu` + 2 * (`kb on cable` + `kb on gearbox`) + `mac/phy delay` + `peer response` - `Xon` = `pipeline latency` - `Xoff` = `lossless mtu` + `propagation delay` * `cell occupancy` @@ -801,6 +803,7 @@ The values used in the above procedure are fetched from the following table: - `port mtu`: PORT|\|mtu, default value is `9100` - `gearbox delay`: PERIPHERIAL_TABLE|\|gearbox_delay - `mac/phy delay`: ASIC_TABLE|\|mac_phy_delay +- `peer response`: ASIC_TABLE|\|peer_response_time - `cell`: ASIC_TABLE|\|cell_size - `small packet percentage`: LOSSLESS_TRAFFIC_PATTERN|\|small_packet_percentage - `lossless mtu`: LOSSLESS_TRAFFIC_PATTERN|\|mtu