Skip to content

Commit

Permalink
Change orchagent pop batch size from 8192 to 1024 (#16125)
Browse files Browse the repository at this point in the history
### Why I did it
Background running lua script may cause redis-server quite busy if batch size is 8192.
If handling time exceeded default 5s, the redis-server will not response to other process and will cause syncd crash.

```
Aug  9 07:46:29.512326 str-s6100-acs-5 INFO database#supervisord: redis 68:M 09 Aug 2023 07:46:29.511 # Lua slow script detected: still in execution after 5186 milliseconds. You can try killing the script using the SCRIPT KILL command. Script SHA1 is: 88270a7c5c90583e56425aca8af8a4b8c39fe757
Aug  9 07:46:29.523716 str-s6100-acs-5 ERR syncd#syncd: :- checkReplyType: Expected to get redis type 5 got type 6, err: BUSY Redis is busy running a script. You can only call SCRIPT KILL or SHUTDOWN NOSAVE.
Aug  9 07:46:29.524818 str-s6100-acs-5 INFO syncd#supervisord: syncd terminate called after throwing an instance of '
Aug  9 07:46:29.525268 str-s6100-acs-5 ERR pmon#CCmisApi: :- checkReplyType: Expected to get redis type 5 got type 6, err: BUSY Redis is busy running a script. You can only call SCRIPT KILL or SHUTDOWN NOSAVE.
Aug  9 07:46:29.526148 str-s6100-acs-5 INFO syncd#supervisord: syncd std::system_error'
Aug  9 07:46:29.528308 str-s6100-acs-5 ERR pmon#psud[32]: :- checkReplyType: Expected to get redis type 5 got type 6, err: BUSY Redis is busy running a script. You can only call SCRIPT KILL or SHUTDOWN NOSAVE.
Aug  9 07:46:29.529048 str-s6100-acs-5 ERR lldp#python3: :- guard: RedisReply catches system_error: command: *2#015#012$3#015#012DEL#015#012$27#015#012LLDP_ENTRY_TABLE:Ethernet37#015#012, reason: BUSY Redis is busy running a script. You can only call SCRIPT KILL or SHUTDOWN NOSAVE.: Input/output error
Aug  9 07:46:29.529720 str-s6100-acs-5 ERR snmp#python3: :- guard: RedisReply catches system_error: command: *2#015#012$7#015#012HGETALL#015#012$28#015#012COUNTERS:oid:0x100000000000a#015#012, reason: BUSY Redis is busy running a script. You can only call SCRIPT KILL or SHUTDOWN NOSAVE.: Input/output error
```

88270a7c5c90583e56425aca8af8a4b8c39fe757 is /usr/share/swss/consumer_state_table_pops.lua
##### Work item tracking
- Microsoft ADO **24741990**:

#### How I did it
Change batch size from 8192 to1024.
#### How to verify it
Run all test cases in sonic-mgmt to verify the system stability.

### Tested branch (Please provide the tested image version)

- [x] 20220531.36
  • Loading branch information
ZhaohuiS authored Aug 15, 2023
1 parent 1acafa4 commit 286ec3e
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions dockers/docker-orchagent/orchagent.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,8 @@ fi
mkdir -p /var/log/swss
ORCHAGENT_ARGS="-d /var/log/swss "

# Set orchagent pop batch size to 8192
ORCHAGENT_ARGS+="-b 8192 "
# Set orchagent pop batch size to 1024
ORCHAGENT_ARGS+="-b 1024 "

# Set synchronous mode if it is enabled in CONFIG_DB
SYNC_MODE=$(echo $SWSS_VARS | jq -r '.synchronous_mode')
Expand Down

0 comments on commit 286ec3e

Please sign in to comment.