Skip to content

Commit

Permalink
Block diagram for stack and callflow thermalctld
Browse files Browse the repository at this point in the history
  • Loading branch information
mprabhu-nokia authored Aug 13, 2020
1 parent d74ccb0 commit 4133586
Show file tree
Hide file tree
Showing 2 changed files with 0 additions and 0 deletions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

7 comments on commit 4133586

@shyam77git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In pmon-chassis-distributed-thermalctld.png , at the bottom three CLIs are mentioned. Seeking clarification w.r.t them..
show environment
show platform temperature
show platform fan

If all temperature sensors across all cards (Supervisor, LCs) are catered under 'show platform temperature' CLI, then what all 'show environment' CLI covering?
Is 'show environment' CLI covering voltage and current sensors?
Ideally, all environmental sensors (i.e. temperature, voltage and current) should be under 'show environment' CLI.
So, wondering what's the reason/rationale behind moving temperature sensors to 'show platform' CLI?

@shyam77git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following two CLIs are at both cards (Supervisor and LC):
show environment
show platform temperature

Can you please update/confirm on the following:

  1. LC could cater to its local card info and local sensors only so that's what's expected out of these CLIs

  2. These ones on the Supervisor (CC) is expected to cater to self (local) + FCs.
    a) So, what's display output format? Is it like following?

    sensor A - info
    ...
    ...
    sensor Z - info

      <FC0>
      sensor A - info 
       ...
       ...
       sensor Z - info
    

b) Would there be an extension of these CLIs to support location option ?
like show platform loc <> ; show environment loc <>

c) Would these CLIs display LC sensors info (show environment) and LC card/platform info (show platform) to have holistic view on Supervisor (CC)?

@shyam77git
Copy link
Contributor

@shyam77git shyam77git commented on 4133586 Aug 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In doc/pmon/pmon-chassis-images/pmon-chassis-distributed-db.png, which option is being planned?
In option1, is supervisor going to pull data from LC at regular intervals?

In that case, I'd suggest option 2 as it would update Supervisor from LC about any runtime change happening at LCs' end, vs option 1, where Supervisor has to come and pull after certain interval. Mostly the sensors data / thermal conditions at the board varies only when there is change/impact to it w.r.t bandwidth, pkt size, ASICs usage etc. on that board. So, better remote end (LC) notify Supervisor of such changes to save cpu-cycles on supervisor.

Another thing, option 2 populating Global Redis-DB on Supervisor/CC.
a) This may help show environment (on CC) provide holistic view of all remote card sensors data.
b) with this option, would Thermal sensors data from remote LCs to go to ThermalCtl-d of CC's local DB or this GlobalDB (on CC)? I'd think GlobalDB would be preferred.
Can you please share your thoughts?

@shyam77git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In doc/pmon/pmon-chassis-images/pmon-chassis-layout.png, is Device Manager a platform owned process? platform can formulate (design/impl.) it per the platform?
and it always interfaces with PMON's Monitoring process via IPC?
In that case, platform plugins' from SONiC layer towards platform is separate/independent path and is primarily for get/set purposes from PMON's psud, syseeprmd etc. towards platform?

@shyam77git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In doc/pmon/pmon-chassis-images/pmon-chassis-psu.png, in order to compute power budget, need to determine power consumption of remote cards (LCs).
how is power_per_LC determined? This info to come from LC.
Assuming this variable in diagram referring to power consumption at LC

@shyam77git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be am missing, but don't happen to see info on following three sub-infrastructure under Platform management:
FPD
LED
OBFL
Can you please share the link/pointer? or plan to add/discuss them?

@Junchao-Mellanox
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In pmon-chassis-distributed-thermalctld.png , at the bottom three CLIs are mentioned. Seeking clarification w.r.t them..
show environment
show platform temperature
show platform fan

If all temperature sensors across all cards (Supervisor, LCs) are catered under 'show platform temperature' CLI, then what all 'show environment' CLI covering?
Is 'show environment' CLI covering voltage and current sensors?
Ideally, all environmental sensors (i.e. temperature, voltage and current) should be under 'show environment' CLI.
So, wondering what's the reason/rationale behind moving temperature sensors to 'show platform' CLI?

Hi, thanks for your comment. 'show environment' is a wrapper of command 'sensors', it reads sensor.conf and print sensor status to screen. If there is voltage/current sensor configured in sensor.conf, 'show environment' output will cover them. 'show platform fan' command is used for displaying detail fan info in a user friendly way, it covers not only the fan speed but also fan status, fan direction and so on. Compare to 'show environment', it would be easier to help user understand the fan status using 'show platform fan'. And the same with 'show platform temperature'.

Please sign in to comment.