Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CentOS] cpu_freq OSError: [Errno 16] Device or resource busy #2261

Open
IgorPelevanyuk opened this issue Jun 1, 2023 · 4 comments
Open

Comments

@IgorPelevanyuk
Copy link

IgorPelevanyuk commented Jun 1, 2023

Summary

  • OS: CentOS Linux release 7.9.2009
  • Architecture: 64bit
  • Psutil version: 5.9.4
  • Python version: 3.9.15
  • Type: core

Description

Hello! First of all, thank youo for the great tool you are developing.
I have very strange issue with psutil library. It breaks during high load on the server. Always during max_ = int(bcat(pjoin(path, "scaling_max_freq"))) / 1000
The error is the following:

Traceback (most recent call last):
  File "/zfs/tmp/dirac/DIRAC_q06kKSpilot/job/Wrapper/Wrapper_3029060", line 179, in execute
    result = job.execute()
  File "/zfs/tmp/dirac/DIRAC_q06kKSpilot/diracos/lib/python3.9/site-packages/DIRAC/WorkloadManagementSystem/JobWrapper/JobWrapper.py", line 435, in execute
    watchdog.calibrate()
  File "/zfs/tmp/dirac/DIRAC_q06kKSpilot/diracos/lib/python3.9/site-packages/DIRAC/WorkloadManagementSystem/JobWrapper/Watchdog.py", line 791, in calibrate
    result = self.getNodeInformation()
  File "/zfs/tmp/dirac/DIRAC_q06kKSpilot/diracos/lib/python3.9/site-packages/DIRAC/WorkloadManagementSystem/JobWrapper/Watchdog.py", line 973, in getNodeInformation
    result["CPU(MHz)"] = psutil.cpu_freq()[0]
  File "/zfs/tmp/dirac/DIRAC_q06kKSpilot/diracos/lib/python3.9/site-packages/psutil/__init__.py", line 1864, in cpu_freq
    ret = _psplatform.cpu_freq()
  File "/zfs/tmp/dirac/DIRAC_q06kKSpilot/diracos/lib/python3.9/site-packages/psutil/_pslinux.py", line 745, in cpu_freq
    max_ = int(bcat(pjoin(path, "scaling_max_freq"))) / 1000
  File "/zfs/tmp/dirac/DIRAC_q06kKSpilot/diracos/lib/python3.9/site-packages/psutil/_common.py", line 776, in bcat
    return cat(fname, fallback=fallback, _open=open_binary)
  File "/zfs/tmp/dirac/DIRAC_q06kKSpilot/diracos/lib/python3.9/site-packages/psutil/_common.py", line 765, in cat
    return f.read()
OSError: [Errno 16] Device or resource busy

CPU frequency information exists at the following path:
/sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq

Caching method is not working since different processes running cpu_freq function. It happens not always. Do you know what can it be?

@giampaolo
Copy link
Owner

Mmm.. are there some network partitions / folders involved?

@IgorPelevanyuk
Copy link
Author

Yes, there are some network folders on the server. But the code reads data from /sys/... which is 100% local. ZFS folder on which psutil code is running is also placed on a local disks.

@giampaolo
Copy link
Owner

I guess you have no way to reliably reproduce this, correct? This is similar to #2250 (comment): we may retry read() on EBUSY for a certain number of times (say 10), then give up, even though it's not really a proper solution.

@IgorPelevanyuk
Copy link
Author

Well, we spent some more time to play with tests and different hypothesis. So far, it looks like the problem is not related to amount of opened files. And the issue could be the following:
During change from no load to high load linux kernel start to change values of /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq more frequently. And during this change the file is not available for reading by psutil. Right now we are looking for a way to make scaling_max_frequency static to test this sypothesis.

If I understand right, your proposal about EBUSY exception could be used in some future versions of psutils. And right now we can not "activate" it?

If you know some hack how to fix scaling_max_frequency, we would gladly use it. Because the value of CPU frequency is not critical for the system that uses psutil.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants