Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exe() on OSX may incorrectly raise ZombieProcess #1044

Closed
giampaolo opened this issue May 2, 2017 · 2 comments · Fixed by #1100
Closed

exe() on OSX may incorrectly raise ZombieProcess #1044

giampaolo opened this issue May 2, 2017 · 2 comments · Fixed by #1100

Comments

@giampaolo
Copy link
Owner

giampaolo commented May 2, 2017

======================================================================
ERROR: psutil.tests.test_process.TestProcess.test_prog_w_funky_name
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/vagrant/psutil/psutil/tests/test_process.py", line 762, in test_prog_w_funky_name
    self.assertEqual(os.path.normcase(p.exe()),
  File "/vagrant/psutil/psutil/__init__.py", line 684, in exe
    exe = self._proc.exe()
  File "/vagrant/psutil/psutil/_psosx.py", line 304, in wrapper
    raise ZombieProcess(self.pid, self._name, self._ppid)
ZombieProcess: psutil.ZombieProcess process still exists but it's a zombie (pid=60931, name='$testfnfoo bar )')

On Python 3:

======================================================================
ERROR: psutil.tests.test_unicode.TestFSAPIs.test_proc_exe
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/vagrant/psutil/psutil/_psosx.py", line 293, in wrapper
    return fun(self, *args, **kwargs)
  File "/vagrant/psutil/psutil/_psosx.py", line 350, in exe
    return cext.proc_exe(self.pid)
ProcessLookupError: [Errno 3] No such process

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/vagrant/psutil/psutil/tests/test_unicode.py", line 145, in test_proc_exe
    exe = p.exe()
  File "/vagrant/psutil/psutil/__init__.py", line 685, in exe
    exe = self._proc.exe()
  File "/vagrant/psutil/psutil/_psosx.py", line 304, in wrapper
    raise ZombieProcess(self.pid, self._name, self._ppid)
psutil.ZombieProcess: psutil.ZombieProcess process still exists but it's a zombie (pid=62276)

Note: the underlying C call, proc_pidpath, which is authentic crap, incorrectly raises ESRCH. It is so crappy that it does it only sometime. wrap_exceptions() decorator catches ESRCH, sees a PID is still alive so it assumes it's a zombie. As such wrap_exceptions() should be more clever at detecting zombies (do something different than pid_exists(pid)). With that said, I suppose the best we can do is to raise AccessDenied, which sucks but I see no alternatives.

@giampaolo
Copy link
Owner Author

giampaolo commented May 2, 2017

OK, this is very weird. To reproduce:

    def test_prog_w_funky_name(self):
        path = "/vagrant/psutil/xxx"
        create_exe(path)
        self.addCleanup(safe_rmpath, path)
        cmdline = [path, "-c",
                   "import time; [time.sleep(0.01) for x in range(3000)];"
                   "arg1", "arg2", "", "arg3", ""]
        sproc = get_test_subprocess(cmdline)
        p = psutil.Process(sproc.pid)
        for x in range(10000):
            print(x, p.exe())
            p._exe = None  # invalidate cache

It will print:

...
(60, '/vagrant/psutil/xxx')
(61, '/vagrant/psutil/xxx')
(62, '/vagrant/psutil/xxx')
(63, '/vagrant/psutil/xxx')
(64, '/vagrant/psutil/xxx')
(65, '/vagrant/psutil/xxx')
(66, '/vagrant/psutil/xxx')
(67, '/vagrant/psutil/xxx')
(68, '/vagrant/psutil/xxx')
(69, '/vagrant/psutil/xxx')
(70, '/vagrant/psutil/xxx')
(71, '/vagrant/psutil/xxx')
(72, '/vagrant/psutil/xxx')
(73, '/vagrant/psutil/xxx')
xxx: posix_spawn: /vagrant/psutil/xxx2.7: No such file or directory
(74, '/vagrant/psutil/xxx')
ERROR

The errno after proc_pidpath call is set to 3 (no such process).
It's interesting that those strings (posix_spawn, no such file or directory) are printed on screen (it's not psutil printing them so it must be some C syscall).
It must be noted that the failure has nothing to do with the funky path name.

@giampaolo
Copy link
Owner Author

giampaolo commented May 2, 2017

This is a bigger problem than I thought which requires a consistent refactoring. There are different C functions using poor undocumented OSX APIs such as proc_pidinfo not returning a meaningful error in case of failure and as such we are forced to guess what happened from python by using pid_exists(), or even from C itself via os.kill.

The wrap_exception decorator implements this logic (e.g. to guess zombies) but there are some functions such as cext.proc_kinfo_oneshot which are not "poor" (they gracefully set errno) so don't need such a guessing. status() method is one of these. To push this even further, in case of zombie proc_kinfo_oneshot succeeds (and correctly reports status == ZOMBIE) whereas proc_pidtaskinfo_oneshot fails with ESRCH. So the whole logic needs to be carefully checked and possibly reorganized (not easy). I'm still not sure but what we may wanna do is delegate the whole "guessing" logic to Python, not to C, and from C raise RuntimeError or a custom exception (better). From Python, that'll men we're supposed to guess whether to raise NoSuchProcess, AccessDenied or ZombieProcess.

giampaolo added a commit that referenced this issue Jun 7, 2017
* small refactoring

* #1044: define a separate ctx manager which handles zombie processes

* add create_zombie_proc utility function

* disable test on windows

* #1044: cmdline() was incorrectly raising AD instead of ZombieProcess

* #1044: environ() was incorrectly raising AD instead of ZombieProcess

* #1044: memory_maps() was incorrectly raising AD instead of ZombieProcess

* #1044: threads() was incorrectly raising AD instead of ZombieProcess

* enhance test

* fix threads()
giampaolo added a commit that referenced this issue Jan 8, 2021
This was done in order to solve
#1044 and
#1100
...but its logic duplicates the one in wrap_exceptions()
decorator.

Also, this should solve #1901 as it did erroneously translated NSP in
AD.

Signed-off-by: Giampaolo Rodola <g.rodola@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant