Finish implementing REPL server/client #56

Technologicat · 2019-12-20T23:57:38Z

12 February 2020: An updated version of this text is now in doc/repl.md.

Hot-patch a running Python process! With macros in the REPL! Inspired by Swank in Common Lisp.

As of now, a complete implementation is in place, waiting for a few important final touches.

To try it right now, with the latest code from git:

python3 -m unpythonic.net.server.
- This runs a demo server, intended for development and debugging of this feature itself.
- Multiple clients may be connected simultaneously to the same server. Each client gets an independent REPL session, except that the top-level namespace is shared.
- The actual REPL console you get depends on what you have installed in the environment where the server runs. The following will be tried in order. The first successfully imported one wins:
  - If you have imacropy installed, you will get imacropy.console.MacroConsole. (Recommended.)
  - If you have MacroPy installed, you will get macropy.core.console.MacroConsole.
  - As a fallback that is always available, you will get code.InteractiveConsole, and macro support will not be enabled.
- In the upcoming release version, the idea is to allow starting a REPL server in your own Python app with from unpythonic.net import server; server.start(locals=globals()). This is all the preparation your app code needs to do to provide a REPL server that has access to the running process.
  - Note it's strictly opt-in; the REPL server must be imported and started explicitly. There's no way to turn it on in a running process that didn't opt in when it started.
  - There's no need to server.stop() manually; this is automatically registered as an atexit handler.
  - The argument to start specifies the top-level namespace of REPL sessions served by the server. If this is one of your modules' global namespace, you can directly write to that namespace in the REPL simply by assigning to variables. E.g. x = 42 will actually do mymod.x = 42.
    - If you want a namespace that's only accessible from (and shared by) REPL sessions, use an empty dictionary: server.start(locals={}).
    - For write access to module-level globals in other modules, access them as module attributes, like in Manhole. For example, import sys; sys.modules['myothermod'].x.
- When you want to shut down the demo server, press Ctrl+C in the server's terminal window.
  - This will also work in the release version. The server runs in a daemon thread; so if you shut down your app in any way (or if your app crashes), the server will also shut down immediately (forcibly disconnecting clients, if any remain).
In another terminal, python3 -m unpythonic.net.client 127.0.0.1:1337. This opens a REPL session, where:
- Line editing (GNU readline) is available, with history and remote tab completion (when you use tab completion, the client queries for completions from the server).
- Pressing Ctrl+D at the prompt politely asks to disconnect. If the server fails to respond for whatever reason, following that with Ctrl+C forces a client-side disconnect.
  - The server is smart enough to clean up resources after a client that disappeared.
- At any other time, pressing Ctrl+C in a REPL session sends a KeyboardInterrupt to the remote.
  - This works by injecting a KeyboardInterrupt asynchronous exception into the thread running that particular session. Any other threads in the process running the server are unaffected.
    - This feature is actually documented in the CPython C API docs, so it's actually public. But it's a bit hard to find, and was never intended to be called from Python code (without writing a custom C extension). It just happens that ctypes.pythonapi makes that possible.
  - Due to technical reasons, remote Ctrl+C currently only works on CPython. Support for PyPy3 would be nice, but currently not possible. See unpythonic.misc.async_raise and Push PyPy3 compatibility to 100% #58 for details.
  - Be sure to press the Ctrl+C just once. Hammering the key combo may raise a KeyboardInterrupt locally in the code that is trying to send the remote KeyboardInterrupt (or in code waiting for the server's response), thus forcibly terminating the client. Starting immediately after the server has responded, remote Ctrl+C is available again. (The server indicates this by sending the text KeyboardInterrupt, possibly with a stack trace, and then giving a new prompt, just like a standard interactive Python session does.)
- print() is available as usual, but output is properly redirected to the client only in the REPL session's main thread.
  - If you must, look at the value of sys.stdout in the REPL session's main thread. After the REPL server has been started, it's actually a Shim that holds the underlying stream in a ThreadLocalBox, so you can get the stream from there if you really need to. For any thread that hasn't sent a value into that box, the box will return the default, which is the original stdin/stdout/stderr of the server process.
- help(obj) does not work, hangs the client. Known issue. Use the custom doc(obj) instead. It just prints the docstring without paging, while emulating help's dedenting. It's not a perfect solution, but should work well enough to view docstrings of live objects in a live Python process.
  - If you want to look at docstrings for the definition currently on disk instead, just use a regular IPython session or similar.
IPv4 only for now. IPv6 would be nice, but something for a later release.
Tested only on Linux (with CPython 3.6 and PyPy3).
- At least the PTY stuff on the server side is *nix-specific.
- Also, I make no guarantees that select.select is not called on an fd that is not a socket.
- Probably possible to make this work in Windows, but I don't need that. PRs are welcome, though.

DANGER:

A REPL server is essentially an opt-in back door. While the intended use is for allowing hot-patching in your app, by its very nature, the server gives access not only to your app, but also to anything that can be imported, including os and sys. It is trivial to use it as a shell that just happens to use Python as the command language, or to obtain traditional shell access (e.g. bash) via it.

This particular REPL server has no authentication support whatsoever. Any user logged in to the local machine can connect. There is no encryption for network traffic, either. Therefore, to remain secure:

Only bind the server to the loopback interface (this is the default). This ensures connections only come from users who can log in to the machine running your app. (Physical local access or an SSH session are both fine.)
Only enable the server, if you trust any logged in user to allow them REPL access. The two most common scenarios are:
- The app runs on your local machine, which has no untrusted human users.
- The app runs on a dedicated virtual server, which runs only your app.

In both cases, access control and encrypted connections (SSH) are then provided by the OS itself. Note this is exactly the same level of security (i.e. none whatsoever) as provided by the Python REPL itself. If you have access to python, you have access to the system (with the privileges the python process itself runs under).

Why a custom REPL server/client

Macro support, right there in the console of a REPL-in-a-live-Python-process. This is why this feature is included in unpythonic, instead of just recommending Manhole, socketserverREPL, or similar existing solutions.

Furthermore, the focus is different from most similar projects; this server is primarily intended for hot-patching, not so much for debugging. So we don't care about debugger hooks, or instantly embedding a REPL into a particular local scope (to give the full Python user experience for examining program state), pausing the thread that spawned the REPL. We care about running the REPL server in the background (listening for connections as part of normal operation of your app), and making write access to module globals easy.

A hot-patching REPL server is also useful in oldschool style scientific scripts that run directly via python3 mysolver.py or python3 -m mysolver (no Jupyter notebook there), because it reduces the burden of planning ahead. Seeing the first plots from a new study often raises new questions. Experience has shown it would often be useful to re-plot the same data (that took two hours to compute) in alternative ways... while the script doesn't yet have the code to save anything to disk, because the current run was supposed to be just for testing. You know that when you close that last figure window, the process will terminate, and all that delicious data will be gone. But provided the data can be accessed from module scope, an embedded REPL server can still save the day. You just open a REPL session to your live process, and save what it turns out you needed, before closing that last figure and letting the process terminate. It's all about having a different kind of conversation with your scientific problem. (Cf. Paul Graham on software development in On Lisp; original quotation.)

Future directions

Authentication and encryption

SSH with key-based authentication is the primary future direction of interest. It would enable security, making actual remote access feasible.

This may be added in an eventual v2.0 (using Paramiko), but right now it's not on the immediate roadmap. This would allow a client to be sure the server is who it claims to be, as well as letting users log in based on an authorized_keys file. It would also make it possible to audit who has connected and when.

There are a lot of Paramiko client examples on the internet (oddly, with a focus mainly on security testing), but demo_server.py in the distribution seems to be the only server example, and leaves unclear important issues such as how to set up a session and a shell. Reading paramiko/server.py as well as paramiko/transport.py didn't make me much wiser.

(What we want is to essentially treat our Python REPL as the shell for the SSH session.) So for this first version, right now I'm not going to bother with SSH support.

What we needed to get macro support

Drop-in replacing code.InteractiveConsole in unpythonic.net.server with macropy.core.console.MacroConsole gave rudimentary macro support.

However, to have the same semantics as in the imacropy IPython extension, a custom console was needed. This was added to imacropy as imacropy.console.MacroConsole.

For historical interest, refer to and compare imacropy/iconsole.py and macropy/core/console.py. The result is the new imacropy/console.py.

DONE:

~~Robustify socket data handling. Do it properly, no optimistic single reads and writes, since TCP doesn't do datagrams.~~
- Now we have a simplistic message protocol (see unpythonic.net.msg) that runs over TCP, so we can use that for the control channel. But to remain simple and netcat compatible, the primary channel cannot be message-based. So we still need a prompt detector.
- Done in 2658ede.
~~Improve presentation of line editing (needs prompt detection at the client side; refer to repl_tool.py by Ivor Wanders).~~ Done in 2658ede.
~~Add remote Ctrl+C support. Requires a control channel, like in IPython. (There is already a rudimentary control channel for tab completion requests; just generalize this.)~~ First cut of remote Ctrl+C support added in 9b68f95.
The comments suggest the server is going to inject itself to the calling module's globals namespace, but perhaps it's more pythonic to let the user specify the namespace to run in. You can easily pass globals() as the namespace in the call to unpythonic.net.server.start if that's what you want.
- ~~Done, start() now takes a mandatory locals argument; now just need to update the comments.~~
PyPy3 doesn't support remote Ctrl+C due to lack of PyThreadState_SetAsyncExc in cpyext (PyPy's partial emulation of CPython's ctypes.pythonapi). Disable this feature when running on PyPy, for now, to get this thing out of the door. Done.
- Maybe worry later if we can add that to PyPy itself. Tracked in Push PyPy3 compatibility to 100% #58.
~~Figure out and fix bug with remote Ctrl+C: output appears one prompt late after Ctrl+C'ing a computation.~~ Hacked around in c3f67c2. A real fix seems more difficult, and I want to release 0.14.2 sooner rather than later.
- To reproduce:
  - Start server. Start client.
  - In the REPL: for _ in range(int(1e9)): pass.
  - Hit Ctrl+C while the loop is running. Observe KeyboardInterrupt and a new prompt.
  - Try something that should print something, e.g. doc(doc). Observe no output.
  - Hit enter again (blank input is fine). Observe output appears now, one prompt late. This delay remains in effect for the rest of the session.
~~Not sure if we should wrap the REPL session in a PTY or not?~~ Keeping the PTY wrapper for now.
- As things stand, the client/server pair acts mostly like a TTY to the code.InteractiveConsole running within it (e.g. ANSI color codes should work - test this!), but some things are not available, because the client runs a second input prompt (that holds the local TTY), separate from the one running on the server.
- ANSI color codes work as expected. print("\033[32m") in a remote REPL turns the foreground color dark green, and print("\033[39m") resets it to the default. Compulsory links for the curious: [1], [2].
~~Improve/update comments/docstrings.~~ 0ed68f5.
- E.g. it's clear why the built-in help() can't work in the current implementation.
Add support for invoking syntactic macros right there in the REPL console. 90461c3.
- Enable macros if MacroPy is installed.
- This will allow pasting in a replacement definition for a function that uses macros.
- Defining macros in the REPL will not be supported. For this, modify the macro definition on disk, reload the module, and re-import the macro, as usual. Note any already loaded code still uses the old definition, as macros are a compile-time thing. Reload affected modules if you need to.
  - If you really need to change a macro definition, it's much safer to just normally restart the process instead of trying to hot-patch it.
- Progress of the advanced macro-enabled console is tracked in Add InteractiveMacroConsole imacropy#4.
Resolve any small but important TODOs in the unpythonic.net code.
- Clean up the stuff to be injected into the server locals namespace. Actually perform the injection.
- Configurable control port. Maybe client connect syntax like localhost 1337 8128 (address and two ports; omitting ports uses default values).
- Maybe alias doc to help (in the REPL sessions)? Then again, least surprise; maybe not. Yes, let's not do that. Better export doc as-is.
Write documentation. unpythonic.net is almost a separate project, so this should have its own doc/repl.md. The material for the documentation is practically already here and in the docstrings.
Make sure to set timeouts for all I/O operations in unpythonic.net. We don't want the leaves to fall while an operation is pending...
- Maybe actually look at this later. This is hard to get right. Yes, look at this later.
Add a usage example to the documentation. Hmm, maybe what we have is already good enough.

The text was updated successfully, but these errors were encountered:

Technologicat · 2020-02-09T21:37:22Z

The macro-enabled REPL console belongs in imacropy, so it will be added there. See Technologicat/imacropy#4.

imacropy will be added as a soft dependency for unpythonic. This is coming anyway, because the macropy3 bootstrapper already lives there, and its local copy in unpythonic is already deprecated.

Technologicat · 2020-02-11T00:18:30Z

As of today, imacropy 0.2.0 sports an imacropy.console.MacroConsole, which we can use here.

EDIT: ...aaand done.

Technologicat · 2020-02-11T21:57:30Z

A possible future direction: Detachable sessions

Like screen. But there's no "inject asynchronous exception and resume in background", so we can't do much with SIGTSTP (Ctrl+Z) even if we caught it...

Right now, injecting new background computations into a live Python process is not the primary goal of this feature.

But if needed, this is already possible in the REPL:

import threading
import queue
q = queue.Queue()  # for results
def worker():
    ...
    q.push(...)
t = threading.Thread(target=worker, daemon=True)
t.start()

and then just disconnect the session normally. The results should appear in q some time later, and be available in a future session. The thread object will remain available as t. It would be rather easy to wrap something like this into a convenience utility at the server end. For example:

import threading
from ..misc import namelambda
out = {}
def bg(thunk):
    @namelambda(thunk.__name__)
    def worker():
        try:
            result = thunk()
        except Exception as err:
            out[t.ident] = err
        else:
            out[t.ident] = result
    t = threading.Thread(target=worker, daemon=True)
    t.start()
    return t

and then just expose that to the client side. Usage example:

from unpythonic import primes, islice
t = bg(lambda: islice(primes())[:100])  # --> thread object
# later, to read the result
print(out.get(t.ident, None))
# clean up if you care about that sort of thing
t.join()  # should return immediately if the result is already available
del t

This is like & in bash; to start a background computation, you have to declare it when you start it. (Shells additionally support Ctrl+Z when the computation is already running, but that's outside the scope of what's easily achievable here.)

We have bg/fg to do this in the initial release.

Technologicat · 2020-02-12T13:09:25Z

With a208e78, everything necessary for a first release of the REPL feature should be complete.

Technologicat added the enhancement New feature or request label Dec 20, 2019

Technologicat added this to the 0.14.2 milestone Dec 20, 2019

Technologicat self-assigned this Dec 20, 2019

Technologicat mentioned this issue Dec 21, 2019

Update docs for 0.14.2 #8

Closed

Technologicat mentioned this issue Feb 9, 2020

Add InteractiveMacroConsole Technologicat/imacropy#4

Closed

Technologicat closed this as completed Feb 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finish implementing REPL server/client #56

Finish implementing REPL server/client #56

Technologicat commented Dec 20, 2019 •

edited

Loading

Technologicat commented Feb 9, 2020

Technologicat commented Feb 11, 2020 •

edited

Loading

Technologicat commented Feb 11, 2020

Technologicat commented Feb 12, 2020

Finish implementing REPL server/client #56

Finish implementing REPL server/client #56

Comments

Technologicat commented Dec 20, 2019 • edited Loading

Technologicat commented Feb 9, 2020

Technologicat commented Feb 11, 2020 • edited Loading

Technologicat commented Feb 11, 2020

Technologicat commented Feb 12, 2020

Technologicat commented Dec 20, 2019 •

edited

Loading

Technologicat commented Feb 11, 2020 •

edited

Loading