Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High level API for accessing getsockname() / getpeername() #280

Open
njsmith opened this issue Aug 11, 2017 · 8 comments
Open

High level API for accessing getsockname() / getpeername() #280

njsmith opened this issue Aug 11, 2017 · 8 comments

Comments

@njsmith
Copy link
Member

njsmith commented Aug 11, 2017

There needs to be some way to get at the information in getsockname and getpeername from the high-level Stream and Listener interfaces.

It should involve an await, at least on streams, to support the PROXY protocol.

What should we do with SSLStream/SSLListener? There's a conceptual problem where the stream they're talking about doesn't exactly have an address, it's "connect to the transport's address AND THEN wrap the thing in a SSLStream". But then, it's not like getsockname/getpeername exactly has any meaning for streams anyway -- like, if you have a connected socket it's useful to be able to ask it what the two halves of the TCP 5-tuple looks like, but you can't actually do anything with this, it doesn't satisfy any invariants. Except "this might be useful for forensics".

So maybe pragmatics beats purity and SSLStream should just proxy this through to the underlying transport, and we'll leave the user to figure out how to decipher it.

Maybe a SocketAddress object that has family and sockaddr fields, plus possibly host, port, path, etc., as appropriate for the particular address?

@njsmith
Copy link
Member Author

njsmith commented Aug 14, 2017

Small wrinkle to watch out for with AF_UNIX sockets: #279 (comment)

@njsmith
Copy link
Member Author

njsmith commented Jul 1, 2018

We'll also want to think about this: https://bugs.python.org/issue32221

tl;dr: in some versions of python, getsockname (etc.) on IPv6 addresses resolves any scopeid and appends it to the first element of the tuple, but this turns out to be super slow (and this is on a code path that gets called in recvfrom for UDP packets), so they switched to not doing this. The scopeid information is still available at the end of the 4-tuple, but at the least this means cross-version differences in how to interpret the 0th element in the tuple.

This also caused issues for twisted: https://twistedmatrix.com/trac/ticket/9449#9449

@njsmith
Copy link
Member Author

njsmith commented Dec 8, 2019

Maybe something like:

# Empty type for docs and to indicate intention
class Address(ABC):
   pass

@attr.s(frozen=True, slots=True)
class SocketAddress(Address):
    socket_family: socket.AddressFamily
    socket_address: Any  # the same types as returned by getsockname etc.
    
    @property
    def ip(self) -> str:
        if self.socket_family in [AF_INET, AF_INET6]:
            return self.socket_address[0]
        raise AttributeError("ip")

    @property
    def port(self) -> int:
        if self.socket_family in [AF_INET, AF_INET6]:
            return self.socket_address[1]
        raise AttributeError("port")

    @property
    def path(self) -> Optional[str]:  # or is it bytes?
        if self.socket_family == AF_UNIX:
            return self.socket_address
        raise AttributeError("path")

    def __str__(self):
        if self.socket_family == AF_INET:
            return f"{self.ip}:{self.port}"
        if self.socket_family == AF_INET6:
            return f"[{self.ip}]:{self.port}"  # FIXME: what about scopeid and the other thing?
       if self.socket_family == AF_UNIX:
            return f"unix://{self.path}"
       ...

class Stream:
    async def get_local_name(self) -> Optional[Address]: ...
    async def get_remote_name(self) -> Optional[Address]: ...

@Tronic
Copy link
Contributor

Tronic commented Mar 10, 2020

I'd love to see this implemented. Paths are definitely str in Python. Just be sure to use errors="surrogateescape" whenever you encode or decode them.

@altendky
Copy link
Member

Just passing by because of the AddressFamily reference (deciding how to placate pylint) but "paths" aren't always strings and there are tools that I think are for handling the decoding/encoding. https://docs.python.org/3.10/library/os.html#os.fsencode

@Tronic
Copy link
Contributor

Tronic commented Oct 27, 2021

@altendky I gather that paths are always str in Python, even if they may contain arbitrary bytes rather than valid Unicode on the filesystem, and that fsencode just handles this with the surrogateescape mode I mentioned in the previous message. But agreed, using the helper function is still more appropriate for readable code.

@altendky
Copy link
Member

I presume there are functions that always return paths as str but "paths" (a fairly general concept, admittedly) are not exclusively str nor even str based. It seems that surrogateescape is only the default on non-Windows platforms.

os.PathLike protocol

The method should only return a str or bytes object, with the preference being for str.

open() accepts path-likes which includes bytes.

file is a path-like object giving the pathname (absolute or relative to the current working directory) of the file to be opened or an integer file descriptor of the file to be wrapped.

os.fsencode() and os.fsdecode() link to filesystem encoding and error handler and that to filesystem_errors

On Windows: use "surrogatepass" by default, or "replace" if legacy_windows_fs_encoding of PyPreConfig is non-zero.

On other platforms: use "surrogateescape" by default.

@Tronic
Copy link
Contributor

Tronic commented Oct 27, 2021

Ok, I stand corrected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants