Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address the issues uncovered by running Npgsql tests #788

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

drdrsh
Copy link
Collaborator

@drdrsh drdrsh commented Sep 1, 2024

In #593, Jelte notified us that running Npgsql test suite against PgCat's prepared statements results in some failures.

Running the tests uncovered the following issues

Handling DISCARD ALL / DEALLOCATE ALL

The current implementation of Prepared statements in transactions does not handle these two commands correctly. When they arrive, we pass them down to the underlying server connection and that results in a desynchronization between the prepared statements that PgCat thinks are on the server connection and what is actually there. So when it gets a request to execute a prepared statement that it thinks is on the server, it passes it down and that results in query failure with error prepared statement XX doesn't exist

The fix is to reset PgCat prepared statement cache for the connection when we get these calls.

Handling cases where client expects a Bind Complete response but gets a Parse Complete

When the client sends a Parse statement to create a named prepared statement, it expects that it can send Bind messages and they would work.

Under transaction mode, the client is likely to checkout server connections that don't have the prepared statement. The way PgCat handles this is to send a Parse statement under-the-hood and then pass the Bind with some changes to the naming scheme.

When we send Parse then Bind, we get back Parse Complete and Bind Complete from the server. Now, we currently forward both messages to the client. This is not the accurate response because the client only sent a Bind so it expects to get a Bind Complete not Parse Complete followed by Bind Complete

The Npgsql test suite is pretty strict and it catches the edge case and fails the test. Most clients seem to be okay with this.

Fixing this will require some big refactor to keep track of what messages we shouldn't pass down to the client. I won't pursue a fix for this issue in this PR

Misinterpreting the the number of parameters on Parse for very large parameter list

This one has to do with Parse message

Parse (F)
    Byte1('P')
        Identifies the message as a Parse command.
    Int32
        Length of message contents in bytes, including self.
    String
        The name of the destination prepared statement (an empty string selects the unnamed prepared statement).
    String
        The query string to be parsed.
    Int16
        The number of parameter data types specified (can be zero). Note that this is not an indication of the number of parameters that might appear in the query string, only the number that the frontend wants to prespecify types for.
    Then, for each parameter, there is the following:
    Int32
        Specifies the object ID of the parameter data type. Placing a zero here is equivalent to leaving the type unspecified.

We use the Int16 value to know how big of a vector we need to allocate, however, we read that value in an i16 not u16 so while the limit on the number of parameters should be 65k, we would get a negative number if we get more than 32767 parameters and plugging that parameter into an allocation operation results in a panic. Kudos to Npgsql team for testing this edge case.

@drdrsh
Copy link
Collaborator Author

drdrsh commented Sep 1, 2024

@zainkabani @levkk I would appreciate your eyes on this one.

@drdrsh
Copy link
Collaborator Author

drdrsh commented Sep 7, 2024

I have been at this for some time. Documenting my progress here.
The failure I am getting now on Npgsql test suite is

Npgsql.NpgsqlException : Received backend message BindComplete while expecting ParseCompleteMessage. Please file a bug.

I ran a packet sniffer and did a side-by-side comparison between PgCat and Pgbouncer
Screenshot from 2024-09-07 17-25-51

PgCat prepares the statement in an earlier query so to ensure the statement exists on the server, it then sends a Bind message to server instead of Parse => Bind sequence (as we have already sent a Parse to the server and discarded the ParseComplete response). This meant the client would get a BindComplete message without a ParseComplete.

@drdrsh
Copy link
Collaborator Author

drdrsh commented Sep 7, 2024

Pgbouncer doesn't do anything special in this test suite because the test suite uses unnamed prepared statements for the initial segment of the test. The test is not doing anything exotic like sending Parse then Bind then Sync and then attempting to use the unnamed statement in a subsequent call that may end on another server connection and break.

The tests pass when I disable caching unnamed statements in PgCat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant