Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLS1.3 WIP #2

Closed
wants to merge 9 commits into from
Closed

TLS1.3 WIP #2

wants to merge 9 commits into from

Conversation

sam-github
Copy link
Owner

No description provided.

@sam-github sam-github force-pushed the tls1.3-wip branch 10 times, most recently from 2487c32 to 393f932 Compare February 14, 2019 23:02
Allow enabling/disabling 1.3, as well as disabling 1.2.
What appears to be happening is that `SSL_write()` is writing the 5
bytes of `hello`, but it isn't being flushed out as encrypted text into
the BIO. This might be because its in the middle of SSL code where the
info callback is called, but its about to send the NewTickets, so the
data is just buffered briefly before encrypting it out in an appdata
record. Just a guess.

Since there is nothing to flush, the code wanders down to EncOut(),
where it sees there is nothing to write, so it calls InvokeQueued, which
does a sync w->Done(), in violation of the DoWrite() contract.

XXX There is an open question of exactly when to call it. Sync is
illegal, SetImmediate() works, but it seems to me it shouldn't be called
until the data that went into SSL_write() comes out of EncOut() and into
the next stream, and the next stream callbacks, that Done() should
happen.  Tried to make the latter change, but it makes no difference to
the unit tests, so I've no idea if its right or not.
Code passes with the work-around, but it's a bug.  Not sure how to fix.

Calling SSL_read after after a close_notify is a bit dicy, but might
have to do that, but make sure that only 'session' events occur (ie.,
discard data and anything else). Seems subtly wrong.

Have asked openssl list how this is supposed to work.

This only happens with servers that .end() sync with no data in the
secureConnection event, and maybe those servers don't deserve ticket
support? Not sure.
Sometimes it does, sometimes it doesn't, I'm not sure if this expected
streams behaviour, or not.

Compare this test with test/parallel/test-tls-invoke-queued.js.
This is a bug, it isn't supposed to happen, but how to prevent it?

When TLS finishes the DoWrite the underlying stream is closed, so the
stream_resource()->Write() in EncOut() fails with write-after-destroy.

See test/parallel/test-tls-destroy-stream.js, and its inline comments.
@addaleax
Copy link

test/parallel/test-tls-net-socket-keepalive.js is a good test to look at, I think, Its short and pretty self-contained, and exhibits the kind of problem that are breaking multiple tests. If this was fixed, I think a lot of things would go green.

@sam-github Is that still the current state? I have time to look into this now and this test passes for me :)

I am seeing other failures, though, and I’ll look at the source code changes here and try to understand them.

void SecureContext::SetCiphers(const FunctionCallbackInfo<Value>& args) {
#if 1
#define fprintf(...)
#endif

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: Debug() from debug_utils.h might have something – it works out-of-the-box for async wraps, since they already have a limited set of names, but it should be easy enough to add other categories?

@@ -214,6 +214,8 @@ class StreamResource {
// Return 0 for success and a libuv error code for failures.
virtual int DoTryWrite(uv_buf_t** bufs, size_t* count);
// Perform a write of data, and either call req_wrap->Done() when finished
// call when? can we call sync? "call Done() and return" suggests
// first Done, then return, but probably not right

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We’re not strict about this – basically, your options are:

  • return 0. This means you can call Done() at any point, which is almost always asynchronous but might not need to be.
  • return an error code.

@sam-github
Copy link
Owner Author

current state is described in the last few commits, the XXX ones, I tried to comment the test cases themselves, the commits might or might not be informative

I will check, but I'm pretty convinced that returning calling ->Done() then returning 0 from DoWrite() is invalid. The request.cb = callback hasn't happened at that point in internal/stream_base_(I forget).js.

You can confirm by removing the SetImmediate wrapper (its commented) in EncOut().

I've some docs locally, just about half an hour away from being done them at which point I'll pr it to node, though I won't have time to look at it until this evening.

The last outstanding problems are pretty well defined:

  • when to call Done()? current code working, though I suspect it should not call Done() until data has been written to underlying_stream(), AND underlying_stream() has callbacked;
  • the write(); destroy() triggering write-after-delete error (baffled, see comments)
  • close_notify coming on TLS before the two session tickets for servers that end() with no data in the handshake complete callback, which I think I understand, not sure I care about (happens a lot in tests, but real TLS server's don't do this), but am considering ways to fix

@addaleax
Copy link

I will check, but I'm pretty convinced that returning calling ->Done() then returning 0 from DoWrite() is invalid. The request.cb = callback hasn't happened at that point in internal/stream_base_(I forget).js.

The .oncomplete callback should be set as part of createWriteWrap(), so it should always be at least set. That doesn’t have to mean that it’s a good idea to call it synchronously.

You can confirm by removing the SetImmediate wrapper (its commented) in EncOut().

I’ll try that.

I've some docs locally, just about half an hour away from being done them at which point I'll pr it to node, though I won't have time to look at it until this evening.

Sounds good!

  • when to call Done()? current code working, though I suspect it should not call Done() until data has been written to underlying_stream(), AND underlying_stream() has callbacked;

Yup, you’re right. The TLS code has always been weird about this, though, I think.

  • the write(); destroy() triggering write-after-delete error (baffled, see comments)

Yeah, I’ll take a look at that.

  • close_notify coming on TLS before the two session tickets for servers that end() with no data in the handshake complete callback, which I think I understand, not sure I care about (happens a lot in tests, but real TLS server's don't do this), but am considering ways to fix

I think this is something you understand better than me :)

@addaleax
Copy link

What I’m getting for the write(); destroy(); thing is this:

assert.js:86
  throw new AssertionError(obj);
  ^

AssertionError [ERR_ASSERTION]: Expected values to be strictly equal:
+ actual - expected

+ ''
- 'hello world! gosh'
    at TLSSocket.<anonymous> (/home/sqrt/src/node/master/test/parallel/test-tls-invoke-queued-multi.js:64:14)
    at TLSSocket.<anonymous> (/home/sqrt/src/node/master/test/common/index.js:367:15)
    at TLSSocket.emit (events.js:202:15)
    at endReadableNT (_stream_readable.js:1130:12)
    at processTicksAndRejections (internal/process/next_tick.js:76:17)

That seems kind of expected to me?

@sam-github
Copy link
Owner Author

sam-github commented Feb 18, 2019

Compare that one to this one:

% ./node test/parallel/test-tls-destroy-stream.js
net server connection
net conn data
net conn unshift data
server on secureConnection TLSv1.3
server socket.write()
server socket.destroy()
server on error: Error [ERR_STREAM_DESTROYED]: Cannot call write after a stream was destroyed                                                                                
    at doWrite (_stream_writable.js:411:19)
    at clearBuffer (_stream_writable.js:545:7)
    at DuplexSocket.Writable.uncork (_stream_writable.js:319:7)
    at JSStreamSocket.doWrite (internal/js_stream_socket.js:162:17)
    at JSStream.onwrite (internal/js_stream_socket.js:20:57)
    at DuplexSocket.ondata (internal/js_stream_socket.js:64:22)
    at DuplexSocket.emit (events.js:197:13)
    at addChunk (_stream_readable.js:289:12)
    at readableAddChunk (_stream_readable.js:270:11)
    at DuplexSocket.Readable.push (_stream_readable.js:225:10)
server write cb, err: undefined
server on close: hadError? false
Mismatched <anonymous> function calls. Expected exactly 1, actual 0.

EDIT: ----^ this is one of the failures that leaves me stumped. I cannot find any sign that tlsSocket.destroy() calls ANY C++ method of TLSWrap. I believe it is either bypassing it, and closing the underlying stream, or that .destroy() is calling some base class method that I haven't found yet, and that is closing the underlying stream. Either way, its broken, and until I can find the code, I cannot even guess as to why its different for TLS1.3 when it works for TLS1.2.

@sam-github
Copy link
Owner Author

Btw, its a holiday weekend here, and I'm alone with my kids, so apologies in advance, my responsiveness will be a bit limited, though I'm trying to keep an eye on email. Back at my desk Tuesday.

@sam-github
Copy link
Owner Author

Edit, sorry. The above is a problem, but in f978966, I suggest comparing it to test/parallel/test-tls-invoke-queued.js. Note that in that particular configuration:

  c.write('hello ', null, common.mustCall(function() {
    c.write('world!', null, common.mustCall(function() {
      c.destroy();
    }));
    // Data on next _write() will be written, and the cb will still be invoked
    c.write(' gosh', null, common.mustCall());
  }));

ALL three writes get to the client, but with the -multi variant:

  c.write('hello ', null, common.mustCall());
  c.write('world!', null, common.mustCall());
  c.write(' gosh', null, common.mustCall());
  c.destroy();

NONE of the writes get to the client. I dunno, maybe that makes sense? The ordering difference between the last write and the destroy is different, but its very subtly different. As I say in the commit/comment for the -multi.js test, maybe it makes sense, maybe not, I'm not sure.

I am sure that absolutely no amount of reading of the streams docs would give anyone a to predict the behaviour of either test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants