Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer: auto random fill Buffer(num) and new Buffer(num) #11808

Closed
wants to merge 1 commit into from

Conversation

jasnell
Copy link
Member

@jasnell jasnell commented Mar 12, 2017

@addaleax @nodejs/ctc

This is an alternate approach for #11806 that uses fill() to fill the new Buffer(num)/Buffer(num) after allocation. The changes are less disruptive but occur in two steps. The performance profile is a bit better.

This also includes the same --pending-deprecation addition that is in #11806 and the conditionally-emitted deprecationwarning that is off by default. The --pending-deprecation stuff has been separated out into a separate PR: #11968

Again, this is a work in progress

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines
Affected core subsystem(s)

buffer

@nodejs-github-bot nodejs-github-bot added buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. labels Mar 12, 2017
@jasnell jasnell changed the title Edit buffer: auto random fill Buffer(num) and new Buffer(num) *alternate buffer: auto random fill Buffer(num) and new Buffer(num) *alternate Mar 12, 2017
@jasnell jasnell added wip Issues and PRs that are still a work in progress. semver-major PRs that contain breaking changes and should be released in the next major version. labels Mar 12, 2017
@ChALkeR
Copy link
Member

ChALkeR commented Mar 12, 2017

@jasnell, note that #7152 had some prototype changes to avoid breaking valid usecases. Is that concern still applicable? /cc @seishun

Copy link
Member

@ChALkeR ChALkeR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This currently prints a warning on Buffer.alloc(10).map(x => x + 1).

It shouldn't do that. (Perhaps include that as a testcase?)

@seishun had some work-around in #7152, but I didn't check that one.

@jasnell jasnell changed the title buffer: auto random fill Buffer(num) and new Buffer(num) *alternate buffer: auto random fill Buffer(num) and new Buffer(num) Mar 12, 2017
@seishun
Copy link
Contributor

seishun commented Mar 12, 2017

@jasnell Could you clarify what makes this less disruptive than the other one?

@jasnell
Copy link
Member Author

jasnell commented Mar 12, 2017

@ChALkeR ... thanks for the pointer on that, I had remembered seeing it before but couldn't remember where. I've incorporated the workaround and the test. PTAL

@seishun rightfully mentioned in the other alternate PR (which I've closed in favor of this one) that is is a breaking change for performance critical apps that are still using the Buffer(num) constructor. Accordingly, we may need to delay landing.

@seishun ... it's less disruptive in terms of the code changes for the zero-fill (only touches buffer.js, should only give perf impact to Buffer(num) case, etc). The other PR modified the Allocator instance inside node.cc which had an impact on all allocation methods (it measurably slowed down Buffer.allocUnsafe() for instance).

@seishun
Copy link
Contributor

seishun commented Mar 12, 2017

Benchmark results so far:

                                                             improvement confidence      p.value
 buffers\\buffer-creation.js n=1024 len=10 type="buffer()"      -52.16 %        *** 2.748737e-36
 buffers\\buffer-creation.js n=1024 len=1024 type="buffer()"    -55.28 %        *** 2.549842e-32
 buffers\\buffer-creation.js n=1024 len=2048 type="buffer()"    -53.65 %        *** 5.754643e-34
 buffers\\buffer-creation.js n=1024 len=4096 type="buffer()"    -17.88 %        *** 1.982145e-22
 buffers\\buffer-creation.js n=1024 len=8192 type="buffer()"    -48.78 %        *** 2.144261e-44

compare-plot

(Command line: node benchmark/compare.js --old node-master.exe --new node-jasnell.exe --set type=buffer() --filter buffer-creation buffers > compare-jasnell.csv)

IMO unless we can make the results significantly better (and on every supported platform), this warrants a full deprecation cycle.


process.on('warning', common.mustNotCall((warning) => {}));

Buffer.alloc(10).map((i) => {});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: it won't hurt to test other methods here, e.g. Buffer.alloc(10).filter(() => true), which also would have been affected if not for the fix above.

@ChALkeR
Copy link
Member

ChALkeR commented Mar 14, 2017

@jasnell Could I ask you to split this into two separate PRs for a separate review?

Opt-in --pending-deprecation is certainly needed and looks uncontroversial (without scheduling a Buffer(arg) runtime-deprecation), but random-fill has at least two different concerns that might need discussion.

@jasnell
Copy link
Member Author

jasnell commented Mar 14, 2017

Yes, splitting it into two separate PRs is fine.

Just to make sure I'm clear, that's:

  1. One PR that introduces only the --pending-deprecation flag
  2. One PR that adds the random-fill and the pending deprecation for Buffer().

Correct?

@ChALkeR
Copy link
Member

ChALkeR commented Mar 14, 2017

@jasnell, No, I meant the other way around:

  1. One that introduces the flag and does the pending-deprecation for Buffer(num) (and perhaps some other things that we have good reasons to warn about, if needed).
  2. One that does the random-fill.

The opt-in flag for Buffer(arg) should be pretty much uncontroversial.

Selects a random byte and fills Buffer(num) and new Buffer(num)
automatically using the randomly selected value.

Why random-fill vs. zero-fill?

Prior to this commit, the uninitialized Buffer allocation
would be potentially filled with non-zeroed data. By filling
with a randomly selected value, we eliminate the possibility
of leaking uninitialized data but we preserve the need for
users to completely over-write the Buffer contents in order
for it to be useful. Zero-filling by default would
*potentially* put users at risk if module developers assume
that the Buffer instance will always be zero-filled (which
will not be the case on older versions of Node.js).

The cost, however, is that filling with a randomly selected
value is a bit less performant than zero-filling on allocation.

Note that Buffer.allocUnsafe() and Buffer.alloc() are *not*
affected by this change. Buffer.allocUnsafe() will returns
a Buffer with uninitialized data and Buffer.alloc() returns
*zero-filled* data.
@jasnell
Copy link
Member Author

jasnell commented Mar 21, 2017

The PRs have been split

@jasnell
Copy link
Member Author

jasnell commented Mar 23, 2017

For the record, I will say that I am definitely not yet convinced that random-fill is the right approach to take here. It may be easier and better just to zero-fill from 8.0.0 and on.

@ChALkeR
Copy link
Member

ChALkeR commented Mar 25, 2017

I'm not convinced about this for reasons that I already mentioned.

Currently, I'm -1 on this in its default form (just landing this without any additional actions).

I'm +0 on this if this would be paired with a strong commitment to runtime-deprecating without a flag in the next version after the random/zero-fill lands. That is for the case if this won't get backported.

I'm also -1 to a partial backport, without landing a runtime deprecation at the same time.

I'm +1 to landing zero/randomfill at the time when a runtime deprecation is landed, with or without partial or full backport at the time the version that will contain the deprecation will be released.

Given the current situation, I propose to just postpone the zero/randomfill decision to 9.0 (because we probably won't land a runtime deprecation by default in 9.0) and observe the ecosystem usage.


To explain: I am afraid that this could make things worse if we come to a situaton when most current releases that people would care about (e.g. N.0.0 and N-1.x.0) will have the zero/random-fill, and still supported or recently released N-2.y.z or N-1.x-1.q won't, while still being popular, and no actual runtime deprecation will be scheduled — so nothing actually would stop module authors from using Buffer(arg) in new code (the doc deprecation doesn't work, judging by usage stats).

@jasnell
Copy link
Member Author

jasnell commented Mar 25, 2017

@nodejs/ctc ... Given that we have not been able to reach consensus on this, I'm going to call for an official vote at the next CTC meeting.

The vote options are:

  1. Zero-fill new Buffer(num) / Buffer(num) by default starting in 8.0.0
  2. Random-fill new Buffer(num) / Buffer(num) by default starting in 8.0.0
  3. (1) with Backport to 7.x, 6.x and 4.x
  4. (2) with Backport to 7.x, 6.x and 4.x
  5. Do nothing at this time, revisit in 9.0.0 timeframe

Please weigh in here in advance of the CTC meeting if possible. I'd like to have a vote on the next call.

@jasnell
Copy link
Member Author

jasnell commented Mar 25, 2017

My first choice is 1. Second choice would be 3.

@mscdex
Copy link
Contributor

mscdex commented Mar 25, 2017

Why no runtime deprecation vote option?

@jasnell
Copy link
Member Author

jasnell commented Mar 25, 2017

@mscdex ... handling runtime deprecation separately: #11968

@Trott
Copy link
Member

Trott commented Mar 25, 2017

@mscdex ... handling runtime deprecation separately: #11968

@jasnell I'm generally onboard with that approach but in this case @ChALkeR (and maybe others?) on CTC have made it explicit that they can only support zero-filling if there is a commitment to deprecation.

Maybe a good approach is to resolve the "are we going to commit to a deprecation timeline or not" issue first, and then based on that outcome, people can make an informed vote on this?

@seishun
Copy link
Contributor

seishun commented Mar 26, 2017

My humble proposal: start the vote from the question whether a ~50% performance hit to an API should be preceded by a full deprecation cycle.

(I think it should be because that way practically everyone is warned about the impending change and has time to migrate, rather than having their application suddenly get slower after upgrading Node.js)

const { compare: compare_, compareOffset } = binding;
const { isArrayBuffer, isSharedArrayBuffer, isUint8Array } =
process.binding('util');
const bindingObj = {};
const internalUtil = require('internal/util');
const zeroFillBuffers = !!config.zeroFillBuffers;
const randomFill = zeroFillBuffers ? 0 : Math.floor(Math.random() * 256);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not gather crypto.randomBytes() if crypto is available?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we only need a single byte.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and there's no need for it to be cryptographically safe)

@@ -102,7 +105,7 @@ function Buffer(arg, encodingOrOffset, length) {
'If encoding is specified then the first argument must be a string'
);
}
return Buffer.allocUnsafe(arg);
return Buffer.allocUnsafe(arg).fill(randomFill);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing this seems... slow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it is. I went with this approach here because it requires the least amount of code. It is more performant to do a memset in the Allocator inside node.cc, but doing so is a bit more intrusive and introduces a branch that impacts all Allocations (albeit, an extremely minor one).

I maintain that the best thing to do would be to just use calloc and zero-fill instead.

@sam-github
Copy link
Contributor

I don't have an opinion on whether doing this at all is a good idea, but I've had nightmarish experiences debugging systems where someone had the clever idea of randomizing uninitialized memory "so no one would rely on its value". It made bugs manifest randomly. When something is buggy, it should ideally bug out all the time, every run, same place. Not randomly. Please don't make it random.

@jasnell
Copy link
Member Author

jasnell commented Mar 31, 2017

The @nodejs/ctc has voted against the random fill approach in nodejs/CTC#90,
and is currently leaning towards zero-fill in 8.0.0 and forward: nodejs/CTC#89

@jasnell jasnell closed this Mar 31, 2017
@Trott Trott removed the ctc-agenda label Apr 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. semver-major PRs that contain breaking changes and should be released in the next major version. wip Issues and PRs that are still a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants