-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Binary Xml (XmlDictionaryWriter) performance #71478
Conversation
src/libraries/System.Private.DataContractSerialization/src/System/Xml/XmlBinaryWriter.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.DataContractSerialization/src/System/Xml/XmlStreamNodeWriter.cs
Outdated
Show resolved
Hide resolved
@@ -624,19 +604,19 @@ public unsafe override void WriteText(string value) | |||
} | |||
} | |||
|
|||
public unsafe override void WriteText(char[] chars, int offset, int count) | |||
private unsafe void WriteText(Span<char> chars) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did try using ReadOnlySpan but then it looked like to many methods called this overload instead of the string version so I used span instead to avoid extra string allocations.
However it might be possibly to merge this implementation with the string based method if using ReadOnlyMemory instead since it allows trying to retrieve the original string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
many methods called this overload instead of the string version
That does not sound right. If you have a string and there is a string overload, the compiler should pick the string overload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect might have been a VisualStudio bug, but I did not want to take any chances in case it had anything to do with one method being an override or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed the code, will have to decompile the code and have a second look. Visual studio (17.2) still reports many references to the ReadOnlySpan version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this surprises me:
SharpLab
For some reason the WriteText(string) overload being an override is causing it to lose out to the span-based overload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I solved this by creating a separate "implementation metod" which is not an override (I called it WriteTextImpl
to follow what seems to be the current naming convention for implementation methods)
@@ -72,6 +75,12 @@ private void WriteTextNode(XmlBinaryNodeType nodeType) | |||
_textNodeOffset = this.BufferOffset - 1; | |||
} | |||
|
|||
private ref byte GetBufferRef(int size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code was using safe C# with proper boundschecks. You are switching this to use unsafe code without boundschecks. It is increasing security risk profile of this code.
Can this stay with safe C# and Spans? You should be still able to get nice performance gains by doing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes there was quite much unsafe code and points present in the file so I assumed it was ok to be unsafe for the best performance sine GetBuffer does the bounds check.
I've refactored the code so that all new "unsafe" code is now only present in 2 methods to make it much easier to reason about and updated the benchmark with timings for both Unsafe and Span.
The span version does .AsSpan(offset, size)
on the buffer and then uses MemoryMarshal.Write<T>
which I just foud out about.. I hope that method is considered safe enough and works well with "unaligned writes".
Please have a look and let me know based on the new code and the benchmark results which version you want.
I can push the updated span version instead if that version is the prefered one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: A side effect of moving to the span version is that it will probably be necessary to keep separate implementationa for writing 8bit (and maybe 16bit) nodes in order to not regress performance for them.
Some benchmarks would be required to determine the performance numbers for smaller elements first
4d56543
to
d1e850b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there sufficient test coverage to fully exercise all of these changes and root out any corner-case issues?
src/libraries/System.Private.DataContractSerialization/src/System/Xml/XmlBinaryWriter.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.DataContractSerialization/src/System/Xml/XmlBinaryWriter.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.DataContractSerialization/src/System/Xml/XmlBinaryWriter.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.DataContractSerialization/src/System/Xml/XmlBinaryWriter.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.DataContractSerialization/src/System/Xml/XmlBinaryWriter.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.DataContractSerialization/src/System/Xml/XmlBinaryWriter.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.DataContractSerialization/src/System/Xml/XmlBinaryWriter.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.DataContractSerialization/src/System/Xml/XmlBinaryWriter.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.DataContractSerialization/src/System/Xml/XmlBinaryWriter.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.DataContractSerialization/src/System/Xml/XmlBinaryWriter.cs
Outdated
Show resolved
Hide resolved
Co-authored-by: Stephen Toub <stoub@microsoft.com>
src/libraries/System.Private.DataContractSerialization/src/System/Xml/XmlBinaryWriter.cs
Outdated
Show resolved
Hide resolved
I think I fixed all the review comments for now so I think next up it to determine what to do with tests.
@stephentoub I am a bit unsure about that.
I can look into adding a new tests I found some tests for the text version of dictionary writer I could write some tests for the binary writer (write primitive types and some arrays ?). update: I've added a couple of testcases so I think relevant scenarios are covered. |
5da3b9e
to
963315d
Compare
I've added a couple of tests that I think cover relevant cases. I was a bit unsure how to handle any potential big endian platform since it is only handled for single integers so the array test as it is now will fail on big endian platforms. it can easily be made to pass but it will not make the binary xml the same for different endiannes (this pr does not make any changes to the representation apart from the fix for floats described in the description). |
...zation.Xml/tests/ReflectionOnly/System.Runtime.Serialization.Xml.ReflectionOnly.Tests.csproj
Outdated
Show resolved
Hide resolved
…nOnly/System.Runtime.Serialization.Xml.ReflectionOnly.Tests.csproj
The test will give problem on big-endian architectures just as @uweigand found out #73332 (comment) I can take a look and add a commit to this PR (maybe even this week) to try to fix the implementation for big endian architectures if you think it is a good idea? @jkotas @mconnew If you prefer this PR as is without any behavioral changes is there a recommended way of skipping the "problematic" test on big endian architectures? |
@StephenMolloy and @mconnew can you please take a look? |
Test failure appears to be unrelated. #64227 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Improves performance the binary XmlDictionaryWriter mainly by using Span and Unsafe
Other:
Benchmarks
"Before" is the official net7 preview.5 bits (with r2r) while After is local release build of System.Xml and "System.Private.DataContract..* dll.
*Benchmarks Code:*
Remarks:
*
The span and unsafe version ofWriteGuid
andWrite...Array
are identical so I do not know why there is a timing differecnce.