Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefer FMT_COMPILE for string formatting in VtRenderer #10426

Merged
13 commits merged into from
Jun 22, 2021

Conversation

skyline75489
Copy link
Collaborator

@skyline75489 skyline75489 commented Jun 15, 2021

Kill WriteFormattedString and replace it with fmt::format_to to avoid expensive string operations in VtRenderer.

This saves ~8% of the CPU time.

Inspired by #10362 (comment)

Co-authored-by: Leonard Hecker lhecker@microsoft.com

@skyline75489
Copy link
Collaborator Author

Before:

image

After:

image

src/inc/til/format.h Outdated Show resolved Hide resolved
@github-actions
Copy link

github-actions bot commented Jun 15, 2021

@check-spelling-bot Report

Unrecognized words, please review:

  • stdint
Previously acknowledged words that are now absent hpcon serializers testnetv Tlg Xpath
To accept these unrecognized words as correct (and remove the previously acknowledged and now absent words), run the following commands

... in a clone of the git@github.com:skyline75489/terminal.git repository
on the chesterliu/dev/dirty-formatting branch:

update_files() {
perl -e '
my @expect_files=qw('".github/actions/spelling/expect/alphabet.txt
.github/actions/spelling/expect/expect.txt
.github/actions/spelling/expect/web.txt"');
@ARGV=@expect_files;
my @stale=qw('"$patch_remove"');
my $re=join "|", @stale;
my $suffix=".".time();
my $previous="";
sub maybe_unlink { unlink($_[0]) if $_[0]; }
while (<>) {
if ($ARGV ne $old_argv) { maybe_unlink($previous); $previous="$ARGV$suffix"; rename($ARGV, $previous); open(ARGV_OUT, ">$ARGV"); select(ARGV_OUT); $old_argv = $ARGV; }
next if /^(?:$re)(?:(?:\r|\n)*$| .*)/; print;
}; maybe_unlink($previous);'
perl -e '
my $new_expect_file=".github/actions/spelling/expect/813f385c08ae3b10ae782d3b0116ba352ca6fda5.txt";
use File::Path qw(make_path);
use File::Basename qw(dirname);
make_path (dirname($new_expect_file));
open FILE, q{<}, $new_expect_file; chomp(my @words = <FILE>); close FILE;
my @add=qw('"$patch_add"');
my %items; @items{@words} = @words x (1); @items{@add} = @add x (1);
@words = sort {lc($a)."-".$a cmp lc($b)."-".$b} keys %items;
open FILE, q{>}, $new_expect_file; for my $word (@words) { print FILE "$word\n" if $word =~ /\w/; };
close FILE;
system("git", "add", $new_expect_file);
'
}

comment_json=$(mktemp)
curl -L -s -S \
  --header "Content-Type: application/json" \
  "https://api.github.com/repos/microsoft/terminal/issues/comments/861148011" > "$comment_json"
comment_body=$(mktemp)
jq -r .body < "$comment_json" > $comment_body
rm $comment_json

patch_remove=$(perl -ne 'next unless s{^</summary>(.*)</details>$}{$1}; print' < "$comment_body")
  

patch_add=$(perl -e '$/=undef;
$_=<>;
s{<details>.*}{}s;
s{^#.*}{};
s{\n##.*}{};
s{(?:^|\n)\s*\*}{}g;
s{\s+}{ }g;
print' < "$comment_body")
  
update_files
rm $comment_body
git add -u
✏️ Contributor please read this

By default the command suggestion will generate a file named based on your commit. That's generally ok as long as you add the file to your commit. Someone can reorganize it later.

⚠️ The command is written for posix shells. You can copy the contents of each perl command excluding the outer ' marks and dropping any '"/"' quotation mark pairs into a file and then run perl file.pl from the root of the repository to run the code. Alternatively, you can manually insert the items...

If the listed items are:

  • ... misspelled, then please correct them instead of using the command.
  • ... names, please add them to .github/actions/spelling/allow/names.txt.
  • ... APIs, you can add them to a file in .github/actions/spelling/allow/.
  • ... just things you're using, please add them to an appropriate file in .github/actions/spelling/expect/.
  • ... tokens you only need in one place and shouldn't generally be used, you can add an item in an appropriate file in .github/actions/spelling/patterns/.

See the README.md in each directory for more information.

🔬 You can test your commits without appending to a PR by creating a new branch with that extra change and pushing it to your fork. The check-spelling action will run in response to your push -- it doesn't require an open pull request. By using such a branch, you can limit the number of typos your peers see you make. 😉

🗜️ If you see a bunch of garbage

If it relates to a ...

well-formed pattern

See if there's a pattern that would match it.

If not, try writing one and adding it to a patterns/{file}.txt.

Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

Note that patterns can't match multiline strings.

binary-ish string

Please add a file path to the excludes.txt file instead of just accepting the garbage.

File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

@skyline75489
Copy link
Collaborator Author

Breakdown traces shows that the string formatting method StringCchVPrintfExA is quite expensive, considering we only need it for simple integer formatting:

image

@miniksa
Copy link
Member

miniksa commented Jun 15, 2021

I like the idea because you're right, this is a pain point. But I thought that fmt::format was already capable of something like this that should be significantly faster. Can we just use that?

@skyline75489
Copy link
Collaborator Author

Dustin suggested FMT_COMPILE so I tried that. It’s still can be seen as more of a general purpose formatting method. comparing to the original code (10% CPU time), FMT_COMPLIE is faster(~5%), but not the fastest. This PR I believe is the fastest(< 2%), because it is the dirtiest and more specialized.

@skyline75489
Copy link
Collaborator Author

One thing I learned while writing this PR is that even the construction of std::string is reasonably expensive when used very frequently. This is why (I believe) FMT_COMPILE is dragged behind in performance.

In this PR I managed to do zero heap allocation & zero objection construction to minimize the memory footprint & CPU cycles.

@DHowett
Copy link
Member

DHowett commented Jun 15, 2021

fmt::format still uses a dynamically-allocated buffer.

Can you try...

	char buf[16];
	auto end = fmt::format_to(std::begin(buf), FMT_COMPILE("\x1b[38;2;{};{};{}m"), arg1, arg2, arg1);

? This lets us format into a buffer. :P

@skyline75489
Copy link
Collaborator Author

Thanks @DHowett for informing me of fmt::format_to. It looks amazing:

image

@skyline75489 skyline75489 changed the title [DRAFT] Prefer dirty & fast formatting in VtRenderer Prefer FMT_COMPILE for string formatting in VtRenderer Jun 16, 2021
@skyline75489
Copy link
Collaborator Author

I'm marking this "ready for review". I have two lingering questions:

  1. Is FMT_COMPILE safe for more types of formatting? In this PR it only applies to strings that require integer arguments.
  2. Moreover, is there any place else I can apply the same technique to boost the performance?

@skyline75489 skyline75489 marked this pull request as ready for review June 16, 2021 04:21
Copy link
Member

@lhecker lhecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_WriteFormattedString takes a std::string as its first argument.
While this doesn't allocate anything thanks to SSO, it feels weird to me that we do it that way when you can just... you know... pass the char pointer directly lol.

So if you feel like it, it would be nice if the first argument of _WriteFormattedString
was changed to _Printf_format_string_ STRSAFE_LPCSTR IMO. (Just IMO! 😄)

src/renderer/vt/VtSequences.cpp Outdated Show resolved Hide resolved
src/renderer/vt/VtSequences.cpp Outdated Show resolved Hide resolved
@ghost ghost added Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something and removed Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something labels Jun 16, 2021
@github-actions
Copy link

github-actions bot commented Jun 17, 2021

@check-spelling-bot Report

Unrecognized words, please review:

  • MSVC
Previously acknowledged words that are now absent ADDB hpcon serializers testnetv Tlg Xpath
To accept these unrecognized words as correct (and remove the previously acknowledged and now absent words), run the following commands

... in a clone of the git@github.com:skyline75489/terminal.git repository
on the chesterliu/dev/dirty-formatting branch:

update_files() {
perl -e '
my @expect_files=qw('".github/actions/spelling/expect/alphabet.txt
.github/actions/spelling/expect/expect.txt
.github/actions/spelling/expect/web.txt"');
@ARGV=@expect_files;
my @stale=qw('"$patch_remove"');
my $re=join "|", @stale;
my $suffix=".".time();
my $previous="";
sub maybe_unlink { unlink($_[0]) if $_[0]; }
while (<>) {
if ($ARGV ne $old_argv) { maybe_unlink($previous); $previous="$ARGV$suffix"; rename($ARGV, $previous); open(ARGV_OUT, ">$ARGV"); select(ARGV_OUT); $old_argv = $ARGV; }
next if /^(?:$re)(?:(?:\r|\n)*$| .*)/; print;
}; maybe_unlink($previous);'
perl -e '
my $new_expect_file=".github/actions/spelling/expect/b3b648496e92d14e88bbede1b2c0b63e9d56b599.txt";
use File::Path qw(make_path);
use File::Basename qw(dirname);
make_path (dirname($new_expect_file));
open FILE, q{<}, $new_expect_file; chomp(my @words = <FILE>); close FILE;
my @add=qw('"$patch_add"');
my %items; @items{@words} = @words x (1); @items{@add} = @add x (1);
@words = sort {lc($a)."-".$a cmp lc($b)."-".$b} keys %items;
open FILE, q{>}, $new_expect_file; for my $word (@words) { print FILE "$word\n" if $word =~ /\w/; };
close FILE;
system("git", "add", $new_expect_file);
'
}

comment_json=$(mktemp)
curl -L -s -S \
  --header "Content-Type: application/json" \
  "https://api.github.com/repos/microsoft/terminal/issues/comments/863039195" > "$comment_json"
comment_body=$(mktemp)
jq -r .body < "$comment_json" > $comment_body
rm $comment_json

patch_remove=$(perl -ne 'next unless s{^</summary>(.*)</details>$}{$1}; print' < "$comment_body")
  

patch_add=$(perl -e '$/=undef;
$_=<>;
s{<details>.*}{}s;
s{^#.*}{};
s{\n##.*}{};
s{(?:^|\n)\s*\*}{}g;
s{\s+}{ }g;
print' < "$comment_body")
  
update_files
rm $comment_body
git add -u
✏️ Contributor please read this

By default the command suggestion will generate a file named based on your commit. That's generally ok as long as you add the file to your commit. Someone can reorganize it later.

⚠️ The command is written for posix shells. You can copy the contents of each perl command excluding the outer ' marks and dropping any '"/"' quotation mark pairs into a file and then run perl file.pl from the root of the repository to run the code. Alternatively, you can manually insert the items...

If the listed items are:

  • ... misspelled, then please correct them instead of using the command.
  • ... names, please add them to .github/actions/spelling/allow/names.txt.
  • ... APIs, you can add them to a file in .github/actions/spelling/allow/.
  • ... just things you're using, please add them to an appropriate file in .github/actions/spelling/expect/.
  • ... tokens you only need in one place and shouldn't generally be used, you can add an item in an appropriate file in .github/actions/spelling/patterns/.

See the README.md in each directory for more information.

🔬 You can test your commits without appending to a PR by creating a new branch with that extra change and pushing it to your fork. The check-spelling action will run in response to your push -- it doesn't require an open pull request. By using such a branch, you can limit the number of typos your peers see you make. 😉

🗜️ If you see a bunch of garbage

If it relates to a ...

well-formed pattern

See if there's a pattern that would match it.

If not, try writing one and adding it to a patterns/{file}.txt.

Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

Note that patterns can't match multiline strings.

binary-ish string

Please add a file path to the excludes.txt file instead of just accepting the garbage.

File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

@DHowett
Copy link
Member

DHowett commented Jun 18, 2021

It would fit fmt's design a bit better, true.

@lhecker
Copy link
Member

lhecker commented Jun 20, 2021

You know I was curious whether MSVC properly optimizes std::string_view and I tried:

#include <string_view>

extern int foo(std::string_view);

int test() {
    return foo("aaaaaaaa");
}

This is GCC (-O2):

.LC0:
        .string "aaaaaaaa"
test():
        mov     edi, 8
        mov     esi, OFFSET FLAT:.LC0
        jmp     foo(std::basic_string_view<char, std::char_traits<char> >)

This is MSVC (/O2 /Oi /std:c++17):

`string' DB 'aaaaaaaa', 00H         ; `string'

$T1 = 32
$T2 = 32
int test(void) PROC                                 ; test, COMDAT
$LN9:
        sub     rsp, 56                             ; 00000038H
        mov     QWORD PTR $T2[rsp+8], 8
        lea     rax, OFFSET FLAT:`string'
        mov     QWORD PTR $T2[rsp], rax
        lea     rcx, QWORD PTR $T1[rsp]
        movaps  xmm0, XMMWORD PTR $T2[rsp]
        movdqa  XMMWORD PTR $T1[rsp], xmm0
        call    int foo(std::basic_string_view<char,std::char_traits<char> >) ; foo
        add     rsp, 56                             ; 00000038H
        ret     0
int test(void) ENDP                                 ; test

Why does MSVC allocate 56 bytes on the stack for, to call a function that takes two parameters as a struct. And am I seeing this correctly? movaps? The instruction for moving floating point numbers? I'm doing something wrong here, right? Right? ._.

@lhecker
Copy link
Member

lhecker commented Jun 20, 2021

Windows x64 calling convention passes structs by reference if they're larger than 8 bytes. 😑
See here: https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-160#parameter-passing

But I just can't get over this in particular:

        movaps  xmm0, XMMWORD PTR $T2[rsp]
        movdqa  XMMWORD PTR $T1[rsp], xmm0

The compiler copies 16 bytes from rsp+32 (the beginning of the stack + 32 bytes) to xmm0, and then copies 16 bytes back to rsp+32, because $T1 and $T2 are the exact same. This operation is entirely pointless! And has a high latency! Wat. Why would you do that MSVC? 😐

This bit of assembly is not generated if you pass the std::string_view by reference rather than by value
Honestly I cannot actually believe that it generates more efficient code then. This goes against any expectation I had for how the Windows x64 calling convention potentially works, coming from UNIX.
Should we replace all places that use (std::string_view) with (const std::string_view&)? The answer is probably: Yeah, unfortunately.

Edit: Yep, I benchmarked it and passing a std::string_view by reference is significantly faster (1.9ns vs 1.2ns).

@skyline75489 skyline75489 added Area-Performance Performance-related issue Area-VT Virtual Terminal sequence support labels Jun 21, 2021
Copy link
Member

@DHowett DHowett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm impressed with how minimally-invasive the changes are for the impact it gives us. I'm just blocking over the stateMachine changes 😄 since they seem like they belong in #10471.

@ghost ghost added Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something and removed Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something labels Jun 21, 2021
@lhecker
Copy link
Member

lhecker commented Jun 22, 2021

FYI @skyline75489: fmt v8 was released today and contains the append optimizations I mentioned earlier.
I guess we should wait a bit see if any bugs show up and then migrate to it. 🙂
Even if performance shouldn't change (which is relatively likely), we might see a reduction in binary size.

@skyline75489
Copy link
Collaborator Author

Oh yeah now that you mentioned it, there's slight binary size increase caused by this. I thin it's worth is, considering the performance boost. Hope it does not cause trouble regarding inbox requirement.

@lhecker
Copy link
Member

lhecker commented Jun 22, 2021

@skyline75489 How are you measuring the binary size? I'm building Host.EXE and I'm seeing no size changes between main and your branch. In fact it's almost 1kB smaller, but that isn't really anything significant, when OpenConsole is 1.104MB.

@skyline75489
Copy link
Collaborator Author

Huh, strange. I'm seeing main (1102 KB) and my branch (1107KB). Release x64 Build of OpenConsole.exe

@skyline75489
Copy link
Collaborator Author

Maybe you want to pull this branch again. I rebased master just minutes ago,

@DHowett DHowett added the AutoMerge Marked for automatic merge by the bot when requirements are met label Jun 22, 2021
@ghost
Copy link

ghost commented Jun 22, 2021

Hello @DHowett!

Because this pull request has the AutoMerge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

@DHowett
Copy link
Member

DHowett commented Jun 22, 2021

Thank you, @skyline75489!

@ghost ghost merged commit 4f0b57e into microsoft:main Jun 22, 2021
@skyline75489
Copy link
Collaborator Author

Thanks @mmozeiko for the original inspiration of this PR. I'm sorry that this does not give us 3X performance boost (as @jfhs
described in #10362 (comment)), because the original code is flawed. But nevertheless, ~8% CPU usage is still impressive. I've had PRs that saves ~3% of CPU and I was excited about it.

This change should be shipped in the next version of Windows Terminal (OpenConsole). But unfortunately it won't be shipped soon in Windows itself (as in conhost), because of the shipping schedule of Windows is much more complicated. I don't like this, either. But I can live with it.

And I'd like to point out that I am NOT part of the terminal dev team. I'm a MS employee but I work for Azure now. I'm just contributing to this project as an outside contributor since the year 2019 (we call it "moonlighting" inside MS). And I joined MS after that at the year 2020. Being inside MS gave me a better channel to talk to the members of the official dev team, and helped me establishing (dare I say) a nice friendship with them.

I'd also like to point out that me being an outside contributor actually helps me creating PRs like this one, because being an official member means that you'll have all the OKRs, the managerial metrics, the milestones and everything. You can't really just do whatever you want with the project. That's why community contributors play a vital rule in the development process. I do believe the team welcome all kinds of contributions, because that's the idea of open source, right? I'm sure the team did not try convincing the higher management to open source this project, just to show what a group of good programmers they are (or did they? 👀)

@skyline75489 skyline75489 deleted the chesterliu/dev/dirty-formatting branch June 23, 2021 04:17
DHowett pushed a commit that referenced this pull request Jul 7, 2021
Kill `WriteFormattedString` and replace it with `fmt::format_to` to avoid expensive string operations in VtRenderer.

This saves ~8% of the CPU time.

Inspired by #10362 (comment)

Co-authored-by: Leonard Hecker <lhecker@microsoft.com>
(cherry picked from commit 4f0b57e)
@ghost
Copy link

ghost commented Jul 14, 2021

🎉Windows Terminal v1.9.1942.0 has been released which incorporates this pull request.:tada:

Handy links:

@ghost
Copy link

ghost commented Jul 14, 2021

🎉Windows Terminal Preview v1.10.1933.0 has been released which incorporates this pull request.:tada:

Handy links:

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Performance Performance-related issue Area-VT Virtual Terminal sequence support AutoMerge Marked for automatic merge by the bot when requirements are met
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants