Use estimated formatted lengths to optimize performance of VT rendering #4890

miniksa · 2020-03-11T19:26:58Z

Summary of the Pull Request

Allows VT engine methods that print formatted strings (cursor movements, color changes, etc.) to provide a guess at the max buffer size required eliminating the double-call for formatting in the common case.

PR Checklist

Found while working on Restore differential drawing to DirectWrite renderer #778
CLA signed. If not, go over here and sign the CLA
Tests added/passed
Am core contributor.

Detailed Description of the Pull Request / Additional comments

The most common case for VT rendering is formatting a few numbers into a sequence. For the most part, we can already tell the maximum length that the string could be based on the number of substitutions and the size of the parameters.
The existing formatting method would always double-call. It would first call for how long the string was going to be post-formatting, allocate that memory, then call again and fill it up. This cost two full times of running through the string to find a length we probably already knew for the most part.
Now if a size is provided, we allocate that first and attempt the "second pass" of formatting directly into the buffer. This saves the count step in the common case.
If this fails, we fall back and do the two-pass method (which theoretically means the bad case is now 3 passes.)
The next biggest waste of time in this method was allocating and freeing strings for every format pass. Due to the nature of the VT renderer, many things need to be formatted this way. I've now instead moved the format method to hold a static string that really only grows over the course of the session for all of these format operations. I expect a majority of the time, it will only be consuming approximately 5-15 length of a std::string of memory space. I cannot currently see a circumstance where it would use more than that, but I'm consciously trading memory usage when running as a PTY for overall runtime performance here.

Validation Steps Performed

Ran the thing manually and checked it out with wsl and cmatrix and Powershell and such attached to Terminal
Wrote and ran automated tests on formatting method

…ed buffer to save performance of alloc/freeing for lots of formatted output.

miniksa · 2020-03-11T19:29:34Z

A sample before/after for cmatrix:

(Note this session also includes as-of-yet not PR'd bitmap improvements. You're observing the relative drop off of the _scprintf as well as the new, memset, and free calls from Before to After.)

Before:

After:

src/renderer/vt/state.cpp

src/renderer/vt/vtrenderer.hpp

DHowett-MSFT · 2020-03-11T19:38:45Z

How bad would it be if you just made it start at 16 or 32 and ditched the estimations completely? In the fullness of time, we'll definitely need at least 16 characters for the wide variety of outgoing sequences, and we will grow the first time we miss. Then we won't miss again until we grow again. I dunno. Not worth doing?

miniksa · 2020-03-11T20:57:11Z

How bad would it be if you just made it start at 16 or 32 and ditched the estimations completely? In the fullness of time, we'll definitely need at least 16 characters for the wide variety of outgoing sequences, and we will grow the first time we miss. Then we won't miss again until we grow again. I dunno. Not worth doing?

Probably about the same. I can ditch the estimates.

…miss and keep it.

ghost · 2020-03-11T21:42:45Z

Hello @DHowett-MSFT!

Because this pull request has the AutoMerge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

Do note that I've been instructed to only help merge pull requests of this repository that have been opened for at least 8 hours, a condition that will be fulfilled in about 5 hours 44 minutes. No worries though, I will be back when the time is right! 😉

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (`@msftbot`) and give me an instruction to get started! Learn more here.

DHowett-MSFT · 2020-03-11T22:10:53Z

@msftbot merge this in 1 minute

ghost · 2020-03-11T22:10:56Z

Hello @DHowett-MSFT!

Because you've given me some instructions on how to help merge this pull request, I'll be modifying my merge approach. Here's how I understand your requirements for merging this pull request:

I won't merge this pull request until after the UTC date Wed, 11 Mar 2020 22:11:55 GMT, which is in 1 minute

If this doesn't seem right to you, you can tell me to cancel these instructions and use the auto-merge policy that has been configured for this repository. Try telling me "forget everything I just told you".

…ng (microsoft#4890) ## Summary of the Pull Request Allows VT engine methods that print formatted strings (cursor movements, color changes, etc.) to provide a guess at the max buffer size required eliminating the double-call for formatting in the common case. ## PR Checklist * [x] Found while working on microsoft#778 * [x] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/Terminal) and sign the CLA * [x] Tests added/passed * [x] Am core contributor. ## Detailed Description of the Pull Request / Additional comments - The most common case for VT rendering is formatting a few numbers into a sequence. For the most part, we can already tell the maximum length that the string could be based on the number of substitutions and the size of the parameters. - The existing formatting method would always double-call. It would first call for how long the string was going to be post-formatting, allocate that memory, then call again and fill it up. This cost two full times of running through the string to find a length we probably already knew for the most part. - Now if a size is provided, we allocate that first and attempt the "second pass" of formatting directly into the buffer. This saves the count step in the common case. - If this fails, we fall back and do the two-pass method (which theoretically means the bad case is now 3 passes.) - The next biggest waste of time in this method was allocating and freeing strings for every format pass. Due to the nature of the VT renderer, many things need to be formatted this way. I've now instead moved the format method to hold a static string that really only grows over the course of the session for all of these format operations. I expect a majority of the time, it will only be consuming approximately 5-15 length of a std::string of memory space. I cannot currently see a circumstance where it would use more than that, but I'm consciously trading memory usage when running as a PTY for overall runtime performance here. ## Validation Steps Performed - Ran the thing manually and checked it out with wsl and cmatrix and Powershell and such attached to Terminal - Wrote and ran automated tests on formatting method

Update VT renderer format string to use proposed guess sizes and cach…

c8c650b

…ed buffer to save performance of alloc/freeing for lots of formatted output.

miniksa added Area-Performance Performance-related issue Product-Conpty For console issues specifically related to conpty Issue-Task It's a feature request, but it doesn't really need a major design. labels Mar 11, 2020

miniksa requested review from carlos-zamora, DHowett-MSFT, zadjii-msft and leonMSFT March 11, 2020 19:26

miniksa self-assigned this Mar 11, 2020

DHowett-MSFT approved these changes Mar 11, 2020

View reviewed changes

src/renderer/vt/state.cpp Outdated Show resolved Hide resolved

src/renderer/vt/vtrenderer.hpp Outdated Show resolved Hide resolved

miniksa added 2 commits March 11, 2020 14:10

Take the magic static out.

bee400b

Collapse back to one function, forget the estimates, just grow if we …

726710d

…miss and keep it.

zadjii-msft approved these changes Mar 11, 2020

View reviewed changes

DHowett-MSFT approved these changes Mar 11, 2020

View reviewed changes

DHowett-MSFT added the AutoMerge Marked for automatic merge by the bot when requirements are met label Mar 11, 2020

ghost merged commit cd87db6 into master Mar 11, 2020

ghost deleted the dev/miniksa/vtformat_perf branch March 11, 2020 22:12

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use estimated formatted lengths to optimize performance of VT rendering #4890

Use estimated formatted lengths to optimize performance of VT rendering #4890

miniksa commented Mar 11, 2020

miniksa commented Mar 11, 2020

DHowett-MSFT commented Mar 11, 2020

miniksa commented Mar 11, 2020

ghost commented Mar 11, 2020

DHowett-MSFT commented Mar 11, 2020

ghost commented Mar 11, 2020

Use estimated formatted lengths to optimize performance of VT rendering #4890

Use estimated formatted lengths to optimize performance of VT rendering #4890

Conversation

miniksa commented Mar 11, 2020

Summary of the Pull Request

PR Checklist

Detailed Description of the Pull Request / Additional comments

Validation Steps Performed

miniksa commented Mar 11, 2020

DHowett-MSFT commented Mar 11, 2020

miniksa commented Mar 11, 2020

ghost commented Mar 11, 2020

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

DHowett-MSFT commented Mar 11, 2020

ghost commented Mar 11, 2020

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (`@msftbot`) and give me an instruction to get started! Learn more here.