Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use estimated formatted lengths to optimize performance of VT rendering #4890

Merged
3 commits merged into from
Mar 11, 2020

Conversation

miniksa
Copy link
Member

@miniksa miniksa commented Mar 11, 2020

Summary of the Pull Request

Allows VT engine methods that print formatted strings (cursor movements, color changes, etc.) to provide a guess at the max buffer size required eliminating the double-call for formatting in the common case.

PR Checklist

Detailed Description of the Pull Request / Additional comments

  • The most common case for VT rendering is formatting a few numbers into a sequence. For the most part, we can already tell the maximum length that the string could be based on the number of substitutions and the size of the parameters.
  • The existing formatting method would always double-call. It would first call for how long the string was going to be post-formatting, allocate that memory, then call again and fill it up. This cost two full times of running through the string to find a length we probably already knew for the most part.
  • Now if a size is provided, we allocate that first and attempt the "second pass" of formatting directly into the buffer. This saves the count step in the common case.
  • If this fails, we fall back and do the two-pass method (which theoretically means the bad case is now 3 passes.)
  • The next biggest waste of time in this method was allocating and freeing strings for every format pass. Due to the nature of the VT renderer, many things need to be formatted this way. I've now instead moved the format method to hold a static string that really only grows over the course of the session for all of these format operations. I expect a majority of the time, it will only be consuming approximately 5-15 length of a std::string of memory space. I cannot currently see a circumstance where it would use more than that, but I'm consciously trading memory usage when running as a PTY for overall runtime performance here.

Validation Steps Performed

  • Ran the thing manually and checked it out with wsl and cmatrix and Powershell and such attached to Terminal
  • Wrote and ran automated tests on formatting method

…ed buffer to save performance of alloc/freeing for lots of formatted output.
@miniksa miniksa added Area-Performance Performance-related issue Product-Conpty For console issues specifically related to conpty Issue-Task It's a feature request, but it doesn't really need a major design. labels Mar 11, 2020
@miniksa miniksa self-assigned this Mar 11, 2020
@miniksa
Copy link
Member Author

miniksa commented Mar 11, 2020

A sample before/after for cmatrix:

(Note this session also includes as-of-yet not PR'd bitmap improvements. You're observing the relative drop off of the _scprintf as well as the new, memset, and free calls from Before to After.)

Before:
image

After:
image

src/renderer/vt/state.cpp Outdated Show resolved Hide resolved
src/renderer/vt/vtrenderer.hpp Outdated Show resolved Hide resolved
@DHowett-MSFT
Copy link
Contributor

How bad would it be if you just made it start at 16 or 32 and ditched the estimations completely? In the fullness of time, we'll definitely need at least 16 characters for the wide variety of outgoing sequences, and we will grow the first time we miss. Then we won't miss again until we grow again. I dunno. Not worth doing?

@miniksa
Copy link
Member Author

miniksa commented Mar 11, 2020

How bad would it be if you just made it start at 16 or 32 and ditched the estimations completely? In the fullness of time, we'll definitely need at least 16 characters for the wide variety of outgoing sequences, and we will grow the first time we miss. Then we won't miss again until we grow again. I dunno. Not worth doing?

Probably about the same. I can ditch the estimates.

@DHowett-MSFT DHowett-MSFT added the AutoMerge Marked for automatic merge by the bot when requirements are met label Mar 11, 2020
@ghost
Copy link

ghost commented Mar 11, 2020

Hello @DHowett-MSFT!

Because this pull request has the AutoMerge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

Do note that I've been instructed to only help merge pull requests of this repository that have been opened for at least 8 hours, a condition that will be fulfilled in about 5 hours 44 minutes. No worries though, I will be back when the time is right! 😉

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

@DHowett-MSFT
Copy link
Contributor

@msftbot merge this in 1 minute

@ghost
Copy link

ghost commented Mar 11, 2020

Hello @DHowett-MSFT!

Because you've given me some instructions on how to help merge this pull request, I'll be modifying my merge approach. Here's how I understand your requirements for merging this pull request:

  • I won't merge this pull request until after the UTC date Wed, 11 Mar 2020 22:11:55 GMT, which is in 1 minute

If this doesn't seem right to you, you can tell me to cancel these instructions and use the auto-merge policy that has been configured for this repository. Try telling me "forget everything I just told you".

@ghost ghost merged commit cd87db6 into master Mar 11, 2020
@ghost ghost deleted the dev/miniksa/vtformat_perf branch March 11, 2020 22:12
abhijeetviswam pushed a commit to abhijeetviswam/terminal that referenced this pull request Mar 12, 2020
…ng (microsoft#4890)

## Summary of the Pull Request
Allows VT engine methods that print formatted strings (cursor movements, color changes, etc.) to provide a guess at the max buffer size required eliminating the double-call for formatting in the common case.

## PR Checklist
* [x] Found while working on microsoft#778
* [x] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/Terminal) and sign the CLA
* [x] Tests added/passed
* [x] Am core contributor. 

## Detailed Description of the Pull Request / Additional comments
- The most common case for VT rendering is formatting a few numbers into a sequence. For the most part, we can already tell the maximum length that the string could be based on the number of substitutions and the size of the parameters.
- The existing formatting method would always double-call. It would first call for how long the string was going to be post-formatting, allocate that memory, then call again and fill it up. This cost two full times of running through the string to find a length we probably already knew for the most part.
- Now if a size is provided, we allocate that first and attempt the "second pass" of formatting directly into the buffer. This saves the count step in the common case.
- If this fails, we fall back and do the two-pass method (which theoretically means the bad case is now 3 passes.)
- The next biggest waste of time in this method was allocating and freeing strings for every format pass. Due to the nature of the VT renderer, many things need to be formatted this way. I've now instead moved the format method to hold a static string that really only grows over the course of the session for all of these format operations. I expect a majority of the time, it will only be consuming approximately 5-15 length of a std::string of memory space. I cannot currently see a circumstance where it would use more than that, but I'm consciously trading memory usage when running as a PTY for overall runtime performance here.

## Validation Steps Performed
- Ran the thing manually and checked it out with wsl and cmatrix and Powershell and such attached to Terminal
- Wrote and ran automated tests on formatting method
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Performance Performance-related issue AutoMerge Marked for automatic merge by the bot when requirements are met Issue-Task It's a feature request, but it doesn't really need a major design. Product-Conpty For console issues specifically related to conpty
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants