Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvement by refactoring pixel styles #704

Merged

Conversation

clement-roblot
Copy link
Contributor

Hi,

I described the idea as part of #703, but turns out I was looking at version 4.0.0 and not main so a few things changed in what I described. As for the performance gains, I am using a lightly modified Downloader example and using perf to measure. I went from using 12.03% of the processing just for UpdatePixelStyle to 6.97% (measured over 3 runs each time).

A few points of interest:

  • I kept automerge outside of the style bitfield because it is used independently from the other style properties.
  • I didn´t implement any tests, would like to though. To do that I would want to move the function UpdatePixelStyle into a method of the Screen class. If that's ok, then I do that right away.

Closes #703

@ArthurSonzogni ArthurSonzogni self-requested a review July 28, 2023 17:55
@ArthurSonzogni
Copy link
Owner

ArthurSonzogni commented Jul 28, 2023

Hello! Thanks for this patch!

I added a benchmark with different style.
Here are some results:

RUNNING: ../../ftxui-benchmark-old --benchmark_out=/tmp/tmpjpbrac91
2023-07-28T21:38:29+02:00
Running ../../ftxui-benchmark-old
Run on (16 X 4784.38 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 512 KiB (x8)
  L3 Unified 16384 KiB (x1)
Load Average: 4.79, 3.03, 1.31
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
***WARNING*** Library was built as DEBUG. Timings may be affected.
----------------------------------------------------------------
Benchmark                      Time             CPU   Iterations
----------------------------------------------------------------
BencharkBasic/0              661 ns          661 ns      1059576
BencharkBasic/16            5283 ns         5283 ns       132525
BencharkBasic/32            9234 ns         9234 ns        75796
BencharkBasic/48           21391 ns        21392 ns        32743
BencharkBasic/64           33235 ns        33236 ns        21059
BencharkBasic/80           44826 ns        44828 ns        15606
BencharkBasic/96           56374 ns        56376 ns        12379
BencharkBasic/112          68201 ns        68204 ns        10210
BencharkBasic/128          79828 ns        79832 ns         8761
BencharkBasic/144          91744 ns        91747 ns         7630
BencharkBasic/160         103259 ns       103264 ns         6785
BencharkBasic/176         115293 ns       115298 ns         6049
BencharkBasic/192         127754 ns       127762 ns         5477
BencharkBasic/208         139056 ns       139065 ns         5028
BencharkBasic/224         150824 ns       150830 ns         4642
BencharkBasic/240         161952 ns       161962 ns         4322
BencharkBasic/256         173596 ns       173605 ns         4029
BencharkText/0            340392 ns       340414 ns         2048
BencharkText/1            341077 ns       341091 ns         2053
BencharkText/2            343266 ns       343286 ns         2040
BencharkText/3            345713 ns       345733 ns         2022
BencharkText/4            359522 ns       359549 ns         1955
BencharkText/5            371079 ns       371103 ns         1886
BencharkText/6            397320 ns       397343 ns         1760
BencharkText/7            455762 ns       455796 ns         1535
BencharkText/8            567428 ns       567470 ns         1229
BencharkText/9            566031 ns       566067 ns         1235
BencharkText/10          1056071 ns      1056151 ns          662
BenchmarkStyle/1/10         3901 ns         3901 ns       179410
BenchmarkStyle/4/10        12331 ns        12332 ns        56736
BenchmarkStyle/7/10        20032 ns        20033 ns        34924
BenchmarkStyle/10/10       27584 ns        27586 ns        25404
BenchmarkStyle/1/30         8360 ns         8361 ns        83664
BenchmarkStyle/4/30        22897 ns        22899 ns        30725
BenchmarkStyle/7/30        34817 ns        34819 ns        20097
BenchmarkStyle/10/30       47650 ns        47654 ns        14686
BenchmarkStyle/1/50        15338 ns        15339 ns        45580
BenchmarkStyle/4/50        33976 ns        33979 ns        20587
BenchmarkStyle/7/50        52564 ns        52567 ns        13311
BenchmarkStyle/10/50       71080 ns        71086 ns         9837
BenchmarkStyle/1/70        25265 ns        25267 ns        27677
BenchmarkStyle/4/70        47836 ns        47840 ns        14623
BenchmarkStyle/7/70        70667 ns        70673 ns         9903
BenchmarkStyle/10/70       93177 ns        93186 ns         7510
BenchmarkStyle/1/90        37016 ns        37019 ns        18878
BenchmarkStyle/4/90        64023 ns        64027 ns        10944
BenchmarkStyle/7/90        90241 ns        90249 ns         7758
BenchmarkStyle/10/90      116566 ns       116575 ns         6000
BenchmarkStyle/1/110       51207 ns        51211 ns        13628
BenchmarkStyle/4/110       83160 ns        83168 ns         8429
BenchmarkStyle/7/110      114381 ns       114390 ns         6101
BenchmarkStyle/10/110     145643 ns       145656 ns         4806
BenchmarkStyle/1/130       66883 ns        66888 ns        10460
BenchmarkStyle/4/130      103054 ns       103063 ns         6800
BenchmarkStyle/7/130      138655 ns       138668 ns         5041
BenchmarkStyle/10/130     174342 ns       174358 ns         4025
BenchmarkStyle/1/150       85025 ns        85032 ns         8273
BenchmarkStyle/4/150      125898 ns       125909 ns         5562
BenchmarkStyle/7/150      165542 ns       165555 ns         4228
BenchmarkStyle/10/150     205404 ns       205421 ns         3411
BenchmarkStyle/1/170      104656 ns       104666 ns         6683
BenchmarkStyle/4/170      147999 ns       148012 ns         4726
BenchmarkStyle/7/170      190749 ns       190765 ns         3664
BenchmarkStyle/10/170     234539 ns       234562 ns         2983
BenchmarkStyle/1/190      129073 ns       129084 ns         5424
BenchmarkStyle/4/190      178223 ns       178241 ns         3927
BenchmarkStyle/7/190      226333 ns       226354 ns         3094
BenchmarkStyle/10/190     274675 ns       274702 ns         2548
RUNNING: ../../ftxui-benchmark-new --benchmark_out=/tmp/tmp27s1kjxh
2023-07-28T21:39:31+02:00
Running ../../ftxui-benchmark-new
Run on (16 X 4784.38 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 512 KiB (x8)
  L3 Unified 16384 KiB (x1)
Load Average: 2.57, 2.71, 1.31
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
***WARNING*** Library was built as DEBUG. Timings may be affected.
----------------------------------------------------------------
Benchmark                      Time             CPU   Iterations
----------------------------------------------------------------
BencharkBasic/0              670 ns          670 ns      1043570
BencharkBasic/16            5263 ns         5263 ns       133004
BencharkBasic/32            9120 ns         9121 ns        76723
BencharkBasic/48           21340 ns        21342 ns        32827
BencharkBasic/64           33367 ns        33370 ns        20961
BencharkBasic/80           45063 ns        44963 ns        15556
BencharkBasic/96           56675 ns        56554 ns        12362
BencharkBasic/112          68471 ns        68327 ns        10202
BencharkBasic/128          80235 ns        80071 ns         8723
BencharkBasic/144          91922 ns        91736 ns         7579
BencharkBasic/160         103497 ns       103294 ns         6789
BencharkBasic/176         115824 ns       115604 ns         6050
BencharkBasic/192         127998 ns       127757 ns         5475
BencharkBasic/208         140637 ns       140381 ns         4986
BencharkBasic/224         152680 ns       152407 ns         4596
BencharkBasic/240         164970 ns       164685 ns         4250
BencharkBasic/256         176504 ns       176204 ns         3974
BencharkText/0            346516 ns       345937 ns         2026
BencharkText/1            347060 ns       346504 ns         2017
BencharkText/2            348518 ns       347984 ns         2014
BencharkText/3            351807 ns       351284 ns         1996
BencharkText/4            360974 ns       360454 ns         1942
BencharkText/5            374090 ns       373567 ns         1874
BencharkText/6            400412 ns       399872 ns         1752
BencharkText/7            460188 ns       459583 ns         1520
BencharkText/8            576579 ns       575846 ns         1218
BencharkText/9            569757 ns       569039 ns         1222
BencharkText/10          1059942 ns      1058643 ns          660
BenchmarkStyle/1/10         3951 ns         3946 ns       177184
BenchmarkStyle/4/10        12595 ns        12581 ns        55769
BenchmarkStyle/7/10        20099 ns        20076 ns        34666
BenchmarkStyle/10/10       27706 ns        27676 ns        25268
BenchmarkStyle/1/30         8683 ns         8674 ns        80758
BenchmarkStyle/4/30        23349 ns        23324 ns        30026
BenchmarkStyle/7/30        35890 ns        35853 ns        19545
BenchmarkStyle/10/30       48805 ns        48758 ns        14335
BenchmarkStyle/1/50        15588 ns        15573 ns        44937
BenchmarkStyle/4/50        34236 ns        34204 ns        20454
BenchmarkStyle/7/50        53039 ns        52991 ns        13217
BenchmarkStyle/10/50       71543 ns        71479 ns         9786
BenchmarkStyle/1/70        25081 ns        25059 ns        27932
BenchmarkStyle/4/70        47818 ns        47777 ns        14643
BenchmarkStyle/7/70        70660 ns        70601 ns         9912
BenchmarkStyle/10/70       93699 ns        93623 ns         7474
BenchmarkStyle/1/90        36713 ns        36684 ns        19083
BenchmarkStyle/4/90        63686 ns        63636 ns        11006
BenchmarkStyle/7/90        89538 ns        89469 ns         7836
BenchmarkStyle/10/90      116117 ns       116030 ns         6027
BenchmarkStyle/1/110       50926 ns        50888 ns        13755
BenchmarkStyle/4/110       83170 ns        83111 ns         8414
BenchmarkStyle/7/110      114988 ns       114906 ns         6093
BenchmarkStyle/10/110     146711 ns       146611 ns         4782
BenchmarkStyle/1/130       67311 ns        67265 ns        10418
BenchmarkStyle/4/130      103852 ns       103783 ns         6739
BenchmarkStyle/7/130      139412 ns       139322 ns         5005
BenchmarkStyle/10/130     175586 ns       175472 ns         3993
BenchmarkStyle/1/150       84942 ns        84889 ns         8257
BenchmarkStyle/4/150      125590 ns       125512 ns         5592
BenchmarkStyle/7/150      165706 ns       165607 ns         4227
BenchmarkStyle/10/150     205691 ns       205568 ns         3407
BenchmarkStyle/1/170      103381 ns       103321 ns         6774
BenchmarkStyle/4/170      147033 ns       146950 ns         4763
BenchmarkStyle/7/170      190839 ns       190734 ns         3669
BenchmarkStyle/10/170     234973 ns       234848 ns         2980
BenchmarkStyle/1/190      126939 ns       126872 ns         5520
BenchmarkStyle/4/190      176441 ns       176352 ns         3971
BenchmarkStyle/7/190      223759 ns       223642 ns         3124
BenchmarkStyle/10/190     274235 ns       274099 ns         2556
Comparing ../../ftxui-benchmark-old to ../../ftxui-benchmark-new
Benchmark                               Time             CPU      Time Old      Time New       CPU Old       CPU New
--------------------------------------------------------------------------------------------------------------------
BencharkBasic/0                      +0.0143         +0.0143           661           670           661           670
BencharkBasic/16                     -0.0039         -0.0038          5283          5263          5283          5263
BencharkBasic/32                     -0.0123         -0.0123          9234          9120          9234          9121
BencharkBasic/48                     -0.0024         -0.0024         21391         21340         21392         21342
BencharkBasic/64                     +0.0040         +0.0040         33235         33367         33236         33370
BencharkBasic/80                     +0.0053         +0.0030         44826         45063         44828         44963
BencharkBasic/96                     +0.0053         +0.0032         56374         56675         56376         56554
BencharkBasic/112                    +0.0040         +0.0018         68201         68471         68204         68327
BencharkBasic/128                    +0.0051         +0.0030         79828         80235         79832         80071
BencharkBasic/144                    +0.0019         -0.0001         91744         91922         91747         91736
BencharkBasic/160                    +0.0023         +0.0003        103259        103497        103264        103294
BencharkBasic/176                    +0.0046         +0.0027        115293        115824        115298        115604
BencharkBasic/192                    +0.0019         -0.0000        127754        127998        127762        127757
BencharkBasic/208                    +0.0114         +0.0095        139056        140637        139065        140381
BencharkBasic/224                    +0.0123         +0.0105        150824        152680        150830        152407
BencharkBasic/240                    +0.0186         +0.0168        161952        164970        161962        164685
BencharkBasic/256                    +0.0168         +0.0150        173596        176504        173605        176204
BencharkText/0                       +0.0180         +0.0162        340392        346516        340414        345937
BencharkText/1                       +0.0175         +0.0159        341077        347060        341091        346504
BencharkText/2                       +0.0153         +0.0137        343266        348518        343286        347984
BencharkText/3                       +0.0176         +0.0161        345713        351807        345733        351284
BencharkText/4                       +0.0040         +0.0025        359522        360974        359549        360454
BencharkText/5                       +0.0081         +0.0066        371079        374090        371103        373567
BencharkText/6                       +0.0078         +0.0064        397320        400412        397343        399872
BencharkText/7                       +0.0097         +0.0083        455762        460188        455796        459583
BencharkText/8                       +0.0161         +0.0148        567428        576579        567470        575846
BencharkText/9                       +0.0066         +0.0052        566031        569757        566067        569039
BencharkText/10                      +0.0037         +0.0024       1056071       1059942       1056151       1058643
BenchmarkStyle/1/10                  +0.0129         +0.0116          3901          3951          3901          3946
BenchmarkStyle/4/10                  +0.0215         +0.0202         12331         12595         12332         12581
BenchmarkStyle/7/10                  +0.0033         +0.0021         20032         20099         20033         20076
BenchmarkStyle/10/10                 +0.0045         +0.0033         27584         27706         27586         27676
BenchmarkStyle/1/30                  +0.0386         +0.0374          8360          8683          8361          8674
BenchmarkStyle/4/30                  +0.0197         +0.0185         22897         23349         22899         23324
BenchmarkStyle/7/30                  +0.0308         +0.0297         34817         35890         34819         35853
BenchmarkStyle/10/30                 +0.0242         +0.0232         47650         48805         47654         48758
BenchmarkStyle/1/50                  +0.0163         +0.0152         15338         15588         15339         15573
BenchmarkStyle/4/50                  +0.0077         +0.0066         33976         34236         33979         34204
BenchmarkStyle/7/50                  +0.0090         +0.0081         52564         53039         52567         52991
BenchmarkStyle/10/50                 +0.0065         +0.0055         71080         71543         71086         71479
BenchmarkStyle/1/70                  -0.0073         -0.0082         25265         25081         25267         25059
BenchmarkStyle/4/70                  -0.0004         -0.0013         47836         47818         47840         47777
BenchmarkStyle/7/70                  -0.0001         -0.0010         70667         70660         70673         70601
BenchmarkStyle/10/70                 +0.0056         +0.0047         93177         93699         93186         93623
BenchmarkStyle/1/90                  -0.0082         -0.0090         37016         36713         37019         36684
BenchmarkStyle/4/90                  -0.0053         -0.0061         64023         63686         64027         63636
BenchmarkStyle/7/90                  -0.0078         -0.0086         90241         89538         90249         89469
BenchmarkStyle/10/90                 -0.0039         -0.0047        116566        116117        116575        116030
BenchmarkStyle/1/110                 -0.0055         -0.0063         51207         50926         51211         50888
BenchmarkStyle/4/110                 +0.0001         -0.0007         83160         83170         83168         83111
BenchmarkStyle/7/110                 +0.0053         +0.0045        114381        114988        114390        114906
BenchmarkStyle/10/110                +0.0073         +0.0066        145643        146711        145656        146611
BenchmarkStyle/1/130                 +0.0064         +0.0056         66883         67311         66888         67265
BenchmarkStyle/4/130                 +0.0077         +0.0070        103054        103852        103063        103783
BenchmarkStyle/7/130                 +0.0055         +0.0047        138655        139412        138668        139322
BenchmarkStyle/10/130                +0.0071         +0.0064        174342        175586        174358        175472
BenchmarkStyle/1/150                 -0.0010         -0.0017         85025         84942         85032         84889
BenchmarkStyle/4/150                 -0.0024         -0.0032        125898        125590        125909        125512
BenchmarkStyle/7/150                 +0.0010         +0.0003        165542        165706        165555        165607
BenchmarkStyle/10/150                +0.0014         +0.0007        205404        205691        205421        205568
BenchmarkStyle/1/170                 -0.0122         -0.0128        104656        103381        104666        103321
BenchmarkStyle/4/170                 -0.0065         -0.0072        147999        147033        148012        146950
BenchmarkStyle/7/170                 +0.0005         -0.0002        190749        190839        190765        190734
BenchmarkStyle/10/170                +0.0019         +0.0012        234539        234973        234562        234848
BenchmarkStyle/1/190                 -0.0165         -0.0171        129073        126939        129084        126872
BenchmarkStyle/4/190                 -0.0100         -0.0106        178223        176441        178241        176352
BenchmarkStyle/7/190                 -0.0114         -0.0120        226333        223759        226354        223642
BenchmarkStyle/10/190                -0.0016         -0.0022        274675        274235        274702        274099
OVERALL_GEOMEAN                      +0.0052         +0.0041             0             0             0             0

Overall, I don't see any impact. Overall, I get a 0.52% regression. I did multiple experiments and this was consistent.

Potential explanation:

  • My compiler does a good job at optimizing the operator==. Yours doesn't ;-) Depending on the architecture / compiler, we might not see the potential performance improvement. Are you compiling in Release mode? I can try again in debug mode in case we want to optimize for this specifically.
  • UpdatePixelStyle might not represent a sufficiently large hot path for performance improvements here to be visible overall.
  • What we are testing in the benchmark are not close to what you did.

What do you think?

@clement-roblot
Copy link
Contributor Author

Whaooo, I didn´t know about the google benchmark tool, it is great!

The reason why you don´t see a difference before and after my changes is cause the benchmarks were only cooling the render method and not the print one. I added this call in this MR: #708.

Here are my results with the modified benchmarks:

RUNNING: ./ftxui-benchmark_main --benchmark_out=/tmp/tmpma2b34qe
2023-07-29T15:39:18+07:00
Running ./ftxui-benchmark_main
Run on (20 X 4900 MHz CPU s)
CPU Caches:
  L1 Data 48 KiB (x10)
  L1 Instruction 32 KiB (x10)
  L2 Unified 1280 KiB (x10)
  L3 Unified 24576 KiB (x1)
Load Average: 5.70, 3.09, 2.45
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
***WARNING*** Library was built as DEBUG. Timings may be affected.
----------------------------------------------------------------
Benchmark                      Time             CPU   Iterations
----------------------------------------------------------------
BencharkBasic/0             7465 ns         7465 ns        96008
BencharkBasic/16          184769 ns       184764 ns         3749
BencharkBasic/32          360656 ns       360642 ns         1946
BencharkBasic/48          543427 ns       543398 ns         1243
BencharkBasic/64          717633 ns       717579 ns          985
BencharkBasic/80          885555 ns       885485 ns          770
BencharkBasic/96         1064472 ns      1064412 ns          660
BencharkBasic/112        1261441 ns      1261401 ns          565
BencharkBasic/128        1426094 ns      1426021 ns          495
BencharkBasic/144        1599100 ns      1599047 ns          437
BencharkBasic/160        1742708 ns      1742634 ns          391
BencharkBasic/176        1857266 ns      1857179 ns          368
BencharkBasic/192        2016650 ns      2016636 ns          335
BencharkBasic/208        2221263 ns      2221229 ns          316
BencharkBasic/224        2377635 ns      2377602 ns          293
BencharkBasic/240        2626565 ns      2626407 ns          270
BencharkBasic/256        2786363 ns      2786121 ns          247
BencharkText/0           5557972 ns      5557406 ns          130
BencharkText/1           5393001 ns      5392132 ns          126
BencharkText/2           5376993 ns      5376711 ns          129
BencharkText/3           5407915 ns      5407453 ns          131
BencharkText/4           5478744 ns      5478520 ns          127
BencharkText/5           5600976 ns      5600407 ns          122
BencharkText/6           5702823 ns      5702336 ns          118
BencharkText/7           5901232 ns      5900936 ns          122
BencharkText/8           6398739 ns      6398403 ns          111
BencharkText/9           7139393 ns      7139055 ns          100
BencharkText/10          9550684 ns      9550348 ns           75
BenchmarkStyle/1/10        41746 ns        41742 ns        16654
BenchmarkStyle/4/10       102404 ns       102396 ns         6804
BenchmarkStyle/7/10       162840 ns       162827 ns         4355
BenchmarkStyle/10/10      221576 ns       221573 ns         3179
BenchmarkStyle/1/30       163278 ns       163271 ns         4316
BenchmarkStyle/4/30       246746 ns       246738 ns         2840
BenchmarkStyle/7/30       322300 ns       322273 ns         2201
BenchmarkStyle/10/30      396824 ns       396809 ns         1776
BenchmarkStyle/1/50       382279 ns       382262 ns         1843
BenchmarkStyle/4/50       484457 ns       484405 ns         1449
BenchmarkStyle/7/50       583734 ns       583699 ns         1203
BenchmarkStyle/10/50      676326 ns       676288 ns         1031
BenchmarkStyle/1/70       714395 ns       714217 ns          973
BenchmarkStyle/4/70       831070 ns       830820 ns          845
BenchmarkStyle/7/70       939932 ns       939598 ns          747
BenchmarkStyle/10/70     1047830 ns      1047421 ns          674
BenchmarkStyle/1/90      1135729 ns      1135361 ns          607
BenchmarkStyle/4/90      1273106 ns      1272596 ns          541
BenchmarkStyle/7/90      1404297 ns      1403768 ns          505
BenchmarkStyle/10/90     1524838 ns      1524268 ns          463
BenchmarkStyle/1/110     1677757 ns      1677132 ns          420
BenchmarkStyle/4/110     1839403 ns      1838551 ns          385
BenchmarkStyle/7/110     1959455 ns      1958655 ns          353
BenchmarkStyle/10/110    2098355 ns      2097208 ns          335
BenchmarkStyle/1/130     2334769 ns      2333774 ns          299
BenchmarkStyle/4/130     2489973 ns      2488894 ns          279
BenchmarkStyle/7/130     2621486 ns      2620501 ns          264
BenchmarkStyle/10/130    2797303 ns      2795893 ns          252
BenchmarkStyle/1/150     3054922 ns      3053607 ns          229
BenchmarkStyle/4/150     3246166 ns      3244771 ns          215
BenchmarkStyle/7/150     3394809 ns      3393317 ns          206
BenchmarkStyle/10/150    3579819 ns      3578060 ns          196
BenchmarkStyle/1/170     3912660 ns      3910985 ns          177
BenchmarkStyle/4/170     4059272 ns      4057580 ns          170
BenchmarkStyle/7/170     4313366 ns      4311269 ns          163
BenchmarkStyle/10/170    4466248 ns      4464285 ns          158
BenchmarkStyle/1/190     4656814 ns      4656497 ns          146
BenchmarkStyle/4/190     4869453 ns      4868974 ns          146
BenchmarkStyle/7/190     5075422 ns      5075096 ns          136
BenchmarkStyle/10/190    5109552 ns      5109322 ns          137
RUNNING: ./ftxui-benchmark_new --benchmark_out=/tmp/tmpcmz33yjq
2023-07-29T15:40:20+07:00
Running ./ftxui-benchmark_new
Run on (20 X 4900 MHz CPU s)
CPU Caches:
  L1 Data 48 KiB (x10)
  L1 Instruction 32 KiB (x10)
  L2 Unified 1280 KiB (x10)
  L3 Unified 24576 KiB (x1)
Load Average: 3.39, 2.91, 2.43
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
***WARNING*** Library was built as DEBUG. Timings may be affected.
----------------------------------------------------------------
Benchmark                      Time             CPU   Iterations
----------------------------------------------------------------
BencharkBasic/0             7318 ns         7318 ns        89848
BencharkBasic/16          172303 ns       172285 ns         4119
BencharkBasic/32          330195 ns       330176 ns         2165
BencharkBasic/48          482383 ns       482329 ns         1432
BencharkBasic/64          651671 ns       651654 ns         1066
BencharkBasic/80          815455 ns       815433 ns          862
BencharkBasic/96          976000 ns       975877 ns          721
BencharkBasic/112        1133033 ns      1132916 ns          617
BencharkBasic/128        1298220 ns      1297982 ns          523
BencharkBasic/144        1478084 ns      1478013 ns          481
BencharkBasic/160        1689523 ns      1689454 ns          417
BencharkBasic/176        1889036 ns      1889004 ns          384
BencharkBasic/192        2084001 ns      2083875 ns          327
BencharkBasic/208        2280531 ns      2280047 ns          304
BencharkBasic/224        2379306 ns      2379202 ns          296
BencharkBasic/240        2510841 ns      2510733 ns          279
BencharkBasic/256        2676281 ns      2676200 ns          259
BencharkText/0           5206081 ns      5205903 ns          127
BencharkText/1           5200560 ns      5200421 ns          135
BencharkText/2           5183302 ns      5182898 ns          135
BencharkText/3           5194144 ns      5193513 ns          135
BencharkText/4           5280199 ns      5279550 ns          129
BencharkText/5           5600647 ns      5600329 ns          132
BencharkText/6           5483730 ns      5483412 ns          125
BencharkText/7           5691876 ns      5691543 ns          125
BencharkText/8           6226383 ns      6225548 ns          110
BencharkText/9           6873366 ns      6873159 ns          102
BencharkText/10          8980743 ns      8978179 ns           76
BenchmarkStyle/1/10        39272 ns        39271 ns        18000
BenchmarkStyle/4/10        97000 ns        96994 ns         7229
BenchmarkStyle/7/10       148218 ns       148214 ns         4554
BenchmarkStyle/10/10      205387 ns       205367 ns         3477
BenchmarkStyle/1/30       150921 ns       150879 ns         4690
BenchmarkStyle/4/30       230935 ns       230923 ns         3059
BenchmarkStyle/7/30       302386 ns       302362 ns         2338
BenchmarkStyle/10/30      381807 ns       381787 ns         1847
BenchmarkStyle/1/50       361157 ns       361129 ns         1943
BenchmarkStyle/4/50       451193 ns       451131 ns         1515
BenchmarkStyle/7/50       530508 ns       530497 ns         1333
BenchmarkStyle/10/50      629591 ns       629550 ns         1152
BenchmarkStyle/1/70       650914 ns       650882 ns         1082
BenchmarkStyle/4/70       746891 ns       746842 ns          938
BenchmarkStyle/7/70       854111 ns       854069 ns          815
BenchmarkStyle/10/70      947701 ns       947653 ns          718
BenchmarkStyle/1/90      1022163 ns      1022104 ns          689
BenchmarkStyle/4/90      1158309 ns      1158236 ns          623
BenchmarkStyle/7/90      1253361 ns      1253337 ns          540
BenchmarkStyle/10/90     1371411 ns      1371338 ns          509
BenchmarkStyle/1/110     1503839 ns      1503808 ns          464
BenchmarkStyle/4/110     1643051 ns      1642952 ns          424
BenchmarkStyle/7/110     1766694 ns      1766256 ns          398
BenchmarkStyle/10/110    1885139 ns      1885053 ns          372
BenchmarkStyle/1/130     2204306 ns      2204161 ns          340
BenchmarkStyle/4/130     2411446 ns      2411099 ns          287
BenchmarkStyle/7/130     2584320 ns      2584051 ns          272
BenchmarkStyle/10/130    2722934 ns      2722457 ns          256
BenchmarkStyle/1/150     2974006 ns      2973857 ns          234
BenchmarkStyle/4/150     3127931 ns      3127602 ns          222
BenchmarkStyle/7/150     3312059 ns      3311734 ns          211
BenchmarkStyle/10/150    3471608 ns      3471391 ns          200
BenchmarkStyle/1/170     3767043 ns      3766893 ns          184
BenchmarkStyle/4/170     3973963 ns      3973672 ns          177
BenchmarkStyle/7/170     4132973 ns      4132868 ns          171
BenchmarkStyle/10/170    4320306 ns      4320098 ns          163
BenchmarkStyle/1/190     4671311 ns      4670867 ns          151
BenchmarkStyle/4/190     4858326 ns      4858196 ns          144
BenchmarkStyle/7/190     5035643 ns      5035129 ns          138
BenchmarkStyle/10/190    5225560 ns      5225351 ns          136
Comparing ./ftxui-benchmark_main to ./ftxui-benchmark_new
Benchmark                               Time             CPU      Time Old      Time New       CPU Old       CPU New
--------------------------------------------------------------------------------------------------------------------
BencharkBasic/0                      -0.0197         -0.0197          7465          7318          7465          7318
BencharkBasic/16                     -0.0675         -0.0675        184769        172303        184764        172285
BencharkBasic/32                     -0.0845         -0.0845        360656        330195        360642        330176
BencharkBasic/48                     -0.1123         -0.1124        543427        482383        543398        482329
BencharkBasic/64                     -0.0919         -0.0919        717633        651671        717579        651654
BencharkBasic/80                     -0.0792         -0.0791        885555        815455        885485        815433
BencharkBasic/96                     -0.0831         -0.0832       1064472        976000       1064412        975877
BencharkBasic/112                    -0.1018         -0.1019       1261441       1133033       1261401       1132916
BencharkBasic/128                    -0.0897         -0.0898       1426094       1298220       1426021       1297982
BencharkBasic/144                    -0.0757         -0.0757       1599100       1478084       1599047       1478013
BencharkBasic/160                    -0.0305         -0.0305       1742708       1689523       1742634       1689454
BencharkBasic/176                    +0.0171         +0.0171       1857266       1889036       1857179       1889004
BencharkBasic/192                    +0.0334         +0.0333       2016650       2084001       2016636       2083875
BencharkBasic/208                    +0.0267         +0.0265       2221263       2280531       2221229       2280047
BencharkBasic/224                    +0.0007         +0.0007       2377635       2379306       2377602       2379202
BencharkBasic/240                    -0.0441         -0.0440       2626565       2510841       2626407       2510733
BencharkBasic/256                    -0.0395         -0.0395       2786363       2676281       2786121       2676200
BencharkText/0                       -0.0633         -0.0632       5557972       5206081       5557406       5205903
BencharkText/1                       -0.0357         -0.0356       5393001       5200560       5392132       5200421
BencharkText/2                       -0.0360         -0.0360       5376993       5183302       5376711       5182898
BencharkText/3                       -0.0395         -0.0396       5407915       5194144       5407453       5193513
BencharkText/4                       -0.0362         -0.0363       5478744       5280199       5478520       5279550
BencharkText/5                       -0.0001         -0.0000       5600976       5600647       5600407       5600329
BencharkText/6                       -0.0384         -0.0384       5702823       5483730       5702336       5483412
BencharkText/7                       -0.0355         -0.0355       5901232       5691876       5900936       5691543
BencharkText/8                       -0.0269         -0.0270       6398739       6226383       6398403       6225548
BencharkText/9                       -0.0373         -0.0372       7139393       6873366       7139055       6873159
BencharkText/10                      -0.0597         -0.0599       9550684       8980743       9550348       8978179
BenchmarkStyle/1/10                  -0.0592         -0.0592         41746         39272         41742         39271
BenchmarkStyle/4/10                  -0.0528         -0.0528        102404         97000        102396         96994
BenchmarkStyle/7/10                  -0.0898         -0.0897        162840        148218        162827        148214
BenchmarkStyle/10/10                 -0.0731         -0.0731        221576        205387        221573        205367
BenchmarkStyle/1/30                  -0.0757         -0.0759        163278        150921        163271        150879
BenchmarkStyle/4/30                  -0.0641         -0.0641        246746        230935        246738        230923
BenchmarkStyle/7/30                  -0.0618         -0.0618        322300        302386        322273        302362
BenchmarkStyle/10/30                 -0.0378         -0.0379        396824        381807        396809        381787
BenchmarkStyle/1/50                  -0.0553         -0.0553        382279        361157        382262        361129
BenchmarkStyle/4/50                  -0.0687         -0.0687        484457        451193        484405        451131
BenchmarkStyle/7/50                  -0.0912         -0.0911        583734        530508        583699        530497
BenchmarkStyle/10/50                 -0.0691         -0.0691        676326        629591        676288        629550
BenchmarkStyle/1/70                  -0.0889         -0.0887        714395        650914        714217        650882
BenchmarkStyle/4/70                  -0.1013         -0.1011        831070        746891        830820        746842
BenchmarkStyle/7/70                  -0.0913         -0.0910        939932        854111        939598        854069
BenchmarkStyle/10/70                 -0.0956         -0.0953       1047830        947701       1047421        947653
BenchmarkStyle/1/90                  -0.1000         -0.0998       1135729       1022163       1135361       1022104
BenchmarkStyle/4/90                  -0.0902         -0.0899       1273106       1158309       1272596       1158236
BenchmarkStyle/7/90                  -0.1075         -0.1072       1404297       1253361       1403768       1253337
BenchmarkStyle/10/90                 -0.1006         -0.1003       1524838       1371411       1524268       1371338
BenchmarkStyle/1/110                 -0.1037         -0.1033       1677757       1503839       1677132       1503808
BenchmarkStyle/4/110                 -0.1067         -0.1064       1839403       1643051       1838551       1642952
BenchmarkStyle/7/110                 -0.0984         -0.0982       1959455       1766694       1958655       1766256
BenchmarkStyle/10/110                -0.1016         -0.1012       2098355       1885139       2097208       1885053
BenchmarkStyle/1/130                 -0.0559         -0.0555       2334769       2204306       2333774       2204161
BenchmarkStyle/4/130                 -0.0315         -0.0313       2489973       2411446       2488894       2411099
BenchmarkStyle/7/130                 -0.0142         -0.0139       2621486       2584320       2620501       2584051
BenchmarkStyle/10/130                -0.0266         -0.0263       2797303       2722934       2795893       2722457
BenchmarkStyle/1/150                 -0.0265         -0.0261       3054922       2974006       3053607       2973857
BenchmarkStyle/4/150                 -0.0364         -0.0361       3246166       3127931       3244771       3127602
BenchmarkStyle/7/150                 -0.0244         -0.0240       3394809       3312059       3393317       3311734
BenchmarkStyle/10/150                -0.0302         -0.0298       3579819       3471608       3578060       3471391
BenchmarkStyle/1/170                 -0.0372         -0.0368       3912660       3767043       3910985       3766893
BenchmarkStyle/4/170                 -0.0210         -0.0207       4059272       3973963       4057580       3973672
BenchmarkStyle/7/170                 -0.0418         -0.0414       4313366       4132973       4311269       4132868
BenchmarkStyle/10/170                -0.0327         -0.0323       4466248       4320306       4464285       4320098
BenchmarkStyle/1/190                 +0.0031         +0.0031       4656814       4671311       4656497       4670867
BenchmarkStyle/4/190                 -0.0023         -0.0022       4869453       4858326       4868974       4858196
BenchmarkStyle/7/190                 -0.0078         -0.0079       5075422       5035643       5075096       5035129
BenchmarkStyle/10/190                +0.0227         +0.0227       5109552       5225560       5109322       5225351
OVERALL_GEOMEAN                      -0.0536         -0.0535             0             0             0             0

So an overall 5% improvement. The 3 runs of the benchmarks I did yield improvements of: 4.33%, 4.65%, and finally the one above 5.36%.

@ArthurSonzogni
Copy link
Owner

Awesome!

This PR will definitely be merged. I am a bit busy this week and I would like to explore some alternatives ways to express the same idea with the same performance benefits. It will take me some time. Thank you for your patience.

@StefanRvO
Copy link
Contributor

FYI, rebasing on top of #713, i measure those two together to improve the benchmark results by ~36%

@ArthurSonzogni
Copy link
Owner

I pushed some changes.

I wanted to get the same performance improvement, but without causing a large breaking change.

I managed to get an additional 2.5% on top of the previous performance improvement. See details:

RUNNING: ../../../ftxui-benchmark-4.0.0new --benchmark_out=/tmp/tmp7xa1pfhu
2023-08-05T16:52:41+02:00
Running ../../../ftxui-benchmark-4.0.0new
Run on (16 X 4784.38 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 512 KiB (x8)
  L3 Unified 16384 KiB (x1)
Load Average: 0.62, 1.47, 0.93
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
----------------------------------------------------------------
Benchmark                      Time             CPU   Iterations
----------------------------------------------------------------
BencharkBasic/0              896 ns          895 ns       779008
BencharkBasic/16           31396 ns        31338 ns        22412
BencharkBasic/32           58456 ns        58403 ns        11748
BencharkBasic/48           98401 ns        98323 ns         7105
BencharkBasic/64          136861 ns       136762 ns         5120
BencharkBasic/80          173237 ns       173118 ns         4046
BencharkBasic/96          209190 ns       209052 ns         3349
BencharkBasic/112         246510 ns       246360 ns         2842
BencharkBasic/128         282502 ns       282334 ns         2478
BencharkBasic/144         322656 ns       322480 ns         2177
BencharkBasic/160         359378 ns       359197 ns         1955
BencharkBasic/176         395286 ns       395099 ns         1758
BencharkBasic/192         426039 ns       425571 ns         1644
BencharkBasic/208         474770 ns       474077 ns         1477
BencharkBasic/224         507986 ns       507498 ns         1370
BencharkBasic/240         534813 ns       534326 ns         1306
BencharkBasic/256         573306 ns       573029 ns         1205
BencharkText/0           1135108 ns      1134690 ns          618
BencharkText/1           1137683 ns      1136998 ns          615
BencharkText/2           1133314 ns      1132970 ns          615
BencharkText/3           1138201 ns      1137893 ns          615
BencharkText/4           1145515 ns      1145198 ns          611
BencharkText/5           1154433 ns      1154098 ns          607
BencharkText/6           1150365 ns      1150071 ns          609
BencharkText/7           1222571 ns      1222222 ns          573
BencharkText/8           1309002 ns      1308680 ns          535
BencharkText/9           1245396 ns      1245136 ns          563
BencharkText/10          1654822 ns      1654483 ns          423
BenchmarkStyle/1/10         7340 ns         7338 ns        95284
BenchmarkStyle/4/10        17663 ns        17659 ns        39634
BenchmarkStyle/7/10        25520 ns        25515 ns        27416
BenchmarkStyle/10/10       32741 ns        32735 ns        21379
BenchmarkStyle/1/30        28981 ns        28977 ns        24143
BenchmarkStyle/4/30        48910 ns        48894 ns        14286
BenchmarkStyle/7/30        64869 ns        64858 ns        10770
BenchmarkStyle/10/30       81745 ns        81727 ns         8553
BenchmarkStyle/1/50        67838 ns        67823 ns        10301
BenchmarkStyle/4/50        93850 ns        93835 ns         7449
BenchmarkStyle/7/50       120960 ns       120941 ns         5817
BenchmarkStyle/10/50      145432 ns       145409 ns         4869
BenchmarkStyle/1/70       127017 ns       126997 ns         5514
BenchmarkStyle/4/70       159732 ns       159689 ns         4311
BenchmarkStyle/7/70       187992 ns       187876 ns         3716
BenchmarkStyle/10/70      220995 ns       220971 ns         3165
BenchmarkStyle/1/90       202850 ns       202835 ns         3457
BenchmarkStyle/4/90       239642 ns       239598 ns         2922
BenchmarkStyle/7/90       274152 ns       274127 ns         2550
BenchmarkStyle/10/90      308259 ns       308235 ns         2292
BenchmarkStyle/1/110      296100 ns       296083 ns         2364
BenchmarkStyle/4/110      339569 ns       339522 ns         2067
BenchmarkStyle/7/110      382001 ns       382057 ns         1831
BenchmarkStyle/10/110     422703 ns       423049 ns         1654
BenchmarkStyle/1/130      406635 ns       406967 ns         1721
BenchmarkStyle/4/130      455198 ns       455553 ns         1537
BenchmarkStyle/7/130      502563 ns       502946 ns         1387
BenchmarkStyle/10/130     550151 ns       550556 ns         1273
BenchmarkStyle/1/150      534260 ns       534652 ns         1329
BenchmarkStyle/4/150      597867 ns       598149 ns         1187
BenchmarkStyle/7/150      645176 ns       644847 ns         1095
BenchmarkStyle/10/150     696525 ns       696493 ns          989
BenchmarkStyle/1/170      674784 ns       675042 ns         1041
BenchmarkStyle/4/170      737893 ns       738354 ns          948
BenchmarkStyle/7/170      796882 ns       797382 ns          892
BenchmarkStyle/10/170     852761 ns       853294 ns          830
BenchmarkStyle/1/190      847716 ns       848197 ns          837
BenchmarkStyle/4/190      911023 ns       911579 ns          771
BenchmarkStyle/7/190      974989 ns       975495 ns          722
BenchmarkStyle/10/190    1043540 ns      1044120 ns          671
RUNNING: ../../../ftxui-benchmark-4.0.0newnew --benchmark_out=/tmp/tmpn0varima
2023-08-05T16:53:42+02:00
Running ../../../ftxui-benchmark-4.0.0newnew
Run on (16 X 4784.38 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 512 KiB (x8)
  L3 Unified 16384 KiB (x1)
Load Average: 0.86, 1.38, 0.94
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
----------------------------------------------------------------
Benchmark                      Time             CPU   Iterations
----------------------------------------------------------------
BencharkBasic/0              855 ns          855 ns       817978
BencharkBasic/16           30625 ns        30642 ns        22846
BencharkBasic/32           58598 ns        58629 ns        11936
BencharkBasic/48           97162 ns        97213 ns         7199
BencharkBasic/64          135953 ns       136023 ns         5155
BencharkBasic/80          172212 ns       172296 ns         4061
BencharkBasic/96          208420 ns       208519 ns         3360
BencharkBasic/112         245834 ns       245948 ns         2847
BencharkBasic/128         273781 ns       273905 ns         2552
BencharkBasic/144         316991 ns       317138 ns         2198
BencharkBasic/160         346441 ns       346593 ns         1980
BencharkBasic/176         388961 ns       389125 ns         1798
BencharkBasic/192         424468 ns       424627 ns         1648
BencharkBasic/208         462705 ns       462878 ns         1512
BencharkBasic/224         497541 ns       497723 ns         1409
BencharkBasic/240         527948 ns       528136 ns         1309
BencharkBasic/256         567338 ns       567522 ns         1233
BencharkText/0           1114830 ns      1115223 ns          628
BencharkText/1           1081216 ns      1081603 ns          637
BencharkText/2           1108655 ns      1109033 ns          631
BencharkText/3           1099759 ns      1100119 ns          629
BencharkText/4           1109606 ns      1109955 ns          624
BencharkText/5           1127471 ns      1127806 ns          618
BencharkText/6           1147775 ns      1148130 ns          609
BencharkText/7           1165137 ns      1165500 ns          587
BencharkText/8           1256542 ns      1256916 ns          549
BencharkText/9           1197070 ns      1197410 ns          585
BencharkText/10          1587300 ns      1587712 ns          439
BenchmarkStyle/1/10         7261 ns         7263 ns        96530
BenchmarkStyle/4/10        17482 ns        17487 ns        40172
BenchmarkStyle/7/10        25284 ns        25291 ns        27700
BenchmarkStyle/10/10       32039 ns        32048 ns        21840
BenchmarkStyle/1/30        28458 ns        28466 ns        24546
BenchmarkStyle/4/30        48113 ns        48124 ns        14565
BenchmarkStyle/7/30        64400 ns        64415 ns        10888
BenchmarkStyle/10/30       80205 ns        80225 ns         8740
BenchmarkStyle/1/50        67078 ns        67095 ns        10427
BenchmarkStyle/4/50        91410 ns        91433 ns         7653
BenchmarkStyle/7/50       115568 ns       115595 ns         6061
BenchmarkStyle/10/50      136407 ns       136438 ns         5129
BenchmarkStyle/1/70       121865 ns       121894 ns         5743
BenchmarkStyle/4/70       151877 ns       151909 ns         4602
BenchmarkStyle/7/70       182333 ns       182374 ns         3865
BenchmarkStyle/10/70      210661 ns       210706 ns         3320
BenchmarkStyle/1/90       198175 ns       198217 ns         3536
BenchmarkStyle/4/90       232899 ns       232940 ns         3022
BenchmarkStyle/7/90       266834 ns       266884 ns         2633
BenchmarkStyle/10/90      295164 ns       295220 ns         2345
BenchmarkStyle/1/110      285371 ns       285428 ns         2452
BenchmarkStyle/4/110      327323 ns       327381 ns         2139
BenchmarkStyle/7/110      370196 ns       370262 ns         1903
BenchmarkStyle/10/110     404047 ns       404122 ns         1705
BenchmarkStyle/1/130      397629 ns       397705 ns         1760
BenchmarkStyle/4/130      444378 ns       444462 ns         1575
BenchmarkStyle/7/130      490318 ns       490398 ns         1429
BenchmarkStyle/10/130     527550 ns       527619 ns         1320
BenchmarkStyle/1/150      519871 ns       519958 ns         1343
BenchmarkStyle/4/150      576687 ns       576772 ns         1223
BenchmarkStyle/7/150      625753 ns       625854 ns         1125
BenchmarkStyle/10/150     681192 ns       681293 ns         1028
BenchmarkStyle/1/170      666201 ns       666309 ns         1050
BenchmarkStyle/4/170      722021 ns       722138 ns          968
BenchmarkStyle/7/170      763110 ns       763224 ns          908
BenchmarkStyle/10/170     830128 ns       830260 ns          843
BenchmarkStyle/1/190      826855 ns       826986 ns          847
BenchmarkStyle/4/190      889685 ns       889807 ns          786
BenchmarkStyle/7/190      950674 ns       950823 ns          734
BenchmarkStyle/10/190    1002979 ns      1003120 ns          693
Comparing ../../../ftxui-benchmark-4.0.0new to ../../../ftxui-benchmark-4.0.0newnew
Benchmark                               Time             CPU      Time Old      Time New       CPU Old       CPU New
--------------------------------------------------------------------------------------------------------------------
BencharkBasic/0                      -0.0467         -0.0443           896           855           895           855
BencharkBasic/16                     -0.0245         -0.0222         31396         30625         31338         30642
BencharkBasic/32                     +0.0024         +0.0039         58456         58598         58403         58629
BencharkBasic/48                     -0.0126         -0.0113         98401         97162         98323         97213
BencharkBasic/64                     -0.0066         -0.0054        136861        135953        136762        136023
BencharkBasic/80                     -0.0059         -0.0047        173237        172212        173118        172296
BencharkBasic/96                     -0.0037         -0.0025        209190        208420        209052        208519
BencharkBasic/112                    -0.0027         -0.0017        246510        245834        246360        245948
BencharkBasic/128                    -0.0309         -0.0299        282502        273781        282334        273905
BencharkBasic/144                    -0.0176         -0.0166        322656        316991        322480        317138
BencharkBasic/160                    -0.0360         -0.0351        359378        346441        359197        346593
BencharkBasic/176                    -0.0160         -0.0151        395286        388961        395099        389125
BencharkBasic/192                    -0.0037         -0.0022        426039        424468        425571        424627
BencharkBasic/208                    -0.0254         -0.0236        474770        462705        474077        462878
BencharkBasic/224                    -0.0206         -0.0193        507986        497541        507498        497723
BencharkBasic/240                    -0.0128         -0.0116        534813        527948        534326        528136
BencharkBasic/256                    -0.0104         -0.0096        573306        567338        573029        567522
BencharkText/0                       -0.0179         -0.0172       1135108       1114830       1134690       1115223
BencharkText/1                       -0.0496         -0.0487       1137683       1081216       1136998       1081603
BencharkText/2                       -0.0218         -0.0211       1133314       1108655       1132970       1109033
BencharkText/3                       -0.0338         -0.0332       1138201       1099759       1137893       1100119
BencharkText/4                       -0.0313         -0.0308       1145515       1109606       1145198       1109955
BencharkText/5                       -0.0234         -0.0228       1154433       1127471       1154098       1127806
BencharkText/6                       -0.0023         -0.0017       1150365       1147775       1150071       1148130
BencharkText/7                       -0.0470         -0.0464       1222571       1165137       1222222       1165500
BencharkText/8                       -0.0401         -0.0396       1309002       1256542       1308680       1256916
BencharkText/9                       -0.0388         -0.0383       1245396       1197070       1245136       1197410
BencharkText/10                      -0.0408         -0.0404       1654822       1587300       1654483       1587712
BenchmarkStyle/1/10                  -0.0107         -0.0103          7340          7261          7338          7263
BenchmarkStyle/4/10                  -0.0102         -0.0097         17663         17482         17659         17487
BenchmarkStyle/7/10                  -0.0092         -0.0088         25520         25284         25515         25291
BenchmarkStyle/10/10                 -0.0214         -0.0210         32741         32039         32735         32048
BenchmarkStyle/1/30                  -0.0180         -0.0176         28981         28458         28977         28466
BenchmarkStyle/4/30                  -0.0163         -0.0157         48910         48113         48894         48124
BenchmarkStyle/7/30                  -0.0072         -0.0068         64869         64400         64858         64415
BenchmarkStyle/10/30                 -0.0188         -0.0184         81745         80205         81727         80225
BenchmarkStyle/1/50                  -0.0112         -0.0107         67838         67078         67823         67095
BenchmarkStyle/4/50                  -0.0260         -0.0256         93850         91410         93835         91433
BenchmarkStyle/7/50                  -0.0446         -0.0442        120960        115568        120941        115595
BenchmarkStyle/10/50                 -0.0621         -0.0617        145432        136407        145409        136438
BenchmarkStyle/1/70                  -0.0406         -0.0402        127017        121865        126997        121894
BenchmarkStyle/4/70                  -0.0492         -0.0487        159732        151877        159689        151909
BenchmarkStyle/7/70                  -0.0301         -0.0293        187992        182333        187876        182374
BenchmarkStyle/10/70                 -0.0468         -0.0465        220995        210661        220971        210706
BenchmarkStyle/1/90                  -0.0230         -0.0228        202850        198175        202835        198217
BenchmarkStyle/4/90                  -0.0281         -0.0278        239642        232899        239598        232940
BenchmarkStyle/7/90                  -0.0267         -0.0264        274152        266834        274127        266884
BenchmarkStyle/10/90                 -0.0425         -0.0422        308259        295164        308235        295220
BenchmarkStyle/1/110                 -0.0362         -0.0360        296100        285371        296083        285428
BenchmarkStyle/4/110                 -0.0361         -0.0358        339569        327323        339522        327381
BenchmarkStyle/7/110                 -0.0309         -0.0309        382001        370196        382057        370262
BenchmarkStyle/10/110                -0.0441         -0.0447        422703        404047        423049        404122
BenchmarkStyle/1/130                 -0.0221         -0.0228        406635        397629        406967        397705
BenchmarkStyle/4/130                 -0.0238         -0.0243        455198        444378        455553        444462
BenchmarkStyle/7/130                 -0.0244         -0.0249        502563        490318        502946        490398
BenchmarkStyle/10/130                -0.0411         -0.0417        550151        527550        550556        527619
BenchmarkStyle/1/150                 -0.0269         -0.0275        534260        519871        534652        519958
BenchmarkStyle/4/150                 -0.0354         -0.0357        597867        576687        598149        576772
BenchmarkStyle/7/150                 -0.0301         -0.0295        645176        625753        644847        625854
BenchmarkStyle/10/150                -0.0220         -0.0218        696525        681192        696493        681293
BenchmarkStyle/1/170                 -0.0127         -0.0129        674784        666201        675042        666309
BenchmarkStyle/4/170                 -0.0215         -0.0220        737893        722021        738354        722138
BenchmarkStyle/7/170                 -0.0424         -0.0428        796882        763110        797382        763224
BenchmarkStyle/10/170                -0.0265         -0.0270        852761        830128        853294        830260
BenchmarkStyle/1/190                 -0.0246         -0.0250        847716        826855        848197        826986
BenchmarkStyle/4/190                 -0.0234         -0.0239        911023        889685        911579        889807
BenchmarkStyle/7/190                 -0.0249         -0.0253        974989        950674        975495        950823
BenchmarkStyle/10/190                -0.0389         -0.0393       1043540       1002979       1044120       1003120
OVERALL_GEOMEAN                      -0.0259         -0.0254             0             0             0             0

If you wish, feel free to take a look.

@ArthurSonzogni
Copy link
Owner

Too bad, the latest change I made with union and anonymous struct do not comply with the warning I enforced:

anonymous types declared in an anonymous union are an extension

I will find another way.

@ArthurSonzogni
Copy link
Owner

I removed the errors, and landed additional tweaks.
I am now getting a ~18% performance improvement.

If this compiles, I think this is ready to ship. Could you please take a look if you will?

RUNNING: ../../../ftxui-benchmark-4.0.0old --benchmark_out=/tmp/tmpwdbsrhp_
2023-08-06T13:10:23+02:00
Running ../../../ftxui-benchmark-4.0.0old
Run on (16 X 4784.38 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 512 KiB (x8)
  L3 Unified 16384 KiB (x1)
Load Average: 0.56, 0.92, 0.61
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
----------------------------------------------------------------
Benchmark                      Time             CPU   Iterations
----------------------------------------------------------------
BencharkBasic/0              911 ns          911 ns       768497
BencharkBasic/16           35484 ns        35497 ns        19726
BencharkBasic/32           67752 ns        67776 ns        10377
BencharkBasic/48          110905 ns       110945 ns         6307
BencharkBasic/64          153428 ns       153481 ns         4548
BencharkBasic/80          194479 ns       194545 ns         3599
BencharkBasic/96          234711 ns       234790 ns         2984
BencharkBasic/112         275944 ns       276042 ns         2536
BencharkBasic/128         316548 ns       316660 ns         2212
BencharkBasic/144         361788 ns       361916 ns         1937
BencharkBasic/160         398733 ns       398866 ns         1740
BencharkBasic/176         439789 ns       439946 ns         1590
BencharkBasic/192         479783 ns       479951 ns         1459
BencharkBasic/208         521774 ns       521960 ns         1333
BencharkBasic/224         560303 ns       560495 ns         1245
BencharkBasic/240         602349 ns       602561 ns         1160
BencharkBasic/256         640668 ns       640898 ns         1091
BencharkText/0           1244609 ns      1245014 ns          561
BencharkText/1           1245577 ns      1246014 ns          561
BencharkText/2           1247203 ns      1247627 ns          561
BencharkText/3           1252097 ns      1252539 ns          559
BencharkText/4           1257367 ns      1257815 ns          557
BencharkText/5           1266390 ns      1266823 ns          553
BencharkText/6           1297606 ns      1298065 ns          539
BencharkText/7           1338284 ns      1338734 ns          524
BencharkText/8           1425898 ns      1426370 ns          492
BencharkText/9           1364798 ns      1365246 ns          514
BencharkText/10          1776927 ns      1777518 ns          395
BenchmarkStyle/1/10         7883 ns         7886 ns        88498
BenchmarkStyle/4/10        18463 ns        18469 ns        37934
BenchmarkStyle/7/10        26441 ns        26450 ns        26494
BenchmarkStyle/10/10       33955 ns        33967 ns        20642
BenchmarkStyle/1/30        32493 ns        32504 ns        21529
BenchmarkStyle/4/30        52728 ns        52746 ns        13223
BenchmarkStyle/7/30        69512 ns        69536 ns        10050
BenchmarkStyle/10/30       86308 ns        86337 ns         8107
BenchmarkStyle/1/50        77149 ns        77175 ns         9075
BenchmarkStyle/4/50       103041 ns       103077 ns         6738
BenchmarkStyle/7/50       129292 ns       129335 ns         5394
BenchmarkStyle/10/50      155908 ns       155961 ns         4486
BenchmarkStyle/1/70       143448 ns       143492 ns         4872
BenchmarkStyle/4/70       175020 ns       175079 ns         3995
BenchmarkStyle/7/70       206399 ns       206469 ns         3387
BenchmarkStyle/10/70      236667 ns       236748 ns         2951
BenchmarkStyle/1/90       229554 ns       229622 ns         3049
BenchmarkStyle/4/90       265978 ns       266072 ns         2633
BenchmarkStyle/7/90       302483 ns       302586 ns         2315
BenchmarkStyle/10/90      337726 ns       337844 ns         2074
BenchmarkStyle/1/110      336183 ns       336292 ns         2081
BenchmarkStyle/4/110      379151 ns       379275 ns         1845
BenchmarkStyle/7/110      422570 ns       422712 ns         1656
BenchmarkStyle/10/110     468247 ns       468410 ns         1492
BenchmarkStyle/1/130      464514 ns       464673 ns         1507
BenchmarkStyle/4/130      513565 ns       513742 ns         1360
BenchmarkStyle/7/130      561801 ns       561983 ns         1244
BenchmarkStyle/10/130     610600 ns       610805 ns         1144
BenchmarkStyle/1/150      613898 ns       614089 ns         1143
BenchmarkStyle/4/150      667154 ns       667385 ns         1048
BenchmarkStyle/7/150      720175 ns       720423 ns          970
BenchmarkStyle/10/150     765714 ns       765973 ns          904
BenchmarkStyle/1/170      779955 ns       780229 ns          895
BenchmarkStyle/4/170      843697 ns       843977 ns          831
BenchmarkStyle/7/170      899474 ns       899781 ns          778
BenchmarkStyle/10/170     955999 ns       956334 ns          731
BenchmarkStyle/1/190      968127 ns       968460 ns          723
BenchmarkStyle/4/190     1034276 ns      1034630 ns          675
BenchmarkStyle/7/190     1094213 ns      1094583 ns          636
BenchmarkStyle/10/190    1159374 ns      1159776 ns          603
RUNNING: ../../../ftxui-benchmark-4.0.0newnew --benchmark_out=/tmp/tmplxetqg9d
2023-08-06T13:11:24+02:00
Running ../../../ftxui-benchmark-4.0.0newnew
Run on (16 X 4784.38 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 512 KiB (x8)
  L3 Unified 16384 KiB (x1)
Load Average: 0.91, 0.95, 0.65
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
----------------------------------------------------------------
Benchmark                      Time             CPU   Iterations
----------------------------------------------------------------
BencharkBasic/0              852 ns          853 ns       822241
BencharkBasic/16           27118 ns        27127 ns        25788
BencharkBasic/32           51542 ns        51560 ns        13565
BencharkBasic/48           86875 ns        86902 ns         8060
BencharkBasic/64          121597 ns       121635 ns         5755
BencharkBasic/80          154492 ns       154543 ns         4534
BencharkBasic/96          187337 ns       187402 ns         3734
BencharkBasic/112         220394 ns       220470 ns         3175
BencharkBasic/128         253011 ns       253095 ns         2764
BencharkBasic/144         282067 ns       282155 ns         2457
BencharkBasic/160         322349 ns       322452 ns         2173
BencharkBasic/176         355575 ns       355689 ns         1968
BencharkBasic/192         387652 ns       387780 ns         1805
BencharkBasic/208         422977 ns       423114 ns         1655
BencharkBasic/224         453466 ns       453619 ns         1542
BencharkBasic/240         488271 ns       488428 ns         1433
BencharkBasic/256         518883 ns       519058 ns         1348
BencharkText/0           1012816 ns      1013143 ns          690
BencharkText/1           1013479 ns      1013816 ns          690
BencharkText/2           1014736 ns      1015080 ns          690
BencharkText/3           1019297 ns      1019629 ns          686
BencharkText/4           1026880 ns      1027209 ns          681
BencharkText/5           1034830 ns      1035161 ns          676
BencharkText/6           1056471 ns      1056829 ns          662
BencharkText/7           1101588 ns      1101941 ns          636
BencharkText/8           1163902 ns      1164296 ns          596
BencharkText/9           1119182 ns      1119562 ns          626
BencharkText/10          1518824 ns      1519307 ns          460
BenchmarkStyle/1/10         7029 ns         7032 ns        99863
BenchmarkStyle/4/10        17223 ns        17228 ns        40713
BenchmarkStyle/7/10        24805 ns        24813 ns        28215
BenchmarkStyle/10/10       32376 ns        32386 ns        21605
BenchmarkStyle/1/30        26885 ns        26895 ns        26014
BenchmarkStyle/4/30        45969 ns        45984 ns        15229
BenchmarkStyle/7/30        62532 ns        62553 ns        11187
BenchmarkStyle/10/30       78470 ns        78497 ns         8931
BenchmarkStyle/1/50        62169 ns        62190 ns        11389
BenchmarkStyle/4/50        86263 ns        86291 ns         8118
BenchmarkStyle/7/50       109061 ns       109096 ns         6419
BenchmarkStyle/10/50      134509 ns       134554 ns         5205
BenchmarkStyle/1/70       113427 ns       113465 ns         6169
BenchmarkStyle/4/70       142471 ns       142519 ns         4918
BenchmarkStyle/7/70       171679 ns       171734 ns         4078
BenchmarkStyle/10/70      201516 ns       201578 ns         3473
BenchmarkStyle/1/90       182671 ns       182731 ns         3861
BenchmarkStyle/4/90       216773 ns       216843 ns         3227
BenchmarkStyle/7/90       250076 ns       250161 ns         2798
BenchmarkStyle/10/90      283477 ns       283564 ns         2468
BenchmarkStyle/1/110      258862 ns       258946 ns         2625
BenchmarkStyle/4/110      304000 ns       304102 ns         2300
BenchmarkStyle/7/110      342803 ns       342919 ns         2049
BenchmarkStyle/10/110     382510 ns       382621 ns         1829
BenchmarkStyle/1/130      361343 ns       361465 ns         1936
BenchmarkStyle/4/130      405703 ns       405840 ns         1730
BenchmarkStyle/7/130      449392 ns       449539 ns         1559
BenchmarkStyle/10/130     492745 ns       492909 ns         1420
BenchmarkStyle/1/150      464601 ns       464755 ns         1506
BenchmarkStyle/4/150      528503 ns       528673 ns         1345
BenchmarkStyle/7/150      577352 ns       577539 ns         1211
BenchmarkStyle/10/150     626990 ns       627174 ns         1115
BenchmarkStyle/1/170      605700 ns       605903 ns         1154
BenchmarkStyle/4/170      659115 ns       659331 ns         1062
BenchmarkStyle/7/170      713547 ns       713785 ns          982
BenchmarkStyle/10/170     767872 ns       768115 ns          912
BenchmarkStyle/1/190      752316 ns       752572 ns          930
BenchmarkStyle/4/190      811797 ns       812060 ns          876
BenchmarkStyle/7/190      869947 ns       870232 ns          804
BenchmarkStyle/10/190     919729 ns       920026 ns          766
Comparing ../../../ftxui-benchmark-4.0.0old to ../../../ftxui-benchmark-4.0.0newnew
Benchmark                               Time             CPU      Time Old      Time New       CPU Old       CPU New
--------------------------------------------------------------------------------------------------------------------
BencharkBasic/0                      -0.0642         -0.0642           911           852           911           853
BencharkBasic/16                     -0.2358         -0.2358         35484         27118         35497         27127
BencharkBasic/32                     -0.2392         -0.2393         67752         51542         67776         51560
BencharkBasic/48                     -0.2167         -0.2167        110905         86875        110945         86902
BencharkBasic/64                     -0.2075         -0.2075        153428        121597        153481        121635
BencharkBasic/80                     -0.2056         -0.2056        194479        154492        194545        154543
BencharkBasic/96                     -0.2018         -0.2018        234711        187337        234790        187402
BencharkBasic/112                    -0.2013         -0.2013        275944        220394        276042        220470
BencharkBasic/128                    -0.2007         -0.2007        316548        253011        316660        253095
BencharkBasic/144                    -0.2204         -0.2204        361788        282067        361916        282155
BencharkBasic/160                    -0.1916         -0.1916        398733        322349        398866        322452
BencharkBasic/176                    -0.1915         -0.1915        439789        355575        439946        355689
BencharkBasic/192                    -0.1920         -0.1920        479783        387652        479951        387780
BencharkBasic/208                    -0.1893         -0.1894        521774        422977        521960        423114
BencharkBasic/224                    -0.1907         -0.1907        560303        453466        560495        453619
BencharkBasic/240                    -0.1894         -0.1894        602349        488271        602561        488428
BencharkBasic/256                    -0.1901         -0.1901        640668        518883        640898        519058
BencharkText/0                       -0.1862         -0.1862       1244609       1012816       1245014       1013143
BencharkText/1                       -0.1863         -0.1864       1245577       1013479       1246014       1013816
BencharkText/2                       -0.1864         -0.1864       1247203       1014736       1247627       1015080
BencharkText/3                       -0.1859         -0.1859       1252097       1019297       1252539       1019629
BencharkText/4                       -0.1833         -0.1833       1257367       1026880       1257815       1027209
BencharkText/5                       -0.1829         -0.1829       1266390       1034830       1266823       1035161
BencharkText/6                       -0.1858         -0.1858       1297606       1056471       1298065       1056829
BencharkText/7                       -0.1769         -0.1769       1338284       1101588       1338734       1101941
BencharkText/8                       -0.1837         -0.1837       1425898       1163902       1426370       1164296
BencharkText/9                       -0.1800         -0.1800       1364798       1119182       1365246       1119562
BencharkText/10                      -0.1453         -0.1453       1776927       1518824       1777518       1519307
BenchmarkStyle/1/10                  -0.1083         -0.1083          7883          7029          7886          7032
BenchmarkStyle/4/10                  -0.0672         -0.0672         18463         17223         18469         17228
BenchmarkStyle/7/10                  -0.0619         -0.0619         26441         24805         26450         24813
BenchmarkStyle/10/10                 -0.0465         -0.0465         33955         32376         33967         32386
BenchmarkStyle/1/30                  -0.1726         -0.1726         32493         26885         32504         26895
BenchmarkStyle/4/30                  -0.1282         -0.1282         52728         45969         52746         45984
BenchmarkStyle/7/30                  -0.1004         -0.1004         69512         62532         69536         62553
BenchmarkStyle/10/30                 -0.0908         -0.0908         86308         78470         86337         78497
BenchmarkStyle/1/50                  -0.1942         -0.1942         77149         62169         77175         62190
BenchmarkStyle/4/50                  -0.1628         -0.1629        103041         86263        103077         86291
BenchmarkStyle/7/50                  -0.1565         -0.1565        129292        109061        129335        109096
BenchmarkStyle/10/50                 -0.1373         -0.1373        155908        134509        155961        134554
BenchmarkStyle/1/70                  -0.2093         -0.2093        143448        113427        143492        113465
BenchmarkStyle/4/70                  -0.1860         -0.1860        175020        142471        175079        142519
BenchmarkStyle/7/70                  -0.1682         -0.1682        206399        171679        206469        171734
BenchmarkStyle/10/70                 -0.1485         -0.1486        236667        201516        236748        201578
BenchmarkStyle/1/90                  -0.2042         -0.2042        229554        182671        229622        182731
BenchmarkStyle/4/90                  -0.1850         -0.1850        265978        216773        266072        216843
BenchmarkStyle/7/90                  -0.1733         -0.1733        302483        250076        302586        250161
BenchmarkStyle/10/90                 -0.1606         -0.1607        337726        283477        337844        283564
BenchmarkStyle/1/110                 -0.2300         -0.2300        336183        258862        336292        258946
BenchmarkStyle/4/110                 -0.1982         -0.1982        379151        304000        379275        304102
BenchmarkStyle/7/110                 -0.1888         -0.1888        422570        342803        422712        342919
BenchmarkStyle/10/110                -0.1831         -0.1831        468247        382510        468410        382621
BenchmarkStyle/1/130                 -0.2221         -0.2221        464514        361343        464673        361465
BenchmarkStyle/4/130                 -0.2100         -0.2100        513565        405703        513742        405840
BenchmarkStyle/7/130                 -0.2001         -0.2001        561801        449392        561983        449539
BenchmarkStyle/10/130                -0.1930         -0.1930        610600        492745        610805        492909
BenchmarkStyle/1/150                 -0.2432         -0.2432        613898        464601        614089        464755
BenchmarkStyle/4/150                 -0.2078         -0.2078        667154        528503        667385        528673
BenchmarkStyle/7/150                 -0.1983         -0.1983        720175        577352        720423        577539
BenchmarkStyle/10/150                -0.1812         -0.1812        765714        626990        765973        627174
BenchmarkStyle/1/170                 -0.2234         -0.2234        779955        605700        780229        605903
BenchmarkStyle/4/170                 -0.2188         -0.2188        843697        659115        843977        659331
BenchmarkStyle/7/170                 -0.2067         -0.2067        899474        713547        899781        713785
BenchmarkStyle/10/170                -0.1968         -0.1968        955999        767872        956334        768115
BenchmarkStyle/1/190                 -0.2229         -0.2229        968127        752316        968460        752572
BenchmarkStyle/4/190                 -0.2151         -0.2151       1034276        811797       1034630        812060
BenchmarkStyle/7/190                 -0.2050         -0.2050       1094213        869947       1094583        870232
BenchmarkStyle/10/190                -0.2067         -0.2067       1159374        919729       1159776        920026
OVERALL_GEOMEAN                      -0.1823         -0.1823             0             0             0             0

@clement-roblot
Copy link
Contributor Author

Dang, those __builtin_expect are powerful! The code itself looks good to me and you manage to keep it more readable than what I first implemented 🎉

As for the tests failing, it feels to me like that bit fields for the style are randomly initialized and so sometimes they mess up the tests. Forcing a proper initialisation of the Pixel struct everywhere it is used end up lowering the performance gains from 12% to 6% on my machine. I tried by removing the bitfield from Pixel an replacing by just a bunch of bool that are init at 0, and that last solution I committed that end up doing the same regarding performance.

At least now the tests are passing (at least on my machine)

@ArthurSonzogni ArthurSonzogni merged commit e2a205e into ArthurSonzogni:main Aug 7, 2023
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance improvement by refactoring pixel styles
3 participants