Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate to core v3 #117

Draft
wants to merge 10 commits into
base: master
Choose a base branch
from
Draft

migrate to core v3 #117

wants to merge 10 commits into from

Conversation

bertsky
Copy link
Contributor

@bertsky bertsky commented Sep 16, 2024

Still draft as long as v3 is beta / RC, but we can use the CI and already discuss the changes (esp. to tests).

@kba this closely resembles tests on OCR-D/ocrd_kraken#44 (covering variants with METS Caching and/or METS Server and/or parallel pages).

@bertsky
Copy link
Contributor Author

bertsky commented Sep 16, 2024

Oh, and this is based on #116, since I often cannot even run Calamari without that.

@bertsky
Copy link
Contributor Author

bertsky commented Sep 18, 2024

I wrote a simple script to measure and plot the GPU utilisation.

ocrd-calamari-cuda-b32

The rather simple modification 9611e2c (which I will cherry pick into #116 for core v2) helps in two ways: it utilises the GPU better (because it avoids too small batches when regions have but a few lines) and thus also allows increasing the batch size without causing OOM:

Unfortunately, fb2a680 does not accomplish what I expected – reducing the peaky GPU utilization behaviour due to GPU waiting for CPU and vice versa.

Here's a log for batch_size=64 without parallel page threads (but with METS Server) – i.e. before fb2a680:
ocrd-calamari-cuda-b64-B64-v3-p1
And the same with 3 parallel page threads – still before fb2a680:
ocrd-calamari-cuda-b64-B64-v3-p3
Now, after adding fb2a680 with ThreadPoolExecutor computing the predict_raw batches concurrently (shared across parallel page threads):
ocrd-calamari-cuda-b64-B64-v3-p3-bg

Thus, surprisingly, the timeline still shows low average utilisation with lots of waiting time. This is also reflected by wall time and CPU time measurements (see below).

It gets a little better if I split up the batches for the background thread (instead of having Calamari v1 do the batching), though:
ocrd-calamari-cuda-b64-B64-v3-p3-bgbatched

Also, it helps to increase the number of parallel page threads from 3 to 6 – but just relatively, not regarding bg threading. Here's with 6 threads before fb2a680:
ocrd-calamari-cuda-b64-B64-v3-p6
And this is with 6 threads after fb2a680:
ocrd-calamari-cuda-b64-B64-v3-p6-bg

I also tried with more than 1 background thread (number of workers in the shared ThreadPoolExecutor), but that does not do much better either – the above with 2 "GPU" threads:
ocrd-calamari-cuda-b64-B64-v3-p6-bg2
And the same with 4 bg threads:
ocrd-calamari-cuda-b64-B64-v3-p6-bg4

Increasing the number of parallel page threads to 12 or 24 becomes more inefficent still.

Figures for a book with 180 pages of Fraktur:

commit OCRD_MAX_PARALLEL_PAGES wall time CPU time
bf755a3 (region-level batches) 1 1148s 1082s
bf755a3 (region-level batches) 3 744s 1188s
9611e2c (page-level batches) 1 1113s 1042s
9611e2c (page-level batches) 3 698s 1105s
fb2a680 (in 1 background thread) 3 709s 1122s
fb2a680 (in 1 background thread) 6 665s 1178s
fb2a680 (in 1 background thread) 12 693s 1205s
fb2a680 (in 2 background threads) 6 660s 1169s
fb2a680 (in 4 background threads) 6 653s 1160s

Perhaps we must go for Calamari 2 with its efficient tfaip pipelining...

@codecov-commenter
Copy link

codecov-commenter commented Sep 18, 2024

Codecov Report

Attention: Patch coverage is 85.38462% with 19 lines in your changes missing coverage. Please review.

Project coverage is 68.93%. Comparing base (4adf09f) to head (e68ce5f).

Files with missing lines Patch % Lines
ocrd_calamari/recognize.py 85.38% 9 Missing and 10 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #117      +/-   ##
==========================================
- Coverage   71.07%   68.93%   -2.15%     
==========================================
  Files           5        4       -1     
  Lines         204      206       +2     
  Branches       50       55       +5     
==========================================
- Hits          145      142       -3     
- Misses         48       51       +3     
- Partials       11       13       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bertsky bertsky mentioned this pull request Sep 23, 2024
@mikegerber mikegerber self-assigned this Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants