Intro parameters

Introduction to parameters in OCR-D

(as of writing this article, OCR-D/core is at 2.12.6, OCR-D/spec at 3.9.0)

The actual functionality of OCR-D is implemented in the form of processors, command line tools that adhere to the OCR-D CLI spec. For an overview which processors are available and how to combine them into workflows, see the OCR-D workflow guide.

All OCR-D processors have the same command line interface, meaning they all support the same set of flags and options when invoked. However, processors can define processor-specific settings in their ocrd-tool.json, called parameters. When running a processor, users can specify these parameters with the -p and -P command line options.

Which parameters are supported by a processor?

To find out which parameters are supported by a procesor, use the --help flag. For example, for ocrd-tesserocr-recognize, this is the help output:

$ ocrd-tesserocr-recognize --help

Usage: ocrd-tesserocr-recognize [OPTIONS]

  Recognize text in lines with Tesseract (using annotated derived images, or masking and cropping images from coordinate polygons)

Options:
  -I, --input-file-grp USE        File group(s) used as input
  -O, --output-file-grp USE       File group(s) used as output
  -g, --page-id ID                Physical page ID(s) to process
  --overwrite                     Remove existing output pages/images
                                  (with --page-id, remove only those)
  -p, --parameter JSON-PATH       Parameters, either verbatim JSON string
                                  or JSON file path
  -m, --mets URL-PATH             URL or file path of METS to process
  -w, --working-dir PATH          Working directory of local workspace
  -l, --log-level [OFF|ERROR|WARN|INFO|DEBUG|TRACE]
                                  Log level
  -J, --dump-json                 Dump tool description as JSON and exit
  -h, --help                      This help message
  -V, --version                   Show version

Parameters:
   "dpi" [number - -1]
    pixel density in dots per inch (overrides any meta-data in the
    images); disabled when negative
   "raw_lines" [boolean - false]
    Do not attempt additional segmentation
    (baseline+xheight+ascenders/descenders prediction) when using line
    images (i.e. when textequiv_level<region). Can increase accuracy for
    certain workflows. Disable when line segments/images may contain
    components of more than 1 line, or larger gaps/white-spaces.
   "textequiv_level" [string - "word"]
    Lowest PAGE XML hierarchy level to add the TextEquiv results to; when
    below `region`, implicitly adds segmentation below the line level,
    but requires existing line segmentation
    Possible values: ["region", "line", "word", "glyph"]
   "char_whitelist" [string - ""]
    Enumeration of character hypotheses (from the model) to allow
    exclusively; overruled by blacklist if set.
   "model" [string]
    tessdata model to apply (an ISO 639-3 language specification or some
    other basename, e.g. deu-frak or Fraktur)
   "overwrite_words" [boolean - false]
    Remove existing layout and text annotation below the TextLine level
    (regardless of textequiv_level).
   "char_blacklist" [string - ""]
    Enumeration of character hypotheses (from the model) to suppress;
    overruled by unblacklist if set.
   "char_unblacklist" [string - ""]
    Enumeration of character hypotheses (from the model) to allow
    inclusively.
   "padding" [number - 0]
    Number of background-filled pixels to add around the line image (i.e.
    the annotated AlternativeImage if it exists or the higher-level
    image cropped to the bounding box and masked by the polygon
    otherwise) on each side before recognition.

Default Wiring:
  ['OCR-D-SEG-BLOCK', 'OCR-D-SEG-LINE', 'OCR-D-SEG-WORD', 'OCR-D-SEG-GLYPH'] -> ['OCR-D-OCR-TESS']

You can find a description of the parameters in the section Parameters. Every parameter (e.g. overwrite_words) is listed with its name (overwrite_words), its datatype (boolean - so either true or false), its default value (false) and a description of what the parameter does ("Remove existing layout and text annotation below the TextLine level (regardless of textequiv_level)").

How can I pass parameters to a processor?

There are three ways to pass parameters to a processor:

-P KEY VALUE: set parameters individually
-p JSON_FILE: as a JSON file JSON_FILE
-p JSON_STRING: as literal JSON

Option 1. has been introduced in OCR-D/core v2.11.0 and is currently the recommended way to specify parameters.

Option 2. allows to define the parameters in a JSON file, including #-prefixed comments. This is most useful for processors to define and describe sets of parameters.

Option 3. was the preferred way to pass parameters until the introduction of -P KEY VALUE. Its advantage over -p JSON_FILE is that the parameters can be defined ad-hoc on the command line. A major disadvantage is that quoting can become tricky when there's another level of indirection, such as when running a processor within a Docker container.

Can I combine parameter options?

You can combine all variants of parameter passing and both -p and -P are repeatable. This allows for composition, i.e. in the following

ocrd-foo -p defaults.json -P this-param 42

will first read the file defaults.json and parse it as JSON, then override the parameter this-param with the value 42 (a number).

Examples

The following three invocations are functionally equivalent:

echo '{"foo": "bar"}' > param.json
ocrd-foo -p param.json
ocrd-foo -p '{"foo": "bar"}'
ocrd-foo -P foo bar

This illustrates that -P is the most intuitive and therefore recommended way to pass parameters.

Notes on syntax

The -p variants of passing parameters require a well-formed JSON object, that is:

Enclosed in {}
Keys (parameter name) and values (parameter value) separated with :
Keys must be double-quoted ("param-name")
Values must be valid JSON data types:
- string: double-quote (e.g. "some string value")
- number: the digits of the number, decimal separator is . (e.g. 42, 3.1514)
- boolean: true or false
- array: A list of strings, numbers or boolean, separated by , and enclosed in []
- object: The same syntax as for the whole parameter JSON

One extension of JSON we support in OCR-D are #-prefixed comments, i.e. you can describe the parameter JSON with comments like such:

{
  # This is set to true because we're augmenting existing OCR results
  # which may have words already
  "overwrite_words": true
}

For the -P KEY VALUE variant, these rules apply:

KEY must not be quoted
VALUE can be any of the JSON data types described above
If VALUE is not a valid JSON data type, it is interpreted as a string. That has the advantage that you can write -P param-name string-value instead of -P param-name '"string-value"'. ~

Welcome to the OCR-D wiki, a companion to the OCR-D website.

Articles and tutorials

Discussions

Expert section on OCR-D- workflows

Particular workflow steps

Recommended workflows

Successful Workflows for Particular Material (Template)

Workflow Guide

Videos

Section on Ground Truth

Provide feedback

Saved searches

Use saved searches to filter your results more quickly