First, perform the following steps:
- run
make
from thetangent
directory. This will compile the C++ program calledmathindex
. - Add the
tangent-v031
top-level directory to yourPYTHONPATH
.
- Create a directory containing only MathML files. Make sure they end with the extension
.mml
. The.pmml
extension is not currently supported. - Create a file that contains the full paths of each of the MathML files. One path per line. In what follows, assume this file is called
$EQ-PATHS-FILE
. - In the config file named
hopper.cntl
, change thedoc_list
line to point to the$EQ-PATHS-FILE
you created above. - From the tangent directory, run
python3 index.py hopper.cntl
to create a database of symbol layout trees (SLTs). These trees will be written totangent/db-index/
. - Write the tuples for all of the SLTs to a file
$ALL-TUPLES
. From thetangent
directory, run:cat db-index/* | ./mathindex.exe -v 2> $ALL-TUPLES
- (DLMF-only) Organize the tuples into the original directory structure, with 1 directory per chapter, and 1 file per equation. Assume you want all chapters directories in
$DLMF-TUPLE-DIR
From the top-evel tangent-v031 directory:python hop-postProcess.py --eq_paths $EQ-PATHS-FILE --all_tuples $ALL-TUPLES --outdir $DLMF-TUPLE-DIR