Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Page Error in Allgemeine_kirchliche_Zeitung #1

Open
tboenig opened this issue Mar 2, 2022 · 3 comments
Open

Page Error in Allgemeine_kirchliche_Zeitung #1

tboenig opened this issue Mar 2, 2022 · 3 comments

Comments

@tboenig
Copy link

tboenig commented Mar 2, 2022

Hallo,
I have parsed/validated the folder 'Allgemeine_kirchliche_Zeitung' and found the following errors:

System-ID: \DTGT\Data\data_line\Allgemeine_kirchliche_Zeitung\1860\page\16_a23d4_default.xml
Schema: http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15/pagecontent.xsd
Programmname: Xerces
Fehlerlevel: error
Beschreibung: Value '718,113 707,-2 765,-4 776,108 776,117 718,126' is not facet-valid with respect to pattern '([0-9]+,[0-9]+ )+([0-9]+,[0-9]+)' for type 'PointsType'.
Anfang: 12:73
URL: http://www.w3.org/TR/xmlschema-2/#cvc-pattern-valid

System-ID: \DTGT\Data\data_line\Allgemeine_kirchliche_Zeitung\1860\page\16_a23d4_default.xml
Schema: http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15/pagecontent.xsd
Programmname: Xerces
Fehlerlevel: error
Beschreibung: The value '718,113 707,-2 765,-4 776,108 776,117 718,126' of attribute 'points' on element 'Coords' is not valid with respect to its type, 'PointsType'.
Anfang: 12:24
Ende: 12:71
URL: http://www.w3.org/TR/xmlschema-1/#cvc-attribute

System-ID: \DTGT\Data\data_line\Allgemeine_kirchliche_Zeitung\1860\page\18_1f153_default.xml
Schema: http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15/pagecontent.xsd
Programmname: Xerces
Fehlerlevel: error
Beschreibung: Invalid content was found starting with element '{"http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15":TextLine}'. One of '{"http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15":AlternativeImage, "http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15":Coords}' is expected.
Anfang: 317:8
Ende: 317:16
URL: http://www.w3.org/TR/xmlschema-1/#cvc-complex-type

System-ID: \DTGT\Data\data_line\Allgemeine_kirchliche_Zeitung\1860\page\20_4e68f_default.xml
Schema: http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15/pagecontent.xsd
Programmname: Xerces
Fehlerlevel: error
Beschreibung: Invalid content was found starting with element '{"http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15":TextLine}'. One of '{"http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15":AlternativeImage, "http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15":Coords}' is expected.
Anfang: 309:8
Ende: 309:16
URL: http://www.w3.org/TR/xmlschema-1/#cvc-complex-type

System-ID: \DTGT\Data\data_line\Allgemeine_kirchliche_Zeitung\1860\page\21_da7a2_default.xml
Schema: http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15/pagecontent.xsd
Programmname: Xerces
Fehlerlevel: error
Beschreibung: Invalid content was found starting with element '{"http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15":TextLine}'. One of '{"http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15":AlternativeImage, "http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15":Coords}' is expected.
Anfang: 310:8
Ende: 310:16
URL: http://www.w3.org/TR/xmlschema-1/#cvc-complex-type

System-ID: \DTGT\Data\data_line\Allgemeine_kirchliche_Zeitung\1860\page\26_63d1f_default.xml
Schema: http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15/pagecontent.xsd
Programmname: Xerces
Fehlerlevel: error
Beschreibung: Invalid content was found starting with element '{"http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15":TextLine}'. One of '{"http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15":AlternativeImage, "http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15":Coords}' is expected.
Anfang: 310:8
Ende: 310:16
URL: http://www.w3.org/TR/xmlschema-1/#cvc-complex-type

System-ID: \DTGT\Data\data_line\Allgemeine_kirchliche_Zeitung\1860\page\29_557de_default.xml
Schema: http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15/pagecontent.xsd
Programmname: Xerces
Fehlerlevel: error
Beschreibung: Invalid content was found starting with element '{"http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15":TextLine}'. One of '{"http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15":AlternativeImage, "http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15":Coords}' is expected.
Anfang: 317:8
Ende: 317:16
URL: http://www.w3.org/TR/xmlschema-1/#cvc-complex-type

System-ID: \DTGT\Data\data_line\Allgemeine_kirchliche_Zeitung\1860\page\2_aa780_default.xml
Schema: http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15/pagecontent.xsd
Programmname: Xerces
Fehlerlevel: error
Beschreibung: Value '628,132 628,-4 889,-4 889,132 889,150 746,160 744,160 744,160 626,145' is not facet-valid with respect to pattern '([0-9]+,[0-9]+ )+([0-9]+,[0-9]+)' for type 'PointsType'.
Anfang: 12:97
URL: http://www.w3.org/TR/xmlschema-2/#cvc-pattern-valid

System-ID: \DTGT\Data\data_line\Allgemeine_kirchliche_Zeitung\1860\page\2_aa780_default.xml
Schema: http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15/pagecontent.xsd
Programmname: Xerces
Fehlerlevel: error
Beschreibung: The value '628,132 628,-4 889,-4 889,132 889,150 746,160 744,160 744,160 626,145' of attribute 'points' on element 'Coords' is not valid with respect to its type, 'PointsType'.
Anfang: 12:24
Ende: 12:95
URL: http://www.w3.org/TR/xmlschema-1/#cvc-attribute

System-ID: \DTGT\Data\data_line\Allgemeine_kirchliche_Zeitung\1860\page\3_b28d4_default.xml
Schema: http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15/pagecontent.xsd
Programmname: Xerces
Fehlerlevel: error
Beschreibung: Invalid content was found starting with element '{"http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15":TextLine}'. One of '{"http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15":AlternativeImage, "http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15":Coords}' is expected.
Anfang: 310:8
Ende: 310:16
URL: http://www.w3.org/TR/xmlschema-1/#cvc-complex-type

@stweil
Copy link
Member

stweil commented Mar 2, 2022

Thank you for testing this. That seems to be a bug in Kraken and/or eScriptorium then.

@stweil
Copy link
Member

stweil commented Mar 14, 2022

Negative coordinates also exists in PAGE XML for Stimmen_aus_Maria-Laach/. I reported that now at https://gitlab.com/scripta/escriptorium/-/issues/568, and according to the first feedback it is indeed a bug in Kraken.

A different kind of error was fixed in commit 48cfe1f.

@tboenig
Copy link
Author

tboenig commented Mar 17, 2022

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants