Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the Compression element #928

Merged
merged 1 commit into from
Feb 25, 2019
Merged

Add the Compression element #928

merged 1 commit into from
Feb 25, 2019

Conversation

mattgarrish
Copy link
Member

This PR adds the Compression element that was introduced in the 3.1 schema. Fixes #904

@rdeltour rdeltour self-assigned this Jan 17, 2019
@rdeltour rdeltour added status: needs review Needs to be reviewed by a team member before further processing spec: EPUB 3.2 Impacting the support of EPUB 3.2 labels Jan 17, 2019
@rdeltour rdeltour added this to the 4.2.0-beta-1 milestone Jan 17, 2019
@danielweck
Copy link
Member

danielweck commented Feb 8, 2019

Unfortunately this RNC does not work in practice (let me know if I am wrong though, I am no RelaxNG expert).

default namespace = "urn:oasis:names:tc:opendocument:xmlns:container"
namespace epub_xenc_compress = "http://www.idpf.org/2016/encryption#compression"

start =
    element encryption {
        grammar {
            include "http://www.w3.org/TR/2013/NOTE-xmlsec-rngschema-20130411/Lenient-Encryption11-ghc.rnc" {
                start = xenc_EncryptedData | xenc_EncryptedKey
            }
            xenc_EncryptionMethodOtherParams |= attribute Algorithm { xsd:anyURI "http://www.idpf.org/2008/embedding" }
            xenc_EncryptionAlgorithms |= xsd:anyURI "http://www.idpf.org/2008/embedding"
            xenc_EncryptionProperty_child |= epub_xenc_Compression?
            epub_xenc_Compression = element epub_xenc_compress:Compression { epub_xenc_CompressionType }
            epub_xenc_CompressionType =
                attribute Method { "0" | "8" },
                attribute OriginalLength { xsd:nonNegativeInteger }
        }+
}

First of all, xenc_EncryptionProperty_child is not defined in the included Encryption11-ghc.rnc, this should in fact be xenc_EncryptionPropertyType which is defined as:

xenc_EncryptionPropertyType =
  mixed {
    attribute Target { xsd:anyURI }?,
    attribute Id { xenc_IdType }?,
    attribute xml:* { text }*,
    xenc_anyForeignElement+
  }

However, using the |= operator to refine these validation rules doesn't work either, because xenc_anyForeignElement "swallows" every children as valid. Furthermore, elements are ordered (unlike attributes), so there is this issue as well.

On a side note, this remote URL:
include "http://www.w3.org/TR/2013/NOTE-xmlsec-rngschema-20130411/Lenient-Encryption11-ghc.rnc"
...should be replaced with the local asset:
include "mod/security/Lenient-Encryption11-ghc.rnc"

So, I came up with this solution:

default namespace = "urn:oasis:names:tc:opendocument:xmlns:container"
namespace epub_xenc_compress = "http://www.idpf.org/2016/encryption#compression"
namespace xenc = "http://www.w3.org/2001/04/xmlenc#"

start =
    element encryption {
        grammar {
            include "mod/security/Lenient-Encryption11-ghc.rnc" {
                start = xenc_EncryptedData | xenc_EncryptedKey

                xenc_EncryptionPropertyType =
                  attribute Target { xsd:anyURI }?,
                  attribute Id { xenc_IdType }?,
                  attribute xml:* { text }*,
                  mixed {
                    (epub_xenc_Compression, xenc_anyForeignElement*) | xenc_anyForeignElement_+
                  }
            }
            xenc_EncryptionMethodOtherParams |= attribute Algorithm { xsd:anyURI "http://www.idpf.org/2008/embedding" }
            xenc_EncryptionAlgorithms |= xsd:anyURI "http://www.idpf.org/2008/embedding"

            xenc_anyForeignElement_ = element * - (xenc:* | epub_xenc_compress:Compression) { mixed { security_anyAttribute*, security_anyElement* } }

            epub_xenc_Compression = element epub_xenc_compress:Compression { epub_xenc_CompressionType }
            epub_xenc_CompressionType =
                attribute Method { "0" | "8" },
                attribute OriginalLength { xsd:nonNegativeInteger }
        }+
}

This allows the Compression element with optional xenc_anyForeignElement following siblings (not before, only after). This also continues to allow xenc_anyForeignElement under the condition that there are no Compression elements amongst children.

Example tests (trimmed XML for brevity):


<EncryptionProperty>
  <DummyBefore xmlns="http://www.dummy.org/fake/namespace" attr="val"></DummyBefore>
  <Compression xmlns="http://www.idpf.org/2016/encryption#compression" Method="0" OriginalLength="A"></Compression>
</EncryptionProperty>

Expected errors:

  • 'Error while parsing file: element "Compression" not allowed here; expected the element end-tag or text' (that's because of DummyBefore)
  • 'Error while parsing file: value of attribute "OriginalLength" is invalid; must be an integer' (it is A)

<EncryptionProperty>
  <Compression xmlns="http://www.idpf.org/2016/encryption#compression" Method="9" OriginalLength="0"></Compression>
  <DummyAfter xmlns="http://www.dummy.org/fake/namespace" attr="val"></DummyAfter>
</EncryptionProperty>

Expected errors:

  • 'Error while parsing file: value of attribute "Method" is invalid; must be equal to "0" or "8"' (it is 9)
  • (note that DummyAfter is allowed here)

<EncryptionProperty>
  <DummyBefore xmlns="http://www.dummy.org/fake/namespace" attr="val"></DummyBefore>
  <DummyAfter xmlns="http://www.dummy.org/fake/namespace" attr="val"></DummyAfter>
</EncryptionProperty>

Expected errors:

  • (none, this is perfectly valid)

@murata2makoto
Copy link
Contributor

I am responsible for the not-so-easy-to-customize design of xenc-schema.rnc. I was involved in the design of RELAX NG, NVDL, and the RNG schemas for XML security.

@danielweck You appear to disallow elder brothers of the Compression element on purpose. Is this backed by the current draft of OCF 3.2?

The http URLs do not cause network access because epubcheck use XML catalog (epub30-catalog.xml). So, no changes for schema locations are needed.

Finally, I would like to slightly modify the schema for consistency with other security RNG schemas. Here is my proposal.

default namespace = "urn:oasis:names:tc:opendocument:xmlns:container"
namespace xenc = "http://www.w3.org/2001/04/xmlenc#"
namespace epub_xenc_compress = "http://www.idpf.org/2016/encryption#compression"

start =
    element encryption {
        grammar {
            include "http://www.w3.org/TR/2013/NOTE-xmlsec-rngschema-20130411/Lenient-Encryption11-ghc.rnc" {
                start = xenc_EncryptedData | xenc_EncryptedKey
                xenc_EncryptionPropertyType =
                  mixed {
                    attribute Target { xsd:anyURI }?,
                    attribute Id { xenc_IdType }?,
                    attribute xml:* { text }*,
                    xenc_EncryptionPropertyContent
                  }
                xenc_anyForeignElement = element * - (xenc:* | epub_xenc_compress:*) {
                  mixed { security_anyAttribute*, security_anyElement* } }
            }
            xenc_EncryptionMethodOtherParams |= attribute Algorithm { xsd:anyURI "http://www.idpf.org/2008/embedding" }
            xenc_EncryptionAlgorithms |= xsd:anyURI "http://www.idpf.org/2008/embedding"

            xenc_EncryptionPropertyContent = epub_xenc_Compression?, xenc_anyForeignElement+
	    
            epub_xenc_Compression = element epub_xenc_compress:Compression { epub_xenc_CompressionType }
            epub_xenc_CompressionType =
                attribute Method { "0" | "8" },
                attribute OriginalLength { xsd:nonNegativeInteger }
        }+
    }

@mattgarrish mattgarrish changed the base branch from master to next/v4.2.0 February 15, 2019 17:54
@mattgarrish
Copy link
Member Author

mattgarrish commented Feb 15, 2019

I've updated the pull request per the discussions, but made one change to your suggested schema @murata2makoto. This line:

xenc_EncryptionPropertyContent = epub_xenc_Compression?, xenc_anyForeignElement+

requires a foreign element, but my understanding of the compression element is that it should be allowed alone. I also can't find anything in the specification that requires the Compression element be in any specific position, so I replaced it with this:

xenc_EncryptionPropertyContent = (epub_xenc_Compression & xenc_anyForeignElement*) | xenc_anyForeignElement+

But if either you or @danielweck sees anything wrong with this approach, let me know.

@rdeltour
Copy link
Member

@murata2makoto

The http URLs do not cause network access because epubcheck use XML catalog (epub30-catalog.xml). So, no changes for schema locations are needed.

In fact, I don't think EPUBCheck does use a catalog resolver! As far as I know this catalog file is mostly here to be used by third-party tools, e.g. to configure XML editors.

@danielweck the changes made by @mattgarrish look OK to me. Before I mere the PR, Can you confirm the bit about the Compression element position?

The only slight change I will make when integrating the PR is running the error assertions before the ones about the error content, to avoid an NPE if only 1 error is found. But that's a detail :-)

{
assertTrue(testReport.errorList.get(0).message
.contains("value of attribute \"Method\" is invalid; must be equal to \"0\" or \"8\""));
assertTrue(testReport.errorList.get(1).message
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can possibly throw an NPE since we didn't assert at this point that there are at least two errors…

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I don't think EPUBCheck does use a catalog resolver! As far as I know this catalog file is mostly here to be used by third-party tools, e.g. to configure XML editors.

Then, does epubcheck always retreive schemas?

  - update the `ocf-encryption-30.rnc` schema to allow the `Compression`
    element as a child of `EncryptionProperty`
  - add tests

Fix #904
@rdeltour
Copy link
Member

does epubcheck always retreive schemas?

I may be wrong, but I think it uses local schemas and relative references everywhere.

@rdeltour rdeltour added status: ready to merge The pull request is ready to be merged and removed status: needs review Needs to be reviewed by a team member before further processing labels Feb 24, 2019
@danielweck
Copy link
Member

danielweck commented Feb 25, 2019

This looks good to merge.

I opened a separate WIP PR to possibly improve the schema, and most probably to add tests: #978

@rdeltour rdeltour merged commit ed6f1c4 into next/v4.2.0 Feb 25, 2019
@rdeltour rdeltour deleted the ocf-compression branch February 25, 2019 13:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec: EPUB 3.2 Impacting the support of EPUB 3.2 status: ready to merge The pull request is ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants