Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML de/serialization of normalizedstring fields is incorrect #737

Open
8 tasks
justahero opened this issue Jun 29, 2024 · 0 comments
Open
8 tasks

XML de/serialization of normalizedstring fields is incorrect #737

justahero opened this issue Jun 29, 2024 · 0 comments

Comments

@justahero
Copy link
Contributor

Based on the findings in #733 it was concluded that the XML serialization code does not parse the XML type normalizedstring correctly. This does not only affect the license name field, but all occurrences of this string type.

There is a NormalizedString type in /external_models available that replaces a set of forbidden characters, e.g. line feed or tab, in NormalizedString::new with white space characters. All XML schemas in the specification repository use this standard XML type normalizedstring, while the JSON schemas do not use such a string type or put any associated restrictions on them. Therefore this type should only apply to the XML serialization code.

The cyclonedx-bom code handles the normalizedstring XML type as a String, only when converting the spec types to their model representations the NormalizedString type is used, but for both JSON and XML parsed objects. That is wrong, the normalizedstring XML type & therefore the NormalizedString model type are only relevant as part of the XML specification & should be used in this context only.

To fix the issue, the code should be adapted in the following way.

  • introduce a NewType struct that encapsulates a String, which is handled as a normal string with serde_json, most likely under folder /spec, but applies replacing chars in XML serialization code
  • check the XML specification to see which types in the /spec folder need to change their type from String to the NewType struct, therefore documenting types closer to the XML specification. This needs to be done for all available spec versions
  • remove the NormalizedString type from the models folder, the model types that currently declare it, e.g. Option<NormalizedString>, should declare them with String only
  • implement FromXml & ToXml traits for the NewType struct that replaces the forbidden white space characters
  • update all conversion functions between model & spec types accordingly
  • update serialization & deserialization code of the string fields in the XML logic, for example instead of using read_simple_tag to read a String the NewType::read_xml_element trait method is called
  • update & fix tests, there are lot of specific tests to validate NormalizedString fields, which should become unnecessary, likely to eliminate a lot of unnecessary validations?
  • remove the validate_normalized_string function, it's very likely not needed anymore, which should simplify validation logic for a bunch of fields

These are a lot of changes, but the outcome is

  • all string types that define normalizedstring in the XML schemas are correctly parsed afterwards, similar to the behavior of other XML libraries in other languages
  • JSON & XML string types are handled the same way afterwards, the forbidden white space characters are allowed in JSON, but replaced in XML correctly
  • validation logic becomes simpler, same for a lot of tests
  • Relax validation of license name field #733 will be fixed, so are all other fields with the normalizedstring type
  • finally the XML serialization code follows the specifcation more closely, & therefore cyclonedx-bom gets closer to other BOM libraries of the CycloneDX family

cc @Shnatsel, @lfrancke (thanks & also thanks to all others on Discord for their input)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant