Skip to content

Commit

Permalink
Merge pull request #10 from OP-TED/new_main
Browse files Browse the repository at this point in the history
New main
  • Loading branch information
NPJ-OP-LUX committed Oct 24, 2023
2 parents 3304524 + c611a57 commit 63cd9bd
Show file tree
Hide file tree
Showing 25 changed files with 318 additions and 263 deletions.
Binary file added docs/antora/modules/ROOT/images/conmap1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/antora/modules/ROOT/images/conmap2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/antora/modules/ROOT/images/conmap3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 4 additions & 2 deletions docs/antora/modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,13 @@
* [.separated]#**Mapping suite contents**#
* xref:SWS::genref.adoc[General Reference]
* xref:SWS::glossary.adoc[Glossary]
* xref:mapping_suite/methodology.adoc[Methodology]
* xref:mapping_suite/repository-structure.adoc[Repository structure]
* xref:mapping_suite/index.adoc[Mapping suites]
//** xref:mapping_suite/repository-structure.adoc[Repository structure]
* xref:mapping_suite/mapping-suite-structure.adoc[Mapping suite package structure]
* xref:mapping_suite/code-list-resources.adoc[Code list mappings]
* xref:mapping_suite/preparing-test-data.adoc[Data samples]
* xref:mapping_suite/toolchain.adoc[Toolchain]
* [.separated]#**Reusing semantic web service artefacts**#
* xref:sample_app/sa_glossary.adoc[Glossary]
Expand All @@ -24,6 +25,7 @@
* xref:sample_app/jupyter_notebook_r.adoc[Jupyter Notebook - R]
* xref:sample_app/jupyter_notebook_python.adoc[Jupyter Notebook - Python]
* [.separated]#**Reference**#
* xref:mapping_suite/versioning.adoc[Versioning]
Expand Down
2 changes: 1 addition & 1 deletion docs/antora/modules/ROOT/pages/genref.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,6 @@



A mapping suite within the TED Semantic Web Service is a set of mappings that defines how an XML document representing an e-Procurement Notice will be transformed to an equivalent RDF graph representation in conformance with the eProcurement ontology. These mappings are materialized in different forms, as it will be explained later, and a mapping suite will have all its relevant components organized in a package, which we refer to as a *mapping suite package*.A mapping suite can be further broken down into mapping suite packages, one per type of standard form mapped.
A mapping suite, within the TED Semantic Web Service, is a set of mappings that define how an XML document representing an e-Procurement Notice is transformed to an equivalent RDF graph representation that conforms to the eProcurement ontology. These mappings manifest in different forms, and a mapping suite has its relevant components organized in a package, which is refered to as a *mapping suite package*. A mapping suite can be broken down further into mapping suite packages, one per type of standard form mapped.

include::partial$feedback.adoc[]
4 changes: 2 additions & 2 deletions docs/antora/modules/ROOT/pages/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,14 @@

Currently, the TED Semantic Web Service (TED-SWS) continuously converts Contract Award Notices from the xml standard form format available on the TED Website, into RDF format. The RDF is stored in the CELLAR repository. The data in CELLAR can be queried from the https://publications.europa.eu/webapi/rdf/sparql[SPARQL end point] on the EU Vocabularies. It should be noted that the Contract Award Notices should not be confused with the Result Notices resulting from eForms.

This project falls clearly in the European data strategy which foresees a single market for data will allow it to flow freely within the EU and across sectors for the benefit of businesses, researchers and public administrations.
This project falls clearly in the European data strategy which foresees a single market for data that will allow it to flow freely within the EU, and across sectors, for the benefit of businesses, researchers and public administrations.


The following topics are included in this version of the Semantic Web Service Documentation:

////
== Mapping Suites
A mapping suite within the TED Semantic Web Service is a set of mappings that defines how an XML document representing an e-Procurement Notice will be transformed to an equivalent RDF graph representation in conformance with the eProcurement ontology. These mappings are materialized in different forms, as it will be explained later, and a mapping suite will have all its relevant components organized in a package, which we refer to as a *mapping suite package*.A mapping suite can be further broken down into mapping suite packages, one per type of standard form mapped.
A mapping suite within the TED Semantic Web Service is a set of mappings that defines how an XML document representing an e-Procurement Notice will be transformed to an equivalent RDF graph representation in conformance with the eProcurement ontology. These mappings are materialized in different forms, as it will be explained later, and a mapping suite will have all its relevant components organized in a package, which is referred to as a *mapping suite package*.A mapping suite can be further broken down into mapping suite packages, one per type of standard form mapped.
////


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,11 @@ The *specific URIs* are directly used in the
https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03/transformation/mappings[technical mapping files], and they can also be found in the
https://github.com/OP-TED/ted-rdf-mapping/blob/main/mappings/package_F03/transformation/conceptual_mappings.xlsx[conceptual mapping file].

*Imortant note:* Please ensure you adapt the above paths to the resources to match the tag and mapping suite package that you wish to check. For example, for the `2.1.1-rc.1` tag and for form `F20`, the links mentioned above will be:
*Imortant note:* Please ensure you adapt the above paths to the resources to match the tag and mapping suite package that you wish to check. For example, for the `2.1.1-rc.1` tag and for form `F06`, the links mentioned above will be:

* https://github.com/OP-TED/ted-rdf-mapping/tree/2.1.1-rc.1/mappings/package_F20/transformation/resources
* https://github.com/OP-TED/ted-rdf-mapping/tree/2.1.1-rc.1/mappings/package_F20/transformation/mappings
* https://github.com/OP-TED/ted-rdf-mapping/blob/2.1.1-rc.1/mappings/package_F20/transformation/conceptual_mappings.xlsx
* https://github.com/OP-TED/ted-rdf-mapping/tree/2.1.1-rc.1/mappings/package_F06/transformation/resources
* https://github.com/OP-TED/ted-rdf-mapping/tree/2.1.1-rc.1/mappings/package_F06/transformation/mappings
* https://github.com/OP-TED/ted-rdf-mapping/blob/2.1.1-rc.1/mappings/package_F06/transformation/conceptual_mappings.xlsx
[cols="30%,20%,~"]
|===
Expand All @@ -32,12 +32,12 @@ https://github.com/OP-TED/ted-rdf-mapping/blob/main/mappings/package_F03/transfo
|at-voc:cpvsuppl|JSON format|Used a SPARQL query to get the values from the specific EU Voc
|at-voc:main-activity|CSV format|Used this format because the XML element from XSD schema is different than the code from the specific EU Voc
|at-voc:buyer-legal-type|CSV format|Used this format because the XML element from XSD schema is different than the code from the specific EU Voc
|at-voc:award-criterion-type|URI|Used only when we want to map to a specific value from the EU voc
|at-voc:procurement-procedure-type|URI|Used only when we want to map to a specific value from the EU voc
|at-voc:winner-selection-status|URI|Used only when we want to map to a specific value from the EU voc
|at-voc:non-award-justification|URI|Used only when we want to map to a specific value from the EU voc
|at-voc:economic-operator-size|URI|Used only when we want to map to a specific value from the EU voc
|at-voc:direct-award-justification|URI|Used only when we want to map to a specific value from the EU voc
|at-voc:award-criterion-type|URI|Used only when mapping to a specific value from the EU voc
|at-voc:procurement-procedure-type|URI|Used only when mapping to a specific value from the EU voc
|at-voc:winner-selection-status|URI|Used only when mapping to a specific value from the EU voc
|at-voc:non-award-justification|URI|Used only when mapping to a specific value from the EU voc
|at-voc:economic-operator-size|URI|Used only when mapping to a specific value from the EU voc
|at-voc:direct-award-justification|URI|Used only when mapping to o a specific value from the EU voc
|===

include::partial$feedback.adoc[]
17 changes: 9 additions & 8 deletions docs/antora/modules/ROOT/pages/mapping_suite/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
:docdate: October 2023


////

== Prerequisites

To allow for a proper understanding of the Mapping Suite Documentation, the reader should have:
Expand All @@ -16,8 +16,8 @@ Understanding of RDF, RML and SPARQL:: Familiarity with RDF (Resource Descriptio

Understanding of EU Procurement Data and Familiarity with ePO:: If your goal is to understand how the mappings are used to transform specific types of EU procurement data, such as contract notices or award notices, it's important to have a basic understanding of these concepts, and the associated https://docs.ted.europa.eu/EPO/latest/index.html[eProcurement Ontology].

Familiarity with Spreadsheet editing tools:: Since most of the Conceptual mappings is done in spreadsheet working experience with spreadsheet editing tools such as MS Excel or Google Sheets, is desirable.
////
Familiarity with Spreadsheet editing tools:: Since most of the Conceptual mappings is done in spreadsheet working experience with spreadsheet editing tools such as MS Excel or Google Sheets, is beneficial.



////
Expand Down Expand Up @@ -79,7 +79,7 @@ This section describes the upper level of the GitHub repository, the next sectio
a|https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03[/package_F03]
https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F06[/package_F06]
https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F13[/package_F13]
https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F20[/package_F20]
//https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F20[/package_F20]
https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F21[/package_F21]
https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F23[/package_F22]
https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F23[/package_F23]
Expand Down Expand Up @@ -143,7 +143,7 @@ Test data is also provided in the mapping suite packages that are specific to th

|===

////

=== The lower level folders of the GitHub Repository

This section provides more detailed information on the content available in the lower levels of the repository:
Expand All @@ -164,11 +164,12 @@ This section provides more detailed information on the content available in the

** validation

////


// include::methodology.adoc[]
//include::ted-sws-introduction.adoc[Old Introduction]

//include::methodology.adoc[]

// include::toolchain.adoc[]
//include::toolchain.adoc[]

include::partial$feedback.adoc[]
Original file line number Diff line number Diff line change
Expand Up @@ -5,31 +5,35 @@
:docdate: October 2023


In this section we describe the structure of a mapping suite package in GitHub. Such a package contains everything that is needed for the development and testing of a given “mapping suite” that is applicable to a certain set of notices. After the package is finalised, it can be used by a process to apply it to a large number of notices stored in a database, and transform those notices into RDF data.

A package is represented by a well-defined folder structure containing certain files. This folder structure is repeated for every developed mapping. Initial organisation of these packages is per Form number, but it may evolve.
== Mapping suite package structure

The table below shows the structure for the F03 package, where /output refers to ted-rdf-mapping/mappings/package_F03/output, where package_F03 can be replaced by the identifier of the other packages mapped (F06,13, F21, F22, F23 -F20 and es16 should be ignored).
This section describes the structure of a mapping suite package in GitHub. A package contains everything that is needed for the development and testing of a mapping suite that is applicable to a certain set of notices. After the package is finalised, it can be used by a process to apply it to a larger number of notices stored in a database, and transform these into RDF data.


A package is represented by a well-defined folder structure. This folder structure is duplicated for every mapping developed. The listing of these packages is by Form number.

The table below shows the structure of the https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03[F03 package], where /output refers to ted-rdf-mapping/mappings/package_F03/output, where package_F03 can be replaced by the identifier of the other packages mapped (F06,13, F21, F22, F23). The es16 package can be ignored.

[cols="1,2,2"]

|===

|Folder|Subfolder|Description

|/output
|https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03/output[/output]
|/Noticeid/test_suite_report, where the NoticeID is the ID of the notice published on TED for example:

000163-2021/test_suite_report
|Contains the semantic map of the concepts of the given notice (in this case, the F03 package).

a|/output
a|https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03/output[/output]
cont.
|rml_report/html
|https://github.com/OP-TED/ted-rdf-mapping/blob/main/mappings/package_F03/output/rml_report.html[rml_report/html]

shacl_validations.html
https://github.com/OP-TED/ted-rdf-mapping/blob/main/mappings/package_F03/output/shacl_validations.html[shacl_validations.html]

shacl_validations.json
https://github.com/OP-TED/ted-rdf-mapping/blob/main/mappings/package_F03/output/shacl_validations.json[shacl_validations.json]


a|Each folder contains a subfolder for each of the rdf files produced from the sample data (see section on data sampling to understand the folder structure) which is named by the NoticeID of the sample concerned (eg 000163-2021)
Expand All @@ -42,7 +46,7 @@ Another subfolder /test_suite report which in turn contains different files for



|/test_data
|https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03/test_data[/test_data]
|https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03/test_data/form_number_F03_2021[/test_data/form_number_F03_2021]

https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03/test_data/form_number_F03_S01[/test_data/form_number_F03_S01]
Expand All @@ -60,37 +64,35 @@ a|Each folder contains the xml file as published on the TED website of the Notic

Each xml notice is named by the NoticeID of the sample concerned (eg 000163-2021)

|/transformation
|https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03/transformation[/transformation]
|https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03/transformation/mappings[/mappings]
|Contains the rml files used for the transformation. There is one rml file per section of each notice.

|/transformation
|https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03/transformation[/transformation]
|https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03/transformation/resources[/resources]
|Contains files concerning the code list mappings

|/transformation
|conceptual_mappings.xlsx
|Contains the initial conceptual mapping which is used right the rml.
|https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03/transformation[/transformation]
|https://github.com/OP-TED/ted-rdf-mapping/blob/main/mappings/package_F03/transformation/conceptual_mappings.xlsx[conceptual_mappings.xlsx]
|Contains the initial conceptual mapping which is used to write the rml.

|/validation
|https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03/validation[/validation]
|https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03/validation/shacl[/shacl]
|

|/validation
|https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03/validation[/validation]
|https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03/validation/sparql/cm_assertions[/sparql/assertions]
|

|/validation
|https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03/validation[/validation]
|https://github.com/OP-TED/ted-rdf-mapping/tree/main/mappings/package_F03/validation/sparql/integration_tests[/sparql/integration_tests]
|
|===

////
=== Mapping suite package description for Semantic Engineers

In the first, initial, phase, when the Semantic Engineers start working on a new mapping suite, they will have to set up a package folder structure similar to the one described below, and will work on (or with) the files contained there.
=== Mapping suite package description for Semantic Engineers

*Assumption:* Regarding the naming and organisation of the various mapping suites, *one package per form number* is assumed to be THE way to organise these packages.
When the Semantic Engineers start working on a new mapping suite, they first need to set up a package folder structure similar to the one described below. One package per form number is the accepted way to organise packages.

*Challenge:* Are there better ways to deal with certain sections (sub-sections) that repeat across multiple forms? Consider Section I, for example, which in case of forms F03, F06, F25 contains “almost” the same information, therefore only one mapping should be written for it and RE-used in “final” form-mapping-packages. The problem is also discussed in a dedicated section below.

Expand Down Expand Up @@ -123,7 +125,7 @@ The content of this folder should be automatically generated by the mapping pack

=== Mapping suite package description for the Software Engineers

A package provided by the semantic engineers (SE) is enriched with additional artefacts that are generated automatically using the package expanding tools which take as input the artefacts provided by the SE. Here are some examples of these additional artefacts that are being generated:
A package provided by semantic engineers (SE) is enriched with additional artefacts that are generated automatically using the package expanding tools which take as input the artefacts provided by the SE. Here are some examples of additional artefacts generated:

* *Metadata* describing the parameters for selecting the notices that the mappings can be applied to, various version information, etc.
* *SPARQL queries* that can be used to validate and/or test the generated outputs
Expand All @@ -149,7 +151,7 @@ After the package processing/expansion, the structure of the example mapping pac
/sparql
/cm_assertions
*.rq
/shacl # this is a constant, when we know what the SHACL is (currently unknown)
/shacl # this is a constant, when a SHACL is known (currently unknown)
*.shacl.ttl # data shape file(s)
/test_data # manually and carefully selected test data
*.xml
Expand All @@ -166,9 +168,9 @@ After the package processing/expansion, the structure of the example mapping pac

* `/validation/sparql/cm_assertions` SPARQL queries automatically generated from the conceptual mapping

=== Mapping suite package description for the Semantic Engineers after the expansion
=== Mapping suite package description for the Semantic Engineers after execution

After the execution of a mapping, the mapping package will be further enriched, and will contain additional files, as a result of running the mapping suite on the included test data.
After the execution of a mapping, the mapping package will be further enriched, and will contain additional files, as a result of running the mapping suite on the included test data.

----
/package_Fxx
Expand Down Expand Up @@ -205,9 +207,10 @@ After the “execution” of a mapping, the mapping package will be further enri
*.xml
----

* `/output/<notice_file1>` for each example file we create a folder that will contain all the generated artefacts for that sample file
* `/output/<notice_file1>` for each example file a folder is created that contains all the generated artefacts for that sample file
* `/output/test_suite_report` validation reports summarising all individual reports
* `/output/<notice_file1>/<notice_file1>.ttl` the output of the transformation
////

include::partial$feedback.adoc[]

include::partial$feedback.adoc[]

Loading

0 comments on commit 63cd9bd

Please sign in to comment.