When generating notice/RDF, the ted-rdf-mapping-eforms repository map N source schemas to a single target ontology (v4.0), resulting in N individual mappings. With each update to the ontology, all N mappings must also be updated, adding complexity and cost.
By introducing an intermediate model as a pivot, we can simplify ontology upgrades. The concept is to map the N source schemas to the pivot, and then map the pivot to RDF. This reduces the number of mappings to maintain during Ontology upgrades to one.
One approach for this pivot model is to use an in-memory relational database (RDB), where the XML data is mapped to relational tables. RDF can then be produced from the relational schema using a mapping language such as R2RML.
This method will enhance performance and clarity in the mapping process.
- Initialization
- Set up an in-memory relational database, prepopulating it with tables that contain controlled vocabularies.
- Populating Data
- Traverse the XML data to fill the corresponding database tables.
- Converting XML into relational tables is a well-known problem, and tools exist to automate this process, often utilizing XSD files. Ideally, a declarative language should be used for this translation.
- Traverse the XML data to fill the corresponding database tables.
- Mapping to RDF
- Convert the populated relational tables into RDF triples. O(n)
- During development, tools like Ontop can be used to execute SPARQL queries directly against the relational database.
- While I haven't used the latest version of R2RML, my previous experience with D2RQ yielded excellent results.
- Convert the populated relational tables into RDF triples. O(n)
- Lower Migration Costs: When updating to a new ontology version, only the R2RML mapping needs to be modified, rather than N individual mappings.
- Built-in Data Integrity: Database constraints, such as foreign keys tied to controlled vocabularies, help ensure data integrity during the transformation process.
- XML-to-Relational Challenges: It remains to be seen whether all SDK versions can be effectively translated into standard relational tables. Further investigation is required to ensure full compatibility. I've asked ChatGPT to come up with the relational DB schema starting from XML. It proposes something that appears to make sense.
Bonus: Transforming RDF via HTML
One of the simplest approaches to extract RDF is by leveraging the HTML representation of a notice.
Step-by-step:
Pros:
Cons: