# Digital Twin Design Design documents and prototypes for traceable digital twin representations. The project follows a simple structure. . ├── docs │ ├── design Overall design of the digital twin, high level diagrams. │ ├── diagrams │ └── models ├── examples │ └── oml └── tools Utilities aimed to aid in the design of ontologies. ├── oml-generators └── draw Draw.io plugins for writing FTG+PM++ and UML models. ## Diagrams The diagrams used throughout this project are either [diagrams.net][drawio] files or [sprotty][sprotty] images. To create the FTG+PM++ diagrams [a plugin][ftgpm-plugin] is required. We host [a version with the plugin pre-installed][draw-rys]. ## OML Subprojects There are four projects under the `examples/oml` group. These are all [opencaesar OML][oml] DSL ontologies. Each project presents a gradle build file. Using these build files you can convert the OML into OWL and even run a Fuseki endpoint. The fuseki endpoint listens at `http://localhost:3030/`. ## Concepts ### Traceability | Concept | Explanation | |------------------------|---------------------------------------------------------------------| |Horizontal Traceability | Traceability between the FTG, the PM, and the PT. | |Vertical Traceability | Traceability between versions of artifacts. | |Federation | Orchestration + adaptation. | |Orchestration | Sending data/requests to the correct endpoint. | |Adaptation | Converting heterogeneous files to their ontological representation. | NOTE: Add some information about services. ### Drivetrain - [Use case 1][use-case-1]: Update drivetrain model with increasing complexity. - [Use case 2][use-case-2]: Calibrate drivetrain sensors. - [Use case 3][use-case-3]: Measuring data and checking for outliers. ### Federation We aim to provide transparent federation. Not only for data queries but also services. The data can be quite large. We want to allow the user to query this data and perform any neccessary operations without impacting the KG. This prevents us from ingesting all the data. Some of it will need to be accessed on demand. One negative is that this prevents reasoning on the data instances but this is not the main purpose of the data. The reasoning should still be possible on the larger group of types. One possible way of achieving this is by using virtual data. Operations can be applied to this data by encoding them into the models. The data itself is never ingested into the KG. #### Orchestration A query which requires data from two sources: /------ endpoint A query ----- fuseki < \------ endpoint B Some ideas for implementing this are: 1. Create a mapper on the ontology level with a service which gets called to federate the data. 2. Add fuseki modules which act like middleware and grab the needed resources. 3. Use the included ARQ functionality. #### ARQ Fuseki uses ARQ which allows us to create queries which depend on other SPARQL endpoints. PREFIX owl: PREFIX rdfs: SELECT DISTINCT ?class ?label WHERE { ?class a owl:Class. SERVICE { ?class rdfs:label ? label } } The [federation example readme][federation-readme] goes into more detail on this concept. #### Adaptation Adaptation confers data ingestion. The data is stored in heterogeneous formats. For convenience’s sake we'll use CSV as an example of one such a data format. To solve this problem a translation layer needs to be put in place. Options include converting the CSV to rdf/owl using RML. This however, poses another problem. The ontology is designed using the OML domain specific language. We will lose traceability between the OML and the instances. There is also the problem of properly splitting the data and still being able to refer to it using OML concepts. A few possible solutions exist. For one, we can create a custom mapper and create OML directly from CSVs. This has the benefit of creating these instances on the same level as the types. This approach however does not scale and needs to be performed on the whole CSV. We cannot stream only the part of the data which is required for a query. Another option is to keep using RML and adapt the ontology. RML already addresses remote file access and a new version is in the works which should support the generic ontology structure which we currently use. A streaming version of the RML mapper is available. We've yet to test this. ### CRUD Moving the dataset into the database can be done in a few ways. 1. Through the GUI. 2. By performing create operations on the endpoint. 3. Through file operations. When using Fuseki we have access to services to read and write to the database. This makes it possible to query and update using SPARQL or the Graph Store Protocol. SPARQL queries such as this one which selects all the available classes can be used. PREFIX owl: PREFIX rdfs: SELECT DISTINCT ?class ?label ?description WHERE { ?class a owl:Class. OPTIONAL { ?class rdfs:label ?label } OPTIONAL { ?class rdfs:comment ?description } } LIMIT 25 ## Setting up the project You need a few things to get the project working. I am lazy so this is what I did: curl -s "https://get.sdkman.io" | bash sdk install gradle 8.0.1 cd "examples/omlSystemDesignOntology2Layers" && gradle clean && gradle build && gradle publishToMavenLocal cd "examples/Drivetrain" && gradle clean && gradle build && gradle publishToMavenLocal Your mileage may vary. [oml]: http://www.opencaesar.io/oml/ [drawio]: https://app.diagrams.net/ [sprotty]: https://projects.eclipse.org/projects/ecd.sprotty [ftgpm-plugin]: https://msdl.uantwerpen.be/git/jexelmans/drawio/raw/master/src/main/webapp/myPlugins/ftgpm.js [draw-rys]: https://draw.rys.app/ [use-case-1]: ./docs/diagrams/use_case_1.drawio [use-case-2]: ./docs/diagrams/use_case_2.drawio [use-case-3]: ./docs/diagrams/use_case_3.drawio [federation-readme]: ./docs/federation_example.md