暂无描述

Joeri Exelmans 3b3476e753 forgot more		2 年之前
docs	d070f9e9e9 Fix diagram arrows	2 年之前
examples	3b3476e753 forgot more	2 年之前
tools	435acfec73 Create docker build	2 年之前
.gitignore	200e4fd2d6 Fixed Fuseki config (file was in the wrong place)	2 年之前
.project	72defcea8d Adding the .project files.	2 年之前
LICENSE	e0caf82770 Initial commit	2 年之前
README.md	1caa46839c Merge branch 'actions'	2 年之前

Digital Twin Design

Design documents and prototypes for traceable digital twin representations.

The project follows a simple structure.

.
├── docs       
│   ├── design         Overall design of the digital twin, high level diagrams.
│   ├── diagrams       
│   └── models         
├── examples           
│   └── oml            
└── tools              Utilities aimed to aid in the design of ontologies.
    ├── oml-generators 
    └── draw           Draw.io plugins for writing FTG+PM++ and UML models.

Diagrams

The diagrams used throughout this project are either diagrams.net files or sprotty images. To create the FTG+PM++ diagrams a plugin is required. We host a version with the plugin pre-installed.

OML Subprojects

There are four projects under the examples/oml group. These are all opencaesar OML DSL ontologies. Each project presents a gradle build file. Using these build files you can convert the OML into OWL and even run a Fuseki endpoint. The fuseki endpoint listens at http://localhost:3030/.

Concepts

Traceability

Concept	Explanation
Horizontal Traceability	Traceability between the FTG, the PM, and the PT.
Vertical Traceability	Traceability between versions of artifacts.
Federation	Orchestration + adaptation.
Orchestration	Sending data/requests to the correct endpoint.
Adaptation	Converting heterogeneous files to their ontological representation.

NOTE: Add some information about services.

Drivetrain

Use case 1: Update drivetrain model with increasing complexity.
Use case 2: Calibrate drivetrain sensors.
Use case 3: Measuring data and checking for outliers.

Federation

We aim to provide transparent federation. Not only for data queries but also services. The data can be quite large. We want to allow the user to query this data and perform any neccessary operations without impacting the KG. This prevents us from ingesting all the data. Some of it will need to be accessed on demand. One negative is that this prevents reasoning on the data instances but this is not the main purpose of the data. The reasoning should still be possible on the larger group of types.

One possible way of achieving this is by using virtual data. Operations can be applied to this data by encoding them into the models. The data itself is never ingested into the KG.

Orchestration

A query which requires data from two sources:

                    /------ endpoint A
query ----- fuseki <
                    \------ endpoint B

Some ideas for implementing this are:

Create a mapper on the ontology level with a service which gets called to federate the data.
Add fuseki modules which act like middleware and grab the needed resources.
Use the included ARQ functionality.

ARQ

Fuseki uses ARQ which allows us to create queries which depend on other SPARQL endpoints.

    PREFIX owl: <http://www.w3.org/2002/07/owl#>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

    SELECT DISTINCT ?class ?label
    WHERE {
      ?class a owl:Class.
      SERVICE <https://fuseki.rys.app/labels> { 
        ?class rdfs:label ? label 
      }
    }

The federation example readme goes into more detail on this concept.

Adaptation

Adaptation confers data ingestion. The data is stored in heterogeneous formats. For convenience’s sake we'll use CSV as an example of one such a data format. To solve this problem a translation layer needs to be put in place. Options include converting the CSV to rdf/owl using RML. This however, poses another problem. The ontology is designed using the OML domain specific language. We will lose traceability between the OML and the instances. There is also the problem of properly splitting the data and still being able to refer to it using OML concepts.

A few possible solutions exist. For one, we can create a custom mapper and create OML directly from CSVs. This has the benefit of creating these instances on the same level as the types. This approach however does not scale and needs to be performed on the whole CSV. We cannot stream only the part of the data which is required for a query.

Another option is to keep using RML and adapt the ontology. RML already addresses remote file access and a new version is in the works which should support the generic ontology structure which we currently use. A streaming version of the RML mapper is available. We've yet to test this.

CRUD

Moving the dataset into the database can be done in a few ways.

Through the GUI.
By performing create operations on the endpoint.
Through file operations.

When using Fuseki we have access to services to read and write to the database. This makes it possible to query and update using SPARQL or the Graph Store Protocol.

SPARQL queries such as this one which selects all the available classes can be used.

PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT DISTINCT ?class ?label ?description
WHERE {
  ?class a owl:Class.
  OPTIONAL { ?class rdfs:label ?label }
  OPTIONAL { ?class rdfs:comment ?description }
}
LIMIT 25

Setting up the project

You need a few things to get the project working. I am lazy so this is what I did:

curl -s "https://get.sdkman.io" | bash 
sdk install gradle 8.0.1
cd "examples/omlSystemDesignOntology2Layers" && gradle clean && gradle build && gradle publishToMavenLocal
cd "examples/Drivetrain" && gradle clean && gradle build && gradle publishToMavenLocal

Your mileage may vary.

README.md