Friday, May 22, 2009

Semantics 1.1 Coming Soon

Intellidimension will soon be releasing a new version of it’s Semantics.SDK and Semantics.Server (v 1.1). Here are some of the key features:

  • Much improved query compilation. This includes better graph index statistics and caching of compiled query plans.
  • Support for distributed graphs. A graph can be partitioned over multiple servers for applications that require high performance querying on large data models. Distributed graph partitions can be backed by a SQL Server databases or just use the high performance in-memory r/w cache. The framework ensures maximum parallelization during queries and loads.
  • A set of new visual tools for setting up models and testing and debugging queries. Many improvements to these tools are planned over the next several months.

Both products will be available for a free 60-day evaluation.

Stay tuned!

Tuesday, May 5, 2009

Defining an Entity Description Type

One of the first design issues encountered when using the Entity Framework is the design of your description types. The role of a description type is to define the data that makes up an entity. So you need to start with a clear understanding of your data model. Let’s start with a simple class of object like a Person and work through an example.

A Person will have a couple of simple literal properties such as:

  • Full Name
  • First Name
  • Last Name

The simplest way to define a description type with the Entity Framework is with a rulebase. So let’s define a rulebase for the Person description.


select ?s ?p ?o ?r where {?s ?p ?o. filter(?s=?r).} as <description>

We start by adding a description rule that includes all direct properties of the entity. This is the most basic form of a description type. Things get more complicated when we introduce references to other objects. This introduces some rather common data modeling concepts such as is an object contained by another object or just referenced. So let’s add an example of each type of property.

  • Address (contains an Address object)
  • Knows (references another Person object)

So I will add two additional rules to the description rulebase to include these new properties.

select ?s ?p ?o ?r where {
?r x:Address ?s. ?s ?p ?o.
}as <description>


select ?s ?p ?o ?r where {
?r x:Knows ?s.
?s ?p ?o.
filter(?p=x:FullName)
} as <description>

The first rule includes all the direct properties of the contained Address object. The second rule includes just the full name of the referenced Person object to provide some user-friendly identity for the reference.

This description type is useful for retrieving an entity from the store. However get and put operations are often asymmetric. Meaning you often retrieve more data than you want to store after editing. In this example we would not want to store any a statements about the referenced Person object. To handle this we use a different description type when storing the entity. One that does not include any of the statements for the referenced object.

So we would remove the following rule from the storage description type.

select ?s ?p ?o ?r where {
?r x:Knows ?s.
?s ?p ?o.
filter(?p=x:FullName)
} as <description>

Note these description type rules are for illustrative purposes. It it sometimes much easier to classify containment vs. reference through annotations on the property in the ontology. In this case, you can have one set of rules for all Classes in your data model.

For example we could get all contained objects using a generic rule.

select ?s ?p ?o where {
?r ?x ?y.
?x x:referenceType x:Containment.
<description>(?s, ?p, ?o, ?y).
}
as <description>

The rule above recursively includes all properties that are marked with a reference type of Containment.

As you can see description types are just a tool that can be configured to meet the needs of your data model.

Monday, May 4, 2009

Setting up an Entity Model

This article is about the RDF Entity Framework. It shows you how to setup a basic entity model using Semantics.Server. Setting up Semantics.Server The first step to setting up an entity model is configuing a Semantics.Server database with the graphs that you will need to hold your entity model. We will use the Semantics.Server API in the Semantics.SDK to do this. Create a provider graph that will hold the information about your model setup.
SemanticServerModel entityStore = new SemanticServerModel(ConnectionString);

entityStore.Graphs.CreateMultiGraph("http://entitystore/graph-provider");
Create a graph to hold the facts about the entity instances.
entityStore.Graphs.CreateGraph("http://entitystore/graph-fact");
Create a graph to hold facts about the ontology.
entityStore.Graphs.CreateGraph("http://entitystore/graph-ontology");
Create a rulebase for your CBD description type.
entityStore.Rulebases.CreateRulebase("http://entitystore/rulebase-cbd-description", EntityProviderRulebase.ConciseBoundedDescription);
Setting up the Entity Model We will use the ModelSetup helper class to defined a entity model for the Semantics.Server database we just setup. First we define a URI for our model.
ModelSetup modelSetup = new ModelSetup("http://entitystore/model/model-1");
Map the fact graph into the entity model. This connects the logic graph name with the physical name.
GraphSetup factGraph = new GraphSetup(EntityGraphUri.Fact, true);
factGraph.AddTarget("http://entitystore/graph-fact");
modelSetup.AddGraph(factGraph);
Map the ontology graph into the entity model.
GraphSetup ontologyGraph = new GraphSetup(EntityGraphUri.Ontology, true);
ontologyGraph.AddTarget("http://entitystore/graph-ontology");
modelSetup.AddGraph(ontologyGraph);
Map the description graph onto the fact graph since we are not using any other entity data graphs in this model.
GraphSetup descriptionGraph = new GraphSetup(EntityGraphUri.Description);
descriptionGraph.AddTarget("http://entitystore/graph-fact");
modelSetup.AddDefaultGraph(descriptionGraph);
Create a description type based on our CBD rules.
DescriptionTypeSetup cbdType = new DescriptionTypeSetup("http://entitystore/model/model-1/type-default");
cbdType.AddRulebase("http://entitystore/rulebase-cbd-description");
modelSetup.AddDefaultDescriptionType(cbdType);
Create the entity model.
SemanticServerProvider provider = new SemanticServerProvider(ConnectionString);
provider.Settings.ProviderGraphUri = "http://entitystore/graph-provider";
EntityModel model = modelSetup.CreateEntityModel(provider);
That's it.

Entity Framework Overview

The Entity framework is built on top of the Semantics.SDK and providesentity-based transactions on a RDF store such as Semantics.Server. This article will cover some of the basic concepts of the Entity framework. Entity An entity consists of two things: (1) A URI that indentifies the entity and (2) and set of RDF statements that describe the entity. Description Type A description type defines the statements associated with an entity. In the simple case just the statements with the entity URI as the subject value are considered part of the entity. In real applications an entity description is often much richer than this. A common description type is called a CBD (concise bounded description). A CBD is defined as all statements with the entity URI as the subject value plus any anonymous nodes recursively referenced by any of those statements. Description types can be defined using rules and/or custom .NET code. The example below shows the CBD definition using rules.
RULEBASE 
(
    SELECT ?s ?p ?o ?r WHERE {?s ?p ?o. filter(?r=?s)} AS <description>
    SELECT ?s ?p ?o ?r WHERE {
        ?r ?x ?y.
        {<description>(?s, ?p, ?o, ?y)}
        filter(isblank(?y))
    } AS <description>
)
 
Retrieving an Entity When an entity is retrieved using the RdfEntity framework via the EntityModel class. An entity model requires a URI for the model and a EntityServiceProvider in its constructor. We will discuss the EntityServiceProvider class later but it can be thought of as an entity store driver.
EntityModel model = new EntityModel(provider, “model:MyModel”);
model.UriResolver.AddNamespace("dc", "http://purl.org/dc/elements/1.1/");
model.UriResolver.AddNamespace("foaf", "http://xmlns.com/foaf/0.1/");
model.UriResolver.AddNamespace("model", "http://entitystore/model/model-1/");
model.UriResolver.AddNamespace("store", "http://entitystore/");

Entity derrish = model.GetEntity("model:DerrishRepchick");
derrish.DescriptionType = "model:CBD";
 
An entity is constructed when EntityModel.GetEntity is called. The entity is bound to the specifed URI and the entity model and therefore the underlying service provider. All calls on the instance of the entity will be routed through the entity model that created it. Notice how we set the DescriptionType property on the entity. This tells the framework which statements to retrieve for this entity. The statements are not actually retrieved until a graph is accessed for that entity.
string creator = derrish.NamedNode[EntityGraphUri.Metadata]["dc:creator"].FirstOrDefault();
 
Entity Graphs An entity model is logically and/or physically partitioned into multiple graphs. The Entity framework has some well-known graphs that are defined as follows. You may choose which, if any, of these graphs to support in your entity model. However, some are required for certain features of the Entity framework. Ontology Graph Contains statements about the onotology. This is a required graph. Fact Graph Contains explicit facts about entities. This is a required graph. Description Graph Contains full entity description (facts + inferences). This is a required read-only graph. Inference Graph Contains only inferences about entities. This is an optional read-only graph. Metadata Graph Contains metadata about the entities. This is an optional read-only graph. Provider Graph Contains metadata about the underlying service provider. This is a required (optional for some providers) read-only graph. Storing an Entity A modified entity can be stored by calling the SaveToStore method. You will want to make sure to set the correct description type on the entity before storing. It is often the case that an entity will have asymmetric description types. Meaning often more statements are retrieved then stored. For example you may retrieve facts and inferences but only want to store facts. A similar issue exists when removing an entity from the store. Service Providers The Entity framework contains a built in provider for Semantics.Server called SemanticServerProvider. This service provider allows an entity model to be maintained in a Semantics.Server database. The SemanticServerProvider has a convenient helper class for setting up an entity model called ModelSetup.

Querying a graph using SPARQL

The Intellidimension Semantics.SDK supports the SPARQL syntax for querying RDF data. This article will focus on using SPARQL to query a single in-memory graph and accessing the results. The DataSource.Query Method As mentioned in earlier articles, the abstract base class DataSource provides the interface to all sources of RDF data. All DataSource objects support a Query method with several overloads. The simplest overload takes a single string parameter that is the SPARQL query string to be executed on the DataSource object. The Query method returns a Table object that contains any results for the query. SELECT Command One of the most commonly used SPARQL commands is the SELECT command. When executing a SELECT command against a single DataSource object the name of the graph does not need to be included in the FROM clause of the command since it is implied, as shown below.
GraphDataSource g = new GraphDataSource();

Table results = g.Query(@"
  PREFIX dc: <http://purl.org/dc/elements/1.1/>
  SELECT ?title WHERE
  {<http://example.org/book1> dc:title ?title}");
 
Per the SPARQL specification, the graph pattern is matched against the statements in the DataSource object. The variables in the select list are bound as specified in the graph pattern and returned as results. Query Results SPARQL commands such as the SELECT command return query results in a Table object. The Table object has a row for each distinct set of column values. The column values are defined in the select list of the SELECT command. Below is an example of how to iterate over the results of a SELECT command.
for (int i = 0; i < results.RowCount; i++)
RdfLiteral title = results[i][0] as RdfLiteral;
 
The Semantics.SDK provides support for the XML serialization of query results using the SPARQL result syntax, as shown below.
SparqlXmlFormatter fmt = new SparqlXmlFormatter(
results, typeof(CommandSelect));
fmt.Write(stream);
 
Parameterized Queries SPARQL queries can be parameterized using the Semantics.SDK via the QueryParameters object. The QueryParameter object is used to bind a RdfValue object to a named parameter that is specified in the query using the @ character followed by any valid variable name. A query parameter can be used anywhere that a query variable can be specified. Below shows an example of the use of a query parameter in a SPARQL SELECT command.
QueryParameters qp = new QueryParameters();
qp.Add("id", new RdfUri("http://example.org/"));

g.Query(@"SELECT ?s ?p ?o WHERE
{?s ?p ?o. filter(?s=@id)}", qp);
 
Conclusion This article provided an introduction to the objects and methods used to execute a SPARQL query and access the results using the Semantics.SDK. However this article did not provide much insight into the Semantics.SDK support for all the SPARQL commands and extensions. The Semantics.SDK supports a variety of SPARQL commands such as: SELECT, DESCRIBE, CONSTRUCT, ASK, INSERT, and DELETE. In addition, it provides support for inference rules and a variety of extension functions. These will all be discussed in future articles.

Reading and writing a RDF Graph

The Intellidimension Semantics.SDK contains a suite of RDF readers and formatters for .NET. These allow applications to read and write serialized RDF data from and to streams. RdfReader and RdfFormatter The Semantics.SDK has two abstract base classes that provide the interfaces for reading and formatting RDF data, respectively. These base classes provide an interface that allows each to operate on an instance of a stream object that must be created in the application code prior to using either of these classes. This allows the RdfReader and RdfFormatter classes to operate on virtually any source that supports the TextReader or TextWriter interface as defined in System.IO. The RdfReader class is design to read data from a stream into an instance of a DataSource. While the RdfFormatter class will write data from an instance of a DataSource to a stream. RDF Syntaxes The Semantics.SDK supports most of the standard RDF syntaxes. The table below lists the RDF syntax along with the RdfReader and RdfFormatter classes that implement the serialization for that syntax. RDF/XML: RdfXmlReader, RdfXmlFormatter N Triples: NTriplesReader, NTriplesFormatter Turtle: TurtleReader, TurtleFormatter RDFa: RdfaReader Reading RDF In the Semantics.SDK, all RDF graphs derive from the base class DataSource. This base class defines several overloaded Read methods for loading RDF data into an instance of a DataSource. The Read method is used to load RDF data into the DataSource from a string, stream or location specified as a URI. Each of these overloaded methods provides the option of specifying the default base URI for any relative URIs used in the RDF data. The code below shows an example of how to read a RDF/XML file into an in-memory graph. Note, that one of the overloaded Read methods is a generic method in which the class of the RdfReader is specified.
GraphDataSource g = new GraphDataSource();
StreamReader s = new StreamReader(@"c:\sample.rdf");
g.Read(s);
 
Writing RDF The DataSource class also defines several overloaded Format methods for writing RDF data from an instance of a DataSource to a string or stream. Like the Read method, the Format method also has an overloaded generic form in which the class of the RdfFormatter is specified. The code below shows an example of how to store the contents of an in-memory graph to a file using the N Triples syntax.
GraphDataSource g = new GraphDataSource();
StreamWriter s =
    new StreamWriter(@"c:\sample.rdf");
g.Format(s);
 
The code below shows an example of how to store the contents of an in-memory graph to a string using the turtle syntax.
GraphDataSource g = new GraphDataSource();
String s = g.Format();
 
Conclusion This article provided a brief introduction into how to serialize RDF data to and from a stream. The Semantics.SDK also allows RDF data to be stored and retrieved from a Microsoft SQL Server® database using Intellidimension Semantics.Server. This will be discussed in detail in future articles.

Getting Started with Graphs

The Intellidimension Semantics.SDK provides a simple object-oriented API for working with RDF data using Microsoft .NET. The most basic structure for managing RDF data is a graph, which is a collection of RDF statements. This article will focus on the basic interface the Semantics.SDK provides for manipulating statements in a graph. In-memory Graphs The Semantics.SDK provides an abstract base class DataSource that provides an interface for working with a RDF graph. The most commonly used implementation of a DataSource is the GraphDataSource which implements an in-memory RDF graph. Adding Statements One of several ways statements can be added to any DataSource via its Add method.
GraphDataSource g = new GraphDataSource();

g.Add(new RdfUri("http://www.intellidimension.com/sdj.pdf"),
new RdfUri("http://www.w3.org/2000/01/rdf-schema#label"),
new RdfLiteral("Getting Started With Graphs"));
 
The parameters to the Add method specify the subject, predicate and object values, respectively, of the statement to be added to the graph. There are several overloads to the Add method. Removing Statements Similarly, a statement can be removed from a DataSource by calling it’s Remove method and specify the subject, predicate and object values of the statement to be removed. Any one of these parameters can be set to null. In that case the parameters for the Remove method act as a statement mask where each null value is treated as a wildcard value for matching statements in the DataSource. In this manner multiple or all the statement can be removed from the DataSource with a single call. Getting Statements A DataSource object is a collection of RDF statements each represented by an instance of the class Statement. All DataSource objects provide access to the collection of statements via their GetStatements method. The code below shows how to iterate over all the statements in a DataSource.
foreach (Statement stmt in g.GetStatements())
{
    RdfValue s = stmt.Subject;
    RdfValue p = stmt.Predicate;
    RdfValue o = stmt.Object;
}
 
For each Statement the Subject, Predicate and Object properties are retrieved as an instance of the RdfValue class. The RdfValue Class The RdfValue class is an abstract base class that is used to represent subject, predicate and object values of a statement. A RDF resource is represented by the RdfUri class and a RDF literal is represented by the RdfLiteral class.
double d = (double)(RdfLiteral)stmt.Object;
DateTime dt = (DateTime)(RdfLiteral)stmt.Object;
 
The RdfValue class is widely used throughout the Semantics.SDK and it supports some conveniences such as conversion operators to simplify the integration with other data types in .NET. Conclusion This article provided a brief introduction into how to manipulate statements in an RDF graph. The Semantics.SDK provides other capabilities such as issuing queries to graphs that will be discussed in future.

What is Semantics.Server?

Semantics.Server is a set of .NET assemblies that are installed on Microsoft SQL Server enabling it to store, query and perform inferencing on RDF data. All the RDF data stored using Semantics.Server is backed by relational tables thereby leveraging the capabilities of Microsoft SQL Server. Even inferencing can be backed by relational tables. This eliminates memory limits that are often reached when executing inference rules over large models. An evaluation copy of Semantics.Server can be downloaded from: http://www.intellidimension.com/

What is the Semantics.SDK?

The Semantics.SDK is a commercially available product for adding semantics based computing to your .NET application. It consists of visual tools for working if RDF, RDFS/OWL ontologies, SPARQL queries and inference rules. These tools are based on a complete .NET based API that is also available to the developer. Semantics.SDK features:
  • In-memory RDF graphs
  • SPARQL query engine
  • Inference engine
  • Extensible data service interface for integration
  • API for Semantics.Server

An evaluation copy of Semantics.SDK can be downloaded from:

http://www.intellidimension.com/