Example Walkthrough

This page will walk you through the Java version of the Simple Acquaintances example.

Setup

First, ensure that your system meets the prerequisites. Then clone the psl-examples repository:

git clone https://github.com/linqs/psl-examples.git

Running

Then move into the root directory for the simple acquaintances Java example:

cd psl-examples/simple-acquaintances/java

Each example comes with a run.sh script to quickly compile and run the example. To compile and run the example:

./run.sh

To see the output of the example, check the inferred-predicates/KNOWS.txt file:

cat inferred-predicates/KNOWS.txt

You should see some output like:

'Arti'	'Ben'	0.48425865173339844
'Arti'	'Steve'	0.5642937421798706
< ... 48 rows omitted for brevity ...>
'Jay'	'Dhanya'	0.4534565508365631
'Alex'	'Dhanya'	0.48786869645118713

The exact order of the output may change and some rows were left out for brevity.

Now that we have the example running, lets take a look inside the only source file for the example: src/main/java/org/linqs/psl/examples/simpleacquaintances/Run.java.

Configuration

All configuration in PSL is handled through the Config object. By default, PSL will look for two configuration files: psl.properties and log4j.properties. You can find these files in the src/main/resources directory. The Config class will automatically load these files (if they exist) and all the options in them. Configuration options can still be set using the addProperty() and setProperty() methods of the Config class.

Defining Predicates

The definePredicates() method defines the three predicates for our example:

model.addPredicate("Lived", ConstantType.UniqueStringID, ConstantType.UniqueStringID);
model.addPredicate("Likes", ConstantType.UniqueStringID, ConstantType.UniqueStringID);
model.addPredicate("Knows", ConstantType.UniqueStringID, ConstantType.UniqueStringID);

Each predicate here takes two unique string identifiers as arguments. Note that for unique identifiers, ConstantType.UniqueStringID and ConstantType.UniqueIntID are available. Having integer identifiers usually requires more pre-processing on the user’s side, but gains better performance.

Defining Rules

The defineRules() method defines six rules for the example. This page covers how rules can be defined in PSL. We will discuss the following two rules:

model.addRule("20: Lived(P1, L) & Lived(P2, L) & (P1 != P2) -> Knows(P1, P2) ^2");
model.addRule("5: !Knows(P1, P2) ^2");

The first first rule can be read as “If P1 and P2 are different people and have both lived in the same location, L, then they know each other”. Some key points to note from this rule are:

The second rule is a special rule that acts as a prior. Notice how this rule is not an implication like all the other rules. Instead, this rule can be read as “By default, people do not know each other”. Therefore, the program will start with the belief that no one knows each other and this prior belief will be overcome with evidence.

Loading Data

The loadData() method loads the data from the flat files in the data directory into the data store that PSL is working with. For berevity, we will only be looking at two files:

Inserter inserter = dataStore.getInserter(model.getStandardPredicate("Lived"), obsPartition);
inserter.loadDelimitedData(Paths.get(DATA_PATH, "lived_obs.txt").toString());

inserter = dataStore.getInserter(model.getStandardPredicate("Likes"), obsPartition);
inserter.loadDelimitedDataTruth(Paths.get(DATA_PATH, "likes_obs.txt").toString());

Both portions load data using an Inserter. The primary difference between the two calls is that the second one is looking for a truth value while the first one assumes that 1 is the truth value.

If we look in the files, we see lines like:

../data/lives_obs.txt

Jay	Maryland
Jay	California

../data/likes_obs.txt

Jay	Machine Learning  1
Jay	Skeeball 0.8

In lives_obs.txt, there is no need to use a truth value because living somewhere is a discrete act. You have either lived there or you have not. Liking something, however, is more continuous. Jay may like Machine Learning 100%, but he only likes Skeeball 80%.

Partitions

Here we must take a moment to talk about data partitions. In PSL, we use partitions to organize data. A partition is nothing more than a container for data, but we use them to keep specific chunks of data together or separate. For example if we are running evaluation, we must be sure not use our test partition in training. A more complete discussion of partitions and data storage in PSL can be found here on this page.

PSL users typically organize their data in at least three different partitions (all of which you can see in this example):

Running Inference

The runInference() method handles running inference for all the data we have loaded.

Before we run inference, we have to set up a database to use for inference:

StandardPredicate[] closedPredicates = new StandardPredicate[]{model.getStandardPredicate("Lived"), model.getStandardPredicate("Likes")};
Database inferDB = dataStore.getDatabase(targetsPartition, closedPredicates, obsPartition);

The getDatabase() method of DataStore is the proper way to get a database. This method takes a minimum of two parameters:

Now we are ready to run inference:

InferenceApplication inference = new MPEInference(model, inferDB);
inference.inference();

inference.close();
inferDB.close();

To the MPEInference constructor, we supply our model and the database to infer over. To see the results, then we will need to look inside of the target partition.

Output

The method writeOutput() handles printing out the results of the inference. There are two key lines in this method:

Database resultsDB = dataStore.getDatabase(targetsPartition);
...
for (GroundAtom atom : resultsDB.getAllGroundAtoms(model.getStandardPredicate("Knows"))) {

The first line gets a fresh database that we can get the atoms from. Notice that we are passing in targetsPartition as a write partition, but we are actually just reading from it.

The second line uses the Queries class to iterate over all the Knows atoms from the database we just created.

Evaluation

Lastly, the evalResults() method handles seeing how well our model did. The DiscreteEvaluator class provides basic tools to compare two partitions. In this example, we are comparing our target partition to our truth partition.

Edit This Page