TODO: Switch code views from develop to 2.3.0

We are happy to announce the release of PSL version 2.3.0! We have made great improvements to PSL in the areas of optimization and infrastructure. In this changelog, you will find a list of the major changes in 2.3.0 as well as information on migrating from 2.2.2. 2.3.0 is scheduled to be the last minor release before our next major release, PSL 3.0.0. PSL 3.0.0 will contain breaking including dropping Java 7 support.

For those of you that learn better by example, check out the PSL examples repository. Tags are included in the PSL examples repository that track specific versions of PSL to help you ensure the examples are using the same version of PSL as you.

Both the primary PSL repository as well as the PSL examples repository are changing their branching scheme. Previously, the default branch master tracked the latest stable PSL release, while the develop branch contained all active development. Going forward, only one branch, main, will be used, which will contain all active development. Tags will continue to be used to track releases (major and minor).


Infrastructure Improvements

PSL 2.3.0 comes with various improvements to our infrastructure and software quality.

CI

Commit: d10e2f4f,

Continuous integration (CI) for PSL has moved from Travis to Github Actions. In addition, many new checks have been added to the PSL CLI, including:

  • Lint checks on our Java codebase. 35062c03
  • Updated PyPi versioning to create unique identifiers for each build. 1045d4bd
  • Non-release builds are pushed to the Sonatype snapshot repository (Java artifacts) and test.pypi.org.
  • Java versions 8, 11, 16, and 17 are tested. 49180f65
  • Python versions 3.7, 3.8, and 3.9 are tested. 49180f65

Testing

Along with improvements to our CI, we have made various improvements to make PSL’s testing more robust and accessible for new contributors:

  • Moved PSL test utils to their own package. 41800c48
  • Added a common test class, PSLBaseTest that all PSL core tests can derive from. This base test includes common setup and cleanup functionality. 88d82829
  • Added standard Junit assertions to the base PSL test class. Now child tests don’t need to import the assertions or configure them (like when comparing floating point numbers). 1d6e681f,
  • Incorporated the functionality from PSLTest into PSLBaseTest. 00cf883e

Systems Improvements

In our continuous dedication to performant software, PSL 2.3.0 includes several systems-level improvements the affect both speed and memory usage:

  • Grounding will perform additional checks when instantiating rules to avoid creating trivial ground rules. fc898bfd
  • Grounding will bypass querying the database for atoms from closed predicates that it can safely infer a value for. befbf5ab
  • Grounding will skip instantiating certain closed atoms that are deemed useless. ee12f8f7
  • Batch allocation of ground rules is modified to optimize time spent allocating structures. b7f98282
  • Numeric constructors are replaced with calls to valueOf() of the appropriate type. 0a7d6c6f

PSL Interfaces

The PSL 2.2.1 release includes non-breaking changes to one PSL interface and a brand new mid-level interface.

CLI Run Script Updates

Commit (psl-examples): a8c01a64

The PSL run scripts (run.sh) provided with the PSL examples now includes a variable (RUN_SCRIPT_VERSION) denoting the version of the run script. This version is independent of the PSL version and provides a mechanism for checking if a script (which may have been copied to your own model) is out-of-date.

Additionally, the CLI run scripts will now work with snapshot builds even if PSL has not been built on the local machine. The script will first check for a locally built instance of PSL matching the specified version, and if that does not exist the script will fetch a matching snapshot build from our test servers. To update the build/jar being used, you must delete the existing jar to force a re-fetch.

Inference Configuration Specification

Commit: 105bde52

Shortcuts are now provided to configure inference settings. Instead of individually specifying reasoners, term stores, term generators, etc.; inference classes are provided that already have the required configuration. For example, to run Tandem Inference (TI) using the old method (which still works), you would include the options:

--infer -D inference.reasoner=SGDReasoner -D inference.termstore=SGDStreamingTermStore -D inference.termgenerator=SGDTermGenerator

Using the new method, you would just use:

--infer SGDStreamingInference

The available inference configurations can be viewed here.

PSL Runtime

Commit: d86d0a8c

PSL 2.3.0 introduces a new mid-level interface to PSL: the PSL Runtime.

The motivation behind the runtime is to provide a single platform capable of running a full PSL pipeline (e.g. parsing rules, loading data, grounding, inference, and evaluation) that is more programmatically accessible than the CLI.

As a “mid-level” interface, the PSL Runtime is not indented for the common PSL use cases, but rather for the people building new PSL interfaces or people that need access to PSL internals while still easily running PSL pipelines. Most users are recommended to keep using the CLI or Python interface.

Pipeline Method Improvements

PSL 2.3.0 includes many improvements to some of our core pipeline functionality, namely optimization/reasoning, weight learning, and evaluation.

Optimization

  • Reworked the breaking conditions for all optimizers. Where applicable, solution feasibility has been made a higher priority when determining when to stop. Variable movement tracking is also included as a break condition. Reasoners also now have the option (reasoner.runfulliterations) to run until the maximum number of iterations is reached regardless of any other stopping criteria. [378d676c, 640ecba6, 728e1899, 130fb000]
  • Removed direct support for boolean reasoning (MaxWalkSat and MCSat). [4f4420a0]
  • Added more ways atoms can be initialized for inference (1.0 and 0.5). [b786a365]
  • SGD-based inference methods will now relax their hard constraints into soft ones. The chosen weight is the largest weight seen multiplied by a constant (inference.relax.multiplier). The choice of a quadratic or linear relaxation is chosen by the inference.relax.squared option. [e4aa914a]
  • Added weight normalization (inference.normalize) and relaxation (inference.relax) for all rules and inference methods. Weight normalization converts all weights to be in [0, 1] by dividing all weights by the largest weight. [6e213e9c]
  • Improved the logging of SGD and DCD to match the semantics and style of ADMM. [3fc33541, 438f38d2]
  • Added the ability to run an Evaluator between rounds of optimization using the reasoner.evaluate option. Evaluators already selected for the evaluation stage (e.g., the --eval CLI option) will be used. [e3c9f679]
  • SGD now uses the lowest objective variable values as its solution. Since SGD steps are not guaranteed to decrease the MAP objective and the objective computation used to detect convergence is delayed one iteration, the best solution may not be the final state. [d2c866db]
  • SGD now uses first-order optimality conditions to measure convergence of stochastic gradient descent reasoning. [edc96214]

Weight Learning

Older methods that are now outperformed in both speed and quality of answer by more modern methods have been removed. These removed methods includes all EM-based methods [ecdafb3e] and the maximum pseudo-likelihood learner [15c26eea]. The weight sampling method used in search-based learners has been improved by sampling from a hypersphere and Dirichlet distribution [370462f3]. This allows these methods to get a better representation of the search space. For an overview of weight learning (theory and methods) in PSL, we recommend the paper A Taxonomy of Weight Learning Methods for Statistical Relational Learning.

Evaluation

PSL 2.3.0 includes some minor changes to evaluation, including the renaming of the RankingEvaluator to the AUCEvaluator 5b5f43de and a rework of the TrainingMap to more rigorously handle all the possible state of tracked variables (i.e., the observed/unobserved status of training/truth variables). Evlaution will use the reworked training map in addition to configuration options when deciding which atoms to include in evaluation. The eval.closetruth option applies the closed-world assumption to truth atoms and includes target atoms that have no truth atom specified in evaluation, while the eval.includeobs option includes observed atoms from the target database in evaluation. [39a7cdca, 2c1c8b6c].

Online PSL

commit 8f80c384b31b13ef7e0e47a1e9984c3d61379c84 Date: Wed Jun 30 09:00:39 2021 -0700

Inference Optimization (#283)

Delayed objective calculation in the SGD and DCD reasoners.
Saves non-optimizing passes through the data.
Objective change used as stopping criterion is normalized by the number of terms.
Future functionality may modify number of ground terms.
Removal of learning rate as an objective term instance variable.
Saves memory as the learning rate is common accross potentials and now managed by the reasoner.
Added option for setting the learning schedule for SGD inference.
Added option for taking coordinate updates during SGD steps.
Implementation of adagrad and adam in the SGD reasoner.
Improves convergence of inference.
Moved "minimization" (which is actually taking a gradient step) out of DCD/SGD terms and into the reasoner.

commit 586cbc755f140aa0d0958b4448e521f228d2d466 Date: Fri Jul 2 09:16:25 2021 -0700

Introduce online messages. (#308)

Introduce online messages.
Online messages are serializable objects for online client-server communication.
The testing infrastructure for online term stores and reasoners relies on the definition of these objects.

commit 4265741bd0dfa095e8ac16eff47f1ba14f648902 Date: Tue Jul 6 15:06:36 2021 -0700

Introduce online responses and model information messages. (#309)

Introduce online responses and model information messages.
Action status messages are sent to the client after the online psl server executes an online action.
QueryAtomResponses are sent to the client when it sends a QueryAtom action.
ModelInformation is sent when the client initially establishes a connection to the server.

commit 9e406ea18412195a7ecf88adc04a71c065f3a87c Date: Sat Aug 7 09:27:04 2021 -0700

Introduce OnlineInference applications, OnlineClients, and OnlineServers. (#310)

Introduce OnlineInference applications, OnlineClients, and OnlineServers.
This version of OnlineInference only supports Exits and Stops.
Online test utilities and an SGDOnlineInference application test are also introduced.
Currently, the testing infrastructure only tests to see that SGDOnlineInference applications start, will accept client connections, and will shut down cleanly.

commit a4bfb8e21bb405c0594b9d398dfb11f93e1d2668 Date: Wed Aug 18 08:49:55 2021 -0700

Support addAtom actions in online inference applications. (#313)

Support addAtom actions in online inference.
This requires the introduction of online term stores and online grounding iterators.

commit 5a9cd7c6bced96094dbad1cb4cccac0af9d9d929 Date: Thu Sep 16 16:15:00 2021 -0700

Add support for remaining model and control actions.  (#315)

Add support for remaining model and control actions. This includes DeleteAtom, ObserveAtom, UpdateObservation, and WriteInferredPredicates actions.

commit d7a6af1c8539ee1859357f4a6e6146bb33669a38 Date: Tue Nov 2 09:08:56 2021 -0700

Support for online rule actions.  (#319)

Support for online rule actions.
This pull request contains code changes that were necessary for supporting online rule actions.
Notable changes include:
- Hashcodes for abstract rules are no longer identity hashcodes. They are functions of the parameters defining the rules and are provided as an argument to the constructor of abstract rules. Fake rules now have a hashcode, 0. This change ensures that rules in rule actions have the same hash on the client and server.
- Deactivating and deleting rules can throw off the term count that is important in detecting convergence, both the cache iterators and grounding iterators now keep track of this count so the reasoner has the most up-to-date count when it needs it.

commit ed404b8079a00175b9584bd64bdd6d6ead4d6ada Date: Mon Nov 8 14:32:04 2021 -0800

Add OnlinePSL grammar and OnlineActionLoader class.  (#323)

Add OnlinePSL grammar and OnlineActionLoader class to parse user-provided online commands.

commit e290599ce36ae566c26ec606dcb399830ce4369b Date: Thu Nov 18 10:40:05 2021 -0800

Online delete atoms. (#326)

Fix an issue where deleting the existing atom during an add always provides null to the online termstore and this could lead to duplicated terms.
Now, the return value of the deleteAtom call is used to delete atoms in Online terms.

commit 9be3bd40cc30e4ac2ad8b19fac49cdc07d03e9ef Date: Fri Nov 26 08:13:59 2021 -0800

Online action interface.  (#325)

Introduce the online action interface, an interface for users to provide online actions via stdin.

Misc

New Logging/Options Infrastructure

Commit: bcf8fe45

In an effort to improve PSL’s logging infrastructure, we have updated our Log4J dependency to the latest version. Moving fully to Log4J 2 and getting rid of the Log4J 1-to-2 bridge as well as the Log4J 1 configurations. This allows us to more easily stay up-to-date with any Log4J updates and stay ahead of any security issues.

This effort also includes a new logging utility class that acts as both a logger and logging configuration. When used statically, this class provides an interface into logging configuration including getting a new logger and setting the logging level. When used as an object, this class provides standard logging methods that pass through to a Log4J logger. Using this class, PSL developers only need to import one logging resource while having the specifics abstracted.

Commit: 6332b92c

All configuration options have been moved to a centalized locations (the Options class). This allows uniform standards on the creation and use of configuration options.