TODO: Switch code views from develop to 2.3.0
We are happy to announce the release of PSL version 2.3.0! We have made great improvements to PSL in the areas of optimization and infrastructure. In this changelog, you will find a list of the major changes in 2.3.0 as well as information on migrating from 2.2.2. 2.3.0 is scheduled to be the last minor release before our next major release, PSL 3.0.0. PSL 3.0.0 will contain breaking including dropping Java 7 support.
For those of you that learn better by example, check out the PSL examples repository. Tags are included in the PSL examples repository that track specific versions of PSL to help you ensure the examples are using the same version of PSL as you.
Both the primary PSL repository as well as the PSL examples repository are changing their branching scheme.
Previously, the default branch
master tracked the latest stable PSL release, while the
develop branch contained all active development.
Going forward, only one branch,
main, will be used, which will contain all active development.
Tags will continue to be used to track releases (major and minor).
- Infrastructure Improvements
- PSL Interfaces
- Pipeline Method Improvements
PSL 2.3.0 comes with various improvements to our infrastructure and software quality.
- Lint checks on our Java codebase. 35062c03
- Updated PyPi versioning to create unique identifiers for each build. 1045d4bd
- Non-release builds are pushed to the Sonatype snapshot repository (Java artifacts) and test.pypi.org.
- Java versions 8, 11, 16, and 17 are tested. 49180f65
- Python versions 3.7, 3.8, and 3.9 are tested. 49180f65
Along with improvements to our CI, we have made various improvements to make PSL’s testing more robust and accessible for new contributors:
- Moved PSL test utils to their own package. 41800c48
- Added a common test class, PSLBaseTest that all PSL core tests can derive from. This base test includes common setup and cleanup functionality. 88d82829
- Added standard Junit assertions to the base PSL test class. Now child tests don’t need to import the assertions or configure them (like when comparing floating point numbers). 1d6e681f,
- Incorporated the functionality from PSLTest into PSLBaseTest. 00cf883e
In our continuous dedication to performant software, PSL 2.3.0 includes several systems-level improvements the affect both speed and memory usage:
- Grounding will perform additional checks when instantiating rules to avoid creating trivial ground rules. fc898bfd
- Grounding will bypass querying the database for atoms from closed predicates that it can safely infer a value for. befbf5ab
- Grounding will skip instantiating certain closed atoms that are deemed useless. ee12f8f7
- Batch allocation of ground rules is modified to optimize time spent allocating structures. b7f98282
- Numeric constructors are replaced with calls to
valueOf()of the appropriate type. 0a7d6c6f
The PSL 2.2.1 release includes non-breaking changes to one PSL interface and a brand new mid-level interface.
CLI Run Script Updates
Commit (psl-examples): a8c01a64
The PSL run scripts (
run.sh) provided with the PSL examples now includes a variable (
RUN_SCRIPT_VERSION) denoting the version of the run script. This version is independent of the PSL version and provides a mechanism for checking if a script (which may have been copied to your own model) is out-of-date.
Additionally, the CLI run scripts will now work with snapshot builds even if PSL has not been built on the local machine. The script will first check for a locally built instance of PSL matching the specified version, and if that does not exist the script will fetch a matching snapshot build from our test servers. To update the build/jar being used, you must delete the existing jar to force a re-fetch.
Inference Configuration Specification
Shortcuts are now provided to configure inference settings. Instead of individually specifying reasoners, term stores, term generators, etc.; inference classes are provided that already have the required configuration. For example, to run Tandem Inference (TI) using the old method (which still works), you would include the options:
--infer -D inference.reasoner=SGDReasoner -D inference.termstore=SGDStreamingTermStore -D inference.termgenerator=SGDTermGenerator
Using the new method, you would just use:
The available inference configurations can be viewed here.
PSL 2.3.0 introduces a new mid-level interface to PSL: the PSL Runtime.
The motivation behind the runtime is to provide a single platform capable of running a full PSL pipeline (e.g. parsing rules, loading data, grounding, inference, and evaluation) that is more programmatically accessible than the CLI.
As a “mid-level” interface, the PSL Runtime is not indented for the common PSL use cases, but rather for the people building new PSL interfaces or people that need access to PSL internals while still easily running PSL pipelines. Most users are recommended to keep using the CLI or Python interface.
Pipeline Method Improvements
PSL 2.3.0 includes many improvements to some of our core pipeline functionality, namely optimization/reasoning, weight learning, and evaluation.
- Reworked the breaking conditions for all optimizers. Where applicable, solution feasibility has been made a higher priority when determining when to stop. Variable movement tracking is also included as a break condition. Reasoners also now have the option (
reasoner.runfulliterations) to run until the maximum number of iterations is reached regardless of any other stopping criteria. [378d676c, 640ecba6, 728e1899, 130fb000]
- Removed direct support for boolean reasoning (MaxWalkSat and MCSat). [4f4420a0]
- Added more ways atoms can be initialized for inference (1.0 and 0.5). [b786a365]
- SGD-based inference methods will now relax their hard constraints into soft ones. The chosen weight is the largest weight seen multiplied by a constant (
inference.relax.multiplier). The choice of a quadratic or linear relaxation is chosen by the
- Added weight normalization (
inference.normalize) and relaxation (
inference.relax) for all rules and inference methods. Weight normalization converts all weights to be in [0, 1] by dividing all weights by the largest weight. [6e213e9c]
- Improved the logging of SGD and DCD to match the semantics and style of ADMM. [3fc33541, 438f38d2]
- Added the ability to run an Evaluator between rounds of optimization using the
reasoner.evaluateoption. Evaluators already selected for the evaluation stage (e.g., the
--evalCLI option) will be used. [e3c9f679]
- SGD now uses the lowest objective variable values as its solution. Since SGD steps are not guaranteed to decrease the MAP objective and the objective computation used to detect convergence is delayed one iteration, the best solution may not be the final state. [d2c866db]
- SGD now uses first-order optimality conditions to measure convergence of stochastic gradient descent reasoning. [edc96214]
Older methods that are now outperformed in both speed and quality of answer by more modern methods have been removed. These removed methods includes all EM-based methods [ecdafb3e] and the maximum pseudo-likelihood learner [15c26eea]. The weight sampling method used in search-based learners has been improved by sampling from a hypersphere and Dirichlet distribution [370462f3]. This allows these methods to get a better representation of the search space. For an overview of weight learning (theory and methods) in PSL, we recommend the paper A Taxonomy of Weight Learning Methods for Statistical Relational Learning.
PSL 2.3.0 includes some minor changes to evaluation, including the renaming of the RankingEvaluator to the AUCEvaluator 5b5f43de and a rework of the TrainingMap to more rigorously handle all the possible state of tracked variables (i.e., the observed/unobserved status of training/truth variables). Evlaution will use the reworked training map in addition to configuration options when deciding which atoms to include in evaluation. The
eval.closetruth option applies the closed-world assumption to truth atoms and includes target atoms that have no truth atom specified in evaluation, while the
eval.includeobs option includes observed atoms from the target database in evaluation. [39a7cdca, 2c1c8b6c].
commit 8f80c384b31b13ef7e0e47a1e9984c3d61379c84 Date: Wed Jun 30 09:00:39 2021 -0700
Inference Optimization (#283) Delayed objective calculation in the SGD and DCD reasoners. Saves non-optimizing passes through the data. Objective change used as stopping criterion is normalized by the number of terms. Future functionality may modify number of ground terms. Removal of learning rate as an objective term instance variable. Saves memory as the learning rate is common accross potentials and now managed by the reasoner. Added option for setting the learning schedule for SGD inference. Added option for taking coordinate updates during SGD steps. Implementation of adagrad and adam in the SGD reasoner. Improves convergence of inference. Moved "minimization" (which is actually taking a gradient step) out of DCD/SGD terms and into the reasoner.
commit 586cbc755f140aa0d0958b4448e521f228d2d466 Date: Fri Jul 2 09:16:25 2021 -0700
Introduce online messages. (#308) Introduce online messages. Online messages are serializable objects for online client-server communication. The testing infrastructure for online term stores and reasoners relies on the definition of these objects.
commit 4265741bd0dfa095e8ac16eff47f1ba14f648902 Date: Tue Jul 6 15:06:36 2021 -0700
Introduce online responses and model information messages. (#309) Introduce online responses and model information messages. Action status messages are sent to the client after the online psl server executes an online action. QueryAtomResponses are sent to the client when it sends a QueryAtom action. ModelInformation is sent when the client initially establishes a connection to the server.
commit 9e406ea18412195a7ecf88adc04a71c065f3a87c Date: Sat Aug 7 09:27:04 2021 -0700
Introduce OnlineInference applications, OnlineClients, and OnlineServers. (#310) Introduce OnlineInference applications, OnlineClients, and OnlineServers. This version of OnlineInference only supports Exits and Stops. Online test utilities and an SGDOnlineInference application test are also introduced. Currently, the testing infrastructure only tests to see that SGDOnlineInference applications start, will accept client connections, and will shut down cleanly.
commit a4bfb8e21bb405c0594b9d398dfb11f93e1d2668 Date: Wed Aug 18 08:49:55 2021 -0700
Support addAtom actions in online inference applications. (#313) Support addAtom actions in online inference. This requires the introduction of online term stores and online grounding iterators.
commit 5a9cd7c6bced96094dbad1cb4cccac0af9d9d929 Date: Thu Sep 16 16:15:00 2021 -0700
Add support for remaining model and control actions. (#315) Add support for remaining model and control actions. This includes DeleteAtom, ObserveAtom, UpdateObservation, and WriteInferredPredicates actions.
commit d7a6af1c8539ee1859357f4a6e6146bb33669a38 Date: Tue Nov 2 09:08:56 2021 -0700
Support for online rule actions. (#319) Support for online rule actions. This pull request contains code changes that were necessary for supporting online rule actions. Notable changes include: - Hashcodes for abstract rules are no longer identity hashcodes. They are functions of the parameters defining the rules and are provided as an argument to the constructor of abstract rules. Fake rules now have a hashcode, 0. This change ensures that rules in rule actions have the same hash on the client and server. - Deactivating and deleting rules can throw off the term count that is important in detecting convergence, both the cache iterators and grounding iterators now keep track of this count so the reasoner has the most up-to-date count when it needs it.
commit ed404b8079a00175b9584bd64bdd6d6ead4d6ada Date: Mon Nov 8 14:32:04 2021 -0800
Add OnlinePSL grammar and OnlineActionLoader class. (#323) Add OnlinePSL grammar and OnlineActionLoader class to parse user-provided online commands.
commit e290599ce36ae566c26ec606dcb399830ce4369b Date: Thu Nov 18 10:40:05 2021 -0800
Online delete atoms. (#326) Fix an issue where deleting the existing atom during an add always provides null to the online termstore and this could lead to duplicated terms. Now, the return value of the deleteAtom call is used to delete atoms in Online terms.
commit 9be3bd40cc30e4ac2ad8b19fac49cdc07d03e9ef Date: Fri Nov 26 08:13:59 2021 -0800
Online action interface. (#325) Introduce the online action interface, an interface for users to provide online actions via stdin.
New Logging/Options Infrastructure
In an effort to improve PSL’s logging infrastructure, we have updated our Log4J dependency to the latest version. Moving fully to Log4J 2 and getting rid of the Log4J 1-to-2 bridge as well as the Log4J 1 configurations. This allows us to more easily stay up-to-date with any Log4J updates and stay ahead of any security issues.
This effort also includes a new logging utility class that acts as both a logger and logging configuration. When used statically, this class provides an interface into logging configuration including getting a new logger and setting the logging level. When used as an object, this class provides standard logging methods that pass through to a Log4J logger. Using this class, PSL developers only need to import one logging resource while having the specifics abstracted.
All configuration options have been moved to a centalized locations (the Options class). This allows uniform standards on the creation and use of configuration options.