TreeLiker: A Tool For Relational Learning with Tree-like Patterns
TreeLiker is a collection of fast algorithms for working with complex structured data in relational form. The data can, for example, describe large organic molecules such as proteins or groups of individuals such as social networks or predator-prey networks etc. The algorithms included in TreeLiker are unique in that, in principle, they are able to search given sets of relational patterns exhaustively, thus guaranteeing that if some good pattern capturing an important feature of the problem exists, it will be found. In experiments with real-life data, the algorithms were shown to be able to construct complete non-redundant sets of patterns for chemical datasets involving several thousands of molecules as well as for comparably large datasets from genomics or proteomics.
The included relational learning algorithms are tailored towards so-called tree-like features for which some otherwise very hard sub-problems (NP-hard) become tractable. The problem of finding a complete set of informative features remains hard also for tree-like features, however, we were able to develop algorithms for tree-like features which scale well for problems of real-life scale. Currently, our suite of machine learning algorithms integrates implementations of two relational learning algorithms HiFi, RelF and Poly. These algortihms can be accessed through a simple scripting interface or through an intuitive GUI which also allows the users to evaluate usefulness of the generated patterns when combined with several machine learning algorithms from WEKA.
The three algorithms were described in the following papers:
- RelF: Ondrej Kuzelka and Filip Zelezny. Block-Wise Construction of Tree-like Relational Features with Monotone Reducibility and Redundancy. Machine Learning, 83, 2011 - PDF
- Poly: Ondrej Kuzelka, Andrea Szaboova, Matej Holec and Filip Zelezny. Gaussian Logic for Predictive Classification. ECML/PKDD 2011: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases - PDF (this paper described a restricted version of Poly)
- HiFi:Ondrej Kuzelka and Filip Zelezny. HiFi: Tractable Propositionalization through Hierarchical Feature Construction. Late Breaking Papers, the 18th International Conference on Inductive Logic Programming, 2008 - PDF
The manual for TreeLiker can be downloaded from here. It starts with introduction to the most important concepts such as: how learning examples look like, what are relational features, how bias for relational features can be specified etc. Then it describes how TreeLiker can be used from command line through a simple scripting environment. The command line interface provides access to all features of TreeLiker. TreeLiker-GUI provides a simple interface for quick experimenting with TreeLiker. The manual for TreeLiker-GUI can be downloaded from here.
Several datasets can be downloaded from here.
TreeLiker is licensed under GNU GPL