News

Dagstuhl Seminar on Inductive Programming

The 7th AAIP Workshop on Approaches and Applications of Inductive Programming is accepted as Dagstuhl Seminar 17382 and will take place September 17 to 20, 2017. The workshop is jointly organized by Ute Schmid (University of Bamberg), Stephen Muggleton (Imperial College London), and Rishabh Singh (MIT and Microsoft Research).

For more information, see the Dagstuhl Seminar Page.

15.03.2017
RuleML Blog Report about Dagstuhl Seminar AAIP'16

AAIP'16 in the RuleML Blog
http://via.aayo.ws/YmI4J

24.03.2016
CACM on Inductive Programming

The review article “Inductive programming meets the real world” by Gulwani, Hernández-Orallo, Kitzelmann, Muggleton, Schmid, and Zorn has been published in the Communications of the ACM, Vol. 58 No. 11, Pages 90-99. 10.1145/2736282
see fulltext

1.11.2015
Wikipedia Page on Inductive Programming

José Hernández-Orallo and Ute Schmid created Wikipedia articles for Inductive Programming and Inductive Functional Programming.

15.01.2015
Dagstuhl Seminar "Approaches and Applications of Inductive Programming"

José Hernández-Orallo (Polytechnic University of Valencia, ES), Stephen H. Muggleton (Imperial College London, GB), Ute Schmid (Universität Bamberg, DE) and Benjamin Zorn (Microsoft Research - Redmond, US) organize Dagstuhl Seminar 15442 "Approaches and Applications of Inductive Programming" scheduled for October 25 to 30, 2015.

The seminar is a continuation of the AAIP workshop series.

Please visit the AAIP 15 Homepage.

07.10.2014
Report of Dagstuhl Seminar

We're pleased to inform you that the report of Dagstuhl Seminar 13502 is now published as part of the periodical Dagstuhl Reports.

The report is available online at the DROPS Server.

31.03.2014
Dagstuhl Seminar "Approaches and Applications of Inductive Programming"

Ute Schmid (University of Bamberg), Emanuel Kitzelmann (University of Duisburg-Essen), Sumit Gulwani (Microsoft Research) and Marcus Hutter (Austrian National University) organize Dagstuhl Seminar 13502 "Approaches and Applications of Inductive Programming" scheduled for Monday, December 09 to December 11, 2013. The seminar is a continuation of the AAIP workshop series.

Please visit the AAIP 13 Homepage.

15.12.2012
4th Workshop AAIP 2011

AAIP 2011 Homepage

Ute Schmid and Emanuel Kitzelmann organize the 4th Workshop on Approaches and Applications of Inductive Programming. It will take place on July 19, 2011, in Odense, Denmark. Co-located events are the 13th International ACM SIGPLAN Symposium on Principles and Practice of Declarative Programming (PPDP 2011) and the 21st International Symposium on Logic-Based Program Synthesis and Transformation (LOPSTR 2011).

Details can be found on the AAIP 2011 Homepage.

12.01.2011

Systems

ADATE

(Automatic Design of Algorithms Through Evolution)

Language: ADATE ML

Year: 1994

Category: Genetic Programming

Short Description «expand»

Automatic Design of Algorithms Through Evolution (ADATE) is an automatic programming system that synthesizes function definitions in a subset of the programming language Standard ML. ADATE is primarily intended as a system that automatically improves a part of an existing program, but is also useful for generating programs from scratch provided that they are fairly small.

The evolution of programs is guided by an evaluation function that tests a given program and says how good it is. Note that the function generated by ADATE might be only a small part of this given program. Thus, ADATE is especially suitable for so-called reinforcement learning, but also useful in more traditional machine learning where input-output pairs are employed to define the evaluation function.

For example, one possible application of ADATE is to automatically improve image processing algorithms. One may then start with a possibly advanced and complex algorithm, select a part of it to be improved and then define an evaluation function that measures the performance of the new versions of the original algorithm that result from plugging in ADATE generated functions instead of the selected part. ADATE has been used in this way to improve one of the best known image segmentation algorithms.

Automatic algorithm improvement as outlined above appears to be practically useful in applications where programs written by human beings contain subroutines or functions that are designed through experimental feedback rather than being optimally defined using purely theoretical considerations.

Depending on the final size of the function to be improved and its algorithmic complexity, ADATE may need to test millions or even billions of candidates. However, we have found experimentally that the number of evaluations needed typically is determined by a low degree polynomial in program size provided that the training inputs and the evaluation function are well defined. The overall run time of ADATE approximately equals the number of evaluations multiplied with the time required to run a program on all the training inputs.

Therefore, ADATE is most easily used in applications where the overall program has a short run time and where the part of it selected for improvement is reasonably small, say smaller than 100 lines of Standard ML code.

ADATE can automatically generate non-trivial and novel algorithms that contain one or more help functions that are created by ADATE when and where needed. For example, ADATE can easily synthesize a functional program that generates a list of lists containing all permutations of a given input list even if it is not given any help functions to start with except the necessary data type constructors. In this case, the best synthesized program contains two invented auxiliary functions that perform combinations of list rotations and concatenations in non-trivial ways.

ADATE uses large scale combinatorial search that employs sophisticated program transformations. Examples of transformations includes expression synthesis of terms that do not make a program worse and adding extra parameters to local help functions in order to make them more general.

The system currently runs on Linux clusters with MPI and can effectively utilize a few hundred CPU cores but a rewrite to obtain practically unlimited scalability with an increasing number of cores per CPU and number of CPUs is being planned.

(Roland Olsson)

ALEPH

(A Learning Engine for Proposing Hypotheses)

Language: PROLOG

Year: 2007

Category: Inductive Logic Programming

Short Description «expand»

Aleph is intended to be a prototype for exploring ideas. Earlier incarnations (under the name P-Progol) originated in 1993 as part of a fun project undertaken by Ashwin Srinivasan and Rui Camacho at Oxford University. The main purpose was to understand ideas of inverse entailment which eventually appeared in Stephen Muggleton's 1995 paper: Inverse Entailment and Progol, New Gen. Comput., 13:245-286, available at ftp://ftp.cs.york.ac.uk/pub/ML_GROUP/Papers/InvEnt.ps.gz. Since then, the implementation has evolved to emulate some of the functionality of several other ILP systems. Some of these of relevance to Aleph are: CProgol, FOIL, FORS, Indlog, MIDOS, SRT, Tilde, and WARMR.

During routine use, Aleph follows a very simple procedure that can be described in 4 steps:

  1. Select example. Select an example to be generalised. If none exist, stop,otherwise proceed to the next step.
  2. Build most-specific-clause. Construct the most specific clause that entails the example selected, and is within language restrictions provided. This is usually a definite clause with many literals, and is called the "bottom clause." This step is sometimes called the "saturation" step. Details of constructing the bottom clause can be found in Stephen Muggleton's 1995 paper: Inverse Entailment and Progol, New Gen. Comput., 13:245-286, available at ftp://ftp.cs.york.ac.uk/pub/ML_GROUP/Papers/InvEnt.ps.gz.
  3. Search. Find a clause more general than the bottom clause. This is done by searching for some subset of the literals in the bottom clause that has the "best" score. Two points should be noted. First, confining the search to subsets of the bottom clause does not produce all the clauses more general than it, but is good enough for this thumbnail sketch. Second, the exact nature of the score of a clause is not really important here. This step is sometimes called the "reduction" step.
  4. Remove redundant. The clause with the best score is added to the current theory, and all examples made redundant are removed. This step is sometimes called the "cover removal" step. Note here that the best clause may make clauses other than the examples redundant. Again, this is ignored here. Return to Step 1.

(from the Aleph Manual)

ATRE

(Apprendimento di Teorie of Rcorsive da Esempi)

Language:

Year: 1998

Category: Inductive Logic Programming

Short Description «expand»

coming soon...

De-Typechecker

Language: Haskell

Year: 2005

Category: Typechecking and theorem proving

Short Description «expand»

Unlike Djinn, which is a separate, standalone system, de-typechecker uses the Haskell own type-checker to create terms from types; the terms can be then be evaluated by the same Haskell system.

(Oleg Kiselyov)

DIALOGS

(Dialogue-based Inductive and Abductive LOGic program Synthesiser)

Language: PROLOG

Year: 1997

Category: Inductive Logic Programming

Short Description «expand»

The DIALOGS (Dialogue-based Inductive/Abductive LOGic program Synthesiser) technique resulted from an effort at building a fully interactive version of SYNAPSE, and at extending its power while at the same time simplifying its machinery.

The main objective was to take all burden from the specifier by having the technique ask for exactly and only the information it needs. As a result, no evidence needs to be prepared in advance, since the technique invents its own evidence and queries an oracle (usually a human specifier) about it.

This is suitable for all levels of expertise of human users, as the queries are formulated in the specifier's (initially unknown) conceptual language, in a program-independent way, and such that the specifier must know the answers if s/he really feels the need for the program. For instance, the query "When does sort([X,Y],R) hold?" can be answered by

(R=[X,Y] ∧ X ≤ Y) ∨ (R=[Y,X] ∧ X > Y)

thereby expanding the perceived subset of the conceptual language of the specifier, and giving the same information as in a property provided to SYNAPSE. The evidence language implicitly amounts to (non-recursive) normal logic programs. Type declarations are provided as language bias. Mode and determinism information are not required, because the focus is on synthesising the logic component of logic programs. The technique is schema-guided, and currently has two schemas (divide-and-conquer and accumulate) where multiple base clauses and multiple recursive clauses are possible. The hypothesis language is normal logic programs, where negation is restricted to the case discriminants and appears there by extraction from the query answers (i.e., it can only be applied to primitive predicates and could thus be avoided by using the complementary primitives in the query answers, as we did above by writing X > Y instead of ¬ X ≤ Y).

(Pierre Flener)

Djinn

Language: Haskell

Year: 2005

Category: Typechecking and theorem proving

Short Description «expand»

Djinn takes a (Haskell) type and gives back a function of that type if one exists. The system uses a decision procedure for intuitionistic propositional calculus due to Roy Dyckhoff. It's a variation of Gentzen's LJ system. This means that (in theory) Djinn will always find a function if one exists, and if one doesn't exist Djinn will terminate telling you so.

(Lennart Augustsson)

FLIP

Language: C++

Year: 1999

Category: Induction of Functional Logic Programs

Short Description «expand»

FLIP implements a framework for the Induction of Functional Logic Programs (IFLP) from facts. This can be seen as an extension to the now consolidated field of Inductive Logic Programming (ILP). Inspired in the inverse resolution operator of ILP, the system is based on the reversal of narrowing, the more usual operational mechanism for Functional Logic Programming. The main advantages of the FLIP system over the most used ILP systems are a natural handling of functions, without the use of mode or determinism declarations, and its power for inducing short recursive programs. Its applications are mainly program synthesis, program debugging and data mining of small highly structured documents.

The FLIP system accepts both positive and negative examples. Examples consist of sets of equations, with its rhs. normalised wrt. the program to be induced.

Background knowledge can also be used as a conditional functional logic program. Note that any logic program can be expressed as a conditional functional logic program. Therefore, virtually any logic program can be used as background knowledge as well as functional definitions. FLIP must be given the function symbols from the background knowledge that should be used in the definition of the function to be learned.

The FLIP system can induce 'correct' programs from very few examples. There is no need for good examples (in the sense of Ling), nor Basic Representative Sets (BRS) such as FOIL, GOLEM, Progol, amongst others, require.

(Cèsar Ferri)

FOIL/FFOIL

Language: C

Year: 1996

Category: Inductive Logic Programming

Short Description «expand»

coming soon...

GOLEM

Language: PROLOG

Year: 1992

Category: Inductive Logic Programming

Short Description «expand»

Golem (Muggleton and Feng, 1990) is an ILP system, written in C, which iteratively constructs a set of definite clauses from groups of atomic examples based on Plotkin's Relative Least General Generalisation (RLGG). Plotkin's approach provides a method of constructing a unique clause which covers a set of examples relative to given background knowledge. However, such a clause can in the worst case contain infinitely many literals, or at best grow exponentially with the number of examples involved. By bounding the depth of variables and imposing a deterimanacy constraint the Golem system was guaranteed to build clauses whose length was bounded by a polynomial function of various features of the background knowledge.

Golem is given positive and negative examples together with extensional background knowledge. It starts by randomly choosing pairs of positive examples. For each pair the ij-determinate RLGG is constructed and then maximally generalised by dropping body atoms while maintaining consistency with respect to the negative examples. All reduced clauses are evaluated with respect to the degree to which they compress the information in the examples. The pair of examples leading to the most compressive clause is then extended by randomly adding examples to produce triples and later n-tuples, for which further RLGG clauses are constructed and reduced in a similar fashion. The single clause construction terminates once further extenions of the RLGG no longer lead to improved compression. Once a single clause has been asserted, all covered examples are removed and the clause construction process continues on all remaining examples. The process terminates when no remaining pair of examples has a consistent RLGG.

Golem was applied to a variety of relational learning problems including the learning of simple Prolog programs as well as real-world applications involving protein-fold prediction, drug-activity prediction, learning of computer-aided design rules and various chess problems.

The use of inverse tabling methods allowed the efficiency of Golem to be comparable to the contemporary ILP system FOIL. FOIL greedily searches the space guided by an information measure similar to that used in ID3. This measure supports the addition of a literal in the body of a clause on the basis of its ability to discriminate between positive and negative examples. This gains efficiency at the expense of completeness. For instance the literal

partition(Head,Tail,List1,List2)

in the recursive quick-sort clause does not produce any discrimination between positive and negative examples. As a result quick-sort cannot be learned by FOIL. The problem recurs throughout a large class of predicates in which new variables in the body are functionally derived from terms in the head of the clause. Such predicates include arithmetic multiply, list reverse and various real-world domains.

Golem's key weaknesses with respect to the later-developed Progol system were associated with the required use of ground unit clause background knowledge and its limitation to the construction of determinate clauses. The determinacy constraint was particularly awkward for applications involving learning the properties of chemical compounds described in the form of atom and bond relationships.

(S.H. Muggleton)

IGOR I

Language: LISP

Year: 2006

Category: Analytical Inductive Functional Programming

Short Description «expand»

IGOR I is a modern extension of Summer's seminal THESYS system. The underlying program template describes the set of all functional programs with the following restrictions: built-in functions can only be first-order, and no nested or mutual recursion is allowed. IGOR I adopts the two-step approach of THESYS. Synthesis is still restricted to structural problems, where only the structure of the arguments matters, but not their contents, such as in list reversing. Nevertheless, the scope of synthesisable programs is considerably larger. For instance, tree-recursive functions and functions with hidden parameters can be induced. Most notably, programs consisting of a calling function and an arbitrary set of further recursive functions can be induced. The first step of synthesis (trace construction) is therefore expanded such that traces can contain nestings of conditions. The second step is expanded such that the synthesis of a function can rely on the invention and synthesis of other functions (that is, IGOR I uses a technique of function invention in correspondence to the concept of predicate invention).

(Ute Schmid)

IGOR II

Language: MAUDE

Year: 2008

Category: Analytical Inductive Functional Programming

Short Description «expand»

IGOR II relies on constructor-term rewriting techniques. Thereby characteristics of modern functional programming languages are covered: Functions are defined with respect to algebraic datatypes and control flow is guided by pattern matching. Specifications are presented as sets of example equations. Several dependent functions, e.g., the mutual recursive solution for even and odd, can be specified and learned simultaneously. The example equations can contain variables, which is especially convenient for specifying more complex problems where large sets of ground examples would be necessary to cover all possible cases of the desired behaviour of the program to be learned.

IGOR II performs synthesis in one step from a given set of example equations together with the algebraic datatype. Background knowledge can be provided in form of additional example equations. Due to the generality of user-defined datatypes, list problems as well as problems on natural numbers or trees or any other algebraic type can be solved. Since also the Boolean values can be defined as algebraic type, functions yielding trees, lists, numbers, etc, as well as predicates, e.g., equality and order relations, can be induced.

The patterns are assured to be non-unifying such that the equations constituting the induced program can be regarded as a \emph{set} of equations, where order does not matter. As a consequence, e.g., member cannot be defined based on the patterns (X,[]), (X,[X|Xs]) and (X,[Y|Xs]) since these patterns unify and hence cannot be considered in an arbitrary order. Thus, for solving problems where pattern variables need to be compared regarding equality or any other predicate, patterns alone do not suffice and if-then-else conditionals are needed. Currently, IGOR II can introduce such conditionals if needed when the condition is an equality or order test on the pattern variables such that, e.g., member or inserting a number at the right place into an ordered list can be induced. When providing background knowledge, a broader class of semantic problems becomes feasible. For example, insertion sort can be synthesised providing insert as background knowledge; quick sort can be synthesised providing append and partition.

Induction in IGOR II relies on calculating least general generalisations (lggs) over the example equations and refining these lggs using the following three techniques: splitting rules by pattern refinement, introducing function calls, and function invention.

(Ute Schmid)

Inverse typechecking and theorem proving in intuitionistic and classical logics

Language: Kanren Project

Year: 2006

Category: Typechecking and theorem proving

Short Description «expand»

This program enumerates terms given a particular type. It uses a search strategy that seems better than BFS and certainly DFS: it seems complete (like BFS), that is, if a term with a given type exists, it will be found. On the other hand, it takes much less resources than breadth-first search. The advantage of using logic-programming system is that we can give the system the overall shape of expected terms -- a term with `blanks'. The system will fill in the blanks. We can also restrict the set of generated terms to normal terms.

(Oleg Kiselyov)

Magic Haskeller

  • Susumu Katayama's Magic Haskeller homepage

Language: Haskell

Year: 2006

Category: Exhaustive Search-based Inductive Programming

Short Description «expand»

coming soon...

PROGOL

Language: PROLOG

Year: 1999

Category: Inductive Logic Programming

Short Description «expand»

coming soon...

SYNAPSE

(SYNthesis of Algorithms from PropertieS and Examples)

Language: PROLOG

Year: 1992

Category: Inductive Logic Programming

Short Description «expand»

The SYNAPSE (SYNthesis of Algorithms from PropertieS and Examples) technique takes as evidence a set of (non-recursive) Horn clauses describing a single intended relation. Ground unit clauses are called (positive) examples and data-drive the synthesis; all other clauses are called properties and are used to find instances of the place-holders of the schema. For instance, the clauses

sort([X,Y],[X,Y]) ← X ≤ Y
and
sort([X,Y],[Y,X]) ← X > Y

are properties. No other bias is given, though types are inferred from the examples. Mode and determinism information are not required, because the focus is on synthesising the logic component of logic programs. Synthesis is passive, although there is an expert mode where the technique asks for a preference among the possible instances of some place-holders, rather than non-deterministically choosing each from a repository. These problem-independent repositories form the (partitioned) background knowledge. The technique is based on a divide-and-conquer schema where multiple base clauses and multiple recursive clauses are possible. The hypothesis language is normal logic programs, where negation is restricted to the case discriminants and appears there by extraction from the properties (i.e., it can only be applied to primitive predicates and could thus be avoided by using the complementary primitives in the properties)

(Pierre Flener)

 

This file was last modified on Thursday July 08, 2010