CN2 Rule Induction
==================

.. figure:: icons/cn2ruleinduction.png

CN2 Rule Induction

Signals
-------

**Inputs**

-  **Data**

   Data set.

-  **Preprocessor**

   Preprocessed data.

**Outputs**

-  **Learner**

   The CN2 learning algorithm with settings as specified in the dialog.

-  **CN2 Rule Classifier**

   Trained classifier (a subtype of Classifier). **CN2 Rule Classifier**
   is only trained and forwarded if learning data is connected.

Description
-----------

The CN2 algorithm is a classification technique designed for the
efficient induction of simple, comprehensible rules of form "if *cond*
then predict *class*", even in domains where noise may be present.

.. figure:: images/CN2RuleInduction-stamped.png

1. Name under which the learner appears in other widgets. The default
   name is *CN2 Rule Induction*.

2. *Rule ordering*:

   - **Ordered**; induce ordered rules (decision list). Rule
     conditions are found and the majority class is assigned in the
     rule head.
   - **Unordered**; induce unordered rules (rule set). Learn rules for
     each class individually, in regard to the original learning
     data.

3. *Covering algorithm*:

   - **Exclusive**; after covering a learning instance, remove it from
     further consideration.

   - **Weighted**; after covering a learning instance, decrease its
     weight (multiplication by *gamma*) and in-turn decrease its
     impact on further iterations of the algorithm.


4. *Rule search*:

   - **Evaluation measure**; select a heuristic to evaluate found
     hypotheses:

     a. `Entropy <https://en.wikipedia.org/wiki/Entropy_(information_theory)>`_ (measure of unpredictability of content)
     b. Laplace Accuracy
     c. Weighted Relative Accuracy

   - **Beam width**; remember the best rule found thus far and monitor a
     fixed number of alternatives (the beam).

5. *Rule filtering*:

   - **Minimum rule coverage**; found rules must cover at least the
     minimum required number of covered examples. Unordered rules must
     cover this many target class examples.

   - **Maximum rule length**; found rules may combine at most the
     maximum allowed number of selectors (conditions).

   - **Default alpha**; significance testing to prune out most
     specialised (less frequently applicable) rules in regard to the
     initial distribution of classes.

   - **Parent alpha**; significance testing to prune out most
     specialised (less frequently applicable) rules in regard to the
     parent class distribution.

6. Tick 'Apply Automatically' to auto-communicate changes to other
   widgets and to immediately train the classifier if learning data is
   connected. Alternatively, press ‘Apply‘ after configuration.

Examples
--------

Having trained the model, induced rules can be quickly reviewed and
interpreted. Showcased in the example below is the
:doc:`CN2 Rule Viewer <../visualize/cn2ruleviewer>` widget .

.. figure:: images/CN2-Induction-Example1.png

The second schema tests the accuracy of the algorithm, compares its
performance to the :doc:`Classification Tree <../classify/classificationtree>` 
on a specific data set, and presents a standard use of the widget.

.. figure:: images/CN2-Induction-Example2.png

References
----------

1. "Separate-and-Conquer Rule Learning", Johannes Fürnkranz, Artificial
   Intelligence Review 13, 3-54, 1999

2. "The CN2 Induction Algorithm", Peter Clark and Tim Niblett,
   Machine Learning Journal, 3 (4), pp261-283, (1989)

3. "Rule Induction with CN2: Some Recent Improvements", Peter Clark and
   Robin Boswell, Machine Learning - Proceedings of the 5th
   European Conference (EWSL-91), pp151-163, 1991

4. "Subgroup Discovery with CN2-SD", Nada Lavrač et al., Journal of
   Machine Learning Research 5 (2004), 153-188, 2004