Alma Lilia Garcia Almanza
PhD Studies (October 2003-2007) (Thesis 1MB)
Alma submitted her thesis in October 2007, and passed her viva on 18 January 2008. She was examined by Professor Qiang Shen (External Examiner, University of Aberystwyth) and Dr Paul Scott (Internal Examiner).
Research and Contributions
Alma also invented the Repository Method. Given a set of decision trees, the Repository Method returns a repository of rules. These rules have higher precision than the input decision trees. Furthermore, these rules combine the decision trees in an intelligent way (based on precision and novelty of the rules encoded in the trees). This is particularly important for handling extremely imbalance datasets, where positive cases are rare. (which makes it invaluable in chance discovery). As far as we are aware, this is the first classification method designed for handling extremely imbalanced data sets.
The Repository Method works on a threshold. The threshold gives the user a handle over risk management: by varying the threshold, one can generate different rule repositories to cover different parts of the frontier in ROC (Receiver Operating Characteristic). This makes the Repository Method a practical tool in financial forecasting.
The Scenario Method and Repository Method also help to speed up Genetic Programming: Some trees which appear to have poor performance on their own may have useful parts. Those trees are typically discarded through evolution. By combining useful parts of different trees, including those apparently weak ones, Useful building blocks are accumulated more effectively. This allows Genetic Programming to generate trees/rule sets of the same quality in far fewer generations/evaluations.
The Evolving Comprehensible Rules (ECR) Method combines the Scenario Method and Genetic Programming. It is particularly effective in finding scarce opportunities, i.e. working in imbalanced data sets.
Other publications can be found at the Computational Finance & Economics Research Laboratory.