Approximate reasoning applied to unsupervised database mining
Corresponding Author
Lawrence J. Mazlack
Computer Science Area, ECECS Department, University of Cincinnati, Cincinnati, OH 45221-0030
Computer Science Area, ECECS Department, University of Cincinnati, Cincinnati, OH 45221-0030Search for more papers by this authorCorresponding Author
Lawrence J. Mazlack
Computer Science Area, ECECS Department, University of Cincinnati, Cincinnati, OH 45221-0030
Computer Science Area, ECECS Department, University of Cincinnati, Cincinnati, OH 45221-0030Search for more papers by this authorAbstract
A computational approach is shown for unsupervised, reactive, database mining. This approach is dependent on soft computing techniques. Database mining seeks to discover noteworthy, unrecognized associations between database items. A novel approach is suggested for unsupervised search controlled by dissonance reduction. Both crisp and noncrisp data are subject to discovery. Another aspect of uncertainty is the metric that controls discovery. Issues involve: coherence measures, granularization, user intelligible results, unsupervised recognition of interesting results, and concept equivalent formation. © 1997 John Wiley & Sons, Inc.
References
- 1 R. Agrawal, T. Imielinski, and A. Swami “Database mining: A performance perspective,” IEEE Trans. Knowl. Data Eng., 5(6), (December 1993).
- 2
M. C. Chen and
L. P. McNamee
“On the data model and access method of summary data management,”
IEEE Transactions on Knowledge and Data Engineering,
1(4),
519–524
(1989).
10.1109/69.43426 Google Scholar
- 3
L. A. Zadeh,
“A Fuzzy-set-theoretic interpretation of linguistic hedges,”
J. Cybern.,
2(2),
4–34
(1972).
10.1080/01969727208542910 Google Scholar
- 4 W. Ziarko, “ The discovery, analysis, and representation of data dependencies in databases,” in Proceedings of the IJCAI Workshop On Knowledge Discovery, G. Piatetsky-Shapiro and W. Frawley, Eds., AAAI Press, 1991.
- 5 P. Langley, W. Iba, K. Thompson, “ An analysis of Bayesian classifiers,” Proceedings of the Tenth National Conference on Artificial Intelligence, San Jose, AAAI Press, 1992.
- 6 M. Ayel and J. P. Laurent, “ SACCO-SYCOJET: Two different ways of verifying knowledge-based systems,” in Validation, Verification and Test Of Knowledge-Based Systems, M. Ayel and J.-P. Laurent, Eds., Wiley.
- 7 D. Lenat and R. V. Guha, Building Large Knowledge-Based Systems–Representation and Inference in the CYC Project, Addison-Wesley, Reading MA 1990.
- 8 M. Klemettin, H. Mannila, P. Ronkainen, H. Toivonen, and A. I. Verkamo, “ Finding interesting rules from large sets of discovered association rules,” Proceedings of the Third International Conference on Information and Knowledge Management, Gaith-ersburg, Maryland, November 401–407, 1994.
- 9 P. Hoschka and W. Klosgen, “ A support system for interpreting statistical data,” in Proceedings of the IJCAI Workshop on Knowledge Discovery, G. Piatetsky-Shapiro and W. Frawley, Eds., AAAI Press, 1991.
- 10 G. Piatetsky-Shapiro and C. J. Matheus, “ The interestingness of deviations,” in AAAAI Workshop on Knowledge-Discovery in Databases, Seattle, WA, (July 1994), pp. 25–36.
- 11 D. Fisher, Iterative Optimization and Simplification of Hierarchical Clustering, Technical Report CS-95-01, Vanderbilt University, 1995.
- 12 K. Chan, A. K. C. Wong, and D. K. Y. Chiu, “ Discovery of Probablistic Rules for Prediction,” Proceedings of the Fifth IEEE Conference on Artificial Intelligence Applications, IEEE Computer Society, 1989, pp. 223–229.
- 13 P. Smyth and R. Goodman, “ Rule induction using information theory,” in G. Piatet-sky-Shapiro and W. Frawley, Eds., Knowledge Discovery in Databases, AAAI/MIT Press, Cambridge, Mass., 1991, pp. 159–176.
- 14 S. Dzeroski and N. Lavrac, “ Learning relations from noisy examples: An empirical comparison of LINUS and FOIL,” Proceedings Eighth International Workshop on Machine Learning, 1991, pp. 399–402.
- 15 C. J. Matheus, P. K. Chan, and G. Piatetsky-Shapiro, “ Systems for knowledge discovery in databases,” IEEE Trans. Knowl. Data Eng., (1993).
- 16 L. A. Zadeh “A fuzzy-algorithmic approach to the definition of complex or imprecise concepts,” Int. J. Man-M;ach. Stud. 8, 249–291 (1976).
- 17 J. C. Bezdek and S. K. Pal, Eds., Fuzzy Models for Pattern Recognition, IEEE Press, New York 1992.
- 18
G. Shafer,
A Mathematical Theory of Evidence,
Princeton University Press, Princeton, NJ
1976.
10.1515/9780691214696 Google Scholar
- 19
Z. Pawlak,
Rough Sets: Theoretical Aspects of Reasoning About Data,
Kluwer Academic Publishers, Boston
1991.
10.1007/978-94-011-3534-4 Google Scholar
- 20 A. Skowron “The rough sets theory and evidence theory,” Fundamenta Informaticae, 13, p. 245–262.
- 21 S. Vassiliadis “A fuzzy reasoning database question answering system,” IEEE Trans. Knowl. Data Eng. 6(6), 868–882 (December 1994).
- 22 S. Schocken “On the use of the Dempster–Shafer model in information indexing and retrieval applications,” Int. J. Man-Mach. Stud. 39, 843–879 (1993).
- 23 Y. Kodratoff, “ Comprehensibility at the junction of computer science, industry, and cognitive science,” IJCAI Workshop on Machine Learning and Comprehensibility, Workshop Notes, Montreal, 1995.
- 24
J. R. Quinlan
“Induction of decision trees,”
Mach. Learn. J.
1(1),
81–106
(1986).
10.1007/BF00116251 Google Scholar
- 25 N. R. Pal, J. C. Bezdek, and R. Hemasinha “Uncertainty measures for evidential reasoning II: A new measure of total uncertainty,” Int. J. Approx. Reason. 8(1), 1–16 (Jan. 1993.
- 26 R. R. Yager “Entropy and specificity in a mathematical theory of evidence,” Int. J. Gen. Syst. 9, 249–260 (1983).
- 27 U. Hohle, “ Fuzzy plausibility measures,” Proceedings of the 3rd International Seminar on Fuzzy Set Theory, Johannes Kepler University, Linz 1981, pp. 7–30.
- 28 M. Higashi and G. J. Klir “Measures of uncertainty and information based on possibility distributions,” Int. J. Gen. Syst. 9, 43–48 (1983).
- 29 D. Dubois and H. Prade “A note on measures of specificity for fuzzy sets,” Int. J. Gen. Syst. 10, 279–283 (1985).
- 30 G. J. Klir and A. Ramer, “ Uncertainty in dempster–Shafer theory: A critical re-examination,” Int. J. Gen. Syst. 155–166 (1990).
- 31 M. T. Lamata and S. Moral “Measures of entropy in the theory of evidence,” Int. J. Gen. Syst. 14, 297–305 (1987).
- 32 D. B. Lenat “EURISKO: A program that learns new heuristics and concepts,” Artif. Intelli. J. 21, 61–98 (1983).
- 33 G. Piatetsky-Shapiro and C. Matheus “Knowledge discovery workbench for exploring business databases,” Int. J. Intelli. Syst. 7, 675–686 (1992).
- 34 J. M. Zytkow, “ Combining many searches in the FAHRENHEIT discovery system,” Proceedings of the Fourth International Workshop on Machine Learning, Morgan Kaufmann, 1987, pp. 281–287.
- 35 R. Agrawal, S. Ghosh, T. Imielinski, B. Iyer, and A. Swami, “ An interval classifier For database mining applications,” Proceedings of 18th VLDB, 1992, pp. 560–573.
- 36 H. Boley, Knowledge Validation and Exploration by Global Analysis. Technical Report, German Research Center for Artificial Intelligence (DFKI), Kaiserslautern, Germany, August 12, 1992.
- 37 M. Merzbacher and W. W. Chu, “ Pattern-based clustering for database attribute values,” Proceedings AAAI Workshop on Knowledge Discovery in Databases, Washington, D.C., 1993.
- 38 R. S. Michalski, “Understanding the nature of learning: Issues and research directions,” in Machine Learning: An Artificial Intelligence Approach, R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, Eds., 2(1), 3–25 (1986).
- 39 R. S. Michalski, “ Machine learning,” in Artificial Intelligence: A Knowledge-Based Approach, M. W. Firebaugh, Ed., 17, PWS-Kent, Boston, 1987, pp. 578–612.
- 40
L. Rendell
“A general framework for induction and a study of selective induction,”
Mach. Learn.
1(1),
177–226
(1986).
10.1007/BF00114117 Google Scholar
- 41 W. J. Frawley, G. Piatetsky-Shapiro, and C. J. Matheus, “ Knowledge discovery in databases. An overview,” in G. Piatetsky-Shapiro and W. Frawley, Eds., Knowledge Discovery in Databases, AAAI/MIT Press, Cambridge, Mass. 1991, pp. 1–27.
- 42 J. E. Moody, S. J. Hanson, and R. P. Lippman, Eds., Advanced in Neural Information Processing Systems–4, Morgan Kaufmann, 1992.
- 43 R. S. Michalski, “ On the quasi-minimal solution of the general covering problem,” Proceedings of the Fifth International Symposium on Information Processing, Bled, Yugoslavia, 1969.
- 44
N. Clark
“The CN2 induction algorithm,”
Mach. Learn.,
3, 261–283
(1989).
10.1007/BF00116835 Google Scholar
- 45
D. H. Fisher
“Knowledge acquisition via incremental conceptual clustering,”
Machine Learning,
2(2),
139–172
(1987).
10.1007/BF00114265 Google Scholar
- 46 J. H. Gennari, P. Langley, and D. H. Fisher “Models of incremental concept formation,” Artif. Intell., 40, 11–61 (1989).
- 47 F. Hayes-Roth and J. McDermott “An interference matching technique for inducing abstractions,” CACM, 21(5), 401–410 (May 1978).
- 48 P. Smyth and R. Goodman “An information theoretical approach to rule induction from databases,” IEEE Trans. Knowl. Data Eng., 4(4), (August 1992.)
- 49 J. R. Quinlan, “ Learning efficient classification procedures and their application to chess end games” in R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, Eds., Machine Learning: An Artificial Intelligence Approach, Morgan Kaufmann, San Mateo 1983, pp. 463–482.
- 50 J. R. Quinlan, “ Forward,” in G. Piatetsky-Shapiro and W. Frawley, Eds., Knowledge Discovery in Databases, AAAI/MIT Press, Cambridge, Mass. 1992, pp. ix–xii.
- 51 V. Dhar and A. Tuzhilin “Abstract-driven pattern discovery in databases,” IEEE Trans. Knowl. Data Eng. 5(6), (December 1993.)
- 52 J. Han, Y. Cai, and N. Cercone, “ Knowledge discovery in databases,” Proceedings of the 18th VLDB Conference, 1992, pp. 547–559.
- 53 G. Piatetsky-Shapiro and W. Frawley, Eds., Proceedings of the IJCAI Workshop on Knowledge Discovery in Databases, AAAI Press, 1991.
- 54 T. Anwar, H. Beck, and S. Navathe, “ Knowledge mining by imprecise querying: A classification-based approach,” IEEE Conference on Data Engineering, 622–630, (1992).
- 55 J. M. Zytow and J. Baker, “ Interactive mining of regularities in databases,” in G. Piatetsky-Shapiro and W. Frawley, Eds., Knowledge Discovery in Databases, AAAI/ MIT Press, Cambridgem, Mass. 1991, pp. 31–53.
- 56 N. Zhong and S. Ohsuga, “ An integrated calculation model for discovering functional relations from databases,” Proceedings of Database and Expert Systems Applications, Conference (DEXA′93), Prague, Czech Republic, September 213–220, 1993.
- 57 W. Ziarko and N. Shan, “ Discovering attribute relationships, dependencies, and rules by using rough sets,” Proceedings of the 28th Annual Hawaii International Conference on System Sciences, January Vol. III 1995, 293–299.
- 58 W. Shen, “ Discovering regularities from large knowledge bases,” Eighth International Workshop of Machine Learning, 1991, pp. 539–543.
- 59 P. Langley, “ Selection of relevant features in machine learning,” Proceedings of the AAAI Fall Symposium on Relevance, 1994.
- 60 P. Langley “Data-driven discovery of natural laws,” Cognitive Science, 5(1), 31–54 (1981).
- 61 A. Walker, “ On retrieval from a small version of a large data base,” Proceedings VLDB, 47–54 (1980).
- 62 R. S. Michalski, “A theory and methodology of inductive learning,” in Machine Learning: An Artificial Intelligence Approach, R. S. Michalski, J. G. Carbonnell, and T. M. Mitchell, Eds., 2, 3–25 (1983).
- 63 W. Clancey, “ Representing Control Knowledge as Abstract Tasks and Meta Rules,” in Computer Expert Systems, M. Coombs and L. Bloc, Eds., Springer-Verlag, April 1985).
- 64
J. C. Bezdek,
Pattern Recognition with Fuzzy Objective Functions,
Plenum Press, New York
1981.
10.1007/978-1-4757-0450-1 Google Scholar
- 65 R. R. Yager “Learning of fuzzy rules by mountain clustering,” Proceedings of the SPIE, 2061, 1993, pp. 246–254.
- 66 S. L. Chiu, “ A cluster estimation method with extension to fuzzy model identification,” Proceedings of Third IEEE International Conference on Fuzzy Systems, Orlando, 1994, pp. 1240–1245.