The Effect of Correction Factor in Synthesizing Global Rules in a Multi-Database Mining Scenario
|Published in:||Issue 2, (Vol. 3) / 2009|
|Author(s):||Thirunavukkarasu Ramkumar, Rengaramanujam Srinivasan|
|Abstract.||Recently, multi-database mining using local pattern analysis has been identified as an efficient strategy for mining multiple data sources of an interstate business organization. Using this approach, frequent patterns from the individual sites are synthesized and forwarded to the central head. Various synthesizing models [5,7] have been proposed to form global patterns from the forwarded high-frequent rules. Earlier we had proposed a model for synthesizing high-frequent rules on the basis of transaction population of the sites, support and confidence of the rule in the respective sites. The rules that are forwarded by the local sites are "strong" rules which satisfy the minimum support and confidence thresholds at respective sites. It is desired that the synthesized rules from such forwarded patterns must closely match with the mono-mining results, ie. the results that would be obtained if all the databases are put together and mining has been done. When the rule is present in the site but fails to satisfy the minimum support threshold value, it is not allowed to take part in the rule synthesizing process. In such situations the correction factor "h" plays a vital role in inferring the global support and confidence values. A suitable choice of correction factor "h" enables the domain expert to reap the valid synthesized result. In this paper, the impact of correction factor in obtaining synthesized results close to the mono-mining results is brought out.|
|Keywords:||Rule Synthesizing, Transaction Population, Correction Factor, Weighting Model|
1. Agrawal.R and Srikant.R (1994) Fast algorithms for mining association rules. In: Proceedings of 20th International Conference on Very Large Databases, pp.487-499.
2. Agrawal.R, Imielinski.T, Swami. A (1993) Mining association rules between sets of items in large databases. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp.207-216.
3. Kum H C, Pai J, Wan W, Duncan D SIAM international conference on Data mining (2003)
4. Liu H, Lu H, Yao J (2001) Toward multi database mining: Identifying relevant databases. IEEE Transactions on Knowledge and Data Engineering 13(4) : 541- 553.
5. Ramkumar T, Srinivasan R (2008) Modified algorithm for synthesizing high-frequency rules from different data sources. Knowledge and Information System 17(3) :313-334
6. Wu X, Zhang C, Zhang S (2005) Database classification for multi-database mining: Information System 30(1) : 71-88.
7. Wu X and Zhang S (2003) Synthesizing high-frequency rules from different data sources. IEEE Transactions on Knowledge and Data Engineering 15(2): 353-367.
8. Zhang N, Yao.Y.Y, Ohshima M(2003) Peculiarity oriented Multi-Database Mining :IEEE Transactions on Knowledge and Data Engineering 15(4): 952- 960
9. Zhang C, Liu M, Nie W, et al. (2004) Identifying global and exceptional patterns in multi-database mining. IEEE Computational Intelligence Bulletin 3(1): 19-24.
10. Zhang S, Wu X, Zhang C(2003) Multi-Database Mining. IEEE computational Intelligence Bulletin 2(1) : 5-13.
11. Zhang S, Zhang C, Wu X (2004) Knowledge Discovery in Multiple Databases. Springer.
|Back to the journal content|
This article is licensed under a
Creative Commons Attribution-ShareAlike 4.0 International License.