Locating previously unknown patterns in data-mining results: a dual data- and knowledge-mining method

Siadaty, Mir S; Knaus, William A

doi:10.1186/1472-6947-6-13

BMC Medical Informatics and Decision Making

Table 1 The dual mining algorithm

From: Locating previously unknown patterns in data-mining results: a dual data- and knowledge-mining method

1	Given a database (DB) to be mined, select a relevant knowledgebase (KB);
2	Produce a mapping of terminology of DB to KB, and/or vice versa. In biomedicine, databases usually use "aggregate classification", while knowledgebases usually use "detailed clinical vocabularies";
3	Choose "primary unit of analysis" for DB and KB. Examples of unit of analysis for DB are each 'patient', each 'visit', or an 'episode of care'; and for KB is each biomedical article;
4	Choose a type of pattern, a mining method for finding that type of pattern, and its measure of pattern strength. For example, association rules where strength of the rule is measured by Spearman's Rho;
5	Given list of attributes, and their sampling probabilities, generate m n-tuples. m and n are integer numbers. m is count of n-tuples that are chosen simultaneously for a single iteration. For example, m can be a number like 20, 50, or 100. n is count of attributes within a pattern. For example, n may range from 2 to 5;
6	Evaluate the batch of m n-tuples in the DB, and estimate strength of each n-tuple;
7	Estimate strengths of the same n-tuples in the KB;
8	Estimate the surprise score (SS), by using the pair of strengths of each n-tuples in DB and KB. Besides, estimate statistical significance of the scores;
9	Update list of sampling probabilities of attributes by using the estimated SS's. Attributes observed more frequently in n-tuples with high SS, will receive higher sampling probabilities, while attributes of low SS n-tuples receive lower probabilities.
10	Start over from step 5, until all n-tuples generated from the list of attributes are exhausted, or the time limit is reached.

Back to article page

ISSN: 1472-6947

Contact us

General enquiries: journalsubmissions@springernature.com