+39 0935 77534

MIDClass

Microarray Data Classification by Association Rules and Gene Expression Intervals. Microarrays are a well established technology to analyze the expression of many genes in a single reaction whose applications range from cancer diagnosis to drug response. They are matrices, where known samples of DNA, cDNA, or oligonucleotides, called probes, combine with mRNA sequences. The expression level of genes is given by the amount of mRNA bounding to each entry. The aim is to find either sets of genes that characterize particular disease states or experimental condition or highly correlated genes that share common biological features.

MIDClass is a tool based on association rules, which classifies gene expression exploiting the idea that gene expression interval values could better discriminate subtypes in the same class.

Association rule mining finds sets of items (called frequent itemsets) whose occurrences exceed a predefined threshold in the dataset. Then it generates association rules from those itemsets with the constraints of minimal confidence. The market basket analysis problem is an example of this kind of mining. Customers habits are classified by finding associations between the items placed in their shopping baskets. In MIDClass items are gene expression intervals. Baskets are the phenotypes containing (i.e. described by) sets of gene expression intervals. The aim of MIDClass is to extract frequent maximal itemsets and then use them as rules whose antecedent is the conjunction of gene expression intervals and the consequence is the class-label.

MIDClass has been developed with other authors in a research project at University of Catania.

MIDClass is supported by StudioPigola on all editions of Windows 10.

 

 REFERENCES

  • 2013 Rosalba Giugno, Alfredo Pulvirenti, Luciano Cascione, Giuseppe Pigola, Alfredo Ferro, “MIDClass: Microarray Data Classification by Association Rules and Gene Expression Intervals”, PLoS One, 01/2013; 8:e69873. DOI:10.1371/journal.pone.0069873