DI-logo

Feature Selection Phase

Assume that a particular biological phenomenon is monitored in a high-throughput experiment under n different conditions, for instance a cell-cycle study performed for several different synchronized cultures. Each experiment will be supposed to measure the gene expression levels of m genes in a number of different time points. Thus a set of n different data matrices will be produced, one per experiment. These matrices will not necessarily cover exactly the same phases of the studied phenomenon, neither have the same dimensions nor use the same time sampling interval.

Assume that p-values are estimated in some way for each gene in each experiment. Then the p-values from all the experiments are joined together to form a single matrix, consisting of as many rows as experiments and as many columns as genes. Further we can apply the hybrid aggregation procedure(1), which after a final number of iterations will transform this matrix into a single vector, consisting of one overall p-value per gene. The hybrid aggregation procedure is schematically illustrated in the above figure. The values of the resulting vector can be interpreted as the consensus p-values supported by all the n experiments. These can be further used to select a subset of genes, which are eventually of interest for the studied biological phenomenon. Assume that a set of s (1 ≤ s ≤ m) genes has been selected either by using a predefined p-value threshold or retaining a certain percentage of the genes with the lowest p-values. Subsequently, the time expression profiles of these s genes can be extracted from the original n data matrices and thus constructing n new matrices. Thus each gene of interest may be represented with multiple expression profiles, one for each experiment, shedding light on the gene function from different experimental perspectives.


(1) Tsiporkova, E. and Boeva, V. Nonparametric Recursive Aggregation Process. Kybernetika. J. of the Czech Society for Cybernetics and Inf. Sciences 40 1 (2004) 51-70.

Technical University of Sofia-branch Plovdiv, Tsanko Dyustabanov 25, 4000 Plovdiv, Bulgaria