The Distribution of Gene Duplicates

The predicted and observed numbers of gene duplicates among 18 MIPS (Munich Information Center for Protein Sequences) functional categories in the Saccharomyces cerevisiae genome.  Chi-squared analyses revealed a significant difference between the predicted and observed values for six categories (p<0.0027, df=1) and the overall distribution (p<<0.001, df=17).  Asterisks represent those categories that had a significantly larger deviation from the predicted distribution.  The leftmost bar for each category is the predicted number of duplicated genes if redundancies between categories are left in the analysis.  The middlemost and rightmost bars represent the predicted and observed numbers, respectively, of duplicated genes when all genes that reside in more than one functional category are removed from the analysis.

This was a short paper I wrote in grad school at Indiana University to satisfy my interest in duplication and repetition in biological systems.  It is unpublished, but you can download The Distribution of Single-Function Duplicate Genes Among Functional Groups here (pdf 312 kb).

