Knowledge Integration for Gene Target Selection Graciela Gonzalez, PhD Juan C. Uribe Contact: [email protected] GeneRanker in a Nutshell Integration of knowledge from biomedical literature curated PPI databases, and protein network topology Seeks to prioritize lists of genes on their association to specific diseases and phenotypes [1], Such associations may or may not have been published (thus, not text mining) [1] Gonzalez G, Uribe JC, Tari L, Brophy C, Baral C. Mining Gene-Disease relationships from Biomedical

Literature: Incorporating Interactions, Connectivity, Confidence, and Context Measures. Pacific Symposium in Biocomputing; 2007; Maui, Hawaii; 2007. GeneRanker Interface 1. The user types a disease or biological process to be searched. 2. Genes found to be in association to the disease are extracted from the literature. 3. Protein-protein interactions involving those genes are then pulled from the literature & curated sources 4. The protein network is built and each gene ranked

GeneRanker Interface Collaboration: Application of GeneRanker to a biological context, with Dr. Michael Berens, Director of the Brain Tumor Unit at the Translational Genomics Institute (TGen). GeneRanker is available as an online application at http://www.generanker.org. Each gene is scored and can be annotated (count of co-occurrences and statistical representation) Evaluation of GeneRanker Mining genes related to gliom a: Precision by Method Ranked list (top 50) Ranked list (top 100) Ranked list (top 200) Gene-disease search Random List

0% 10% Related (>10 articles) 20% 30% 40% 50%

60% Possibly Related (1 to 10 articles) 70% 80% 90% 100% No evidence of relation or not a gene Contextual (PubMed search) based shows > 20% jump in precision over NLP based extraction. Synthetic network results show AUC > 0.984 Empirical validation against a glioma dataset shows consistent results

(118 vs 22 differentially expressed probes from top vs bottom of list) Complementary Work CBioC: www.cbioc.org shows PPIs, gene-disease, and gene-bioprocess associations extracted from abstracts BANNER: sourceforge.banner.org (presenting a poster on this one). An open source entity recognizer available now. Gene normalization: a similar open source system soon to be available.

