NERIC 2017 Speakers
Cathy H. Wu, PhD
Systems integration is becoming the driving force for the 21st century biology. Researchers are systematically tackling gene functions and complex regulatory processes by studying organisms at different levels of organization, from genomes, transcriptomes and proteomes to metabolomes and interactomes. To fully realize the value of such high-throughput data requires advanced bioinformatics for integration, mining, comparative analysis, and functional interpretation. My group conducts bioinformatics and computational biology research and has developed a bioinformatics resource at the Protein Information Resource with integrated databases and analytical tools to support genomics, proteomics and systems biology research [Wu et al., 2003]. PIR is a member of the UniProt Consortium to provide the central international resource on protein sequence and function [Wu et al., 2006]. The PIR web site and the UniProt web site at PIR are accessible by researchers worldwide with over 4 million hits per month from over 100,000 unique sites.
Our research encompasses protein evolution-structure-function relationships, biological text mining, protein ontology, proteomic bioinformatics, computational systems biology, and bioinformatics cyberinfrastructure. The protein-centric bioinformatics framework we are developing connects data mining, text mining and ontology for functional analysis of genes and proteins in the systems biology context. The integrative approach reveals hidden relationships among the various components of the biological systems, allows researchers to ask complex biological questions and gain better understanding of disease processes, and facilitates target discovery. We will further establish a new Center for Bioinformatics and Computational Biology at University of Delaware to foster collaborative interdisciplinary research and to offer graduate degree programs in Bioinformatics and Computational Biology to train the next generation of researchers and educators.
- Protein family classification, functional annotation, and structure-function analysis – As a central approach to protein annotation for the UniProt Knowledgebase, we employ a classification-driven rule-based method. The PIRSF system classifies proteins from superfamily to subfamily levels to reflect evolutionary relationship of proteins and their domain architecture, allowing comparative studies of protein function and evolution [Wu et al., 2004; Nikolskaya et al., 2006]. Coupling with manually curated, structure-guided rules, the system supports the standardization and accurate annotation of protein names, functions, and functional sites [Wu et al., 2006]. The systematic approach provides high-quality functional annotation, while keeping pace with the exponential growth of molecular sequence data.
- Biological text mining – With an ever-increasing volume of scientific literature now available electronically, we have been collaborating with several Natural Language Processing research groups to develop algorithms for text mining and information extraction [Hirschman et al., 2002]. Several projects have led to tools directly accessible from the iProLINK text mining resource [Hu et al., 2004], including the BioThesaurus of gene/protein names that allows the identification of synonymous and ambiguous names [Liu et al., 2006] and the RLIMS-P text mining system to extract phosphorylation information (kinase, protein substrate, and phosphorylation sites) from Medline abstracts [Hu et al., 2005]. We plan to develop a “configurable, intelligent and integrated” text mining system as the link bridging PubMed and databases for knowledge discovery. We co-organize the BioCreative Challenge Evaluations, bringing together both the text mining and biological research communities to evaluate and guide the future development of text mining systems.
- Biomedical ontology – As biomedical ontologies emerged as critical tools in biological research for semantic integration of complex data in disparate resources, we have developed a Protein Ontology (PRO) in the OBO (Open Biomedical Ontologies) Foundry framework [Natale et al., 2007]. Extending from the evolutionary relationships of protein classes to the representation of multiple protein forms of genes (e.g., isoforms, post-translational modifications), PRO allows precise definition of protein objects in biological context (e.g., pathways, networks, complexes) and specification of relationships with other ontologies (such as Gene Ontology) [Arighi et al., 2009]. The project aims to capture knowledge representation of protein biology embedded in the scientific literature to facilitate pathway, network and disease modeling.
- Omics data integration and pathway/network analysis – Designed for data integration in a distributed environment, the iProClass database provides rich protein annotation with data from over 100 molecular databases [Wu et al., 2004]. It is also the underlying data warehouse for gene/protein ID and name mapping. Built upon iProClass and UniProt, we have developed the iProXpress system for functional profiling and pathway analysis of large-scale gene expression and proteomic data [Huang et al., 2007]. iProXpress has been applied to several studies, including proteomic profiling of melanosomes and lysosome-related organelle proteomes, identification of signaling pathways and networks underlying estrogen-induced apoptosis of breast cancer cells, and analysis of cellular pathways in radiation-resistant cells [Chi et al., 2006; Hu et al., 2007; 2008]. As part of the NIAID biodefense proteomics program, we have integrated various omics data on pathogens and their hosts, allowing biologists to query and analyze data from multiple disparate proteomic centers about pathogen-host relationships. We have conducted integrative bioinformatics analysis of protein structure, function and evolution to identify potential targets for hemorrhagic viruses [Mazumder et al., 2007]. We plan to further develop network mining, visualization and prediction methods, and coupling with the integrative bioinformatics approach, to facilitate data-driven hypothesis generation.
Agenda(Information subject to change)
Wednesday, August 16th
|2:00 - 9:00 PM||Registration Desk Hours of Operation (Diamond Foyer)|
|5:00 - 6:00 PM||Reception and Poster Session I Setup (Emerald Promenade)|
|6:00 - 6:15 PM||Welcoming Remarks (Emerald I-III Ballrooms) Ralph Budd, MD, Director, Vermont Center for Immunology and Infectious Diseases, UVM College of Medicine, Judith Van Houten, PhD, Director, Vermont Genetics Network, UVM President Thomas Sullivan and UVM Provost David Rosowsky, PhD|
|6:15 - 7:30 PM||Plated Dinner (Emerald I-III Ballrooms)|
|7:30 - 9:00 PM||Poster Session I (Undergraduate students and Cores) (Emerald Promenade)|
Thursday, August 17thCOBRE/INBRE Business Grant Managers Meeting with Arina Kramer, Grants Specialist at the Center for Research Capacity Building, NIGMS (Carlton Boardroom)
|7:00 AM - 5:30 PM||Registration Desk Hours of Operation (Diamond Foyer)|
|7:00 - 8:15 AM||Breakfast, Career Tables for interested students, Poster Session II Set-up
|8:30 - 9:15 AM||Opening Remarks
(Emerald I-III Ballrooms)
|9:30 - 10:00 AM||Break and Refreshments (Emerald Promenade)|
|10:00 - 12:00 PM||Concurrent Sessions
|10:30 - 11:30 AM||COBRE/INBRE Business Grant Managers Meeting with Arina Kramer, Grants Specialist at the Center for Research Capacity Building, NIGMS (Carlton Boardroom)|
|12:00 - 1:30 PM||Lunch and Networking
VGN Baccalaureate Partner Institution Meeting (Carlton Boardroom)
|1:30 - 3:30 PM||Concurrent Scientific Sessions (solicit themes from PIs and from abstracts received).
Development/Genetics (Emerald I Ballroom) Chair,
|3:30 - 3:45 PM||Break and Refreshments (Emerald Promenade)|
|3:45 - 4:45 PM||Keynote Lecture I
(Emerald III Ballroom)
|4:45 - 6:00 PM||Poster Session II (Graduate students and Postdoctoral Fellows) (Emerald III Ballroom)|
|6:00 PM||Dinner on your own in Burlington|
Friday, August 18th
|8:00 - 9:00 AM||Breakfast (Exhibition Hall)|
|9:00 - 10:00 AM||Cool ideas from IDeA programs
(Emerald III Ballroom)
|10:00 - 12:00 PM||Concurrent Scientific Sessions
Neuroscience (Emerald I Ballroom) Chair, Judith Van Houten, PhD
Cardiovascular and Pulmonary Systems (Emerald III Ballroom) Chair, Bruce Stanton, PhD
Bioinformatics, Computational Biology, Complex Systems (Emerald II Ballroom) Chair, Cathy Wu, PhD, Edward G. Jefferson Chair of Bioinformatics & Computational Biology, University of Delaware
|12:00 - 1:00 PM||Lunch and Networking Box lunches (G's Restaurant)|
|1:00 PM||Conference Concludes|