David Haussler (born 1953) is an American bioinformatician known for his work leading the team that assembled the first human genome sequence in the race to complete the Human Genome Project and subsequently for comparative genome analysis that deepens understanding the molecular function and evolution of the genome.[12][13][14]
Haussler was elected a member of the National Academy of Engineering in 2018 for developments in computational learning theory and bioinformatics, including first assembly of the human genome, its analysis, and data sharing.
During summers while he was in college, Haussler worked for his brother, Mark Haussler, a biochemist at the University of Arizona studying vitamin Dmetabolism. They were the first to measure the levels of Calcitriol, the hormonal form of vitamin D, in the human bloodstream.[16] Between 1975 and 1979 he traveled and worked a variety of jobs, including a job at a petroleum refinery in Burghausen, Germany, tomato farming on Crete, and farming kiwifruit, almonds, and walnuts in Templeton, CA. While in Templeton he worked on his master's degree at nearby California Polytechnic University.[9]
Haussler was an assistant professor in Mathematics and Computer Science at the University of Denver in Colorado from 1982 to 1986. From 1986 to the present, he has been at UC Santa Cruz, initially in the Computer Science Department, and in 2004 as an inaugural member of the Biomolecular Engineering Department.[9]
Haussler's research combines mathematics, computer science, and molecular biology.[6] He develops new statistical and algorithmic methods to explore the molecular function and evolution of the human genome, integrating cross-species comparative and high-throughput genomics data to study gene structure, function, and regulation.[21][22][23][24] He is credited with pioneering the use of Hidden Markov models (HMMs), stochastic context-free grammars, and the discriminative kernel method for analyzing DNA, RNA, and protein sequences. He was the first to apply the latter methods to the genome-wide search for gene expression biomarkers in cancer, now a major effort of his laboratory.
As a collaborator on the international Human Genome Project, his team, featuring programming work by graduate student Jim Kent, computationally assembled the first draft of the human genome[25] and posted it on the Internet on July 7, 2000.[26] Following this, his team developed the UCSC Genome Browser,[27][28][29] a web-based tool that is used extensively in biomedical research and serves as the platform for several large-scale genomics projects. These include the National Human Genome Research Institute (NHGRI)'s ENCODE project to use omics methods to explore the function of every base in the human genome (for which UCSC served as the Data Coordination Center), NIH's Mammalian Gene Collection, NHGRI's 1000 genomes project to explore human genetic variation, the Human Pangenome Reference Consortium to replace the single reference human genome with a collection of genomes from around the world, and the National Cancer Institute (NCI) Cancer Genome Atlas project to explore the genomic changes in cancer.
His group's informatics work on cancer genomics, including the UCSC Cancer Genomics Browser,[30] provides a complete analysis pipeline from raw DNA reads through the detection and interpretation of mutations and altered gene expression in tumor samples. His group collaborates with researchers at medical centers nationally, including members of the Stand Up To Cancer "Dream Teams" and the Cancer Genome Atlas, to discover molecular causes of cancer and develop a new personalized, genomics-based approach to cancer treatment.[31]
He co-founded the Genome 10K Project (now superseded by the Vertebrate Genomes Project) to assemble a genomic zoo—a collection of DNA sequences representing the genomes of 10,000 vertebrate species—to capture genetic diversity as a resource for the life sciences and for worldwide conservation efforts.[33][34]
^Freund, Yoav (1993). Data filtering and distribution modeling algorithms for machine learning (PhD thesis). University of California, Santa Cruz. OCLC679396091.
^Baum, Eric B.; Haussler, David (1988-01-01). "What size net gives valid generalization?". Proceedings of the 1st International Conference on Neural Information Processing Systems. NIPS'88. Cambridge, MA, USA: MIT Press: 81–90.