Skip navigation SIMASystems Integration for Manufacturing Applications NIST - National Institute of Standards and Technology
ToolsPublicationsPublicationsResearch ProjectsAbout SIMAContactHome  
 
Technical Research Projects
Data Standards for Structural Bioinformatics
Principal Investigator: T.N. Bhat
(301) 975-5448
talapady.bhat@nist.gov

Objective:
To develop standards for structural bioinformatics for pre-clinical data that includes X-ray and NMR structural data for biological macromolecules with particular emphasis to internet-based databases of importance to biotechnology and Semantic Web.

Background:
The last decade has seen an amazing explosion in the field of bioinformatics fueled by the large-scale genomics sequencing efforts funded by NIH, NSF, DOE and private industry. Recently, new initiatives in Structural Genomics and Proteomics are underway that will be much more data intensive, with both large volumes and more complex data representations. Today the harvesting and management of large sets of biological structural data, and the mining of the information contained therein, is an activity that is transforming biological science, biotechnology, and the pharmaceutical industry. In most cases, the amount of data is enormous: thousands of macromolecular structures, millions of protein sequences, tens of thousands of structural and sequence neighbors.
In this era of Bioinformatics, there is a growing need for specialized, critically evaluated, and reliable data delivered using easy-to-use Web tools. NIST has been the leader in such data activities. NIST has a long history of producing, evaluating, and disseminating chemical data and is increasingly applying this expertise to biosciences. Researchers who are either developing drug treatments for AIDS or studying the virus that causes the disease has a new resource - the HIV Structural Reference Database - an online database of AIDS-related protein structures developed in part using SIMA funds, unveiled for public use by NIST in 2004. Since its release it has drawn considerable attention within NIST and it has quickly become one of the most popular NIST databases. This work has resulted in two major awards in 2006: (1) the NIST Judson C. French award (2005); and (2) the Science Spectrum Trailblazer Award by Science Spectrum Magazine (2006). In 2007 it resulted in another major award - Emerald Honors by the Science Spectrum Magazine.

Developed in collaboration with the National Cancer Institute, the HIV Structural Reference Database (HIVSDB) is receiving, annotating, archiving, and distributing structural data for proteins involved in making HIV, the virus that causes AIDS, as well as molecules that inhibit these activities. Until now, much of this information was not widely available because it was unpublished. The new database contains data from both the published literature and from direct contributions by industrial and other laboratories.
The database (SRD No. 102, copyrighted by the Department of Commerce, as mandated by the Congress) will be especially useful in developing strategies for inhibiting the activities of the HIV protease that is essential for maturation of HIV. In addition, the database is expected to help scientists understand and circumvent the problem of mutations that make HIV resistant to certain drugs. It is one of NIST’s most widely used databases.

NIST scientists annotate the structural data with information from various sources and index or classify the entries so that users can reliably find particular structures. NIST has helped to develop a novel technique for indexing HIV protease inhibitors. This, in turn, has enabled scientists to rapidly and reliably get data on all enzyme-inhibitor complexes such as a mutant strain that is resistant to a particular drug. The HIV database is a model for developing and testing new technologies to annotate and standardize HIV inhibitor names, and for evaluating structural data for macromolecules. At present this Webpage has the largest collection of 3-D structures of AIDS targets integrated with the 2-D structures of their inhibitors. In 2008 several updates were posted to this Web page, one of the major update was the inclusion of about 1000 additional protease inhibitors. During 2008, a significant amount of time was also spent on security related updates for the HIVSDB and Enzyme Thermodynamics database.

Enzymes, the biological machinery of chemical catalysis, are key components of technological and industrial growth using biological data. For this reason, during the year 2007, using in part SIMA funds, NIST established a revised data resource for enzyme thermodynamics data (SRD No.74). In 2008 a new database with emphasis to Bio-fuels was posted to the public. In 2009 additional info will be posted to this resource.
 


 

  Back to list of all projects
 

Page created October 2008

  Last updated:
 

Web site point of contact