Protein Structure Initiative
The Protein Structure Initiative (PSI) is a federal, university, and industry attempt directed at dramatically bringing down the prices and lessening the time it takes to determine a three-dimensional protein anatomical structure. The long term goal of the PSI is to make the third-dimensional atomic-level structures of most proteins easily procurable from knowledge of their comparable DNA sequences. Funding is provided by the U.S. National Institute of General Medical Sciences (NIGMS), and information is available on the website describing the Centers in the U.S. that are performing structural genomics and PSI-related technology development, as well as PSI meeting reports.
An information portal, the PSI SGKB (Knowledgebase), provides general information and search features, such as protein sequence and keyword searching, and modules describing target selection, experimental details, models, annotation, metrics, and technology. This information portal is based out of the Rutgers University, New Brunswick at the Protein Data Bank (PDB).
A PSI Materials Repository at the Harvard Institute of Proteomics has been funded to provide easy access to the many thousands of clones produced by PSI-funded Centers since 2000. Clones will be available starting in 2008.
In 2000, the US National Institute of General Medical Sciences of the National Institutes of Health funded the Protein Structure Initiative (PSI), a ten-year project to uncover the three-dimensional shapes of a wide range of proteins. The Joint Center for Structural Genomics (JCSG), based at The Scripps Research Institute in La Jolla, California, USA, is one of four large-scale centers involved in the production phase of the PSI. Four centers focus on high-throughput protein-structure determination, six specialized centers deal with difficult-to-solve proteins, such as membrane proteins, and two others provide new approaches to molecular modeling.
Ian Wilson, director of the JCSG, thinks the timing is perfect for the PSI centers to produce large numbers of new protein structures for the research community: “With more and more DNA sequences becoming available each day, the possibilities for the future of protein structure determination are tremendous.” A central goal of the PSI is to enable the prediction of three-dimensional structures for most proteins from knowledge of their corresponding DNA sequence. In principle, this can be done by inferring the structure of a protein based on the known structure of representative members of the protein’s family. “Most of the big protein families have been mapped-but still for 70% of known families we have no structural data,” says Adam Godzik, of the Burnham Institute for Medical Research in La Jolla, and head of bioinformatics at the JCSG. This makes for a huge number of potential target proteins if one wants to have representatives from all families and therefore raises difficult questions: ‘how do you choose which families to target and then which proteins within those families to obtain structures from?’.
“We are dealing with a continually expanding universe of proteins, so we had to have some rules about target selection,” says Wilson. For the PSI, seventy percent of the target protein families are communally selected through PSI’s Target Selection Committee. “We all sit down and execute a draft to decide which families each center will get,” says Wilson. “By virtue of choosing particular families we avoid overlap, but also with this selection process each center can optimize specific targets within families for themselves,” says Godzik. Another 15% of target proteins are decided upon by each center, and the final 15% are community targets proposed by outside researchers.
Godzik says that it is most effective for individual centers to decide which proteins to go after within the families they have been assigned because each center relies on different ‘reagent genomes’-large sets of genomic DNAs used to isolate homologous sequences. At JCSG, it is Godzik, along with his bioinformatics team, who is responsible for determining the specific proteins JCSG will work on. By aligning a protein family with all 100 genomes available at JCSG, they first identify all homologous proteins. Then, using their own software, they assign a crystallization score to each homologous gene identified within the family-a measure of the likelihood of success of the corresponding protein in the structure determination pipeline. “We take the ones that we predict to be most likely to succeed from this tool, and then we work our way down the list,” he says.
Read more about Bioinformatics Market Potential

Comments are currently closed.