Variable gap penalty for protein sequence-structure alignment

TitleVariable gap penalty for protein sequence-structure alignment
Publication TypeJournal Article
Year of Publication2006
AuthorsMadhusudhan, MS, Marti-Renom, MA, Sanchez, R, Sali, A
JournalProtein Eng Des Sel
KeywordsAlgorithms Amino Acid Sequence Models; Amino Acid *Software; Molecular Molecular Sequence Data Proteins/*chemistry Sequence Alignment/*methods Sequence Analysis; Protein/*methods *Sequence Homology

The penalty for inserting gaps into an alignment between two protein sequences is a major determinant of the alignment accuracy. Here, we present an algorithm for finding a globally optimal alignment by dynamic programming that can use a variable gap penalty (VGP) function of any form. We also describe a specific function that depends on the structural context of an insertion or deletion. It penalizes gaps that are introduced within regions of regular secondary structure, buried regions, straight segments and also between two spatially distant residues. The parameters of the penalty function were optimized on a set of 240 sequence pairs of known structure, spanning the sequence identity range of 20-40%. We then tested the algorithm on another set of 238 sequence pairs of known structures. The use of the VGP function increases the number of correctly aligned residues from 81.0 to 84.5% in comparison with the optimized affine gap penalty function; this difference is statistically significant according to Student’s t-test. We estimate that the new algorithm allows us to produce comparative models with an additional approximately 7 million accurately modeled residues in the approximately 1.1 million proteins that are detectably related to a known structure.


Madhusudhan, M S Marti-Renom, Marc A Sanchez, Roberto Sali, Andrej DE016274/DE/NIDCR NIH HHS/United States GM54762/GM/NIGMS NIH HHS/United States GM62529/GM/NIGMS NIH HHS/United States Comparative Study Research Support, N.I.H., Extramural England Protein engineering, design & selection : PEDS Protein Eng Des Sel. 2006 Mar;19(3):129-33. Epub 2006 Jan 19.