One of the approaches to identify proteins by mass spectrometry includes the separation of proteins by gel electrophoresis or liquid chromatography. Subsequently the proteins are cleaved with sequence-specific endoproteases. Following digestion the generated peptides are investigated by determination of molecular masses or specific sequence. For protein identification the experimentally obtained masses/sequences are compared with theoretical masses/sequences compiled in various databases.
Trypsin is the favored enzyme for this application, for the following reasons: A) the peptides contain a basic residue (Arg or Lys) on the C terminus and thus are good candidates for collision induced activation (CAD) in tandem experiments (low charge states and high mass-to-charge ratios); B) it is relatively Inexpensive; and C) optimal digestion conditions have been well characterized.
An inherent limitation of trypsin is the size of the peptides that it generates. For most organisms > 50% of tryptic peptides are less than 6 amino acids, too small for mass spectrometry based sequencing.
One recent publication examined the use of multiple proteases (trypsin, LysC, ArgC , AspN and GluC) in combination with either CAD or electron-based fragmentation (ETD) to improve protein identification (1). Their results indicated a significant improvement from a single protease digestion (trypsin), which yielded 27,822 unique peptides corresponding to 3313 proteins. In contrast using a combination of proteases with either CAD or ETD fragmentation methods yielded 92,095 unique peptides mapping to 3908 proteins.
Swaney DL, Wenger CD, & Coon JJ (2010). Value of using multiple proteases for large-scale mass spectrometry-based proteomics. Journal of proteome research, 9 (3), 1323-9 PMID: 20113005