Help -- emPAI Calc Search Fields

Protein Database

Select the protein database to be searched. The databases available on emPAI Calc are:

Name Description URL
IPI_Human International Protein Index
Ecoli SwissProt High quality, curated protein database
yeast_orf Saccharomyces Genome Database (SGD)
Ecoli_MoriLab GenoBase (Escherichia coli K12 W3110)
Arabidopsis_tair The Arabidopsis Information Resource (TAIR)

We can add new protein database by FASTA files upon request.    [Contact]


Specify the reagent used for protein digestion.
At present Trypsin (Cleave CTerm of KR/ Don't cleave P) is supported.


Select any known modifications. emPAI-Calc supports fixed modifications. Fixed modifications are applied universally, to every instance of the specified residue(s) or terminus. There is no computational overhead associated with a fixed modification, it is simply equivalent to using a different mass for the modified residue(s) or terminus. For example, selecting Carboxymethyl (C) means that all calculations will use 161 Da as the mass of cysteine.

MW range

The mass of the peptide in Da. The observable peptides are screened based on the MW ranges, which users can specify to suit their own experimental settings i.e. MS scan range.

LC Conditions (for Mascot and Xome only)

Name Column Mobile phase Gradient Ref.
Shinoda, K. et al. (2006) C18 (A) water and (B) acetonitrile , both with 0.2% formic acid linear gradient
(B) 1%-40% (0-50 min)
(B) 40%-50% (50-55 min)
Ishihama, Y. (2005) C18 (A) 0.5 acetic acid
(B) 0.5% acetic acid, 80% acetonitrile
(B) 5%-60% (0-180 min) [2]
Krokhin, O. V. et al. (2004) C18 Water and acetonitrile, with 0.1% TFA A linear gradient of 1-80% acetonitrile in 60 min [3]
Petritis, K. et al. (2003) Fused silica capillary columns packed with 5-um C18 particles [5] (A) acetic acid/TFA/water (0.2:0.05:100 v/v)
(B) TFA/acetonitrile/water (0.1:90:10, v/v)
nonlinear (exponential) gradient [6] [4,6]

[1] Shinoda, K. et al. J. Proteome Res. 5, 3312-3317 (2006). [PubMed]
[2] Ishihama, Y. J Chromatogr A 1067, 73-83 (2005). [PubMed]
[3] Krokhin, O. V. et al. Mol Cell Proteomics 3, 908-19 (2004). [PubMed]
[4] Petritis, K. et al. Anal. Chem. 75, 1039-48 (2003). [PubMed]
[5] Shen, Y. et al. Anal. Chem. 73, 1766-75 (2001). [PubMed]
[6] Shen, Y. et al. Anal. Chem. 73, 3011-21 (2001). [PubMed]

Peptide Type (for Mascot and Xome only)

Reporting the results from a search which includes multiple queries can be complex, because it is not always clear which peptide "belongs" to which protein. The use of red and bold typefaces is intended to highlight the most logical assignment of peptides to proteins. (from Mascot Userguide)

Bold / light The first time a peptide match to a query appears in the report, it is shown in bold face.
Red / Black Black peptide is second (or below) ranked candidate assigned to a MS/MS peak. Whenever the top ranking peptide match appears, it is shown in red.
Checked / Unchecked It is not always clear which peptide "belongs" to which protein. "Checked" peptide means that the peak is also assigned to other protein(s) as "black" peptide.

These mean that protein hits with peptide matches that are both bold and red (=Bold red) are the most likely assignments. By default, emPAI Calc includes "Bold red" and "Checked light red" in observed peptides.

For more detail, refer to here (Matrix Science website).

Peptide Filtering (for Mascot and Xome only)

Users can filter peptides in their Mascot/Xome results based on arbitrary score threshold or "identity threshold", which indicates reliability of peptide identifications.

Description for "score" and "identity threshold"
from Matrix Science website

In Mascot, the score for an MS/MS match is based on the absolute probability (P) that the observed match between the experimental data and the database sequence is a random event. The reported score is -10Log(P). So, during a search, if 1.5 x 10^5 peptides fell within the mass tolerance window about the precursor mass, and the significance threshold was chosen to be 0.05, (a 1 in 20 chance of a false positive), this would translate into a score threshold of 65.

If the quality of an MS/MS spectrum is poor, particularly if the signal to noise ratio is low, a match to the "correct" sequence might not exceed this absolute threshold. Even so, the match to the correct sequence could have a relatively high score, which is well differentiated from the quasi-normal distribution of 1.5 x 10^5 random scores. In other words, the score is an outlier. This would indicate that the match is not a random event and, on inspection, such matches are often found to be either the correct match or a match to a close homologue. For this reason, Mascot also attempts to characterise the distribution of random scores, and provide a second, lower threshold to highlight the presence of any outlier. The lower, relative threshold is reported as the "homology" threshold while the higher, absolute threshold is reported as the "identity threshold".

(C) 2007 Institute for Advanced Biosciences, Keio University
Powered by Uber-Uploader