The PSORT principle uses the amino acid sequence information to generate an overall prediction of the protein localization sites. This rules are derived from experimental observations. For example, when analysing a gram negative organism, possible localization sites are: cytoplasm, cytoplasmic membrane, periplasm, outer membrane and extracellular space.
Blast2GO allows to assign sub-cellular localization sites to proteins based on their amino acid sequence via PSORTb. PSORTb is an algorithm which can be applied to bacteria or archaea protein sequences and uses a probabilistic system to predict the most probable localization. Once sites are predicted, its corresponding cellular component GO terms can be merged with the already existing Blast2GO annotations.
Starting with a previously loaded blast2go project with PROTEIN sequences, the PSORTb tool can be found under Analysis → Run PSORTb.
If the loaded project contains nucleotide sequences, the "Translate Longest ORF" tool can help to obtain the predicted protein sequences and be able to run PSORTb.
Run PSORTb in the Blast2GO menu.Figure 1.
Wizard and parameters
The wizard allows to adjust the algorithm parameters (Figure 2).
It performs different analysis depending on the Organism Type and the Gram Stain. It can be used with bacteria positive and negative gram stains or archaea organism sequences. For more details of the core algorithm, visit psortb.org.
The algorithm returns score values between 0 and 10 for each localisation site, the Cutoff parameter allows to set a minimum value of each localization above which the value can be considered as possible localization.
Figure 2. PSORTb wizard where the user can adjust the parameters.
The tool will iterate over the input sequences and analyse each of them with the PSORTb 3. The process will open a new tab and as the results come back, they are shown in a table format.
The table contains one row for each sequence. The table columns are:
- Sequence name: shows each sequence identifier.
- Final localization: contains the the predicted localization name.
- Final score: represents the prediction score for the localization.
- GO ID: the Gene Ontology ID associated to the location.
- Secondary Localization: a possible secondary localization when there is more than one score above the cutoff.
- The next 6 columns, hidden by default, show the score for all possible localizations.
PSORTb results in a Blast2GO tableFigure 3.
Merge GO information
The GO IDs from the prediction can be merged into the original Blast2GO project as cellular component characterisation of the sequences.
The merge option is available in the right side-panel of the PSORTb results (Figure 3.)
The merge wizard asks for the Blast2GO project file where to merge the GO results and will add the GO information to the project, matching the Sequence Name. Note: The initial Blast2GO project must be saved as a file before running the Merge GOs option.
For more information regarding PSORTb, visit the psortb.org documentation page.