How do I decide which identifications are correct?
After importing identifications for your features, you may find that some compounds have more than one possible identification. In the Review Compounds screen, the possible identifications for the selected compound are listed at the bottom-right. To confirm the compound's true identification, you need only click the star alongside the Compound ID, marking it as the accepted identification.
However, how do you decide which identification is the correct one? Here are a few considerations that can influence your decision:
Highest identification score
If you're identifying compounds using a metabolite database that assigns each identification a score, with the highest value for a given compound representing the most likely identification, this appears in the Score column of the Possible identifications list.
You could simply accept the identification with the highest score, but even if you have score values, it's still recommended that you validate the identifications with some of the other considerations listed here.
The presence of peptides
If you can safely ignore amino acids and peptides for your compound identifications, this may provide a way to drastically reduce the number of identifications under consideration for a given compound. This is especially true with peptide chains, as the same set of amino acids could be present in different orders to produce the same compound mass, as seen in this example:
Mismatch in the features' abundance profiles
Typically, you would expect each adduct or charge state of a given compound to have similar abundance levels in the various runs of each experimental condition. These abundance levels can be viewed easily by For example, in an experiment with control and treated conditions, all adducts of a given compound might be more abundant in the control samples than in the treated samples.
However, if the compound comprised ions whose abundance profiles were different, that could indicate that one or more of the identifications is unreliable. The screen below, displayed by double-clicking a compound in the Review Compounds screen, shows one such situation:
Here, the compound under review is composed of two compound ions, the two adducts being a sodium ion and a potassium ion. By selecting both adducts' rows in the table, we have displayed their abundance profiles overlaid on the same graph. We can clearly see that they are not consistent. One adduct has a relatively high average abundance in Control whereas the other has a low abundance; and the same is true for the Treated condition.
Note that these profiles will be exactly the same, no matter which of the compound's identifications is selected in this screen. That's because the set of features listed isn't dependent on the identification; it's dependent on the compound itself. Consequently, all possible identifications for this compound — in other words, the compound as a whole — may have to be rejected.
Adduct retention times that don't make sense
If one of your possible identifications is based on a pair of adducts (i.e. the compound is composed of two features) that you know should produce the same retention times (e.g. Na+ and K+), but you can see that the retention times for the features are significantly different, that can be a reason to reject the identification.
That exact scenario can be seen here:
Knowledge about the samples
Finally, if your knowledge of how the samples were obtained leads you to believe that certain compounds are simply unlikely to be present, you can use that to inform your decision of which is the correct identification.