Why is fragmentation data required to be in a separate file?
As explained in the FAQ "What are fragment databases?", Progenesis QI requires you to use a separate fragment database (MSP file) to store the fragmentation data, in addition to the CSV or SDF database file containing compound properties like neutral mass and retention time.
There are two main reasons why this data is required to be in a separate file, rather than being added to the SDF or CSV file:
1. Ease of updating externally sourced compound databases
Suppose you download an SDF file from an external source (e.g. MassBank) which contains no fragmentation data, and then proceed to augment it with fragmentation patterns you have observed in your experiments.
Now suppose that MassBank release a new version of the SDF file. Updating to this SDF would require somehow transplanting the fragmentation data from the old SDF into the new one (either manually or using a script - either way a time consuming task).
By storing your fragmentation data in a separate MSP file, which links to the SDF via compound IDs, you can update the SDF by simply downloading the new version and replacing the old one. Neither the MSP or updated SDF need to be edited, since they can still be linked up using compound IDs.
2. Difference in fragmentation patterns of adducted forms
Different adducted forms of a compound may produce different fragmentation patterns.
However, SDF files just give information about the neutral compound, so storing fragmentation patterns for different adducts would be problematic.
By using a separate MSP file, it is a lot easier to store fragmentation data for multiple adducts of the same compound (a record is added in the MSP file for each adduct form).
Progenesis QI can then easily choose the fragmentation data for the correct adducted form when doing a compound search.