The approaches we rejected during development of Progenesis SameSpots
During the development of SameSpots, several approaches for dealing with the missing values problem were investigated and rejected
Remove all spots that aren't fully matched
This is the easiest way to get rid of any missing values, but doing so throws away too much data. e.g.
Including unmatched spots
After removing unmatched spots
This method (known as 'listwise deletion') is used by some competing proteomics packages, but we found it wasn't good enough for us. This also means you get less data to work with as you add more replicates.
Removing spots that aren't fully matched assume that the removed spots are a relatively small proportion of the entire dataset, and are representative of it - that is, spots are unmatched completely at random. In some cases, however, missing values are indicative of some pattern and cannot safely be assumed to reflect randomness. In such circumstances, removing them can introduce substantial bias and unacceptable loss of power.
REJECTED: lose too much of your valuable data
Replace the missing values with zero
Replacing missing values with an estimated value (imputation) is always going to be less accurate than measuring the real value. A naive approach of simply replacing the missing values with zero can cause more problems than it solves by distorting estimates, standard errors and hypothesis tests.
REJECTED: data bias likely, more levels of assumption
Replace the missing values using a statistical model
Instead of replacing the missing values with zero, we could look at the surrounding data to help us estimate the real value. However, the model used will at best be only approximately true and could still bias your results. This method is also likely to mask real changes, because the missing values will be replaced by nearby values - making it look as though there is no change in the spot series.
REJECTED: data bias, masking likely
What does Progenesis SameSpots do?
In a 2D experiment, the gels typically don't have holes in them: the data isn't really 'missing', we just need to be able to measure it. By using our unique approach of first aligning your gels, then using a single representative spot pattern for the whole experiment, we have a valid measurement for every spot on every image. This completely avoids the problems with missing values.



Progenesis SameSpots and its statistic tools have become indispensable in our gel based proteomics workflows.