Data snooping

Data snooping is a form of statistical bias manipulating data or analysis to artificially get statistically significant results. Extended data manipulation increases your chances of observing statistically significant results because of the probabilistic nature of all statistical tests. Although some of these results may be significant by nature, other results could demonstrate this property just by chance.

[Blog] [Slides]

Data preprocessing

Nonsense input data produces nonsense output. Data preprocessing is a data mining technique which is used to transform the raw data in a useful and efficient format.

[Slides]