### Estimating Confidence Intervals: Classifier Models M1 vs. M2

- Suppose we have 2 classifiers, M1 and M2, which one is better?
- Use 10-fold cross-validation to obtain err'(M1) and err'(M2)
- These mean error rates are just
*estimates*of error on the true population of*future*data cases - What if the difference between the 2 error rates is just attributed to
*chance*? - Use a
**test of statistical significance** - Obtain
**confidence limits**for our error estimates

