Design of Experiments 101: Cross Validation

What is an experiment?

An experiment is a procedure that you perform in order to validate (or to reject) your hypothesis.

Your hypothesis might be that the selection strategy, the classifier (regressor), or a smart combination of those that you developed performs better than others. Or maybe you just want to let your approaches in the wild (on your data) and assess the results.

For the sake of simplicity, let’s assume that you have a paradigm H (your hypothesis), a data set X, and a performance measure E (this is how you assess the performance of your approach numerically; e.g. classification accuracy).

The following approach works for supervised learning too, not just for active learning.

A simple example

The main idea behind design of experiments is:

the design of the experiment is similar to a contest.

The Contest: Alice has a dataset consisting of 100 data points and wants to know if Bob or Carl is the better data scientist. So, she gives Bob and Carl 75 data points and asks each of them to provide the best model they can achieve. After that Alice will compare both models on the 25 data points, she held back.

The Optimization: Now, both data scientists try to find the best parameters for their model. They also split the data: 60 for training and 15 for validation. After training several models with different parameters on the 60 data points, each of them chooses the model which performed best on the remaining 15 data points.

The Comparison: Finally, Alice will evaluate the final models of both data scientists on her held out data. Bob wins if his model performs best and Carl respectively.

Our terminology

In the following, we use these terms to describe the different kinds of subsets (see also wikipedia):

  • Outer training set: the data Bob and Carl are given by Alice to find their best approach (75 data points)
  • Outer test set (often: test or evaluation set): the data Alice held back to test Bob’s and Carl’s approach (25 data points)
  • Inner training set (often: training set): the data Bob and Carl used to train a model with specific parameters of their approach (60 data points)
  • Inner test set (often: validation set): the data Bob and Carl used to determine the best parameter set (15 data points)

How can Bob and Carl do better (improve the generalization of their training procedure)?

So far, both data scientist just had one fixed training set (inner training set) and one validation set (inner test set). By random it could happen that one test set is particularly difficult for parameter setting and easy for another. Hence, we should ensure that every instance has been used for testing.

In k-fold cross validation, the data given by Alice (75 data points) is split in \(k=5\) folds. Hence, they have 5 subsets with 15 instances each. To predict the labels of the first fold, the data from folds 2, 3, 4, 5 is used for training. For the second fold, the algorithm is trained on folds 1, 3, 4, 5, etc. This methodology is much more robust and therefore leads to better results. Hence, it is more probable that the parameter setting which performed best actually is the best for the given data.

But now, one problem occurs. For the best parameter setting, each data scientist has 5 different model because of the k-fold cross validation. As Carl did not know what to do, he chose one by random. Bob had a better idea: He used the parameter setting, he found out was best, and trained the model on all data that he was given.

How can Alice do better?

Alice is faced with a similar situation as Bob and Carl. Maybe, someone just got lucky or the selection of training resp. test instances has been better for one of the competitors. Hence, Alice also performs k-fold cross validation (here \(k=4\)). Hence, Bob and Carl are asked to provide 4 different models and Alice checks if the results are consistent.

To be even more certain, she calculates only one performance value for one k-fold cross validation. Then she repeats the selection of instances multiple times to be certain that the results are not random.

Summary: How do you split your data?

The main idea of cross validation is to prevent that the model had seen the test data during training. This means that test data has neither been used for training or tuning. If we want to rank different algorithms with their best parameter setting, we need the two-staged cross validation. Hence, algorithms selection is the outer cross validation and on each training set, we perform a separate inner cross validation. More details can be found in the wikipedia pages mentioned above.

If you are interested how to evaluate active learning algorithms, please see the paper:
Challenges of Reliable, Realistic and Comparable Active Learning Evaluation by Kottke, Calma et al.

16 Replies to “Design of Experiments 101: Cross Validation”

  1. Hello, after reading this amazing piece of writing i am as well happy to share my knowledge here with mates.

    Feel free to visit my page :: best delta 8 gummies –,

  2. Your style is so unique compared to other people I have read stuff from.
    Many thanks for posting when you have the opportunity,
    Guess I will just book mark this site.

    Here is my web blog :: best delta 8 gummies

  3. This site was… how do I say it? Relevant!! Finally I have found something
    that helped me. Thanks a lot!

    My web blog – buy Instagram likes

  4. Pretty nice post. I just stumbled upon your blog and wanted to say that I’ve really enjoyed surfing
    around your blog posts. After all I’ll be subscribing to your feed and I hope
    you write again very soon!

    my blog Instagram followers – Nereida

  5. Your style is so unique in comparison to other folks I have read stuff from.
    Thanks for posting when you’ve got the opportunity, Guess I’ll just bookmark this page.

    my web blog – buy Instagram followers Buzzoid –

  6. Pretty! This was an incredibly wonderful post. Thank you for providing this info.

    Also visit my blog best THC vape carts

  7. After checking out a number of the blog posts on your website, I honestly appreciate your way
    of writing a blog. I bookmarked it to my bookmark webpage list and will be checking back in the
    near future. Please check out my web site too and tell me
    your opinion.

    Here is my web blog THC Gummies

  8. Have you ever thought about creating an e-book or guest authoring on other websites?
    I have a blog based on the same subjects you discuss and would
    love to have you share some stories/information. I know my subscribers would value your work.
    If you’re even remotely interested, feel free to send me an email.

    my web-site cannabis gummies

  9. I visit every day some web sites and sites to read articles, except this website presents quality based writing.

    my blog post … weed dealer

  10. I’m really loving the theme/design of your website. Do you ever run into any internet browser compatibility issues?
    A handful of my blog visitors have complained about my website
    not working correctly in Explorer but looks great in Chrome.
    Do you have any ideas to help fix this problem?

    Also visit my site where to find weed

  11. Nice post. I used to be checking constantly this weblog and I am inspired!
    Very helpful information specifically the closing section 🙂 I maintain such info a lot.
    I was looking for this certain info for a very lengthy time.

    Thanks and good luck.

    my blog – Buy Weed

  12. My family all the time say that I am killing my time
    here at web, except I know I am getting experience all the time by
    reading such fastidious content.

    My webpage :: buy weed online (

  13. Hello there, I do believe your web site could be having web browser compatibility problems.
    When I take a look at your website in Safari, it looks fine but when opening in Internet Explorer,
    it’s got some overlapping issues. I merely wanted to give
    you a quick heads up! Other than that, excellent website!

    My homepage: buy weed

  14. I read this post fully concerning the difference of hottest and preceding technologies, it’s awesome article.

    my page: US Magazine

  15. I couldn’t refrain from commenting. Well written!

    Feel free to visit my site; buy weed

  16. I think this is among the most vital information for me.
    And i’m glad reading your article. But want to remark
    on some general things, The site style is great, the articles is really
    nice : D. Good job, cheers

Leave a Reply

Reload Image