GK Logo 003 350 x 100

In a previous blog I mentioned two examination timetabling datasets. The Carter dataset and the ITC (International Timetabling Competition) dataset. I thought it would be worthwhile describing these datasets for anybody that is interested, as they are the datasets that everybody in the scientific community uses in order to test any proposed algorithms.
In this blog, I’ll discuss the Carter dataset and save the discussion of the ITC datasets for a later blog.

The Carter dataset were introduced in 1996 by Carter, Laporte and Lee in a paper published in the Journal of the Operational Research Society.

In this paper they made the observation that “One of the major drawbacks of most articles in the timetabling literature is that testing is limited to randomly generated problems and perhaps to one practical example.

They go on to say “In this study, in addition to running randomly generated problems, we carried out a comparison of thirteen unconstrained real-life examples.

The thirteen problems were taken from Canadian institutions (10), the London School of Economics (1), King Fahd University of Petroleum and Minerals (1) and Purdue University (1).

One of the main characteristics of these problems is that the objective we are trying to minimise is quite simple. We are trying to ensure that the spread of examinations for a given student is as wide as possible. That is, if a student is sitting one exam followed immediately by another, the penalty is greater than if the student has an exam, a rest period, and then another examination. If the student has two rest periods, this is even better, and so on, until there are five rest periods. At this point anything greater than or equal to five rest periods incurs no penalty.

The formulation for the Carter dataset does not capture some aspects such as:

However, capturing all the complexities inherent in the real world was not the point of the 1996 paper.
The point was, for the first time, Carter et al. provided the scientific community with a standard set of problems and these thirteen instances have been the bedrock of examination timetabling research since that time.

If you decide to use this dataset you should be aware that there was some confusion about five of the instances as there were two different versions available. The recent paper by Qu et al. provides the definitive source for the variants.

4 Responses

  1. Is it safe to say that the ITC version is the Carter++ version of the examination problem?
    Or does it have a constraint which the ITC version doesn't have?

    It looks like we're all looking at the same problems though, because the ITC examination problem is also an example in Drools Solver 🙂
    A very long time ago, I did a blog series about the UML domain model and the DRL constraint implementation I used:
    http://blog.athico.com/2008/02/solving-examination-problem-domain.html

    PS: I am nearly complete in improving my benchmarker suite to output a graph of how the best score changes over time during solving. I 'll blog the graphs of my examination implementation soon.

  2. It's not really for me to say if we can call the ITC dataset, Carter++.

    My personal point of view is "not really", as they are different instances and include more constraints.
    There is, of course, a relationship (the are both examination timetabling) but there are also many differences.

    I am not as familiar with the ITC dataset (yet), as I am with Carter so I just need to take a lok at them before I give a definitive response – but I will do soon.

  3. .. sorry, just to follow up. As far as I am aware, they both have common hard constraints and both try to spread out the exams. So in this repect, ITC is an extension of Carter.