Understanding and Evaluating Generalisation for Superhuman AI Systems