Simulations. If the group difference was apparent in the features, there would be a clear difference between red and blue dots in the feature plot. If the group difference was apparent in the kernels, there would be a checkerboard-pattern with four squares in the kernel matrices: high similarity within the first half and within the second half of subjects, and low similarity between the first and the second half of subjects. Each kernel should also have the strongest similarity on the diagonal because each subject should be more similar to themselves than to any other subject. In the second kernel plots, we remove the diagonal for visualisation purposes to show the group difference more clearly. a) Simulating two groups of subjects that are different in their state means. The error distributions of all 10 iterations show that the Fisher kernel recovers the simulated group difference in all runs with 0% error (1). Features, kernel matrices, and kernel matrices with the diagonal removed for the first iteration for the linear naïve kernel (2), the linear naïve normalised kernel (3), and the linear Fisher kernel (4). The Fisher kernel matrices show an obvious checkerboard pattern corresponding to the within-group similarity and the between-group dissimilarity of the first and the second half of subjects. b) Simulating two groups of subjects that are different in their transition probabilities. Neither kernel is able to reliably recover the group difference, as shown in the error distribution of all 10 iterations (1), and the features and kernel matrices of one example iteration (2-4). c) Simulating two groups of subjects that are different in their transition probabilities but excluding state parameters when constructing the kernels. The Fisher kernel performs best in recovering the group difference as shown by the error distributions of all 10 iterations (1).