Barcoded DNA from the THP1 cell line is mixed 1:1 with a random, barcoded sample. Analysis of only the THP1 reads was used to infer ‘pure’ matches, while analyses of the mixture were used to characterize the efficiency of matching using contaminated samples. The match probability is inferred by comparing a MinION sketch to 1,099 reference files that are part of the cancer cell line encyclopedia (CCLE) generated by the Broad Institute (grey). (A) The posterior probability for an exact match between the MinION sketch of the ‘pure’ cell line THP1 (considering a single barcode) and the reference file generated by the CCLE (the red line indicates the THP1 reference file, other strains are depicted in grey). The posterior probability is plotted as a function of the sketching time and number of SNPs analyzed. (B) 10,000 simulated runs of sketching the THP1 cell line were matched against its reference file. The number of SNPs used to reach a 99.9% match (x-axis), is plotted against the number of times it is observed (y-axis). (C) The posterior probability that the contaminated (50% mixed) sample matched THP1 is plotted as a function of the sketching time and number of SNPs analyzed.