(A) As described in Figure 1B and C and Materials and Methods, the 8mer (P4–P4’) 3Cpro polyprotein cleavage motif was initially generated from unique, concatenated 8mer cleavage sites across 796 enteroviral polyprotein sequences. To assess the capture capability of the motif on both virus and host targets, the motif was then used to conduct a low threshold (p-value=0.1) FIMO (MEME Suite) search across training set of 2678 nonredundant enteroviral polyproteins from ViPR and 27 experimentally validated human targets of 3Cpro (Laitinen et al., 2016). In the graph, the X-axis represents a log10 of the p-value reported by FIMO as an indicator for the strength of the cleavage motif hit, or cleavage score. The left Y-axis depicts the number of uncalled ‘true positives’, or motif hits within the enteroviral polyprotein training set that overlap with the initial set of 8mer polyprotein cleavage sites used to generate the motif (black). The right Y-axis depicts the number of called false positive sites, or any motif hits that are not true positives, in the training set of enteroviral polyprotein sequences (gray). (Above) Each line depicts a single, experimentally validated case of enteroviral 3Cpro cleavage site within a human protein as reported in Laitinen et al., 2016 and is ordered along the x-axis by its corresponding cleavage score. Vertical dotted lines are used to represent the decided thresholds for comparison of capture capability. Capture of human targets at 95%, 99%, or 100% capture of true positives in the polyprotein dataset corresponds to capture of 4, 7, and 16 human hits. (B) Pseudo-counts to the position-specific scoring matrix of the motif shown in (A) were adjusted by total information content where the two most information-dense positions P1 and P1’ are assigned pseudocount = 0 and the least information-dense position P3 pseudocount = 1, and the remaining positions are assigned a pseudocount value relative to the most information-dense position P1. This optimized motif is then used to FIMO search against the same training set as described in (A). Capture of human targets at 95%, 99%, or 100% capture of true positives in the polyprotein dataset corresponds to capture of 16, 23, and 24 human hits.