(A) Consensus sequence around all the identified TSSs generated using the Web-LOGO algorithm (Crooks et al., 2004). The site of transcription initiation (the TSS) is labeled +1 while the preceding nucleotide position is labeled −1. (B) Percentage of mismatches between the actual genomic sequence and the first base (red squares) or all the other bases (blue dots) of the complementary DNAs (cDNAs) after alignment of the TSS reads on the genome. A, G, C, and T on the bottom line correspond to the genomic sequence. A strong percentage of mismatches (∼60%) was observed specifically for the first cDNA position when pyrimidines are encoded on the genome at the corresponding location. In 80% of the cases, these mismatches consisted of a pyrimidine to A mismatch. (C) As in (A) but taking into account only the TSSs for which a pyrimidine to A mismatch at the 5′-end of the cDNAs was observed. The vertical arrows pointing to an A indicate the fact that the +1 nucleotide position, a pyrimidine in the genome, corresponds to a mismatched A at the first position of the cDNA. Up to 30% of the cDNA 5′-ends mapping on a pyrimidine actually conform to this consensus. Note that the presence of a conserved A at position −7 while it is located at position −8 in the general consensus and the presence of the pyrimidine to A mismatch suggests that transcription may not start at the pyrimidine but rather at the following A. Indeed, if one would remove the base at the 5′-end of the cDNA reads and align these to the genome, they would then perfectly conform to the general consensus, including the A at position −8, suggesting that an extra A is added at the 5′-end of the transcript during its synthesis. (D) Schematic model depicting how a one nucleotide ‘backward-shift’ relative to the template occurring during the transcription of the first three consecutive As might lead to the incorporation of four As at the beginning of the transcript. This scenario is supported by the fact that for TSSs mapped at PyAAA sequences, the corresponding transcripts carried three or four As at about equal frequencies, which suggests that this ‘one nucleotide back-shifting’ phenomenon occurs about half of the time when transcription initiates on this particular motif.