Genetics: Probing the phenomics of noncoding RNA

  1. John S Mattick  Is a corresponding author
  1. Garvan Institute of Medical Research, Australia

It has been known since the late 1970s that many DNA sequences are transcribed but not translated. Moreover, most protein-coding genes in mammals are fragmented, with only a small fraction of the primary RNA transcript being spliced together to form messenger RNA. For many years it was assumed that untranslated RNA molecules served no useful purpose but, starting in the mid-1990s, a small body of researchers, including the present author (Mattick, 1994), have been arguing that these RNAs transmit regulatory information, possibly associated with the emergence of multicellular organisms. This is supported by the observation that the proportion of noncoding genomic sequences broadly correlates with developmental complexity, reaching over 98% in mammals (Liu et al., 2013), although others have argued that the increase in genome size is due to the inefficiency of selection against non-functional elements as body size goes up and population size goes down (Lynch, 2007).

High-throughput sequencing analyses over the past decade have shown that the majority of mammalian genome is transcribed, often from both strands, and have revealed an extraordinarily complex landscape of overlapping and interlacing sense and antisense, alternatively spliced, protein-coding and non-protein-coding RNAs, the latter generally referred to as long noncoding RNAs (lncRNAs). Moreover, the repertoire of these lncRNAs is different in different cells (Carninci et al., 2005; Cheng et al., 2005; Birney et al., 2007; Mercer et al., 2012). While some transcripts may encode previously unrecognized small proteins, the function or otherwise of the vast majority of lncRNAs remains to be determined.

Because many lncRNAs appear to be expressed at low levels, and many have lower sequence conservation than messenger RNAs, one interpretation has been that these RNAs represent transcriptional noise from complex genomes cluttered with evolutionary debris. However, assessments of sequence conservation rely on assumptions about the non-functionality and representative distribution of reference sequences, which are not verified and cannot be directly tested (Pheasant and Mattick, 2007). Nonetheless, many lncRNAs show patches of relative sequence conservation (Derrien et al., 2012), and even more do so at the secondary structural level (Smith et al., 2013).

Expression analyses have shown that lncRNAs originate from all over the genome and are expressed at different times during differentiation and development (Dinger et al., 2008), often exhibiting highly cell-specific patterns (Mercer et al., 2008). The precision of lncRNA expression is consistent with evidence suggesting that many are associated with chromatin-modifying complexes, thereby acting as regulators of the epigenetic control of differentiation and development (Mercer and Mattick, 2013).

A number of lncRNAs have also been linked to complex diseases like cancer (Mattick, 2009) and other complex physiological processes (see, for example, Rapicavoli et al., 2013). However, these results seem at odds with the fact that few lncRNAs have been identified in traditional genetic screens. The reason for this is likely a combination of phenotypic, technical and expectational bias: mutations in protein-coding regions of the genome generally have phenotypes that are more severe, and are easier to identify, than those in non-coding regions. By contrast, in this context, it is worth noting that ∼95% of all variants associated with complex (as opposed to monogenic) diseases in humans map to non-coding, presumably regulatory, sequences (Freedman et al., 2011).

Still, the gold standard in this field is the targeted in vivo silencing or deletion of specific genes, and since few of these have been conducted to date, some researchers have remained sceptical about the biological significance of lncRNAs. Now, in eLife, John Rinn, Paolo Arlotta and co-workers at Harvard, MIT, the Broad Institute, Rutgers and Regeneron Pharmaceuticals—including Martin Sauvageau, Loyal Goff and Simona Lodata as joint first authors—report the results of the first large-scale attack on the question (Sauvageau et al., 2013). They selected 18 lncRNA genes in the mouse genome that had been stringently assessed for lack of protein-coding capacity and that did not overlap with known protein-coding genes or other known gene annotations—hence the name long intergenic noncoding RNAs (lincRNAs)—and generated knockout mouse mutants by replacing the lncRNA gene with a lacZ reporter cassette.

Sauvageau, Goff, Lodata et al. report discernable developmental problems in five of the 18 mutants, with three exhibiting embryonic or post-natal lethality, two of which exhibited growth defects in the survivors. The phenotypes of two of the mutants were analyzed in detail: one of the mutants that died showed defects in multiple organs (including the lung, heart and gastrointestinal tract), and one of the mutants that survived with growth defects also showed defects in the cerebral cortex. Other mutants that did not exhibit overt developmental defects showed brain-specific expression patterns and may be associated with cognitive defects that are not grossly apparent at the developmental level.

Another group (Grote et al., 2013) recently generated a different knockout allele for one of the 18 lincRNAs interrogated by Sauvageau et al., and also reported an embryonic lethal phenotype, albeit with some differences. Importantly, the approach used by Grote et al. also provided strong evidence that the mutant defects were not caused by an indirect effect on an overlapping genomic element, such as an enhancer for a nearby gene.

The work of Sauvageau, Goff, Lodata et al. is a mini tour-de-force that shows that there are lncRNAs with important developmental functions in vivo, and it joins a small number of studies from other pioneering groups that show the same thing (Lewejohann et al., 2004; Gutschner et al., 2013; Li et al., 2013), although not all of the targeted lncRNAs showed a phenotype. Similarly, other knockout experiments of widely expressed lncRNAs, as well as some of the most highly conserved elements in the mammalian genome, also did not yield discernable phenotypes (Ahituv et al., 2007; Nakagawa et al., 2011), which should sound a note of caution about the interpretation of negative results.

Indeed, since most lncRNAs are expressed in the brain (Mercer et al., 2008) and many are primate-specific (Derrien et al., 2012), it may be that much of the lncRNA-mediated genetic information in humans (and in mammals generally) is devoted to brain function, and therefore not easily detectable in developmental, as opposed to cognitive, screens. A good example is a noncoding RNA called BC1 that is widely expressed in the brain: knockout of BC1 causes no visible anatomical consequences, but it leads to a behavioural phenotype that would be lethal in the wild (Lewejohann et al., 2004).

Although evidence for the hypothesis that lncRNAs have a role in mammalian development, brain function and physiology is growing, there is also a clear need for more sophisticated and comprehensive phenotypic screens, especially with respect to cognitive function.

References

    1. Birney E
    2. Stamatoyannopoulos JA
    3. Dutta A
    4. Guigo R
    5. Gingeras TR
    6. Margulies EH
    7. Birney E
    8. Stamatoyannopoulos JA
    9. Dutta A
    10. Guigó R
    11. Gingeras TR
    12. Margulies EH
    13. Weng Z
    14. Snyder M
    15. Dermitzakis ET
    16. Thurman RE
    17. Kuehn MS
    18. Taylor CM
    19. Neph S
    20. Koch CM
    21. Asthana S
    22. Malhotra A
    23. Adzhubei I
    24. Greenbaum JA
    25. Andrews RM
    26. Flicek P
    27. Boyle PJ
    28. Cao H
    29. Carter NP
    30. Clelland GK
    31. Davis S
    32. Day N
    33. Dhami P
    34. Dillon SC
    35. Dorschner MO
    36. Fiegler H
    37. Giresi PG
    38. Goldy J
    39. Hawrylycz M
    40. Haydock A
    41. Humbert R
    42. James KD
    43. Johnson BE
    44. Johnson EM
    45. Frum TT
    46. Rosenzweig ER
    47. Karnani N
    48. Lee K
    49. Lefebvre GC
    50. Navas PA
    51. Neri F
    52. Parker SC
    53. Sabo PJ
    54. Sandstrom R
    55. Shafer A
    56. Vetrie D
    57. Weaver M
    58. Wilcox S
    59. Yu M
    60. Collins FS
    61. Dekker J
    62. Lieb JD
    63. Tullius TD
    64. Crawford GE
    65. Sunyaev S
    66. Noble WS
    67. Dunham I
    68. Denoeud F
    69. Reymond A
    70. Kapranov P
    71. Rozowsky J
    72. Zheng D
    73. Castelo R
    74. Frankish A
    75. Harrow J
    76. Ghosh S
    77. Sandelin A
    78. Hofacker IL
    79. Baertsch R
    80. Keefe D
    81. Dike S
    82. Cheng J
    83. Hirsch HA
    84. Sekinger EA
    85. Lagarde J
    86. Abril JF
    87. Shahab A
    88. Flamm C
    89. Fried C
    90. Hackermüller J
    91. Hertel J
    92. Lindemeyer M
    93. Missal K
    94. Tanzer A
    95. Washietl S
    96. Korbel J
    97. Emanuelsson O
    98. Pedersen JS
    99. Holroyd N
    100. Taylor R
    101. Swarbreck D
    102. Matthews N
    103. Dickson MC
    104. Thomas DJ
    105. Weirauch MT
    106. Gilbert J
    107. Drenkow J
    108. Bell I
    109. Zhao X
    110. Srinivasan KG
    111. Sung WK
    112. Ooi HS
    113. Chiu KP
    114. Foissac S
    115. Alioto T
    116. Brent M
    117. Pachter L
    118. Tress ML
    119. Valencia A
    120. Choo SW
    121. Choo CY
    122. Ucla C
    123. Manzano C
    124. Wyss C
    125. Cheung E
    126. Clark TG
    127. Brown JB
    128. Ganesh M
    129. Patel S
    130. Tammana H
    131. Chrast J
    132. Henrichsen CN
    133. Kai C
    134. Kawai J
    135. Nagalakshmi U
    136. Wu J
    137. Lian Z
    138. Lian J
    139. Newburger P
    140. Zhang X
    141. Bickel P
    142. Mattick JS
    143. Carninci P
    144. Hayashizaki Y
    145. Weissman S
    146. Hubbard T
    147. Myers RM
    148. Rogers J
    149. Stadler PF
    150. Lowe TM
    151. Wei CL
    152. Ruan Y
    153. Struhl K
    154. Gerstein M
    155. Antonarakis SE
    156. Fu Y
    157. Green ED
    158. Karaöz U
    159. Siepel A
    160. Taylor J
    161. Liefer LA
    162. Wetterstrand KA
    163. Good PJ
    164. Feingold EA
    165. Guyer MS
    166. Cooper GM
    167. Asimenos G
    168. Dewey CN
    169. Hou M
    170. Nikolaev S
    171. Montoya-Burgos JI
    172. Löytynoja A
    173. Whelan S
    174. Pardi F
    175. Massingham T
    176. Huang H
    177. Zhang NR
    178. Holmes I
    179. Mullikin JC
    180. Ureta-Vidal A
    181. Paten B
    182. Seringhaus M
    183. Church D
    184. Rosenbloom K
    185. Kent WJ
    186. Stone EA
    187. Batzoglou S
    188. Goldman N
    189. Hardison RC
    190. Haussler D
    191. Miller W
    192. Sidow A
    193. Trinklein ND
    194. Zhang ZD
    195. Barrera L
    196. Stuart R
    197. King DC
    198. Ameur A
    199. Enroth S
    200. Bieda MC
    201. Kim J
    202. Bhinge AA
    203. Jiang N
    204. Liu J
    205. Yao F
    206. Vega VB
    207. Lee CW
    208. Ng P
    209. Shahab A
    210. Yang A
    211. Moqtaderi Z
    212. Zhu Z
    213. Xu X
    214. Squazzo S
    215. Oberley MJ
    216. Inman D
    217. Singer MA
    218. Richmond TA
    219. Munn KJ
    220. Rada-Iglesias A
    221. Wallerman O
    222. Komorowski J
    223. Fowler JC
    224. Couttet P
    225. Bruce AW
    226. Dovey OM
    227. Ellis PD
    228. Langford CF
    229. Nix DA
    230. Euskirchen G
    231. Hartman S
    232. Urban AE
    233. Kraus P
    234. Van Calcar S
    235. Heintzman N
    236. Kim TH
    237. Wang K
    238. Qu C
    239. Hon G
    240. Luna R
    241. Glass CK
    242. Rosenfeld MG
    243. Aldred SF
    244. Cooper SJ
    245. Halees A
    246. Lin JM
    247. Shulha HP
    248. Zhang X
    249. Xu M
    250. Haidar JN
    251. Yu Y
    252. Ruan Y
    253. Iyer VR
    254. Green RD
    255. Wadelius C
    256. Farnham PJ
    257. Ren B
    258. Harte RA
    259. Hinrichs AS
    260. Trumbower H
    261. Clawson H
    262. Hillman-Jackson J
    263. Zweig AS
    264. Smith K
    265. Thakkapallayil A
    266. Barber G
    267. Kuhn RM
    268. Karolchik D
    269. Armengol L
    270. Bird CP
    271. de Bakker PI
    272. Kern AD
    273. Lopez-Bigas N
    274. Martin JD
    275. Stranger BE
    276. Woodroffe A
    277. Davydov E
    278. Dimas A
    279. Eyras E
    280. Hallgrímsdóttir IB
    281. Huppert J
    282. Zody MC
    283. Abecasis GR
    284. Estivill X
    285. Bouffard GG
    286. Guan X
    287. Hansen NF
    288. Idol JR
    289. Maduro VV
    290. Maskeri B
    291. McDowell JC
    292. Park M
    293. Thomas PJ
    294. Young AC
    295. Blakesley RW
    296. Muzny DM
    297. Sodergren E
    298. Wheeler DA
    299. Worley KC
    300. Jiang H
    301. Weinstock GM
    302. Gibbs RA
    303. Graves T
    304. Fulton R
    305. Mardis ER
    306. Wilson RK
    307. Clamp M
    308. Cuff J
    309. Gnerre S
    310. Jaffe DB
    311. Chang JL
    312. Lindblad-Toh K
    313. Lander ES
    314. Koriabine M
    315. Nefedov M
    316. Osoegawa K
    317. Yoshinaga Y
    318. Zhu B
    319. de Jong PJ
    (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project
    Nature 447:799–816.
    https://doi.org/10.1038/nature05874
    1. Carninci P
    2. Kasukawa T
    3. Katayama S
    4. Gough J
    5. Frith MC
    6. Maeda N
    7. Carninci P
    8. Kasukawa T
    9. Katayama S
    10. Gough J
    11. Frith MC
    12. Maeda N
    13. Oyama R
    14. Ravasi T
    15. Lenhard B
    16. Wells C
    17. Kodzius R
    18. Shimokawa K
    19. Bajic VB
    20. Brenner SE
    21. Batalov S
    22. Forrest AR
    23. Zavolan M
    24. Davis MJ
    25. Wilming LG
    26. Aidinis V
    27. Allen JE
    28. Ambesi-Impiombato A
    29. Apweiler R
    30. Aturaliya RN
    31. Bailey TL
    32. Bansal M
    33. Baxter L
    34. Beisel KW
    35. Bersano T
    36. Bono H
    37. Chalk AM
    38. Chiu KP
    39. Choudhary V
    40. Christoffels A
    41. Clutterbuck DR
    42. Crowe ML
    43. Dalla E
    44. Dalrymple BP
    45. de Bono B
    46. Della Gatta G
    47. di Bernardo D
    48. Down T
    49. Engstrom P
    50. Fagiolini M
    51. Faulkner G
    52. Fletcher CF
    53. Fukushima T
    54. Furuno M
    55. Futaki S
    56. Gariboldi M
    57. Georgii-Hemming P
    58. Gingeras TR
    59. Gojobori T
    60. Green RE
    61. Gustincich S
    62. Harbers M
    63. Hayashi Y
    64. Hensch TK
    65. Hirokawa N
    66. Hill D
    67. Huminiecki L
    68. Iacono M
    69. Ikeo K
    70. Iwama A
    71. Ishikawa T
    72. Jakt M
    73. Kanapin A
    74. Katoh M
    75. Kawasawa Y
    76. Kelso J
    77. Kitamura H
    78. Kitano H
    79. Kollias G
    80. Krishnan SP
    81. Kruger A
    82. Kummerfeld SK
    83. Kurochkin IV
    84. Lareau LF
    85. Lazarevic D
    86. Lipovich L
    87. Liu J
    88. Liuni S
    89. McWilliam S
    90. Madan Babu M
    91. Madera M
    92. Marchionni L
    93. Matsuda H
    94. Matsuzawa S
    95. Miki H
    96. Mignone F
    97. Miyake S
    98. Morris K
    99. Mottagui-Tabar S
    100. Mulder N
    101. Nakano N
    102. Nakauchi H
    103. Ng P
    104. Nilsson R
    105. Nishiguchi S
    106. Nishikawa S
    107. Nori F
    108. Ohara O
    109. Okazaki Y
    110. Orlando V
    111. Pang KC
    112. Pavan WJ
    113. Pavesi G
    114. Pesole G
    115. Petrovsky N
    116. Piazza S
    117. Reed J
    118. Reid JF
    119. Ring BZ
    120. Ringwald M
    121. Rost B
    122. Ruan Y
    123. Salzberg SL
    124. Sandelin A
    125. Schneider C
    126. Schönbach C
    127. Sekiguchi K
    128. Semple CA
    129. Seno S
    130. Sessa L
    131. Sheng Y
    132. Shibata Y
    133. Shimada H
    134. Shimada K
    135. Silva D
    136. Sinclair B
    137. Sperling S
    138. Stupka E
    139. Sugiura K
    140. Sultana R
    141. Takenaka Y
    142. Taki K
    143. Tammoja K
    144. Tan SL
    145. Tang S
    146. Taylor MS
    147. Tegner J
    148. Teichmann SA
    149. Ueda HR
    150. van Nimwegen E
    151. Verardo R
    152. Wei CL
    153. Yagi K
    154. Yamanishi H
    155. Zabarovsky E
    156. Zhu S
    157. Zimmer A
    158. Hide W
    159. Bult C
    160. Grimmond SM
    161. Teasdale RD
    162. Liu ET
    163. Brusic V
    164. Quackenbush J
    165. Wahlestedt C
    166. Mattick JS
    167. Hume DA
    168. Kai C
    169. Sasaki D
    170. Tomaru Y
    171. Fukuda S
    172. Kanamori-Katayama M
    173. Suzuki M
    174. Aoki J
    175. Arakawa T
    176. Iida J
    177. Imamura K
    178. Itoh M
    179. Kato T
    180. Kawaji H
    181. Kawagashira N
    182. Kawashima T
    183. Kojima M
    184. Kondo S
    185. Konno H
    186. Nakano K
    187. Ninomiya N
    188. Nishio T
    189. Okada M
    190. Plessy C
    191. Shibata K
    192. Shiraki T
    193. Suzuki S
    194. Tagami M
    195. Waki K
    196. Watahiki A
    197. Okamura-Oho Y
    198. Suzuki H
    199. Kawai J
    200. Hayashizaki Y
    201. FANTOM Consortium
    202. Genome Network Project Core Group
    (2005) The transcriptional landscape of the mammalian genome
    Science 309:1559–1563.
    https://doi.org/10.1126/science.1112014
  1. Book
    1. Lynch M
    (2007)
    The origins of genome architecture
    Sunderland, MA: Sinauer Associates.
    1. Mercer TR
    2. Dinger ME
    3. Sunkin SM
    4. Mehler MF
    5. Mattick JS
    (2008) Specific expression of long noncoding RNAs in the mouse brain
    Proceedings of the National Academy of Sciences of the United States of America 105:716–721.
    https://doi.org/10.1073/pnas.0706729105

Article and author information

Author details

  1. John S Mattick, Reviewing Editor

    St Vincent’s Clinical School and the School of Biotechnology and Biomolecular Sciences, Garvan Institute of Medical Research, Sydney, Australia
    For correspondence
    j.mattick@garvan.org.au
    Competing interests
    The author declares that no competing interests exist.

Publication history

  1. Version of Record published: December 31, 2013 (version 1)

Copyright

© 2013, Mattick

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,126
    Page views
  • 136
    Downloads
  • 13
    Citations

Article citation count generated by polling the highest count across the following sources: Scopus, Crossref, PubMed Central.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. John S Mattick
(2013)
Genetics: Probing the phenomics of noncoding RNA
eLife 2:e01968.
https://doi.org/10.7554/eLife.01968

Further reading

    1. Developmental Biology
    Edgar M Pera, Josefine Nilsson-De Moura ... Ivana Milas
    Research Article

    We previously showed that SerpinE2 and the serine protease HtrA1 modulate fibroblast growth factor (FGF) signaling in germ layer specification and head-to-tail development of Xenopus embryos. Here, we present an extracellular proteolytic mechanism involving this serpin-protease system in the developing neural crest (NC). Knockdown of SerpinE2 by injected antisense morpholino oligonucleotides did not affect the specification of NC progenitors but instead inhibited the migration of NC cells, causing defects in dorsal fin, melanocyte, and craniofacial cartilage formation. Similarly, overexpression of the HtrA1 protease impaired NC cell migration and the formation of NC-derived structures. The phenotype of SerpinE2 knockdown was overcome by concomitant downregulation of HtrA1, indicating that SerpinE2 stimulates NC migration by inhibiting endogenous HtrA1 activity. SerpinE2 binds to HtrA1, and the HtrA1 protease triggers degradation of the cell surface proteoglycan Syndecan-4 (Sdc4). Microinjection of Sdc4 mRNA partially rescued NC migration defects induced by both HtrA1 upregulation and SerpinE2 downregulation. These epistatic experiments suggest a proteolytic pathway by a double inhibition mechanism:

    SerpinE2 ┤HtrA1 protease ┤Syndecan-4 → NC cell migration.

    1. Developmental Biology
    2. Neuroscience
    Kristine B Walhovd, Stine K Krogsrud ... Didac Vidal-Pineiro
    Research Article

    Human fetal development has been associated with brain health at later stages. It is unknown whether growth in utero, as indexed by birth weight (BW), relates consistently to lifespan brain characteristics and changes, and to what extent these influences are of a genetic or environmental nature. Here we show remarkably stable and lifelong positive associations between BW and cortical surface area and volume across and within developmental, aging and lifespan longitudinal samples (N = 5794, 4–82 y of age, w/386 monozygotic twins, followed for up to 8.3 y w/12,088 brain MRIs). In contrast, no consistent effect of BW on brain changes was observed. Partly environmental effects were indicated by analysis of twin BW discordance. In conclusion, the influence of prenatal growth on cortical topography is stable and reliable through the lifespan. This early-life factor appears to influence the brain by association of brain reserve, rather than brain maintenance. Thus, fetal influences appear omnipresent in the spacetime of the human brain throughout the human lifespan. Optimizing fetal growth may increase brain reserve for life, also in aging.