Introduction

Most teleost fishes have a stage-structured life cycle that includes a transition between larval and juvenile phases known as metamorphosis; this transition is regulated by thyroid hormones (TH) [1, 2]. Of all teleost fishes, flatfishes experience one of the most extreme metamorphosis, with significant changes occurring in their body organization and appearance during this period, switching from a symmetrical to an asymmetrical body plan [3, 4]. However, metamorphic changes are not always as pronounced in other fish species. For example, the metamorphosis of zebrafish is mainly marked by relatively discrete pigmentation changes that appear to be regulated by TH [58].

Metamorphosis in teleost fishes is not only marked by visible changes in the body, but also by a range of ecological, physiological, biochemical, and behavioral changes. These changes are thought to be initiated and coordinated by a surge of TH, which regulates various signaling pathways through the action of specific transcription factors known as thyroid hormone receptors (TRα. TRβ). For example, there is evidence that TH is associated with the transition between oceanic and coral reef environments in the convict surgeonfish[9], controls pigmentation changes in zebrafish, clownfish, and grouper [10, 11], regulates ossification processes in zebrafish and flatfishes [12, 13] and is involved in the shift of visual perception by controlling the expression of opsin genes in many species [14, 15]. More recently, it has also been suggested that the metabolic changes that occur during larval development in teleosts may be regulated by TH, as demonstrated in clownfish [14]. Of note in some cases, like groupers (elongated spines) [1618] or carapids (vexillum appendage) [19], some changes occur very early on and are considered as temporary specialization of the pelagic larval stages, serving as anti-predator defense [20], flotation [21] or camouflage [22]. It is still unclear if these are late developmental processes occurring after hatching or early manifestations of metamorphosis.

Beside TH signaling pathway, other actors have been shown to be important in metamorphosis regulation. For example, studies have provided clear evidence that corticoids and TH are interacting together to regulate amphibian metamorphosis [2325]. But as far as we know, there is limited information available regarding the interaction between corticoids and TH during fish metamorphosis. Although a synergistic effect of cortisol and TH has been observed in flatfish metamorphosis (advancement of morphological changes), there has been insufficient investigation into the communication between corticoids and TH pathways during teleost metamorphosis [26, 27]. More research is needed, and the use of genomic analysis would be a good way to investigate which pathways are associated with early larval development and metamorphosis.

The use of high-throughput sequencing techniques, such as transcriptomics, has made it possible to study gene expression in greater detail, especially when combined with a high-quality annotated genome, which has enabled the identification of genes that may be involved in the key biological changes that occur during metamorphosis. These techniques have provided valuable insights into the underlying molecular mechanisms that drive metamorphosis in teleost fishes [28]. Most of the studies investigating the transcriptomic changes during marine fish larval development have been focused on commercial fish species used in aquaculture to: (i) gain insight into the key biological processes that occur, (ii) identify the genes involved in these processes, and (iii) find ways to improve rearing conditions to ensure high survival rates and harmonious development [28]. However, these studies rarely mention metamorphosis to explain the onset of the various processes occurring during the transition between larval and juvenile stages. This is another reason why studying the molecular changes occurring during the larval development of the Malabar grouper Epinephelus malabaricus is very relevant. In addition, as mentioned above, groupers display elongate appendages during early larval development that disappear over time, providing an interesting way to study the development of these enigmatic structures [16, 29].

Our study will thus allow for a better understanding of the biological processes at play during the early larval development and metamorphosis, and to understand the carry-over effect in the context of aquaculture. Indeed, it is well known that rearing conditions may impact welfare and growth at later stages and understanding the molecular changes occurring during the development of this species might be useful to enhance survival rates [3032].

Grouper (Family Serranidae, Subfamily Epinephelinae) are a group of fish of both economic and ecological importance. Inhabiting temperate and tropical waters of eastern and southern regions Indo-Pacific region, East Atlantic, Mediterranean regions, and the intertropical American zone, they comprise 165 species in 16 genera [33, 34]. Ecologically, groupers provide a wide variety of important functions as large top-level predators [35]. However, due to their high economic value on the food market, more than 40 species are at risk of extinction [36, 37]. This has led to the widespread development of grouper aquaculture farms, which produced 155,000 tons per year according to the Food and Agriculture Organization of the United Nations in 2015, with 95% of global production occurring in Asia [38, 39]. Despite wide variations in growth rate, body size, and color, groupers share many biological traits and lifestyles, such as protogynous hermaphroditism, complex social structure [40], and a biphasic lifestyle. Like many marine fishes, groupers larvae hatched after 24 to 48 h of embryonic development giving rise to a transparent elongated larvae surrounded by an embryonic fin fold. After a couple of days melanophores colonize the tail (after the anus) and the gut. Shortly after the elongated appendages composed of two pelvic spines and the second dorsal spines both displaying melanophores at their tips appear. The elongation of these spines occurs before notochord flexion and their regression is concomitant with the appearance of the adult-like body pattern [17, 18, 4143]. Their regression as well as the development of the adult-like body pattern has been demonstrated to be under the control of thyroid hormones in E. coioides suggesting that it corresponds to the TH-regulated metamorphosis [29].

In order to gain insight into the molecular pathways involved in grouper larval development, we assembled a chromosome-scale genome sequence of E. malabaricus and conducted a transcriptomic analysis of nine developmental stages ranging from freshly hatched larvae to roughly two-month-old juveniles. We investigated the expression patterns of genes involved in the TH pathway and four biological processes known to be regulated by TH in other teleost species during the metamorphosis step: ossification, pigmentation, visual perception, and metabolic transition. In addition, we used TH pathway and downstream regulated biological processes activation as indicators to look for the potential involvement of corticoids during larval development. We observed the activation of the TH pathway during the regression of fin spines, which in other grouper species coincides with the surge of TH and marks the beginning of metamorphosis. Interestingly, the activation of the TH pathway at this stage was associated with the activation of corticoids pathways as well as the four biological processes we investigated. Especially noteworthy is the observation of an early activation of the two regulatory pathways (TH and corticoids) occurring before the formation of the elongated fin spines during early larval develpoment.

Results and Discussion

Genome assembly, phasing, scaffolding, and annotation

A total of 46 Gbp of PacBio HiFi reads (~43X coverage, Supp. Table S1) were assembled into a fully haplotype phased genome of the Malabar grouper (Epinephelus malabaricus) with the primary phase consisting of 298 contigs across 1.09 Gbp genome length, a contig N50 of 7.4 Mbp, and a genome level BUSCO completeness of 93.6% with 1.3% duplication (Table 1, Supp. Table S2). The raw assembly was further scaffolded by Phase Genomics using HiC data, resulting in a 1.03 Gbp assembly across 24 pseudo-chromosomes (Table 1). The scaffolded pseudo-chromosomes ranged from 22.5 Mbp to 50.6 Mbp in size and contained 90.5 % of the contigs and 92.8 % of the contig length (Supp. Fig. S1). The gene model annotation resulted in 26,140 protein coding genes, with a BUSCO completeness of 95.5% and a duplication level of 1.3%. The final GC content was 41.3 % and the assembly contained 56.4% repeat regions overall, which were mainly made up of DNA transposons (28.9%), followed by LINEs (5.3%), and LTR elements (2.2%) (Table 1, Supp. Tables S2 and S3). The genome length, GC content, repeat content, number of gene models, and BUSCO values are similar to other published chromosome-level grouper genomes, for example Epinephelus lanceolatus [44], E. akaara [45], and E. moara [46].

Statistics of the Epinephelus malabaricus chromosome-scale genome assembly, scaffolding and gene annotation.

Additional statistics can be found in Supp. Table S2.

General transcriptomic results

Transcriptomic analysis of E. malabaricus larval development was performed on grouper larvae raised in the Okinawa Prefectural Sea Farming Center. An average of 77.1 M reads were obtained per sample (pooled or individual entire larvae), which after quality control and mapping resulted in an average of 65.8 M uniquely mapped reads (85.6%) per sample for differential gene analysis. Sampled larvae from one day to two months old were sorted according to their morphology allowing us to sequence nine developmental stages (D01, D03, D06, D10, D13, D18, D32, D60, Juvenile) (Table 1). Principal component analysis (PCA) performed on all genes allowed to distinguish between three distinct groups: early developmental phase (composed of D01), intermediate developmental phase (composed of D03, D06, D10, D13 and D18) and late developmental phase (composed of D32, D60 and Juvenile) (Fig. 1A).

Transcriptomic results of E. malabaricus larval development.

A) Principal component analysis of different larval stages using variance stabilizing transformed complete transcriptome. B) Cluster analysis using the coseq R package, focusing on genes that are upregulated on days 3 and/or 32. The number of genes in each cluster are shown above each graph. Adjusted p-values and functional annotations for the four gene clusters in this figure can be found in the supplementary data file. Gene expression data was generated from whole larvae.

The analysis of upregulated genes during this post-embryonic development series revealed two major peaks of gene expression that underlies the clusters of regulated genes. Indeed, the cluster analysis shows 2,651 genes upregulated on D03 and to a lesser degree on D32 (clusters 1 and 2), 1,515 genes upregulated on D32 (cluster 3), and 785 genes upregulated on D32 and to a lesser degree on D03 (cluster 4) (Fig. 1B). Unsurprisingly, these two transitions, D01 to D03 and D18 to D32, also show the highest number of differentially expressed genes with 14,830 genes (7,151 up, 7,679 down) between D01 and D03, and 10,774 genes (5,320 up and 5,454 down) between D18 and D32 (Supp. Table S4). This suggests that there are two major events occurring in terms of gene expression: one early on, at day 3 and one later around day 32. This last event corresponds to the separation between the intermediate and late developmental phase and is concomitant with the regression of the elongated spines, an overall change of shape and progression of the pigmentation. In other grouper species, the regression of the elongated spines corresponds to the onset of metamorphosis and is associated with an increase of TH levels [29]. However, the very early event is more striking as such a global gene expression change very early on has, to our knowledge, never been reported in other teleost fish species.

Two periods of activation of the TH signaling pathway during grouper post-embryonic development

We investigated the expression patterns of key genes involved in the hypothalamo-pituitary-thyroid axis (tshb, trhr1a, trhr1a-like, trhr1b, trhr2) as well as in TH synthesis (tg, tpo, nis), TH metabolism (dio1, dio2, dio3), and finally the genes encoding thyroid hormone receptors (trα, trαβ, trβ). These genes all play important roles in the regulation of TH levels and TH signaling in the body and understanding their expression patterns during larval development can illuminate the underlying mechanisms that drive this process.

The gene encoding the pituitary thyroid stimulating hormone (tshb) is strongly expressed very early on during larval development at D01, decreases from D03, and then strongly increases again at D32 (Fig. 2A). Accordingly, we also observed two surges of expression for the hypothalamic factors trhr1aa, trhr1a like, trhr1b, and trhr2, at D03 and between D32 and D60, suggesting two distinct periods of stimulation of TH synthesis, one early on around D03 and one later at around D32. This pattern can also be seen in the expression of the corticotropin-releasing hormone (crhb) and receptors (crhr1a, crhr1b, and crhr2), which stimulates the synthesis of TH [47] (see section “Possible involvement of corticoid pathways in metamorphosis” and Fig. 5 below). Interestingly, we also observed a peak of expression for tg at D03, the gene encoding for the TH precursor, and a strong increase of expression starting at D32 (Fig. 2B). The respective order of appearance of TSH and Tg (TSH at D32, Tg after) is consistent with what we would expect but a bit later than expected given the morphological transformation. It would be interesting to revisit this in a future series of experiments, with tighter temporal sampling to study how gene expression and morphological transformation aligned. A similar expression pattern was obtained for tpo, the gene encoding for the enzyme adding iodine to TH precursor, as well as for sis, gene encoding for the symporter involved in transferring iodide into thyrocytes. Once produced, TH, particularly T4, are transported into target cells where they convert them into the active form T3 mostly by dio2 and dio1 or degraded by dio3 and dio1. As it has been observed for HPT factors, tg, tpo, and sis, we first noticed two peaks of expression of dio2 with a very early one at hatching (D01) and a second one at D32 suggesting two distinct periods in which active TH (that is T3) is required. In accordance with this observation, we notice a minimal expression of the T3 degrading enzyme dio3 at these two periods followed by a final late increase after D32. dio1, whose net function is unclear [48], shows a regular increase of expression that becomes maximal at the juvenile stages (J) (Fig. 2C). Finally, thyroid hormones receptors (TRs) expression levels increased throughout the entire larval development with a stronger increase of trβ at D60 (Fig. 2D). Taken together, these data reinforce the existence of two distinct periods of TH signaling activity, one early on at D03, and one late at D32 (Fig. 2C).

Expression levels of TH signaling pathway genes in E. malabaricus.

A) TRH: thyroid releasing hormone, TSH: thyroid stimulating hormone. B), DUOX: dual oxidase, TG: thyroglobulin, TPO: thyroperoxidase, SIS: sodium iodine symporter. C) DIO: deiodinase. D) TR: thyroid hormone receptor. Colored lines join the average values of each stage. Biochemical pathways adapted from Roux et al.15. E) T4 and T3 levels (in ng/g of larvae) during early larval development. # indicates that the value is below the quantification limit, and different letters indicate significant differences (one-way ANOVA followed by a Tukey HSD test for T4 levels only as no significant differences were observed after ANOVA for T3 levels). Gene expression data was generated from whole fish. Expression levels were derived from DESeq2 normalised gene counts.

These results suggest the activation of the TH axis around D32, which coincides with the regression of the elongated appendages (second dorsal spine and pelvic spines) and the appearance of the adult-like pigmentation pattern, indicating that metamorphosis in E. malabaricus occurs around D32 in our rearing conditions. These observations are consistent with what has been observed in E. coioides, in which TH levels peak around 40 dph when the pelvic and second dorsal spines regress and adult-like pigmentation pattern formation is ongoing [29]. Interestingly, the high expression levels of tshb, trhr, tg, tpo, sis, dio3, and TRs at the very beginning of development (D01-D03) suggest a precocious activation of TH synthesis, which, to our knowledge, has not been observed in groupers nor in other teleost fishes so far (Fig. 2). Measurements of TH levels during these early development stages showed an early peak of T4 at D03, confirming the early activation of the TH pathway observed with gene expression patterns (Fig. 2E).

TH involvement in elongate appendage and regression

As mentioned in the introduction, many marine fish larvae present several morphological features that improve larval survival rates during their pelagic phase [49]. This is what we observe in grouper with the formation of elongated spines of the dorsal and pelvic fins that are supposed to have a defensive function [43, 50, 51]. These spines then regress while adult-like pigmentation pattern appears and TH surge corresponding to the TH regulated metamorphosis. It is well known that during fish larval development genes involved in ossification are under the control of TH. In zebrafish, TH control the proper morphogenesis and ossification in the majority of the bones, during post-embryonic development and metamorphosis [52]. This is why we investigated the expression changes of some of these genes in E. malabaricus. Interestingly, we observed, once again, two surges in the expression of the following genes: bone gamma-carboxyglutamate (bglap), periostin (postnb), and phosphate regulating endopeptidase (phex), three key genes implicated in the mineralization of tissues. The first at D13 following the early surge in thyroid hormone (TH) signaling genes, and the second starting at D60 (Fig. 3A). The first surge of gene expression coincides with the appearance and growth of the dorsal and pelvic elongated spines starting at D10 (Fig. 3B, shown by green arrowhead), while the second surge coincides with the regression of these spines, a process known to be regulated by TH in E. coioides [29]. The coincidence of both the growth and the regression of the elongated spines with the activation of the TH pathway in E. malabaricus may suggest that TH may play a role not only in the regression of these spines but also in their formation in this species.

Biological processes likely to be under TH control during E. malabaricus metamorphosis.

A) Expression patterns of key genes involved in ossification process and known to be regulated by TH in teleosts. Bglap: bone gamma carboxyglutamate protein, mgp: matrix gla protein, postna: periostin a, postnb: periostine b, phex: phosphate regulating endopeptidase homolog X linked. B) Pictures of E. malabaricus at D03, D10, D32, D60 illustrate the elongation of the dorsal and pelvic floating spines (green arrow heads at D10 and D32) and their regression (red arrow heads at D60). C) Expression patterns of genes involved in pigmentation. Three areas of interest were chosen to illustrate the appearance of melanophores (C6 to C8: at the top of the head, C11 to C13: above internal organs and, C20 to C25: close to the caudal peduncle,) and xanthophores (C7 to C8 at the top of the head, C14 to C16 above internal organs and C26: close to the caudal peduncle). D) Expression patterns of genes encoding for the rhodopsins (rh1) and the visual cone opsins (rh2A, rh2B, rh2C, opnlw, opnsw1, opnsw2A-1, opnsw2A-2, opnsw2B). Gene expression data was generated from whole fish. Expression levels were derived from DESeq2 normalised gene counts.

Other TH regulated biological processes are also activated during grouper metamorphosis

Pigmentation changes are often the most visible changes in some teleost species such as clownfish [10]. In grouper, the pigmentation changes are accompanied by the regression of the dorsal and pelvic spines. The acquisition of an adult pigmentation pattern is characterized by the formation of brown and white vertical bars in E. malabaricus (Fig. 3C, juvenile stage). To reveal the molecular regulations driving these pigmentation changes, we assessed the expression of key pigmentation genes involved in white (iridophore genes), black (melanophore genes) and yellow (xanthophore genes) pigment cells known to be regulated by TH in zebrafish and clownfish [10, 11].

The expression level of the iridophore gene flh2a showed a strong increase from D03, followed by a decrease at D32 and a new surge at D60 (Fig. 3C). The first increase may correspond to the appearance of iridophores on the ventral cavity whereas the second may coincide with the formation of the white bars. In contrast, its paralogue fhl2b remained relatively stable throughout the development. Xanthophores start colonizing the larval body at D10, which may explain the increase of the expression level of two xanthophores markers, gtsm3 and perp6, which play a role in concentrating and trafficking lipophilic pigments [53]. On the other hand, scarb1, which is involved in carotenoid deposition in zebrafish, increased slightly at D03 (Fig. 3C). Similarly, melanophore genes are displaying a strong increase of their expression level at D03 and D06 that may be related to the colonization of melanophores on larval body (tyrp1b, tyr, Fig. 3C).

During their metamorphosis in the wild, fish larvae also undergo ecological changes such as habitat transition (from ocean to coastal environment) and food habits. It is well known that in many fish species this ecological transition is accompanied by a change in color vision [54]. Since TH appeared critical in the regulation of genes involved in vision in salmonids, zebrafish and clownfish [14, 15, 55, 56], we investigated the regulation of genes encoding for visual opsin. We expected to find at least eight visual cone opsin genes in E. malabaricus according to the phylogeny of opsin genes in teleosts[57] (opnsw1, opnsw2Aa, opnsw2Ab, opnsw2B, rh2A, rh2B, rh2C, opnlw) and one rhodopsin gene (rh1) [57, 58]. These genes were indeed expressed in our transcriptomic data. We observed that the medium wavelength opsin (rh2A, rh2B, rh2C), and the long wavelength opsin (opnlw) were highly expressed at the beginning of the larval development (at D03 for rh2B, rh2C and opnlw, and at D10 for rh2A) (Fig. 3D). These surges of expression are followed by the increase of the expression levels of opnsw2B, opnsw2Bb and opnlw from D32. From D18 the rhodopsin involved in scotopic vision (rh1) increases. The expression level opnsw1 remained low and stable during the entire development (Fig. 3D). It is again very interesting to note that these changes coincide with both TH signaling peaks. As these genes are regulated by TH in other species and according to the observed expression patterns, we may assume that this is also the case in E. malabaricus.

The timing of cone opsin (opnsw2a1, opsnw2a2 and opnlw) expression in E. malabaricus is similar to E. bruneus [59], but different from E. akaara where opnsw2 is strongly expressed early and then decreases [60]. However, the expression levels of mid wavelength opsins and opnlw are similar between E. malabaricus and E. akaara, suggesting their involvement in cone photoreceptor differentiation, while rod photoreceptors differentiate during metamorphosis in E. akaara and E. malabaricus larvae.

Metamorphosis is accompanied by a metabolic shift

Because metamorphosis is known to be energetically demanding and because the ecology of the planktonic larvae and the demersal juveniles are different, we investigated metabolic gene expression. Fig. 4 shows the expression profile of the genes encoding for the rate limiting steps enzymes involved in glycolysis, (phosphofructokinase, pfkma and pfkmb), and citric acid cycle (citrate synthase, cs; isocitrate dehydrogenase, idh3a; oxoglutarate dehydrogenase complex, ogdhl, dlst2). The expression profile of all the genes associated with these pathways are shown in Supp. Fig. 2.

Metabolic transition and corticoids expression levels of E. malabaricus.

A) Schematization of the metabolic transition occuring during E. malabaricus larval development showing that young larvae rely on aerobic metabolism whereas older larvae rely on anaerobic metabolism. Expression levels of genes involved in glycolysis (pfkma, pfkmb), krebs cycle (idh3, dlstb). Gene expression data was generated from whole fish. Expression levels were derived from DESeq2 normalised gene counts.

These profiles revealed a clear overall pattern: glycolysis genes are poorly expressed at the very beginning of the larval development while their expression increases throughout the development. This is particularly visible for pfkma which starts to increase from D10 and reaches its highest expression level at D32, likely coinciding with the onset of metamorphosis, and then decreases until juvenile stage (J) (Fig. 4A). The genes involved in the rate limiting steps of the citric acid cycle (cs, idh3, dlst) are more expressed during early larval stages and then decrease progressively. It is also worth noting that several genes involved in both glycolysis and TCA cycle are encountering these two peaks of expression during the larval development (gpi1b, aldoaa, gapdh1, pgam1a, pgam1b, pgam2, eno1b, pkma, dlsta, dldh, sdhb, mdh2, Supp. Fig. 2). The lactic acid fermentation genes show an increase throughout the larval development with peaks of expression at D18 for ldha and at D03 for ldhc (Supp. Fig. 2). Taken together, these results reveal that at the very beginning of the development larval fish mainly rely on the citric acid cycle for aerobic energy production and then switch progressively to anaerobic energy production via glycolysis and lactic fermentation. This trend is similar to what has been observed in other fish species such as sea bass [28, 61], but contrasts with the situation of other species such as the clownfish [14]. Thyroid hormones are known to play a role in the regulation of metabolism in mammals [62], so it is likely that a similar regulatory process occurs during the development of E. malabaricus larvae, as it has been recently observed in the development of clownfish larvae[14]. Larval development and metamorphosis are very sensitive periods during which larvae must face a myriad of challenges: disperse into the open ocean, find food, escape from predators, locate and swim toward a suitable habitat, metamorphose and settle. All these challenges are highly demanding in terms of energy, it is thus very important for the larvae to properly allocate this energy to ensure the success of these various challenges. The regulation by TH of genes involved in processes such as glycolysis, lactic fermentation and citric acid cycle might be a way for larvae to tune their energetic source to enhance their survival and the success of metamorphosis.

Possible involvement of corticoid pathways in grouper larval development

Synergistic action of cortisol and THs has been encountered during flatfish larval development and more specifically during its metamorphosis. However, crosstalk between corticoids and TH pathways has remained poorly investigated during fish post-embryonic development20. For this reason, we decided to investigate eight key genes genes involved in the Hypothalamo-Pituitary-Interrenal axis: crha, crhb, crhr1a, crhr1b, crhr2, pomc-a1, pomc-a2, pomc-b, mr, gr1, gr2 which encode respectively for the corticotropin releasing hormone (which stimulates the production of POMC and the stress hormone ACTH), the receptors of the CRH which are involved in the production of the stress-related hormone ACTH the pro-opiomelanocortin A1, A2 and B (precursors of several hormones such as ACTH) and corticoid receptors: mineralocorticoid receptor (MR) and glucocorticoid receptor (GR1&2) (Fig. 5) We also scrutinized the expression levels of genes encoding for key proteins involved in corticoid synthesis: star, fdx1, fdx2, fdxr, cyp11a1, hsd3b1, cyp17a1, cyp21a2, cyp11c1, hsd11b1, hsd11b2.

Expression levels of genes involved in the Hypothalamo-Pituitary-Interranal axis (HPI) and corticoids synthesis.

A) Expression levels of genes involved in Hypothalamo-Pituitary-Interenal axis (HPI): crha (corticotropin releasing hormone a), crhb (corticotropin releasing hormone b), crhr1a (corticotropin releasing hormone receptor 1a), crhr1b (corticotropin release hormone receptor 1b), crhr2 (cortico release hormone receptor 2), pomc-a1 (propiomelanocortin a1), pomc-a2 (propiomelanocortin a2), pomc-b (propiomelanocortin b), mr (mineralocorticoid receptor), gr1 (glucocorticoid receptor 1), gr2 (glucocorticoid receptor 2). B) Expression levels of genes involved in corticoids synthesis: star (steroidogenic acute regulatory protein), fdx1 (ferredoxine 1), fdx2 (ferredoxine 2), fdxr (ferredoxine reductase), cyp11a1 (Cytochrome P450 Family 11 Subfamily A Member 1), hsd3b1 (Hydroxysteroid dehydrogenases 3β1), cyp17a1 (Cytochrome P450 Family 17 Subfamily A Member 1), cyp21a2 (Cytochrome P450 Family 21 Subfamily A Member 2), cyp11c1 (Cytochrome P450 Family 11 Subfamily C Member 1), hsdllbl, hsd11b2 (Hydroxysteroid dehydrogenase 11 3α1&2). C) Cortisol levels (in ng/g of larvae) during early larval development. # indicates that the value is below the quantification limit, and different letters indicate significant differences (one-way ANOVA followed by a Tukey HSD test). Gene expression data was generated from whole fish. Expression levels were derived from DESeq2 normalised gene counts.

Most of the genes of this pathway displayed a similar pattern as described previously above with a surge of expression between D03 and D10 and a second one between D32, D60 (crhb, crhr1b, crhr2, pomc-a2, mr, gr1) (Fig. 5A). The expression level of the gene encoding for CRHR2 started to increase after D01 and remained relatively stable all along whereas crha was lowly expressed (Fig. 5A). A surge of expression was observed for pomc-a1 at D03 followed by a constant decrease until the juvenile stage. High expression of pomc-b was observed at D13, D18 and Juvenile stage. Finally, gr2 expression level increased strongly at D03, then remained stable and increased again at D60. The relatively high expression of the crhr genes may suggest an increase in the sensitivity to CRH to mediate the production of POMC by the pituitary gland, a process that seems to occur twice during E. malabaricus larval development.

Concomitantly, a two-step increase in star is observed: first at D03 and a second at D32. This may suggest an increase in the production of cortisol following the high expression of pomc-a2. Indeed, POMC is the precursor of the Adreno Cortico Trophic Hormone (ACTH) which is the pituitary factor stimulating cortisol production by the inter-renal gland [63]. The expression levels of genes involved in cortisol production corroborate this hypothesis. Indeed, we observe an increase of expression around D03 and D06 for fdx1, cyp11a1, hsd3b1, cyp11c1, hsd11b1, hsd11b2, as well as a second increase of expression from D32 for fdx1, fdxr, hsd3b1, cyp11c1, hsd11b1, hsd11b2. Interestingly, measurements of cortisol levels during early larval development (between D01 and D10) showed that cortisol concentration starts to increase from D3, coinciding with the expression levels of star, and is followed by a stronger increase from D10. Those results first indicate that HPI axis and cortisol production are activated at the beginning of the larval development around the timing of activation of TH pathway genes between D03 and D10. Second, the transcriptomic data also showed an activation of the corticoid pathway genes around D32 as it has been observed for TH pathway genes. There is contrasting evidence of communication between these two pathways during teleost fish larval development with some data suggesting a synergic and other an antagonistic relationship. In terms of synergy, an increase in cortisol level concomitantly with an increase in TH levels has been observed in flatfish [26], golden sea bream [64] and silver sea bream [65]. Cortisol was also shown to enhance in vitro the action of TH on fin ray resorption (phenomenon occurring during flatfish metamorphosis) in flounder[27]. It has also been shown that cortisol regulates local T3 bioavailability in the juvenile sole via regulation of deiodinase 2 in an organ-specific manner [66]. On the antagonistic side, it has been shown that experimentally induced hyperthyroidism in common carp decreases cortisol levels[67], whereas cortisol exposure decreases TH levels in European eel [68]. Given this scattered evidence, the existence of a crosstalk active during teleost larval development and metamorphosis has never been formally demonstrated. The results we obtained in grouper are clearly indicating that HPI axis is activated during both early development and metamorphosis and that cortisol synthesis is activated during early development. This may suggest that in some aspect, cortisol synthesis could work in concert with TH, as has been shown in several different contexts in amphibians [25], but functional experiments need to be conducted to confirm this hypothesis. It is worth to note, however, that the increase of the gene encoding POMC-A2 may not only be linked to cortisol synthesis as POMC is also a precursor of other hormones and notably melanocytes-stimulating hormones [63]. Those hormones belong to the melanocortin system that is involved in body pigmentation, but also in social behavior, appetite, and stress physiology [69]. The increase in pomc-a2 observed during E. malabaricus may thus also be involved in the onset of pigmentation pattern. Taken together, these results brought a first insight into the potential role of corticoids in the larval development of E. malabaricus and call for functional experiments directly testing a possible synergy. Given the results obtained in our study, E. malabaricus could be a good model to investigate the potential role of corticoids and thyroid hormones in elongate appendages formation during early larval development as well as during metamorphosis and if there is an interplay between the two pathways. Such interplay could have relevant consequences in terms of aquaculture and claim for an examination of the role of stress in regulating fish larval development and impacting metamorphosis triggering.

Overall, the results obtained in this study revealed a very precocious surge of expression of genes involved in two key hormonal pathways (corticoids and TH) that are known to control ontogenetic transitions, but which are also involved in the regulation of many biological processes [70, 71]. This indicates that the early post-embryonic period in grouper may correspond to such an ontogenetic transition that has been ignored until now and that could be linked to the formation of the specific elongate appendages present in groupers.

More generally, the fact that the outcome of metamorphosis is very variable from one species to another (e.g., differences in metamorphosis between clownfish, grouper, flatfish, etc.) and that it also allows exquisite acclimation of the juveniles to their local environment [47], highlights the capacity of this transitional step, controlled by environmentally connected hormonal systems, to change rapidly in accordance with ecological needs [72]. Finally, considering that rearing conditions during larval metamorphosis in an aquaculture context may impact growth and welfare at later life stages, understanding the molecular changes occurring during the development of a species might prove useful to enhance survival rates.

Material and Methods

Larval husbandry

This study was conducted in partnership with the Okinawa Prefectural Sea Farming Center, Motobu-cho, Okinawa, Japan. Epinephelus malabaricus larvae and juveniles were obtained from various clutches obtained from natural spawning in 2020, 2021 and 2023. Larvae were reared under natural condition in 50,000 L of natural sea water in circular tanks. Light exposure duration followed natural daylight hours, salinity (approximately 33–34 ppm) and temperature (approximately 27 °C on average) remained relatively stable as the tanks were constantly renewed with natural seawater. Microalgae (Nannochloropsis sp.) was added from hatching until 15 days post hatching (dph) to maintain the nutritional value of live-feed organisms and create a green-water environment. Rotifers Brachionus sp. (S type) were enriched with fish oil and distributed twice a day from 1 dph to maintain a concentration of 10 ind/mL until 13 dph. Artemia nauplii were added twice a day from 13 dph to 20 dph. Frozen copepods were given five times a day from 13 dph until 20 dph. Artificial food was given from 20 dph during daytime by automatic feeding (one distribution every hour).

Sample collection and tissue collection

In order to assemble and functionally annotate the genome, tissues for DNAsequencing and RNA sequencing were collected on 08. September 2020 from two approximately four-month-old fish sourced from the Okinawa Prefectural Sea Farming Center. The fish were euthanized by cervical dislocation, and immediately dissected. The liver and muscle tissues of one fish were immediately frozen in liquid nitrogen for PacBio HiFi and Hi-C sequencing, respectively. Brain, gill, liver, heart, caudal fin, eye, spleen, stomach, intestine, muscle, skin spinal cord and spinal nerve tissues were taken from the second fish and stored in RNAlater™ (ThermoFisher Scientific) for tissue specific transcriptome sequencing.

For the larval developmental analysis, whole larval and juvenile fish were sampled between 30. April 2021 and 02. June 2021, ranging from 1 day post hatching (dph) to approximately 2 months (Table 2). A total of four clutches spawned in early and late April were sampled during this period and larvae were collected and sorted according to their morphology allowing us to sequence 8 developmental stages. Larvae and juveniles were euthanized in the afternoon (between 13:00 and 15:00) with MS222 solution (200 mg/L, Sigma-A5040) before being placed in RNAlater. Larger fish were cut open for improved RNAlater penetration and samples were kept at 4°C for two to eight days before being stored at -20°C until extraction. Larvae for TH and cortisol measurements were sampled in triplicates between 17. June 2023 and 26. June 2023 at D01 (n=120 per replicate), D03 (n=120 per replicate), D06 (n=60 per replicate) and D10 (n=40 per replicate), as described in Roux et al. [14] and kept at -80 until analysis. TH and cortisol extraction and measurement were outsourced to ASKA Pharmaceutical Medical Co., Kanagawa, Japan. Detailed protocols can be found in Supp. protocol S1 for TH and Supp. protocol S2 for cortisol.

Morphological description of the larval and juvenile stages sampled for the transcriptomic analysis.

D01: 1 day post hatching (dph), D03: 3 dph, D10: 10 dph, D:13 1315 dph, D18: 18–20 dph, D32: 32–34 dph, D60: ca. 60 dph, J: ca. 60 dph with juvenile phenotype. NL is “notochord length” for preflexion and flexion larvae, SL is “standard length” for postflexion and older stages, and TL is “total length” for all stages.

All sampling conducted in this study was done under the approval from the Animal Care and Use Committee at the Okinawa Institute of Science and Technology Graduate University (approval N°2021-328).

DNA extraction and sequencing

Genomic DNA was extracted from liver tissue using the NucleoBond HMW DNA extraction kit (Machery-Nagel). Library preparation was carried out with the SMRTbell Express Template Prep Kit 2.0 and SMRTbell Enzyme Cleanup Kit, Sequencing primer v2, Sequel II Binding Kit 2.0, and Sequel II Sequencing Kit 2.0 (Pacific Biosciences). Sequencing was done on a Sequel II System, using three SMRT Cell 8M flow cells through diffusion loading of 60–100pM library. Hi-C library preparation and sequencing was carried out by Phase Genomics from muscle tissue using the Phase Genomics Proximo Animal Kit v3.0 and sequenced on a Illumina HiSeq 4000 with 150 bp PE.

RNA extraction and sequencing

For the functional genome annotation, tissue samples were homogenized using a Kinematica Polytron PT1200E Homogenizer and RNA was extracted using the Maxwell RSC simplyRNA Tissue Kit (Promega: AS1340). Individually barcoded IsoSeq Express libraries of all 13 tissues were prepared by the OIST Sequencing Section using the SMRTbell Express Template Prep Kit 2.0. The libraries were sequenced on a PacBio Sequel 2 across two SMRT Cell 8M flow cells.

For the developmental transcriptomic analysis, samples from 1 to 32 dph were homogenized in thioglycerol using metal beads lysing matrix tubes (MPB) in an automated homogenizer (FastPrep-24 5G MPB). Bigger samples (60 dph and juveniles) were manually homogenised in thioglycerol using 14 mL round bottom tubes and a tissue grinder (Tissue Ruptor II, Qiagen). Samples from 1 and 3 dph consisted of pools of three larvae in triplicates, while all remaining timepoints consisted of triplicates of single individuals. RNA extraction was then carried out as for the tissue samples using the Maxwell RSC simplyRNA Tissue Kit (Promega: AS1340). Library preparation was carried out at the OIST Sequencing Section using the NEBNext Ultra II Directional RNA Library Prep Kit. The final pooled library was then split across two Illumina Nova Seq SP flowcells for sequencing with 150 bp PE reads.

Genome assembly, scaffolding and phasing

The genome assembly was carried out using unprocessed PacBio HiFi reads with the diploid aware Improved Phased Assembler (IPA, V1.3.1, https://github.com/PacificBiosciences/pbipa) using default parameters, which resulted in a primary and alternative phase genome. The two phased genomes were assessed using purge_haplotigs [73] using default parameters to generate a genome-wide read-depth histogram; however, no purging was necessary. Completeness of the final assembly was assessed using BUSCO (V4.1.2) [74] with the actinopterygii_odb10 database. Scaffolding and phasing were outsourced to Phase Genomics (See Supp. Protocol S3 for detail).

Genome and functional annotation

Genome annotation was carried out as described Ryu et al. (2022)[75]. Briefly, repeat content analysis was done in RepeatModeler[76] (V2.0.1), RepeatMasker[77] (V4.1.1), the vertebrata library of Dfam (V3.3) [78], and GenomeTools (V1.6.1) [79]. Annotation was done using BRAKER2 [80] and associated programs [8192]. For this, the ISO-seq data from the adult tissue and RNA-seq data from the larval samples (see below for quality control process) were used together with publicly available protein data (Supp. Table S5). Post-processing was carried out as described by Ryu et al. (2022) [75] using the Swiss-Prot protein database (UniProt) [93] with Diamond [83] (V2.0.9) and Pfam domains [94] identified by InterProScan (V5.48.83.0) [95]. Gene model statistics were calculated using the get_general_stats.pl script from the eval package (V2.2.8) [96]. Finally, functional annotation was carried out with the filtered gene models produced by BRAKER. The amino acid sequences were blasted against the non-redundant protein database (downloaded 15. November 2021) using blastp (V2.10.0+; parameters: -show_gis -num_threads 10 -evalue 1e-5 -word_size 3 -num_alignments 20 - outfmt 14 -max_hsps 20) [97]. Additionally, protein domains were assigned using InterProScan (V5.48.83.0; parameters: --disable-precalc --goterms --pathways -f xml) [95]. The blast and interproscan results were then loaded into OmicsBox [98, 99] for postprocessing.

Differential gene expression analysis

The differential gene expression analysis for the larval developmental stages was carried out on the sequencing data from the whole larval and juvenile fish. Before processing, the data from the two lanes were merged per sample. Low quality bases and adaptor sequences were filtered using Trim Galore (V0.6.5) [100] and cutadapt (V2.10) [101] using default parameters with the exception of “--length 30”. Kraken2 (V2.0.9-beta) [102] was used to remove bacterial reads using the bacterial and archeal database (V4.08.20) and “--confidence 0.3”. Cleaned reads were mapped using STAR (V2.7.9a) [103] with “--quantMode GeneCounts” and “-- outSAMtype BAM SortedByCoordinate”, using the filtered gff file produced by the braker2 annotation outlined above for the genome indexing (--genomeSAindexNbases 13, -- sjdbOverhang 149). The unstranded mapped reads were then loaded into Rstudio (V2022.02.4) [104] using R (V3.6.3) [105]. DESeq2 (V1.36.0) [106] was used for general data analysis, with coseq (V1.20.0) [107, 108] being used for cluster analysis. The cluster analysis was carried out on differentially expressed genes only, as determined through likelihood ratio test (LRT) analysis (full model: design = ~ dph, reduced model: reduced = ~ 1, adjusted p-value threshold: 0.001) in DESeq2. Adjusted p-values and annotations for the group of genes represented in Fig 1B in this study can be found in the Suppl. Data File. Normalisation was done in DESeq2, while the following parameters were used for coseq: model = “Normal”, transformation = “arcsin”, seed = 1234, iter=10000. Specific genes belonging to clusters where D03 and/or Day32 showed upregulation were then re-clustered with the same parameter for visualization. A complete representation of all initial clusters found in this study can be found in Supp. Fig. 3. Pairwise analysis of differentially expressed genes between two time points was done using the Wald test in DESeq2 (design = ~ dph, adjusted p-value threshold: 0.01, log2FoldChange ≥ ±0.58). Figures were plotted using ggplot2 (V3.4.1) [109], and the analysis made general use of the tidyverse package (V1.3.2) [110]. Lastly, expression levels shown in figures 25 are normalised gene counts produced by DESeq2.

Acknowledgements

We are grateful to Misaki Yamauchi and Hiroyuki Nakamura at the Okinawa Prefectural Sea Farming Center, who provided us with the samples of groupers. This work was supported by the Okinawa Institute of Science and Technology Graduate University. We are also thankful for the help and support provided by the Sequencing Section, especially Mayumi Kawamitsu and Albert Murzabaev, and the Scientific Computing Section of the Research Support Division at Okinawa Institute of Science and Technology Graduate University. We thank Konstantin Khalturin (Marine Genomics Unit, OIST) for his support of our sampling. Scaffolding was carried out by Mary Wood from Phase Genomics. Vincent Laudet would like to dedicate this manuscript to the memory of Donald D. Brown.

Code and Data Availability

Code used in genome assembly and annotation, as well as in developmental transcriptome analysis can be found under the following DOI: 10.5281/zenodo.10972118. All raw and assembled data used in this study has been deposited on GenBank under umbrella BioProject PRJNA798702, with the principal phased assembly in BioProject PRJNA798188, the alternate phased assembly in BioProject PRJNA798189, and the raw data in BioProject PRJNA794870. BioSamples can be found under SAMN24662200 (genome sequencing), SAMN24664212 (ISO-seq), and SAMN24664213 - SAMN24664234 / SAMN32359227 - SAMN32359229 (RNA-seq). PacBio and Illumina raw data can be found under SRR17639994 - SRR17640023 / SRR22859365 - SRR22859367. The final scaffolded assembly can be found under accession JANUFT000000000 and the alternative phase under accession JANUFU000000000. The genome and functional annotation are deposited on figshare under DOI: 10.6084/m9.figshare.25486387. The raw gene expression count data and hormone measurements can be found in the supplementary data excel file. Any additional processed data used in this publication (e.g., list of genes in each cluster) can be requested from the corresponding author.

Author Contributions

R.H. carried out all bioinformatical analyses and contributed to the design of the work, the interpretation, and the writing of the manuscript. N.R. carried out the biological interpretation of the data and contributed to the writing of the manuscript. K.M. and P.P. designed and carried out the larval collection for transcriptomic analysis. S.M. and M.I. contributed to the lab work and data collection for transcriptomic analysis and genome assembly. H.C. and S.M. sampled larvae for TH and cortisol measurements. V.L. contributed to the design of the work, the interpretation of the data, and writing the manuscript. T.R. contributed to the design of the work and writing of the manuscript.

Competing Interests

The authors declare no competing interests.

Materials & Correspondence

Correspondence and material requests should be addressed to Roger Huerlimann at either roger.huerlimann@oist.jp or roger.huerlimann@gmail.com.

Additional Declarations

The authors declare no competing interests.

Supplementary material

Hi-C contact map after scaffolding.

E. malabaricus genome contig contact matrix using Hi-C data. The color bar indicates contact density from dark red (high) to white (low).

Metabolic shift during grouper metamorphosis.

Expression levels of genes involved in glycolysis, lactic fermentation, and citric acid cycle at each developmental stage (D01, D03, D06, D10, D13, D18, D60, J) extracted from transcriptomic data. Enzyme highlighted in red represents rate limiting steps for each metabolic pathway.

Complete cluster analysis of differentially expressed genes.

Cluster analysis of differentially expressed genes (n = 22,135, LRT analysis (full model: design = ~ dph, reduced model: reduced = ~ 1, adjusted p-value threshold: 0.001). Genes contained in clusters: 1) 925, 2) 382, 3) 548, 4) 1,193, 5) 804, 6) 1,503, 7) 2,025, 8) 786, 9) 2,399, 10) 570, 11) 801, 12) 1,141, 13) 2,439, 14) 1,471, 15) 681, 16) 1,702, 17) 2,765.

PacBio HiFi data generated for E. malabaricus genome assembly based on three SMRT cells.

Additional genome assembly, scaffolding and annotation statistics.

Detailed repeat annotation results using the DFAM repeat database.

Differentially expressed, upregulated and downregulated genes between consecutive time points.

Origin of protein sequences used for genome annotation in braker2.

Supp. protocol S1: T3 and T4 measurements

ASKA Pharmaceutical Medical Co., Ltd., 5-36-1 Shimosakunobe, Kawasaki Takatsu-ku, Kanagawa 213-8522, Japan

Before sample preparation, a whole fish body was homogenized in distilled water by a ball mill (ShakeMaster® NEO) with stainless beads. The homogenate sample was transferred to a polypropylene (PP) tube and spiked with isotope-labelled internal standards solution containing 13C6-T4, 13C6-T3, and 13C6-rT3. The homogenate sample was denatured with acetonitrile and then, equilibrated for 30 min at room temperature on dark

After equilibration, the sample was centrifuged at 3000 rpm for 3 min, and then the supernatant was decanted into a new PP tube which was added distilled water. The sample was applied to an Oasis MCX cartridge which had been successively conditioned with methanol, distilled water, and 1% acetic acid solution. After the cartridge was washed with distilled water followed by methanol, the thyroid hormones were eluted with methanol/distilled water/ammonia solution (70:30:1,v/v/v). After the sample was evaporated to dryness, the residue was dissolved with methanol/distilled water pyridine solution (40:60:1, v/v/v). The sample was subjected to an LC-MS/MS system for determination of T4, T3 and rT3. The SRM transitions were m/z 777.8/731.6 for T4, 651.9/605.8 for T3 and 651.9/507.7 for rT3. The measurement ranges were 4–4000 pg/tube for T4 and 0.5–500 pg/tube for both T3 and rT3. The limits of quantification were 4 pg/tube for T4 and 0.5 pg/tube for both T3 and rT3..

Supp. protocol S2: Cortisol measurements

ASKA Pharmaceutical Medical Co., Ltd., 5-36-1 Shimosakunobe, Kawasaki Takatsu-ku, Kanagawa 213-8522, Japan

Before sample preparation, a whole fish body was homogenized in distilled water by a ball mill (ShakeMaster® NEO) with stainless beads. The homogenate sample was transferred to a glass tube and spiked with isotope-labelled internal standard solution containing F-d4. F was extracted with 4 mL of methyl tert-butyl ether.

After the organic layer was evaporated to dryness, the extract was dissolved in 0.5 mL of methanol and diluted with 1 mL of distilled water. The sample was applied to OASIS MAX cartridge which had been successively conditioned with 3 mL of methanol and 3 mL of distilled water. After the cartridge was washed with 1 mL of distilled water, 1 mL of methanol/distilled water/acetic acid (45:55:1,v/v/v), and 1 mL of 1% pyridine solution, the F was eluted with 1 mL of methanol/pyridine (100:1,v/v).

After evaporation, the residue was reacted with 50 μf of mixed solution (80 mg of 2-methyl-6-nitrobenzoic anhydride, 20 mg of 4-dimethylaminopyridine, 40 mg of picolinic acid and 10 μl. of triethylamine in 1 mL of acetonitrile) for 30 min. at room temperature. After the reaction, the sample was dissolved in 0.5 mL of ethyl acetate/hexane/acetic acid (15:35:1, v/v) and the mixture was applied to HyperSep SI cartridge which had been successively conditioned with 3 mL of acetone and 3 mL of hexane. The cartridge was washed with 1 mL of hexane, and 2 mL of ethyl acetate/hexane (3:7, v/v). Steroids was eluted with 2.5 mL of acetone/hexane (7:3, v/v). After evaporation, the residue was dissolved in 0.1 mL of acetonitrile/distilled water (2:3, v/v) and the solution was subjected to a LC-MS/MS. The SRM transitions was m/z 468.2/309.2 for F and 472.2/454.3 for F-d4. The measurement ranges was 10–100000 pg/tube. The limits of quantification of F was 10 pg/tube.

Supp. protocol S3: HiC scaffolding protocol used by Phase Genomics

Chromatin conformation capture data was generated using a Phase Genomics (Seattle, WA) Proximo Hi-C 4.0 Kit, which is a commercially available version of the Hi-C protocol [1]. Following the manufacturer’s instructions for the kit, intact cells from two samples were crosslinked using a formaldehyde solution, digested using the DPNII restriction enzyme, and proximity ligated with biotinylated nucleotides to create chimeric molecules composed of fragments from different regions of the genome that were physically proximal in vivo, but not necessarily genomically proximal. Continuing with the manufacturer’s protocol, molecules were pulled down with streptavidin beads and processed into an Illumina-compatible sequencing library. Sequencing was performed on an Illumina HiSeq 4000, generating a total of 159,606,973 read pairs.

Reads were aligned to the draft assembly also following the manufacturer’s recommendations [2]. Briefly, reads were aligned using BWA-MEM[3] with the -5SP and -t 8 options specified, and all other options default. SAMBLASTER[4] was used to flag PCR duplicates, which were later excluded from analysis. Alignments were then filtered with samtools[5] using the -F 2304 filtering flag to remove non-primary and secondary alignments. FALCON-Phase[6] was used to correct likely phase switching errors in the primary contigs and alternate haplotigs from FALCON-Unzip and output its results in pseudohap format, creating one complete set of contigs for each phase.

Phase Genomics’ Proximo Hi-C genome scaffolding platform was used to create chromosome-scale scaffolds from FALCON-Phase’s phase 0 assembly, following the same single-phase scaffolding procedure described in Bickhart et al. [7]. As in the LACHESIS method [8], this process computes a contact frequency matrix from the aligned Hi-C read pairs, normalized by the number of DPNII restriction sites (GATC) on each contig, and constructs scaffolds in such a way as to optimize expected contact frequency and other statistical patterns in Hi-C data.

Juicebox[9,10] was then used to correct scaffolding errors, and FALCON-Phase was run a second time to detect and correct phase switching errors that were not detectable at the contig level, but which were detectable at the chromosome-scaffold level.

Metadata generated by FALCON-Phase about scaffold phasing was used to generate matching. assembly files (a file format used by Juicebox) and subsequently used to produce a diploid, fully-phased, chromosome-scale set of scaffolds using a purpose-built script[11]. In these final scaffolds, phase 0 included 24 scaffolds spanning 1,027,591,625 bp (92.82% of input) with a scaffold N50 of 43,313,630 bp, and phase 1 included 24 scaffolds spanning 1,019,278,383 bp (91.98% of input) with a scaffold N50 of 43,164,609 bp.