Chromosome-scale genome assembly of the European common cuttlefish Sepia officinalis

Simone Rencken; Georgi Tushev; David Hain; Elena Ciirdaeva; Oleg Simakov; Gilles Laurent

doi:10.7554/eLife.107393.2

eLife Assessment

This manuscript reports a high-quality genome assembly of the European cuttlefish, Sepia officinalis, a representative species of the Cephalopod lineage. This solid work relies on current best practices in genome sequencing and assembly, combining PacBio HiFi long reads and Hi-C chromatin conformation capture, and on state-of-the-art comparative genomic analyses, including chromosome number evolution and analyses of expanded gene families. The resulting genome will be a valuable resource for researchers interested in cuttlefish biology and comparative genomics in general.

https://doi.org/10.7554/eLife.107393.2.sa4

Significance of findings

valuable: Findings that have theoretical or practical implications for a subfield

landmark
fundamental
important
valuable
useful

Strength of evidence

solid: Methods, data and analyses broadly support the claims with only minor weaknesses

exceptional
compelling
convincing
solid
incomplete
inadequate

During the peer-review process the editor and reviewers write an eLife assessment that summarises the significance of the findings reported in the article (on a scale ranging from landmark to useful) and the strength of the evidence (on a scale ranging from exceptional to inadequate). Learn more about eLife assessments

Abstract

Coleoid cephalopods, a subclass of mollusks that includes octopuses, cuttlefish and squid, exhibit sophisticated biological features such as dynamic and neurally driven camouflage behavior, inter-individual communication, single-lens camera-like eyes, the largest brains among invertebrates and a distinctive embryonic development.

The common cuttlefish Sepia officinalis has served as a model organism in various research fields, spanning biophysics, neurobiology, behavior, evolution, ecology and biomechanics.

More recently, it has become a model to investigate the neural mechanisms underlying cephalopod camouflage, using quantitative behavioral approaches alongside molecular techniques to characterize the identity, evolution and development of neuronal cell types.

Despite significant interest in this animal, a high-quality, annotated genome of this species is still lacking. To address this, we sequenced and assembled a chromosome-scale genome for S. officinalis. Our assembly spans 5.68 billion base pairs and comprises 1n=47 repeat-rich chromosome scaffolds. This was unexpected because the haploid karyotypes of other decapods indicate 46 chromosomes. Detailed comparisons of our data to those from published decapod genome assemblies and to another recent genome assembly of S. officinalis (itself suggesting 1n=49 chromosomes) in fact revealed clear homologies between 46 scaffolds across all the datasets. The discrepancies between datasets are explained by highly repetitive regions, impairing proper read alignments. We conclude that the true karyotype of S. officinalis is probably 1n=46 chromosomes, a likely ancestral and if true, conserved decapod karyotype.

Our results include a comprehensive gene annotation and full-length transcript prediction, which we used to characterize orthologous gene families across mollusks. We identified several large-scale expansions specific to cephalopods, with many genes specific to neural or non-neural tissues of adult S. officinalis. In summary, this genome should provide a valuable resource for future research on the evolution, brain organization, information processing, development and behavior in this important clade.

Introduction

Coleoid cephalopods (octopus, squid, cuttlefish) are a highly derived group of mollusks, characterized by the largest nervous systems among all invertebrates (ca. 500 million neurons in adult octopus of which 200 million are in the central brain^1,2, compared to ca. 140,000 in the fruit fly³ or 70 million in the mouse⁴) and specializations with a great historical importance for neuroscience (e.g., “giant axons”⁵ and “giant synapses”^6–8). These animals exhibit very sophisticated behaviors such as dynamic camouflage^9–15, learning^16–18, social communication^19–21 and hunting^22–24 as well as two-stage sleep^25–27. Because of these characteristics, cephalopods have been the focus of many fundamental studies by biologists, biophysicists and physiologists over the past century^28–37. Thanks to advances in sequencing technologies, coleoid cephalopods are now also emerging as animal models in molecular neurobiology, evolution and genomics³⁸. Recent studies examined cephalopod biology from the perspectives of single-cell gene expression^39–42, genome topology and gene regulation^43,44.

Despite recent technical progress in genomics, the genomes of cephalopods remain challenging to assemble because of their large sizes and high repeat fractions. While the genomes of Octopodiformes (Octopus, Eledone, Argonauta) are either smaller than (1.1 Gigabases or Gb⁴⁵) or comparable in size to that of humans (around 3 Gb^46,47), the typical genomes of Decapodiformes (squids and cuttlefish) often reach 6 Gb^48,49. The biggest contributing factors to this genome expansion are transposable elements (TEs), with different TE classes differentiating squid and cuttlefish genomes from those of octopods⁵⁰.

Besides such differences in genome sizes, karyotypes in coleoid cephalopods are also unusual when compared to those of other invertebrates, including other mollusks: for example, haploid chromosome numbers are around 30 in octopuses and around 46 in squids and cuttlefish^49,51, while they are between 8 and 19 in non-cephalopod mollusks^52–59. Karyotype reconstruction and validation techniques in animals with such large genomes are also not as developed as they are for more familiar invertebrate and vertebrate species.

The common cuttlefish Sepia officinalis has been used as a model organism for studies in neural control and development^60,61, behavior^9,17, evolution⁶², environmental studies^63,64 and biomechanics^65,66. Found in the Mediterranean, the English Channel and the European coastal Atlantic, it can produce large egg clutches containing hundreds of embryos, several times throughout its semelparous reproductive cycle^67,68. More recently, this species has emerged as an important model to study the neural basis of cephalopod camouflage^69,70 building on extensive behavioral studies^9–15 and neuronal cell type characterization, evolution and development^71–73.

Despite a widespread interest in this animal, including the recent sequencing of several cephalopod species as part of the Aquatic Symbiosis Genomics project⁷⁴, a detailed, annotated genomic resource for this species is still lacking. We describe here the sequencing, assembly, and annotation of the Sepia officinalis genome.

Our data provided initial evidence for the existence of 1n=47 chromosomes. We compared them with available genome assemblies (i) from another S. officinalis individual by the Darwin Tree of Life project (DToL)⁷⁵, itself suggesting 49 chromosomes, and (ii) from other decapod species such as Euprymna scolopes, Doryteuthis pealeii and Acanthosepion esculentum, each with estimates of 46 chromosomes⁴⁹. When compared with the assemblies generated for the other species, both S. officinalis assemblies contained additional and independent chromosome splits. We thus investigated the repeat content and the alignment of raw data at the discrepant chromosome junctions to assess whether these differences were of a technical or biological nature.

Results

We sequenced the genome of a 6-month-old Sepia officinalis male individual, reared from eggs collected in the Portuguese Atlantic coast. The DNA was extracted from brain tissue and sequenced using long-read (PacBio HiFi) and chromatin conformation (Hi-C) methods (Figure 1A,B). The sex of the animal was confirmed by qPCR following a recently described protocol⁷⁶.

*Sepia officinalis* assembly statistics and quality control.
A) Specimen of *S. officinalis* (credit: Stephan Junek, MPI for Brain Research). B) Overview of the genome assembly workflow. Genome size was estimated from short DNA reads (Illumina) using GenomeScope^77,78. The primary assembly was generated from long DNA reads (PacBio Sequel II) and chromatin conformation capture (Hi-C) reads (Dovetail OmniC) with hifiasm¹⁶⁷. Assembly was scaffolded with YAHS⁸² and residual small scaffolds were manually placed in chromosomes. C) Snail plot of chromosome-scale *S. officinalis* assembly generated using blobtools2¹⁹¹ showing scaffold statistics (e.g. number of scaffolds, median scaffold length N50), base composition and completeness measured using Benchmarking Universal Single-Copy Orthologs (BUSCO)⁸⁰ against the *metazoa_odb12* database. D) Hi-C heatmap showing the 47 chromosome-scale scaffolds with few sequences remaining in unplaced scaffolds. X and y-axes show the genome position in Mbp. The heatmap was generated using juicebox⁸³, 0-7039 observed counts (balanced) are shown.

Genome size and heterozygosity

The haploid genome size of S. officinalis was estimated to be ∼5.14 Gb based on k-mer estimation from the short-read data, with a high repeat content of 54 %^77,78. The heterozygosity rate was estimated to be 1.03 %, which is higher than in octopus genomes, yet moderate among marine invertebrates^46,47,49.

The size of the scaffolded assembly was 5.68 Gb (Figure 1C), about 10% greater than our initial GenomeScope estimate. Indeed, the sizes of most published metazoan genome assemblies deviate by more than 10% from estimates, with proportionally greater deviations as assembly size increases⁷⁹.

Completeness of the genome was assessed using Benchmarking Universal Single-Copy Orthologs (BUSCO⁸⁰) with metazoa_odb12. It was 94.3% [single copy: 93.2%, duplicated: 1.2%], with 4.5% fragmented and 1.2% missing BUSCOs. The BUSCO score with mollusca_odb12 for the final assembly was 90.2% completeness [single copy: 89.1%, duplicated: 1.1%] with 4.9% fragmented and 4.9% missing BUSCOs.

Assembly-based karyotype

The assembly was generated from PacBio HiFi reads and chromosome conformation capture (Hi-C) reads with ca. 23-fold and 11-fold coverage, respectively. Scaffolding was performed using Hi-C reads, placing 95.8 % of bases in 47 scaffolds. The Hi-C heatmap in Figure 1D shows the chromatin contacts in clusters corresponding to these 47 scaffolds, suggesting that the 47 clusters each correspond to a single chromosome.

To further test the quality of our assembled S. officinalis karyotype, we used several scaffolding programs and manually curated the scaffolds using two independent approaches. The first rested on HapHiC⁸¹, a tool based on Hi-C data that uses allele information from primary assembly programs, and scaffolds assemblies with the constraint of an expected number of chromosomes. The resulting Hi-C contact maps with an input of 46, 47, 48, 49 and 50 (Supplementary Figure 1) supported 47 scaffolds: an input of 46 scaffolds resulted in one clear chromosome merger (blue arrowhead); higher input numbers resulted in false chromosome splits (black arrowheads, 48, 49 and 50 scaffolds).

Second, we manually curated the scaffolds (obtained with YAHS⁸²) using JBAT⁸³. These two approaches, conducted independently and both based on Hi-C signal, again converged on 47 chromosomal scaffolds.

Karyotype comparisons with other Decapods

Several chromosome-scale genome assemblies of decapod cephalopods have been published recently. From those studies, a haploid karyotype of 46 chromosomes seems to be common and conserved among Decapodiformes^42,49,84 (Figure 3A). Thus, we sought to further investigate and confirm our estimated and different karyotype for S. officinalis.

Besides assembly-based karyotypes, several cephalopod species have been investigated using cytogenetic techniques. These studies, however, report widely varying estimates of chromosome numbers for decapods, reflecting the difficulty of resolving large (diploid) chromosome numbers in situ. For example, 1n=46 chromosomes has been reported for two species of cuttlefish (Acanthosepion esculentum and Acanthosepion lycidas) and three loliginid squids⁸⁵; while 1n=34 chromosomes has been reported for Aurosepina arabica⁸⁶ and 1n=24 chromosomes in Acanthosepion pharaonis⁸⁷. In Sepia officinalis, a karyotype of 1n=52 has even been described from testis samples⁸⁸.

Comparison with another chromosome-scale Sepia officinalis assembly

A chromosome-scale assembly for Sepia officinalis was released recently by the Wellcome Sanger Institute’s Darwin Tree of Life project⁷⁵ (DToL, GCA_964300435.1). That genome was assembled from a male individual using high coverage PacBio Sequel II (∼51x) and Arima2 Hi-C (∼80x) data, with a final assembly size of 5.8 Gb. The haploid chromosome number was estimated to be 49. To compare both S. officinalis datasets directly, we downloaded the DToL data and created two new assemblies using the pipeline described above (hifiasm using PacBio HiFi and Hi-C data). The resulting assemblies were overall very similar, with the DToL assembly having a slightly higher contiguity (N50 length, see Table 1) and BUSCO completeness (Supplementary Figure 2A,B) due to their higher sequencing coverage.

Statistics of *S. officinalis* assemblies from two independent datasets, assembled using a common pipeline.

After scaffolding with YAHS, both datasets reached the previously identified chromosome numbers (1n=47 for MPIBR and 1n=49 for DToL, Figure 2A,B). To further investigate this surprising discrepancy, we aligned both assemblies using Winnowmap^89,90 to locate the differences between them (Figure 2C). We observed four “breakpoints” (BP) of chromosome scaffolds: one in the MPIBR assembly compared to DToL (BP1: DToL_5 = MPIBR_40+44) and three in the DToL assembly compared to MPIBR (BP2: DToL_31+40 = MPIBR_2, BP3: DToL_41+46 = MPIBR_6, BP4: DToL_44+45 = MPIBR_7). We also aligned the assemblies to the chromosome-scale genome of another cuttlefish Acanthosepion esculentum (1n=46, GCA_964036315.1). In this alignment, all four breakpoints were collinear with single A. esculentum chromosomes (Figure 2D).

Comparison of two *Sepia officinalis* chromosome-scale assemblies indicates chromosome number of 1n=46.
Datasets were collected from two *S. officinalis* animals, one as described in this study (MPIBR), the second by the Darwin Tree of Life consortium (DToL)⁷⁵. Both datasets were assembled using a common pipeline (hifiasm and YAHS). A) Hi-C contact map of the MPIBR primary assembly, scaffolded using YAHS without manual curation. Assembled 47 chromosome scaffolds are shown as blue boxes. B) Hi-C contact map of DToL primary assembly, scaffolded using YAHS without manual curation, showing 49 assembled chromosome scaffolds as blue boxes. C) Whole-genome alignment of both scaffolded assemblies using Winnowmap2⁸⁹, showing DToL on x-axis and MPIBR on the y-axis. The 4 “breakpoints” of chromosomes in either of the assemblies (3 breaks in DToL chromosomes compared to MPIBR, 1 break in MPIBR compared to DToL) are highlighted in different colors. D) Ribbon diagram showing the four breakpoints from C) compared to the chromosome-scale assembly from another cuttlefish, *Acanthosepion esculentum* (1n=46). The color of breakpoints are the same in panels C+D.

Syntenic comparison of three decapod species.
A) Taxonomy of selected cephalopod species showing their genome size (in gigabases, Gb) and haploid chromosome numbers. Taxonomy information was downloaded from NCBI taxonomy browser, divergence times for Coleoidea and Decapodiformes from¹⁹² and for Sepiidae from⁹⁴. B) Genome-wide syntenic relationship between chromosomes of *E. scolopes*⁴⁹ (top), *D. pealeii*⁴⁹ (middle) and *S. officinalis* (bottom). Colored braids connect syntenic regions across genomes, with chromosomes drawn to physical scale. *Euprymna* chromosomes 45 and 46 are not shown because they contain too few orthogroups. C) Detailed synteny of *Sepia* chromosomes 40 (magenta) and 43 (darkblue) shown, that are joined in the other species and cause the different haploid chromosome number in *Sepia*. Riparian plots were generated using GENESPACE v1.2.3¹⁷⁵.

To better understand the potential cause of these divergent chromosome numbers, we analyzed the Hi-C and HiFi coverage in the breakpoint regions (Supplementary Figure 3A). First, we aligned the Hi-Fi reads to the scaffolds and extracted all alignments along the 200 kb terminal scaffold windows to find any notable drops in coverage, or reads spanning any of the scaffold junctions. We detected no spanning reads. This is not surprising given that no contigs were assembled at these sites, resulting in the observed scaffold junctions. More interestingly, we noted a ∼5-fold decrease in HiFi coverage along the DToL scaffold_40 (part of BP2) relative to its flanking regions, indicating a highly repetitive, low-mappability region at this boundary.

Next, we realigned the Hi-C data to the scaffolded assemblies using bwa-mem2⁹¹ and extracted all trans HiC pairs (between-scaffold contacts) using pairtools⁹². We normalized trans HiC contacts to the scaffold length and compared contact rates between breakpoint scaffolds to the baseline contact rate (computed from pairs of scaffolds with a clear 1-to-1 match between assemblies), and the contact rate within scaffolds (intra-scaffold pairs) (Supplementary Figure 3B,C). The contact rates within breakpoints were consistently lower than within scaffolds, likely falling below the threshold to be merged during assembly.

However, the contact rates at three of four breakpoints (BP1, BP3, BP4) were significantly elevated above the genome-wide background distribution (empirical p = 0.010, 0.005, 0.005 respectively), suggesting that they may represent intra-chromosomal contacts disrupted by a misassembly. Notably, BP2 was not significant (empirical p = 0.170), likely due to the low coverage and mappability around the DToL scaffold_40 boundary. Considered jointly, the three DToL breakpoint scaffold pairs showed significantly higher trans contact rates than the background (Wilcoxon rank-sum, one-tailed, U = 1771, p = 0.004).

Lastly, we analyzed the repeat landscape around the 200 kb scaffold ends using RepeatMasker⁹³ and the custom repeat library that we had generated for Sepia officinalis (described further below). Compared to control scaffolds of the same assembly, we observed consistently elevated repeat content at the breakpoint junctions (mean 71.5% vs 67.6% masked bases), with an enrichment of unclassified repeats (32.1% vs 30.0%), which could explain a repeat-driven assembly fragmentation or scaffolding failure. The BP2 DToL scaffold_40 junction window was 99.99% masked (99.2% unclassified repeats), providing a likely mechanistic explanation for both the HiFi coverage drop and the absence of a significant trans Hi-C signal at this breakpoint. Taken together, these analyses suggest that the different chromosome numbers across the two S. officinalis assemblies are due to technical reasons, caused by repeat-rich scaffold boundaries that impair HiFi and Hi-C read alignment and in turn, correct assembly in these regions.

Comparisons with other chromosome-scale Decapodiformes genomes

To further investigate why our initial estimate of Sepia officinalis’s chromosome numbers differred from those in other studies, we compared our assembly with the chromosome-scale genomes of two other decapod cephalopods, Euprymna scolopes⁴³ (a sepiolid) and Doryteuthis pealeii⁴⁹ (a loliginid), both described as having 46 chromosomes (Figure 3A).

Using orthogroups to perform linkage analysis, we could detect a clear chromosome homology between Doryteuthis and Sepia, and to a lesser extent between Euprymna and Sepia (Figure 3B). We observed some small-scale rearrangements between Euprymna and Doryteuthis, such as a fusion of chromosomes 24 and 40 from Doryteuthis in chromosome 2 of Euprymna, which has been also observed in other Sepiolids^42,84. Dotplots showing detailed pairwise synteny comparisons are shown in Supplementary Figure 4 (Sepia to Doryteuthis) and Supplementary Figure 5 (Sepia to Euprymna). The inferred species tree as part of this analysis places Euprymna as sister to Sepia and Doryteuthis, matching recent phylogenetic analyses^53,94 (note, however, that this tree was constructed without the inclusion of outgroup taxa, and therefore lacks a reliable root).

This comparison revealed that chromosomes 40 and 43 in Sepia are merged into one in Euprymna and in Dorytheutis (Figure 3C), as they were in the DToL Sepia officinalis assembly (Figure 2C). Together with the whole-genome alignments between the cuttlefish assemblies described earlier (Figure 2D), these results lead to the following parsimonious conclusion: that the karyotype for S. officinalis is 1n=46 chromosomes, as it is for other Decapodiformes.

We also performed a synteny comparison including the cuttlefish A. esculentum that was recently annotated by genome liftover from Acanthosepion pharaonis⁹⁵. That assembly, constructed from a female individual, showed a low read coverage for a sex chromosome in a ZZ/Z0 sex determination system⁹⁶. Our syntenic comparison indicated a strong homology between the inferred Z chromosome of A. esculentum and chromosome 46 of S. officinalis (Supplementary Figure 6A). As indicated above, we determined the sex of our S. officinalis specimen by replicating the analysis used to identify the Z chromosome in A. esculentum⁹⁶. For this, we aligned short-reads (Illumina) from the same S. officinalis individual to the assembly and examined the normalized read coverage for each chromosome (Supplementary Figure 6B). In contrast to the low coverage observed in the female A. esculentum assembly (Supplementary Figure 6C), we observed no significant decrease in read coverage for chromosome 46, suggesting that our material came from a male animal. Additionally, we used a recently published genotyping protocol for cephalopods⁷⁶ and performed qPCR on extracted genomic DNA from tissue samples, confirming the male sex of our sequenced individual.

Genome repeat landscape

After creating a custom repeat library for Sepia officinalis using RepeatModeler⁹⁷, we masked the genome using RepeatMasker⁹³, resulting in 71.17% masked bases. The categories of repeats are shown in Figure 4A. Most repeats were not characterized (39.65% of total bp) and presumably represent ancient repeats that diverged beyond recognition⁹⁸.

Genome annotation for *Sepia officinalis*.
A) Annotation of repeat landscape of the *S. officinalis* genome, annotated using RepeatModeler⁹⁷. Full repeat landscape is shown on the left, annotated repeats (excluding unclassified or simple repeats) are shown on the right. **B-C)** Quality control of gene annotation and comparison to two other cuttlefish species using OMArk¹¹⁷. Results shown for *Acanthosepion lycidas* (GCA_963932145.1, Ensembl Genebuild), *Sepia officinalis* (BRAKER, this study) and *Acanthosepion pharaonis*¹⁹³ (BRAKER). Lophotrochozoa was used as the ancestral clade. B) Completeness assessed by the presence of genes conserved in the clade, classified as *single* or multiple copies (*duplicated*), or *missing*. C) Consistency assessed by the proportion of proteins placed in the correct lineage (*consistent*); placement in incorrect lineages randomly (*inconsistent*) or to specific species (*contamination*), or no placement in known gene families (*unknown*). D) Phylogenetic tree of 13 molluscan species used for analysis of gene families with Orthofinder¹²². Species are colored by clade: purple = coleoid cephalopods, blue = nautiloid (non-coleoid cephalopod), green = non-cephalopod mollusk. E) Heatmap of largest gene families (orthogroups from Orthofinder, with more than 100 genes in any species), ordered from largest gene count across all species on the left. Families with at least one gene in *S. officinalis* are depicted. Rows show gene counts for each species (color capped at 500 genes), columns show orthogroups and their annotation by eggNOG mapper^120,121 or InterProScan¹¹⁹, if available. Clade colors match D).

Retroelements constituted the largest characterized repeat category (17.32%) followed by DNA transposons (5.92%); 5.83% were annotated as simple repeats. As observed in other Decapodiformes, the Sepia genome contained almost no short interspersed nuclear elements (SINEs), supporting the hypothesis that the SINE expansion observed in octopuses occurred independently in their lineage⁵⁰.

Gene modeling and annotation

The genome was further annotated using BRAKER^80,99–114 combining short-and long-read RNA-seq data and publicly available protein data from multiple molluscan species (Doryteuthis pealeii⁴⁹, Euprymna scolopes¹¹⁵, Octopus bimaculoides⁴⁹, Octopus vulgaris⁴⁷, Nautilus pompilius¹¹⁶, and Pecten maximus⁵⁴). A total of 18,663 gene models and 23,768 proteins were annotated.

The gene annotation was evaluated using BUSCO⁸⁰ v5.5.0 in protein mode, showing high completeness of the annotation (metazoa_odb10: C:98.2%[S:77.3%,D:20.9%], F:1.0%, M:0.8%, n:954 and mollusca_odb10: C:81.5%[S:59.6%,D:21.9%], F:0.9%, M:17.6%, n:5295). We further checked the completeness and consistency of our gene models using OMArk¹¹⁷, and compared them to genome annotations for two other cuttlefish species, Acanthosepion pharaonis and Acanthosepion lycidas (Figure 4B). Completeness was assessed by comparing the annotated genes to conserved orthologs (present in 80% of extant species) in a given taxonomic clade (here the superorder Lophotrochozoa). The S. officinalis annotation missed 181 out of 2373 genes, which is higher than with A. lycidas (96 genes) but lower than with A. pharaonis (458 genes).

In a consistency assessment, where the presence of known gene families from the lineage is evaluated (Figure 4C), S. officinalis contains low proportions of taxonomically inconsistent or fragmented proteins (ca. 13%) similar to A. lycidas (9%), giving confidence in the annotation. In contrast, more than 50% of A. pharaonis proteins are labeled as inconsistent or fragmented, which could indicate annotation errors¹¹⁷.

Overall, the annotations contain different numbers of predicted proteins which may reflect differences in the annotation method and the reference data used. The genome of S. officinalis contains fewer proteins (23,768) than those of A. lycidas (35,949) or A. pharaonis (53,515). In comparison, two octopus genomes contain similar numbers of proteins (O. vulgaris: 30,134; O. bimaculoides: 29,037) and were produced using NCBI’s RefSeq¹¹⁸ pipeline, suggesting that the differences observed across cuttlefish proteomes are probably of technical instead of biological origin.

Lastly, we assigned orthology information to the S. officinalis proteome using InterProScan¹¹⁹ and eggNOG-mapper¹²⁰ to aid the interpretability of the resource for future transcriptomic or proteomic studies. Overall, 89% of proteins (21,204 out of 23,768) received an annotation from InterProScan from at least one of their databases. 59% of proteins (14,126 out of 23,768) were annotated by eggNOG-mapper, reflecting the more stringent orthology filters and prioritization of full-length matches implemented in the program¹²¹.

Analysis of expanded gene families

We sought to investigate the S. officinalis gene annotation and place it in the context of gene repertoires from other cephalopod or molluscan species. First, we collected available genome annotations from 12 other molluscan species (Table 2) and clustered them using OrthoFinder v.3.1.0¹²², resulting in 23,658 orthogroups, hereafter named gene families.

Overview of gene annotation of 13 molluscan species used for gene family analysis.

First, we investigated 36 of the gene families that contain more than 100 genes in any of the species, with 17 of these families containing at least one gene of S. officinalis, that reflect large-scale gene family expansions (Figure 4E). We used the InterProScan and eggNOG-mapper annotations to infer functional roles of these genes, selecting the most common gene annotation as the name of the gene family.

The zinc finger C2H2-type transcription factors (TFs) were grouped into three of the large gene families, with the largest family (OG0000000) only present in decapod cephalopods. This likely reflects the largely independent expansions in the octopod and decapod lineages that date back to a burst of transposon activity ca. 25 million years ago^46,48,49. The largest expansion across mollusks occurs in the cadherin-like family (OG0000001): 310 in S. officinalis, 283 in D. pealeii, 209 in A. lycidas, 102 in O. vulgaris, 55 in O. bimaculoides, with low but non-zero counts in bivalves (C. virginica, M. gigas). This profile is consistent with the protocadherin expansion first described in O. bimaculoides⁴⁶ and subsequently shown to be present across cephalopods^48,49,123.

HPGDS (OG0000005, hematopoietic prostaglandin D synthase) is a glutathione-S-transferase family member that catalyzes the conversion of prostaglandins, which have well-described roles in immune responses in vertebrates and insects^124,125. This family shows a broad expansion in decapods, with a lesser expansion in octopods.

Additionally, members of the glutathione-S-transferase families have been co-opted as S-crystallins, structural proteins found in the lens of cephalopods that may, or may not, retain enzymatic functions^126,127.

Two large families are mostly lineage-restricted. The RING-type zinc finger family (OG0000058) has 103 copies in S. officinalis and 26 in A. lycidas but is absent in all other species except for E. scolopes. Conversely, OG0000002 (unknown function) has 479 copies in E. scolopes and only a few copies in the other species. This interesting Sepiolid-specific expansion warrants further characterization.

We estimated gene family evolution rates using CAFE5¹²⁸ for all families with less than 100 copies in any species (this excludes the families described above, as very large copy-number differences between species preclude likelihood calculations under the applied birth-death model). After comparing different model parameters, we chose a gamma model with three rate categories, allowing for evolutionary rate variation among gene families. Out of the 12,895 gene families analyzed, 1,813 showed a significant (p < 0.05) expansion or contraction in at least one of the species. We focused our analysis on the 30 most significantly expanded families; among them were several retrotransposon-associated domains that have expanded specifically in S. officinalis: five families carrying

Retrovirus-related Pol polyprotein domains, two Reverse transcriptase domain families, and four Ribonuclease H-like families (Supplementary Figure 7A). There was no coordinate-based overlap of the coding sequences with annotated TEs from the RepeatMasker output (Methods).

In addition to the three large gene families of C2H2 zinc finger expansions, 45 gene families containing this TF type showed a significant change in the CAFE5 analysis. Notably, eight of the significant gene families, as well as four of the largest gene families, were annotated as CCHC-type zinc fingers, which contain a “zinc knuckle” motif that is characteristic of retroviral nucleocapsid proteins¹²⁹ and is functionally integrated in the genomes of several species, including humans¹³⁰.

Some gene families without any relationship to retrotransposons were also expanded. For example, the UGT2A1-related family is a UDP-glucuronosyltransferase, a class of enzymes central to phase II detoxification and conjugation of metabolites, reported in other mollusks in the context of environmental chemical tolerance¹³¹, and in insects in the context of pigmentation¹³². We also detected a family of homeodomain-like proteins, representing an expansion of this important TF family.

Tissue-specific expression of expanded gene families

To place the identified gene families in a functional context, we profiled their expression in the bulk RNA-seq data (taken from multiple tissues of S. officinalis) used originally for gene modeling (Figure 5A). Principal component analysis (PCA) revealed the largest axis of variation in gene expression to separate brain tissues from peripheral tissues, with skin being the most transcriptomically distinct (Figure 5A), consistent with the high number of tissue-specific differentially expressed (DE) genes identified in non-neural tissues (Figure 5B). We identified the genes belonging to expanded families that were differentially expressed across tissues and enriched gene ontology^133,134 (GO) terms for them to gain additional insight. The large families excluded from CAFE5 modelling and the significantly expanded families identified by CAFE5 were analyzed separately.

Expression of expanded gene families in tissue bulk RNA-seq data.
Bulk RNA-seq data collected from one adult *S. officinalis* from different brain tissues (optic lobes - yellow, basal lobes - turquoise, vertical and subvertical lobes - orange, posterior subesophageal mass - purple), retina (red) and skin (blue, from the dorsal mantle). Tissue color code is identical throughout the figure. A) Principal component analysis (PCA) of the data, showing the first 2 PCs, colored by tissue. B) Barplot showing number of differentially expressed (DE) genes (i.e. marker genes) for each tissue, calculated against all other tissues using DESeq2¹⁸⁷. C) Largest gene families (orthogroups) with differential expression in bulk RNA-seq data. Dot size shows the number of DE genes for each tissue. Families with enriched GO terms are highlighted in grey. **D)+E)** Dotplots of enriched gene ontology^133,134 (GO) terms for large gene families, enriched using clusterProfiler¹⁸⁹ using a hypergeometric test. Dot size shows the number of expressed genes per family with this GO term, x-axis shows the percentage of expressed genes from all genes with this GO term. Dot color shows the adjusted p-value after Benjamini-Hochberg false discovery rate (FDR) correction. CC: cellular component, MF: molecular function, BP: biological process. F) Heatmap of z-scored expression of all DE genes from the largest gene families with enriched GO terms.

Eleven of the largest gene families were expressed in our data (Figure 5C) and five had enriched GO terms (Figure 5D,E). Among them, the cadherin family showed brain-restricted expression and GO terms related to cell–cell adhesion and calcium binding, consistent with their role in neuronal connectivity and circuit formation^46,135. Two C2H2 zinc finger gene families were expressed in the optic and vertical/subvertical lobes of the brain and in the skin, with GO terms related to DNA-binding, transcriptional regulation or development. The RING-type zinc finger family was expressed specifically in the skin, with GO terms including zinc binding and ubiquitin protein ligase activity, the canonical function of RING-domain E3 ligases¹³⁶. Genes of the HPGDS/S-crystallin family were expressed in the brain (basal and optic lobes and posterior subesophageal mass) and skin, with GO terms related to glutathione metabolism, matching their described enzymatic function. We did not find expression in the retina, which is expected given that S-crystallins are expressed in lentigenic cells of the eye^42,137 and these cells were not included during sampling.

Among the 30 most significantly expanded families examined (out of 1,813 total), expression was widespread (20/30) and tissue-specific differential expression was common (17/30), suggesting that a substantial proportion of expanded paralogs represent functional coding sequences with specialized spatial deployment (Supplementary Figure 7B). Ten of the retrotransposon-associated families were differentially expressed in the brain (optic and vertical/subvertical lobes) and skin, arguing against these loci being inactive repeat fragments and supporting their inclusion as transcribed gene models. Two significantly expanded families showed both differential expression and enriched GO terms (Supplementary Figure 7C). The first was the UGT2A1-related family, which had the largest number of differentially expressed genes overall, with expression concentrated in the skin, retina and posterior subesophageal mass of the brain. Enriched GO terms matched the described enzymatic function for this family, namely UDP-glycosyltransferase activity. The second gene family was the homeodomain-like family with enrichment for DNA binding terms consistent with their role as transcription factors, and was preferentially expressed in the vertical and subvertical brain lobes with weaker expression in other areas.

Collectively, many differentially expressed genes from expanded families were restricted to specific tissues or brain subregions (Figure 5F and Supplementary Figure 7D), indicating that paralogs within an expanded family have adopted distinct spatial expression domains and possibly, specialized functions.

Discussion

This study presents a detailed, chromosome-scale genome assembly and annotation for the common cuttlefish Sepia officinalis, providing an additional important resource for comparative genomics, neurobiology, and evolutionary studies within cephalopods and mollusks. The assembly was derived from a combination of PacBio HiFi long reads, Dovetail Omni-C chromatin conformation capture, and multiple rounds of scaffolding and manual curation. The resulting assembly has a haploid genome size of 5.68 Gb, and contains 47 chromosome-like clusters. This number prompted us to perform detailed analyses of chromosome structure and homology across various decapod genomes, leading to a likely consensus of 46 chromosomes. This resource should be valuable for future efforts in cephalopod genomics, especially within the Decapodiformes, where large and repetitive genomes have hindered previous efforts.

Chromosome Number and Karyotypic Variation

Our assembly contains an estimated haploid chromosome number of 47, a result produced by two independent scaffolding approaches (YAHS⁸² and HapHiC⁸¹) and supported by Hi-C contact maps. This karyotype, however, differed from those reported for other decapod cephalopods, such as Euprymna scolopes and Doryteuthis pealeii⁴⁹, each with 46 chromosomes. Syntenic comparisons across datasets revealed that chromosomes 40 and 43 in S. officinalis correspond to a single chromosome in these other species, suggesting that the apparent split in Sepia may result from a technical artefact.

We also compared our results to the recently published S. officinalis genome from the Darwin Tree of Life project (DToL)⁷⁵, which proposed 49 chromosomes for that species. This difference was surprising: two genome assemblies from the same species are usually expected to have the same karyotype. The discrepancy is attributable to one chromosome split in our assembly and three splits in the DToL assembly, and could stem from technical differences of the Hi-C methods (Dovetail Omni-C vs. Arima v2), sequencing depth (11-fold vs. 83-fold coverage) or the tissue used (optic lobe vs. eye). We investigated these splits, or breakpoints in two ways: first, we reassembled both datasets with the same parameters and aligned both to the genome of another cuttlefish A. esculentum (1n=46). In this alignment, all four breakpoints were collinear with single A. esculentum chromosomes, providing a phylogenetically grounded prediction of 1n=46 for Sepia officinalis. Second, we investigated read depth at the breakpoint junctions, and saw a significant increase of Hi-C read pairs spanning the junctions compared to random chromosome pairings. Together with a higher repeat content at the breakpoints, these analyses suggest a technical cause for the differences between assemblies, caused by repeat-rich scaffold boundaries that impaired HiFi and Hi-C read alignment and in turn, made correct assembly challenging in these regions.

In conclusion, the two independent genome assemblies and our analysis suggest three possible karyotypes for Sepia officinalis: 46, 47, or 49. The most likely explanation based on chromosome synteny is that the true number is 46, matching the numbers established for other cuttlefish and decapod species^42,49,84. 47 chromosomes (our initial estimate) and 49 chromosomes (DToL estimate) would therefore be erroneous values explained by technical factors related to repeat content and low alignment rates preventing complete assembly.

Incorporating ultra-long read data may help to correctly assemble these problematic regions, as is now common for telomere-to-telomere assemblies¹³⁸.

Another, but less likely, explanation for the observed differences could be variations in chromosomal architecture of different Sepia officinalis populations, due to idiosyncratic individual chromosome fusion. Intraspecific chromosome number variation is rare but not unprecedented; for instance, chromosomal polymorphism has been described in the butterfly Leptidea sinapis across different populations^139,140. Notably, the two S. officinalis individuals used for sequencing originated from different regions - the Portuguese Atlantic (this study) and the French Mediterranean (DToL project), raising the possibility of geographic variation in chromosome structure. Future population-level analyses will hopefully determine whether S. officinalis exhibits karyotypic polymorphism.

Additional methods such as cytogenetic karyotyping or optical mapping such as BioNano¹⁴¹ (imaging of fluorescently tagged, linearized DNA) could be used to validate chromosome numbers. However, whereas karyotypes of octopuses have been consistent throughout the literature (1n=30)^142,143, those measured in decapods vary greatly. For example, 1n=46 chromosomes have been reported for two species of cuttlefish (A. esculentum and A. lycidas) and three loliginid squids⁸⁵; 1n=36 has been reported for A. arabica⁸⁶ and 1n=24 in A. pharaonis⁸⁷. In S. officinalis, a karyotype of 1n=52 is reported for testis samples⁸⁸. Combining cytogenetic preparations with fluorescent labeling of centromeric or telomeric sequences, as demonstrated in the octopus A. aerolatus¹⁴³ could help resolve these issues. Establishing a routine staining protocol would enable comprehensive tests at the species-and population-level.

Taken together, our results illustrate the difficulty of assembling large genomes with high repeat content and large karyotypes, at least from sequencing data alone. Internal validation methods and genome comparisons across species are therefore important. Convergence of reliable estimates will, in turn, help identify chromosomal fusion-with-mixing events (FWM; fusion of two ancestral chromosomes followed by extensive shuffling of their gene content) that are clade specific. Early branching order in Decapodiformes has been notoriously unstable^{53,84,94,144–147}; thus, such rare and irreversible FWM characters could be useful in further phylogenetic analysis of this clade^51,148.

In addition to studying chromosomal topology in phylogenetic reconstructions, some of the most interesting aspects of these rearrangements relate to changes of and innovation in regulatory elements that underlie phenotypic diversity. In coleoid cephalopods, it is thought that an ancient large-scale genome rearrangement was combined with lineage-specific changes and repeat expansions^48–50. This restructuring gave rise to hundreds of tightly linked, evolutionarily unique microsyntenies, corresponding to distinct topological compartments with specialized regulatory architectures that contribute to complex, tissue-specific expression patterns in the nervous system and elsewhere⁴³. Extending this, chromosomal conformation analyses in E. scolopes revealed that co-regulated eye and light-organ genes cluster at topologically associating domain (TAD) boundaries, and that an evolutionarily recent rearrangement at the dachshund (DAC) locus may have been instrumental in the emergence of the symbiotic light organ in Euprymna - directly linking specific chromosomal topology to morphological innovation⁴⁴.

To understand the broader functional impact of these changes across coleoids, a recent study investigating Micro-C, RNA-seq, and ATAC-seq data from multiple species revealed broadly conserved chromatin domains, but also many lineage-specific chromatin loops that form novel regulatory signatures and impact expression profiles across species and tissues¹⁴⁹.

Despite the observed small-scale regulatory changes, the chromosomes of decapods are considered to be more closely related to the ancestral coleoid karyotype than those of octopods. The derived octopod karyotype becomes apparent when comparing it to the genome of the vampire squid, an early-branching octopodiform (sister to all octopods) which retained features of the decapod, ancestral karyotype¹⁵⁰. Taken together, the conserved karyotype of decapods accommodates fine-scale regulatory diversity that might underlie morphological diversity among species, which suggests that many regulatory innovations are still being evolutionarily explored through rearrangements within the existing chromosomes.

Genome Size and Repeat Landscape

The size of the genome of S. officinalis is estimated to be 5.68 Gb - comparable to those of other Decapodiformes and roughly twice the size of typical Octopodiformes genomes. We found that over 71% of the genome is repetitive, the repeats being dominated by unclassified elements and retrotransposons. The near absence of SINEs reinforces the idea that their expansion in octopuses was a lineage-specific event⁵⁰. These results underscore the evolutionary complexity and dynamism of cephalopod genomes, shaped by waves of transposon activity that likely played a role in the evolution of novel traits^43,49.

Gene Annotation and Comparative Assessment

We annotated 18,663 gene models and 23,768 proteins using a combination of short-and long-read RNA-seq data and molluscan protein references. Compared with other annotated cuttlefish genomes (A. lycidas, A. pharaonis), our annotation is conservative concerning gene numbers but stands out in terms of completeness and taxonomic consistency.

OMArk-based evaluations confirmed that our gene models have low levels of fragmentation and contamination, and high lineage-specific coherence. Striking differences across species in the number of predicted genes, duplications, and missing orthologs likely reflect variation in the annotation pipelines and reference data rather than true biological differences. Note that the comprehensive annotation of protein isoforms remains challenging even for model organisms^151,152. Still, our dataset provides a solid foundation that may be refined with additional (long-read) transcriptomic data taken from diverse tissues and developmental stages.

We characterized the S. officinalis gene models further by clustering them into families together with 12 other cephalopod and non-cephalopod mollusks. As in other coleoid genomes, we observed large-scale expansions of gene families such as C2H2 zinc finger transcription factors and protocadherins, genes implicated in regulatory innovation, neural development and plasticity^46,48,49,123. These conserved expansions may be linked to the unusual cognitive complexity and behavioral sophistication of these invertebrates.

We profiled the expression of expanded gene families in bulk RNA-seq data from multiple S. officinalis tissues and found that many genes were differentially expressed, frequently in a tissue-restricted manner. This spatial partitioning of paralogs across tissues validates many expanded families as likely functional and biologically relevant, and suggests that subfunctionalization or neofunctionalization may have accompanied gene family expansion in S. officinalis, with individual family members acquiring specialized roles^153,154. Note that we only investigated the expression of the largest copy number families, and the most significantly expanded families. Whether this pattern generalizes to the broader set of expanded families remains to be determined.

Among the large families, the cadherin and C2H2 zinc finger families showed preferential expression in neural tissues, consistent with known roles in synaptic organization and transcriptional regulation of neural development, respectively. The skin-specific expression of the RING-type zinc finger family is notable given its restricted expansion in decapods.

RING domains are characteristic of E3 ubiquitin ligases that catalyze protein ubiquitination and target substrates for proteasomal degradation^155,156. Their expression could hint at unique demands in protein turnover in the skin, but the specific function of this gene family in cuttlefish skin remains to be determined.

Many of the smaller but significantly expanded gene families, which were predominantly found in decapod genomes, were linked to retrotransposons, like retrovirus-related Pol polyprotein, reverse transcriptase domain, and ribonuclease H-like families. We confirmed that these genes were genuine coding sequences and not artifacts of TE repeats, and that they were expressed in S. officinalis tissues. The combination of coding-sequence annotation, expression, and tissue specificity is consistent with at least some of these loci having been retained as functional genes after retroelement-related origins.

Notably, we observed many expanded families containing CCHC-type zinc fingers, which contain a “zinc knuckle” motif, characteristic of retroviral nucleocapsid proteins¹²⁹. In addition to the retroviral function, some eukaryotic proteins containing CCHC-domains have been described, which likely originate from domesticated nucleocapsid sequences and play important roles in RNA metabolism^130,157. Taken together, these gene families may represent a recent wave of retroelement activity, and the variation in copy number across different Decapodiformes is consistent with repeated recruitment and/or independent retention of retroelement-derived sequences in different decapod lineages.

Two non-retroviral gene families were also differentially expressed in S. officinalis tissues: a UDP-glucuronosyltransferase (UGT) family and a homeodomain-like family. The expression of the UGT family, particularly in skin and retina is interesting: enzymes of the UGT family are involved in pigment metabolism in insects¹³², but have also been reported in mollusks as playing roles in environmental chemical tolerance¹³¹. The homeodomain-like family contains transcription factors with important roles in body patterning and brain regionalization¹⁵⁸. We found members of this family to be expressed in the vertical and subvertical lobes of the brain, but also more weakly in other brain areas, suggesting novel regulatory roles of this transcription factor class in higher-order neural circuits in brain areas associated with learning and memory in cephalopods¹⁸.

A potential caveat of this analysis is that the gene modeling approaches used for the species were different: while the majority of genomes were annotated with the RefSeq pipeline¹¹⁸, others used Ensembl genebuild¹⁵⁹, BRAKER^80,99–114 or other custom approaches. This may have introduced artificially inflated gene counts due to insufficient isoform resolution, or missing gene models due to the absence of reference data. In the future, as genome annotations become progressively refined (as for D. pealeii¹⁶⁰), gene family-level analyses will become more powerful and enable the identification of additional lineage-or species-specific innovations.

Conclusions and Outlook

This study provides a new genomic resource for Sepia officinalis. The chromosome-scale assembly and annotation open the door to in-depth studies of gene regulation, neural circuit development, and genome evolution across coleoid cephalopods.

As we move toward a more complete picture of cephalopod genome evolution, the integration of chromosomal synteny, regulatory architecture, and transcriptomic diversity across species will be especially important. The S. officinalis genome represents an important step on this path, enabling high-resolution comparative and functional analyses of one of the most enigmatic and evolutionarily successful invertebrate lineages.

Materials & Methods

Animal husbandry

All research and animal care procedures were carried out following the institutional guidelines that comply with national and international laws and policies (DIRECTIVE 2010/63/EU; German animal welfare act; FELASA guidelines). European cuttlefish Sepia officinalis were hatched from eggs collected in the Portuguese Atlantic and reared in a seawater system, at 20°C. The closed system contains 4,000 L of artificial seawater (ASW; Instant Ocean) with a salinity of 33‰ and pH of 8–8.5. Water quality was tested daily and adjusted as required. Trace elements and amino acids were supplied weekly. Marine LED lights above each tank provided a 12/12-h light/dark cycle with gradual on-and off-sets at 07:00 and 19:00. The animals were fed live food (either Hemimysis spp. or small Palaemonetes spp.) ad libitum twice per day. The animals were housed together in 120 L glass tanks with a constant water through-flow resulting in five complete water exchanges per hour. Enrichment consisted of natural fine-grained sand substrate, seaweed (Caulerpa prolifera), rocks of different sizes and various natural and man-made three-dimensional objects.

Tissue preparation

Animals underwent terminal anesthesia to prevent animal suffering or distress, following the Guidelines for the Care and Welfare of Cephalopods in Research^161,162. Animals were transferred to a bucket containing ethanol 2% (v/v) in ASW. After 5 minutes, animals were gently probed for simple avoidance reflexes, and the ethanol concentration in the ASW was gradually increased to a maximum of 5%. Sufficient depth of anesthesia was reached when the animal no longer reacted even to stronger stimuli (touching and pinching with tweezers), and no reaction was observed when a hand was moved in the visual area and the cornea was touched¹⁶³. In this deep anesthesia, the animals were decapitated and the tissue was rapidly transferred into ice-cold, calcium-free Ringer solution (460 mM NaCl, 10 mM KCl, 51 mM MgCl₂, 10 mM glucose, 2 mM Glutamine, 10 mM HEPES, pH 7.4) bubbled with oxygen. Brain and body tissue was dissected under a stereoscope and flash-frozen in liquid nitrogen before storage at-80°C until further use.

Unless stated otherwise, sequencing libraries were prepared from the same individual (6-month-old adult Sepia officinalis, F1 from eggs collected in Portugal).

Long-read Whole Genome Library Preparation and Sequencing

The long-read library was prepared and sequenced at the MPI for Plant Research in Cologne, Germany. Genomic DNA was extracted from flash-frozen brain tissue with the MagAttract HMW DNA kit (Qiagen, Cat. no. 67563) and the sequencing library was prepared with the SMRTbell express template prep kit 2.0 (PacBio, Cat. no. 100-938-900). The library was sequenced on 5 SMRT cells on the PacBio Sequel II with the Sequel II binding kit 2.2 (102-089-000), Sequel II sequencing kit 2.0 (101-820-200) and SMRT Cell 8M tray (101-389-001).

Long-read RNA library preparation and sequencing

RNA was isolated from various flash-frozen tissues (different brain areas, mantle/epidermis, arm/tentacle; 5-10 mg each). First, tissue samples were homogenized in TRIzol using a homogenizer (VDI12/S12N-5S, VWR, Germany). RNA was isolated and DNAseI treated according to the Direct-zol RNA Mini prep kit from ZymoResearch (R2050). RNA was quantified by spectrophotometry (Nanodrop, Thermo Scientific, USA) and its quality was assessed with a RNA 6000 Nanochip on Bioanalyzer (Agilent Technologies, Germany).

The Iso-Seq libraries were prepared and sequenced at the Sequencing facility of the MPI for Plant Research, using a method targeting the 5’ cap and poly-A tail of mRNAs, described in detail in¹⁶⁴. Briefly, mRNA was pooled from the 8 tissue samples and cDNA was synthesized using the TeloPrime Full-Length cDNA Amplification Kit V2 (Lexogen, Cat Nr: 013.08 or 013.24). The sequencing library was prepared with the SMRTbell express template prep kit 2.0 (PacBio, Cat. no. 100-938-900). The pooled library was sequenced on 1 SMRT cell on the PacBio Sequel II system using the Sequel II binding kit 2.1 (101-843-000), Sequel II sequencing kit 2.0 (101-820-200) and SMRT Cell 8M tray (101-389-001).

Further, a separate library was prepared from optic lobe cDNA and sequenced on a second SMRT cell using the same reagents described above.

Omni-C Library Preparation and Sequencing

An Omni-C library was prepared and sequenced at the MPI for Plant Research in Cologne, Germany. The library was prepared from brain tissue using the Dovetail Omni-C™ Kit (Dovetail Genomics, Cat. No. 21005) according to the manufacturer’s protocol. Briefly, chromatin was fixed in place in the nucleus. Fixed chromatin was digested with DNaseI then extracted. Chromatin ends were repaired and ligated to a biotinylated bridge adapter followed by proximity ligation of adapter-containing ends. After proximity ligation, crosslinks were reversed and the DNA was purified from proteins. Purified DNA was treated to remove biotin that was not internal to ligated fragments. The sequencing library was generated using Illumina-compatible adapters. Biotin-containing fragments were isolated using streptavidin beads before PCR enrichment of the library. The library was sequenced on an Illumina NextSeq 2000 platform to generate 400 million 2 x 150 bp read pairs.

Short-read DNA sequencing

High molecular weight genomic DNA was isolated from brain tissue (optic lobe) using NEB Monarch gDNA Purification Kit (T3010S). First, 10 mg of flash-frozen tissue were submerged in 100 µl of tissue lysis buffer, cut into small pieces and transferred into a 1.5 ml tube containing an additional 100 µl of lysis buffer. Then, 3 µl of 20 U/µl Proteinase K was added to the tissue sample and incubated for 2 h at 56°C in a thermal mixer with agitation until tissue pieces were completely dissolved. After incubation, tissue lysate was purified from the debris by centrifugation for 3 min at maximal speed (Eppendorf, Germany). Clean supernatant was transferred to a new 1.5 ml tube and incubated with 3 µl RNase A for 5 min at 56°C with agitation at full speed. After RNase A incubation tissue lysate was column purified and eluted with 80 µl of elution buffer according to the manufacturer protocol. The purity of gDNA was assessed using A260/280 and A260/230 absorbance ratios measured by NanoDrop spectrophotometer (ThermoFisher Scientific). The integrity of the obtained gDNA was checked using 0,75% agarose gel electrophoresis by loading 100 ng of gDNA sample and using HindIII digested Lambda DNA (NEB#3012) as a marker.

DNA sequencing libraries were prepared from 500 ng of high MW gDNA using Illumina DNA PCR-Free Tagmentation Library Prep Kit (20041794) and UD Indexes Set A (20026121).

Obtained dual-indexed single-stranded libraries were quantified using the Qubit single-stranded DNA (ssDNA) assay kit (Q10212, ThermoFisher Scientific). 1000 pM of the final library were run on P3 reagents and Illumina NextSeq2000 using Illumina DNA PCR-Free Sequencing and Indexing primer (20041797) with 151 bp paired-end reads.

Short-read RNA library preparation and sequencing

For short-read RNA sequencing, tissue from another animal (8-month-old adult, F0 from eggs collected in Normandie, France) was used. RNA was isolated from various flash-frozen tissues (different brain areas, skin and retina; 5 mg each). First, tissue samples were homogenized in TRIzol using a homogenizer (VDI12/S12N-5S, VWR, Germany) following DNAseI treatment and column purification according to the Direct-zol RNA Micro prep kit from ZymoResearch (R2062). The RNA integrity and quantity were measured with the Qubit fluorometer (Invitrogen, Q33216) and the 2100 Bioanalyzer (Agilent Technologies, Germany).

RNA-seq libraries were prepared from 300 ng of total RNA, using the Illumina TruSeq stranded mRNA library prep kit (Cat:20020594) and IDT for Illumina xGen UDI-UMI Adapters (Cat:10005903). Libraries were sequenced on an Illumina NextSeq500 (Mid output, 300 cyc) and NextSeq2000 (P3, 300 cyc) with 145 bp paired-end reads.

Nuclear genome assembly

Before assembly, PacBio HiFi reads were subjected to additional trimming to remove residual adapter sequences using NCBI’s VecScreen tool. The k-mer distribution was estimated using Meryl¹⁶⁵ within the Merfin¹⁶⁶ package with a k-mer size of 21, and genome size was estimated using GenomeScope⁷⁷ from Illumina short reads and PacBio HiFi data. The primary assembly was generated with hifiasm¹⁶⁷ using a combination of HiFi and Hi-C reads. Scaffolding was performed iteratively with YAHS⁸² on the phased haplotype 1. We adjusted the run parameters for scaffolding resolutions (-r 1000,2000,5000,10000,20000,30000,50000,70000,100000,150000,200000,250000,300000, 350000,400000,500000,700000,1000000,2000000,5000000,10000000,15000000,20000000,30000000,40000000,50000000,60000000,70000000,80000000,100000000,110000000,120 000000,150000000,170000000,200000000,500000000), repetitions per resolution (-R3), minimum read mapping quality (-q1) and telomeric sequence (--telo-motif “TTAGGG”) from the only experimentially determined telomere motif for cephalopods (Amphioctopus areolatus¹⁴³, accessed via TeloBase¹⁶⁸) to optimize the tool’s performance for a large, fragmented assembly. Finally, the assembly was manually curated using JBAT⁸³ to place residual scaffolds into chromosome scaffolds. Different versions of the assembly were evaluated based on Hi-C coverage, mRNA alignment and completeness based on BUSCO⁸⁰ v5.5.0 (metazoa_odb10 and mollusca_odb10).

Mitochondrial genome assembly

To assemble the mitochondrial genome, we aligned a published S. officinalis mitochondrial genome¹⁶⁹ (NC_007895.1) to the PacBio Hi-Fi and Omni-C reads using minimap2¹⁷⁰. The hits were extracted using seqtk¹⁷¹ subseq and assembled using hifiasm¹⁶⁷. The resulting contigs contained multiple repeats of the circular mitochondrial genome, and were aligned back to NC_007895.1 to extract the final sequence.

Nuclear genome annotation

Repetitive elements in the genome were softmasked using RepeatMasker v4.1.7-p1⁹³ (with rmblast v2.14.1+) (-xsmall and-gff options) and after creating a custom repeat library with RepeatModeler v.2.0.6⁹⁷ (without LTRstruct option).

Gene models were created using BRAKER3^80,99–114, run via the Docker container, on the softmasked genome using both RNA-seq and protein data. We used long-read Iso-Seq data and short-read Illumina RNA-seq data from various tissues (brain, skin, mantle, retina) generated for this study (see above). For protein data, publicly available proteomes for Doryteuthis pealeii⁴⁹, Euprymna scolopes¹¹⁵, Octopus bimaculoides⁴⁹, Octopus vulgaris⁴⁷, Nautilus pompilius¹¹⁶, Pecten maximus⁵⁴ were used for training, as well as a previously generated proteome for Sepia officinalis from StringTie gene models (this study, described below). RNA-seq data was input into BRAKER3 using the --bam option, protein files were specified with the --prot_seq option. Untranslated regions (UTRs) were added by the-addUTR=on parameter. The BUSCO completeness of the resulting gene set was maximized using TSEBRA within BRAKER on the BUSCO lineage metazoa_odb10.

For the initial protein set used as input for BRAKER, the short and long RNA reads were aligned to the genome using minimap2¹⁷⁰. Transcript models were predicted with StringTie v3.0.0¹⁷² using the --conservative and --mix options. The resulting GTF files were combined using the transcript merge mode, resulting in a set of non-redundant transcripts. Finally, TransDecoder v5.7.0¹⁷³ was run with default parameters to translate coding regions in the transcripts.

The final gene annotations for S. officinalis were assessed for completeness using OMArk v.0.3.0¹¹⁷ with the ancestral clade Lophotrochozoa and without including splice information, accessed via their webserver. Proteins were annotated with orthology information using InterProScan¹¹⁹ v5.73-104 including lookup of annotations and GO terms (options-iprlookup-goterms). Further orthology information was added using the eggNOG-mapper v2.1.12¹²⁰ webserver with the eggNOG v5.0 database and default parameters.

Whole genome alignment and synteny analysis

For whole genome alignments, the assembly produced for Sepia officinalis in this study and the published assembly from the Darwin Tree of Life project (GCA_964300435.1) or the Acanthosepion esculentum genome (GCA_964036315.1) were aligned using Winnowmap2⁸⁹ and visualized with a custom script in R v4.4.2¹⁷⁴.

For synteny analyses, proteome and gtf files were downloaded for Doryteuthis pealeii and Euprymna scolopes. Annotation files for Acanthosepion esculentum were recently generated by liftover annotation from Acanthosepion pharaonis⁹⁶ and downloaded from Zenodo⁹⁵.

Synteny analyses between all chromosomes of the compared species were performed using the R package GENESPACE v.1.2.3¹⁷⁵ with default parameters, described briefly below.

Protein sequence similarity was first estimated using DIAMOND2¹⁰⁹ in fast mode, and orthogroups and pairwise orthologues were inferred using OrthoFinder v2.5¹⁷⁶ with hierarchical orthogroups (HOGs) enabled. Prior to synteny inference, tandem arrays were condensed to their most central representative gene, and gene rank order was recalculated on these array-representative genes to reduce confounding effects of tandem duplication on collinearity detection.

Syntenic blocks were identified pairwise between all genome combinations using MCScanX¹⁷⁷, constrained to DIAMOND hits where both query and target genes belonged to the same orthogroup (onlyOgAnchors = TRUE). Initial anchor hits were clustered into large syntenic regions using a density-based spatial clustering approach (dbscan¹⁷⁸), with a minimum block size of five anchor genes (blkSize = 5) and a maximum of five intervening non-anchor genes permitted within a block (nGaps = 5). Anchor clustering used a search radius of 25 gene-rank positions (blkRadius = 25). All hits falling within a syntenic buffer of 100 gene-rank positions around confirmed block anchors (synBuff = 100) were retained as syntenic. No secondary syntenic hits were included (nSecondaryHits = 0). Syntenic orthogroups were integrated across all pairwise comparisons and collapsed into a pan-genome annotation anchored to S. officinalis as the reference genome.

Syntenic relationships were visualized as riparian plots and pairwise dotplots using the built-in plotting functions of GENESPACE v1.2.3. Riparian plots were constructed using physical chromosomal coordinates (useOrder = FALSE) with S. officinalis as the reference, displaying all three genomes. A second riparian plot was generated highlighting a region of interest. Pairwise dotplots were produced for the S. officinalis–D. pealeii and S. officinalis–E. scolopes genome comparisons, displaying only synteny-validated hits (type = “syntenic”) with a minimum synteny score of 10 (minScore = 10) and a minimum of 10 genes per chromosome pair required for display (minGenes2plot = 10).

Coverage analysis

For the analysis of breakpoints between S. officinalis assemblies, Hi-C data was aligned using bwa-mem2 v2.3⁹¹ and quantified using pairtools v1.1.0⁹². Contacts between scaffolds (trans pairs) were extracted from deduplicated read pairs (pair type UU). For each of the four breakpoints, all trans pairs between the two flanking scaffold halves were extracted using awk. As controls, the corresponding joined scaffold from the opposing assembly was paired with the nearest-length uninvolved scaffold from the same assembly, and trans pairs were extracted identically.

Genome-wide background and intra-scaffold contact distributions were computed in a single pass over each pairs file using awk, recording the total number of trans pairs and intra-scaffold pairs (>1 kb separation) for all scaffold pairs across the genome. Trans contact rates were normalized by the product of the two scaffold lengths (pairs per Mb²) to correct for length bias. For visualization, genome-wide trans pair positions were binned into 500 kb windows along each scaffold to produce contact density tracks. To test whether trans contact rates at breakpoints were elevated above background, empirical p-values were computed as the fraction of all genome-wide scaffold pair rates equal to or exceeding the observed breakpoint rate. For the three DToL breakpoints jointly, a one-tailed Wilcoxon rank-sum test was applied against the background distribution.

HiFi reads were aligned to the scaffolded assemblies using minimap2¹⁷⁰ and duplicate-marked alignments were removed. HiFi read depth was computed over the terminal 200 kb at each scaffold end using pysam v0.22.1¹⁷⁹ count_coverage with a mapping quality threshold of ≥10, binned at 1 kb resolution. Spanning reads - defined as reads with supplementary alignments crossing a scaffold boundary - were identified by querying split alignments at each junction using pysam.

Repeat content at scaffold junctions was characterized by extracting 200 kb windows flanking each breakpoint and the corresponding correct scaffold termini, and running RepeatMasker v4.1.7-p1⁹³ (with rmblast v2.14.1+) against a de novo repeat library generated by RepeatModeler v.2.0.6⁹⁷ from the same assembly (described above). Repeat annotations were parsed and collapsed into eight classes (SINE, LINE, DNA transposons, LTR elements, simple repeats, low-complexity regions, unknown repeats, and other), and the masked fraction per class was quantified for each window.

For sex chromosome analysis, read coverage across chromosomes was analyzed as described recently⁹⁶. Short reads were aligned using STAR v2.7.11b¹⁸⁰ to our chromosome-scale assembly. For A. esculentum, short reads (ERR12945500) and assembly (GCA_964036315.1) were downloaded from NCBI. The sequencing depth was calculated using mosdepth¹⁸¹ by a window size of 500,000 bp and normalized to the median coverage of the first chromosome. Chromosomes with significantly reduced read coverage were identified by a one-sided Wilcoxon rank-sum test of each chromosome’s normalized depth windows against all remaining chromosomes. P-values were corrected using the Benjamini-Hochberg method and considered only for chromosomes with at least 10% decrease in median normalized depth.

Gene family expansion analysis

Orthogroups were inferred across 13 molluscan species (Table 2), including S. officinalis, using OrthoFinder v3.1.0¹²² with default parameters. The input proteomes included the longest protein isoform per gene for each species. The rooted species tree from OrthoFinder^182,184 was converted to an ultrametric tree using the R package ape¹⁸³ v5.8.1.

Gene families were filtered by removing orthogroups present in only a single species, and by separating orthogroups containing 100 or more gene copies in any species, as extreme copy-number differences in gene families prevent likelihood calculation under the applied birth-death model.

Gene family evolution rates were estimated using CAFE5¹²⁸ v5.1.1 on the filtered orthogroups, using the ultrametric species tree as input. Four models were evaluated: the base model (single global lambda), and Gamma models with k = 2, 3, and 4 rate categories, which allow evolutionary rate variation among gene families. The Gamma k = 3 model was selected based on the best (lowest) final log-likelihood score. All subsequent statistical inferences were performed under this model.

For families showing statistically significant expansion or contraction (p < 0.05 after Bonferroni correction), branch-specific copy-number changes were extracted from the CAFE5 output. Families were categorized as S. officinalis-specific, coleoid-specific, or broad expansions based on the distribution of significant changes across the phylogeny.

To assess whether expanded gene families in S. officinalis contained genes derived from or embedded within repetitive elements, a coordinate-based overlap analysis was performed. For each gene in an expanded orthogroup, the overlap between its coding sequence (CDS) coordinates and RepeatMasker annotations was computed using bedtools intersect v2.30¹⁸⁵. To avoid double-counting when multiple repeat annotations overlapped the same coding bases, overlapping repeat intervals were merged per gene prior to summing covered bases, and the overlap fraction was computed as merged covered bases divided by total CDS length.

Bulk RNA-seq analysis

Quality-filtered paired-end RNA-seq reads were aligned to the S. officinalis genome assembly using STAR¹⁸⁰ (v2.7.11b) with the following parameters: --outSAMmultNmax 1 (retaining only uniquely mapping reads), --outFilterIntronMotifs RemoveNoncanonical (removing reads with non-canonical splice junctions), and --outSAMtype BAM SortedByCoordinate.

Gene-level read counts were quantified using featureCounts¹⁸⁶ from the Subread package (v2.0.8) with the gene annotation described above (GTF format). Counting was performed at the exon level (-t exon) and summarized by gene (-g gene_id), using paired-end mode (-p--countReadPairs) to count fragments rather than individual reads. Only reads with mapping quality ≥ 255 (-Q 255), corresponding to uniquely mapped reads in the STAR output, were included in the count matrix. The resulting gene count matrix was used as input for differential expression (DE) analysis.

Tissue-specific marker genes were identified using DESeq2¹⁸⁷ (v1.42.0) with a one-vs-all comparison strategy. For each tissue, samples were grouped into a binary condition (“target” vs. “other”), where “target” represented the tissue of interest and “other” comprised all remaining tissues. Raw count matrices were filtered to retain genes with ≥10 counts in at least 3 samples prior to analysis. DE testing was performed using the Wald test, log2 fold changes were shrunk using the apeglm¹⁸⁸ method. Genes were classified as tissue markers if they met the following criteria: adjusted p-value (Benjamini-Hochberg FDR) < 0.05, log2 fold change > 1, and mean expression in the target tissue > 5.

To generate gene-level GO^133,134 annotations for the S. officinalis genome, GO term assignments from the InterProScan output (detailed earlier) were used. First, transcript IDs were collapsed to gene IDs and for genes with multiple transcripts, GO terms were aggregated using a union strategy, collecting all unique GO terms across all isoforms and annotation sources to maximize coverage. GO terms present in fewer than 5 genes or more than 500 genes in the expressed gene universe were excluded from enrichment testing to reduce noise from overly specific or generic terms.

GO enrichment analysis was performed using the clusterProfiler^189,190 (v4.12.6) enricher function with custom GO annotations generated as described above. For each gene set (DE genes for each tissue), enrichment was tested against the background universe of all expressed genes (genes passing the low-count filter in DESeq2). Over-representation was assessed using the hypergeometric test, with p-values adjusted for multiple testing using the Benjamini-Hochberg procedure. GO terms with adjusted p-value (FDR) < 0.05 were considered significantly enriched.

S. officinalis members of each expanded gene family were cross-referenced against tissue-specific marker genes identified by DESeq2 (as described earlier). For families with at least one differentially expressed member, GO enrichment was performed using enricher() from clusterProfiler as described for tissue marker genes. For families with at least one enriched GO term, expression patterns of differentially expressed members were visualized as a heatmap of z-scored VST values, with samples grouped by tissue.

Supplementary figures

HapHiC scaffolding for different numbers of expected chromosome scaffolds show 47 chromosomes as most supported.
Hi-C contact maps from HapHiC⁸¹ are shown for 46, 47, 48, 49 and 50 expected chromosome scaffolds. Assembled chromosomes are shown as blue boxes, Hi-C signal indicating a false (unsupported) merger is shown by cyan arrow, false splits are shown by black arrows. The contact maps differ from the map shown in Figure 1, which was created using YAHS and manual curation.

BUSCO completeness results.
A) Comparison of two *S. officinalis* chromosome-scale assemblies, which were constructed from two independent datasets (this study: MPIBR, Darwin Tree of Life project: DToL), assembled using a common pipeline (hifiasm¹⁶⁷ with PacBio HiFi and Hi-C reads). Results for the database *metazoa_odb12*, the zoom in shows only duplicated, fragmented and missing fractions to improve readability. The DToL assemblies have slightly higher completeness than MPIBR, due to the higher sequencing coverage used as input. In both datasets, compared to the primary assembly (“.hic”), the phased haplotypes (“.hic.hap1” and “hic.hap2”) have less duplicated but more missing genes. B) BUSCO results for the *mollusca_odb12* database, showing the same trend as in A). C) Comparison of different BUSCO databases *odb10* and *odb12* on the manually curated assembly (“sepoff241117”). For the *mollusca* gene sets (top), a strong improvement in completeness was observed between *odb10* and *odb12*, reflecting that the updated gene set is more concise and conserved across species. For the *metazoa* gene sets (bottom) the completeness was marginally increased for *odb12* compared to *odb10*.

Analysis of raw data at breakpoints between *S. officinalis* assemblies hints at a technical cause of breakpoints.
A) Coverage of HiC and HiFi data shown for pairs of scaffolds exhibiting breakpoints. Blue shows MPIBR data, orange shows DToL data. For each breakpoint, trans HiC contacts are shown on top across the full scaffold, with terminal 200 kb windows highlighted in yellow. Both terminal windows are shown below with aligned HiFi reads (grey horizontal bars) and normalized HiFi read density. Trans HiC contacts are shown as purple dots. Right grey box: same data shown for the complete breakpoint scaffold of the other assembly, with trans HiC contacts calculated to a size-matched scaffold. B) Distribution of normalized trans HiC contact rate (pairs per Mb²) for random scaffold pairs (“background pairs”, grey) and within scaffolds (“intra scaffold”, green) for MPIBR (left) and DToL (right) data. Values for scaffolds with breakpoints are indicated in blue and orange, respectively. C) Histogram of contact rates from B) shown for random scaffold pairs and breakpoint pairs. Contact rates and empirical p-values of breakpoint pairs are indicated in blue (left, MPIBR) and orange (right, DToL). Joint p-value for three rates for DToL breakpoints is indicated in box (Wilcoxon rank-sum, one-tailed). D) Repeat analysis of 200 kb scaffold ends at breakpoints and control scaffolds (grey box). Overall repeat content (% of base pairs) and type are shown.

Syntenic relationship between *S. officinalis* and *D. pealeii* chromosomes.
Dot plot showing finer-resolution syntenic anchor hits (perfectly collinear blast hits within the same orthogroup). Genes are ordered along the chromosomes, gridlines are shown every 1000 genes. Only chromosome pairs with a minimum synteny score of 10 and at least 10 syntenic genes are shown. Synteny analysis and visualization were performed using GENESPACE v1.2.3¹⁷⁵.

Syntenic relationship between *S. officinalis* and *E. scolopes* chromosomes.
Dot plot showing finer-resolution syntenic anchor hits (perfectly collinear blast hits within the same orthogroup). Genes are ordered along the chromosomes, gridlines are shown every 1000 genes. Only chromosome pairs with a minimum synteny score of 10 and at least 10 syntenic genes are shown. *E. scolopes* chromosomes 45 and 46 are not shown because they contain too few orthogroups. Synteny analysis and visualization were performed using GENESPACE v1.2.3¹⁷⁵.

Syntenic comparison of four decapod species hints at a cephalopod sex chromosome.
A) Riparian plot showing synteny relationships of chromosomes from four decapod species, generated using GENESPACE¹⁷⁵ with orthogroups. *Euprymna* chromosomes 45 and 46 are not shown because they contain too few orthogroups. Chromosome split in *S. officinalis* compared to other species is shown in purple, putative sex chromosome as identified recently⁹⁶ is shown in cyan. B) Normalized coverage of sequencing data in *S. officinalis* chromosomes. C) Normalized coverage of short reads to female *A. esculentum* genome, reproduced from⁹⁶. Decrease in read coverage for chromosome 46 is visible, the putative Z sex chromosome. Read depth was calculated from Illumina gDNA reads in windows of 500,000 bp and normalized to the median coverage of chromosome 1. Box plots showing median divergence (box dividing line), interquartile range (box), and 1.5 times the interquartile range (whiskers). The putative Z chromosome is highlighted in cyan. Chromosomes with significantly reduced read coverage (orange label) were identified by a one-sided Wilcoxon rank-sum test of each chromosome’s normalized depth windows against all remaining chromosomes (Benjamini-Hochberg-corrected, at least 10% decrease in median normalized depth, * p < 0.5, ** p < 0.01, *** p < 0.001).

Gene family expansion analysis.
A) Gene family expansion analysis using CAFE5¹²⁸ with a gamma model (k=3) on all smaller gene families (less than 100 genes in any species). 30 families with the most change in different categories are shown (expanded only in *S. officinalis* (pink), in all coleoids (orange), in all species (yellow), in non-cephalopod mollusks (green) or overall contraction (blue)). Rows show change (expansion or contraction) of gene families in any species, columns show orthogroups and annotation, if available. Dots show significant change (p < 0.05), gene counts are shown for at any orthogroup with at least 12 genes in any species. B) Gene families with differential expression in bulk RNA-seq data. Dot size shows the number of DE genes for each tissue. D) Dotplots of enriched (GO) terms for large gene families, enriched using clusterProfiler using a hypergeometric test. Dot size shows the number of expressed genes per family with this GO term, x-axis shows percentage of expressed genes from all genes with this GO term. Dot color shows adjusted p-value after Benjamini-Hochberg false discovery rate (FDR) correction. CC: cellular component, MF: molecular function, BP: biological process. D) Heatmap of z-scored expression of all DE genes from the gene families with enriched GO terms.

Data availability

The genome assembly and raw data can be found at the BioProject PRJNA1091451 on NCBI. Raw sequencing reads are deposited at SRA (study accession SRP570862). The code for the genome assembly and annotation is available at https://gitlab.mpcdf.mpg.de/mpibr/laur/cuttlefishomics/soffgenome. Genome annotation files are deposited at https://public.brain.mpg.de/Laurent/sepoff/annotation/.

Acknowledgements

We thank Bruno Huettel for PacBio and Omni-C sequencing. We thank Xitong Liang, Theodosia Woo and Mathieu Renard for help with tissue dissection. We thank Darrin Schultz, Dalila Destanović and Thea Rogers for helpful discussions and advice on manual curation and gene modeling. We thank Victor Nieto Caballero for initial discussions on gene family expansion analysis.

This work was funded by the Max Planck Society (GL) and the European Research Council (GL; ERC grant CAMOUFLAGE, 101141501). O.S. was supported by the ERC’s Horizon 2020: European Union Research and Innovation Programme, grant No. 945026.

All authors declare no conflict of interest. Views and opinions expressed are those of the authors only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them.

Additional information

Author contributions

O.S., and G.L. conceived of the study design. S.R., O.S. and G.L. wrote the first draft of the manuscript. D.H. collected tissue for sequencing. E.C. and S.R. extracted DNA and RNA from tissue, E.C. performed Illumina sequencing. G.T. performed genome assembly. G.T. and S.R. manually curated and annotated the genome. S.R., G.T., and O.S. performed genomic analyses. S.R. created the figures. S.R., O.S. and G.L. coordinated and led the project. G.L. acquired funding for this study.

All authors contributed to the review and editing of the manuscript.

Funding

EC | European Research Council (ERC)

https://doi.org/10.3030/101141501

Gilles Laurent

EC | European Research Council (ERC)

https://doi.org/10.3030/945026

Oleg Simakov

References

1.
1. Giuditta A.
2. Libonati M.
3. Packard A.
4. Prozzo N
1971Nuclear counts in the brain lobes of octopus vulgaris as a function of body sizeBrain Res 25:55–62https://doi.org/10.1016/0006-8993(71)90566-x PubMed Google Scholar
2.
1. Young J
1963The number and sizes of nerve cells in OctopusProceedings of the Zoological Society of London 140:229–254https://doi.org/10.1111/j.1469-7998.1963.tb01862.x Google Scholar
3.
1. Dorkenwald S.
2. Matsliah A.
3. Sterling A.R.
4. Schlegel P.
5. Yu S.
6. McKellar C.E.
7. Lin A.
8. Costa M.
9. Eichler K.
10. Yin Y.
11. et al.
2024Neuronal wiring diagram of an adult brainNature 634:124–138https://doi.org/10.1038/s41586-024-07558-y PubMed Google Scholar
4.
1. Herculano-Houzel S.
2. Mota B.
3. Lent R
2006Cellular scaling rules for rodent brainsProc Natl Acad Sci 103:12138–12143https://doi.org/10.1073/pnas.0604911103 PubMed Google Scholar
5.
1. Hodgkin A.L.
2. Huxley A.F.
3. Katz B
1952Measurement of current-voltage relations in the membrane of the giant axon of LoligoJ Physiol 116:424–448https://doi.org/10.1113/jphysiol.1952.sp004716 PubMed Google Scholar
6.
1. Llinás R.
2. Steinberg I.Z.
3. Walton K
1981Presynaptic calcium currents in squid giant synapseBiophys J 33:289–321https://doi.org/10.1016/S0006-3495(81)84898-9 PubMed Google Scholar
7.
1. Llinás R.
2. Steinberg I.Z.
3. Walton K
1981Relationship between presynaptic calcium current and postsynaptic potential in squid giant synapseBiophys J 33:323–351https://doi.org/10.1016/S0006-3495(81)84899-0 PubMed Google Scholar
8.
1. Llinas R.R.
1984The Squid Giant Synapse
In:
1. Kleinzeller A.
, editors. Current Topics in Membranes and Transport The Squid Axon Academic Press pp. 519–546
https://doi.org/10.1016/S0070-2161(08)60483-9 Google Scholar
9.
1. Hanlon R.T.
2. Messenger J.B.
2018Cephalopod Behaviour 2nd edCambridge University Press https://doi.org/10.1017/9780511843600 Google Scholar
10.
1. Josef N.
2. Berenshtein I.
3. Rousseau M.
4. Scata G.
5. Fiorito G.
6. Shashar N
2017Size Matters: Observed and Modeled Camouflage Response of European Cuttlefish (Sepia officinalis) to Different Substrate Patch Sizes during MovementFront Physiol 7https://doi.org/10.3389/fphys.2016.00671 PubMed Google Scholar
11.
1. Osorio D.
2. Ménager F.
3. Tyler C.W.
4. Darmaillacq A.-S
2022Multi-level control of adaptive camouflage by European cuttlefishCurr Biol 32:2556–2562https://doi.org/10.1016/j.cub.2022.04.030 PubMed Google Scholar
12.
1. Marshall N.J.
2. Messenger J.B
1996Colour-blind camouflageNature 382:408–409https://doi.org/10.1038/382408b0 Google Scholar
13.
1. Zylinski S.
2. Osorio D.
3. Shohet A. j
2009Perception of edges and visual texture in the camouflage of the common cuttlefish, Sepia officinalis. PhilosTrans R Soc B Biol Sci 364:439–448https://doi.org/10.1098/rstb.2008.0264 PubMed Google Scholar
14.
1. Zylinski S.
2. Darmaillacq A.-S.
3. Shashar N
2012Visual interpolation for contour completion by the European cuttlefish (Sepia officinalis) and its use in dynamic camouflageProc R Soc B Biol Sci 279:2386–2390https://doi.org/10.1098/rspb.2012.0026 PubMed Google Scholar
15.
1. Zylinski S.
2. Osorio D.
3. Shohet A.J
2009Cuttlefish camouflage: context-dependent body pattern use during motionProc R Soc B Biol Sci 276:3963–3969https://doi.org/10.1098/rspb.2009.1083 PubMed Google Scholar
16.
1. Jozet-Alves C.
2. Bertin M.
3. Clayton N.S
2013Evidence of episodic-like memory in cuttlefishCurr Biol 23:R1033–R1035https://doi.org/10.1016/j.cub.2013.10.021 PubMed Google Scholar
17.
1. Schnell A.K.
2. Boeckle M.
3. Rivera M.
4. Clayton N.S.
5. Hanlon R.T
2021Cuttlefish exert self-control in a delay of gratification taskProc R Soc B Biol Sci 288:20203161https://doi.org/10.1098/rspb.2020.3161 PubMed Google Scholar
18.
1. Boycott B.B.
2. Young J.Z
1955A memory system in Octopus vulgaris LamarckProc R Soc Lond Ser B - Biol Sci 143:449–480https://doi.org/10.1098/rspb.1955.0024 PubMed Google Scholar
19.
1. Mather J.A
2006Behaviour Development: A Cephalopod PerspectiveInt J Comp Psychol 19https://doi.org/10.46867/ijcp.2006.19.01.02 Google Scholar
20.
1. Hall K.
2. Hanlon R
2002Principal features of the mating system of a large spawning aggregation of the giant Australian cuttlefish Sepia apama (Mollusca: Cephalopoda)Mar Biol 140:533–545https://doi.org/10.1007/s00227-001-0718-0 Google Scholar
21.
1. Norman M.D.
2. Finn J.
3. Tregenza T
1999Female impersonation as an alternative reproductive strategy in giant cuttlefishProc R Soc Lond B Biol Sci 266:1347–1349https://doi.org/10.1098/rspb.1999.0786 Google Scholar
22.
1. Sampaio E.
2. Seco M.C.
3. Rosa R.
4. Gingins S
2021Octopuses punch fishes during collaborative interspecific hunting eventsEcology 102:e03266https://doi.org/10.1002/ecy.3266 PubMed Google Scholar
23.
1. Sampaio E.
2. Sridhar V.H.
3. Francisco F.A.
4. Nagy M.
5. Sacchi A.
6. Strandburg-Peshkin A.
7. Nührenberg P.
8. Rosa R.
9. Couzin I.D.
10. Gingins S
2024Multidimensional social influence drives leadership and composition-dependent success in octopus–fish hunting groups. NatEcol Evol 8:2072–2084https://doi.org/10.1038/s41559-024-02525-2 PubMed Google Scholar
24.
1. Feord R.C.
2. Sumner M.E.
3. Pusdekar S.
4. Kalra L.
5. Gonzalez-Bellido P.T.
6. Wardill T.J
2020Cuttlefish use stereopsis to strike at preySci Adv 6:eaay6036https://doi.org/10.1126/sciadv.aay6036 PubMed Google Scholar
25.
1. Medeiros S.L. de S.
2. Paiva M.M.M. de
3. Lopes P.H.
4. Blanco W.
5. Lima F.D. de
6. Oliveira J.B.C. de
7. Medeiros I.G.
8. Sequerra E.B.
9. de Souza S.
10. Leite T.S.
11. et al.
2021Cyclic alternation of quiet and active sleep states in the octopusiScience 24:102223https://doi.org/10.1016/j.isci.2021.102223 PubMed Google Scholar
26.
1. Iglesias T.L.
2. Boal J.G.
3. Frank M.G.
4. Zeil J.
5. Hanlon R.T
2019Cyclic nature of the REM sleep-like state in the cuttlefish Sepia officinalisJ Exp Biol 222:jeb174862https://doi.org/10.1242/jeb.174862 PubMed Google Scholar
27.
1. Pophale A.
2. Shimizu K.
3. Mano T.
4. Iglesias T.L.
5. Martin K.
6. Hiroi M.
7. Asada K.
8. Andaluz P.G.
9. Van Dinh T.T.
10. Meshulam L.
11. et al.
2023Wake-like skin patterning and neural activity during octopus sleepNature 619:129–134https://doi.org/10.1038/s41586-023-06203-4 PubMed Google Scholar
28.
1. Young J.Z
1974The central nervous system of Loligo I. The optic lobePhilos Trans R Soc Lond B Biol Sci 267:263–302https://doi.org/10.1098/rstb.1974.0002 PubMed Google Scholar
29.
1. Young J.Z
1962The optic lobes of Octopus vulgarisPhilos Trans R Soc Lond B Biol Sci 245:19–58https://doi.org/10.1098/rstb.1962.0005 Google Scholar
30.
1. Young J
1960The Visual System of Octopus:(1) Regularities in the Retina and Optic Lobes of Octopus in Relation to Form DiscriminationNature 186:836–839https://doi.org/10.1038/186836a0 PubMed Google Scholar
31.
1. Young J.Z
1991Computation in the Learning System of CephalopodsBiol Bull 180:200–208https://doi.org/10.2307/1542389 PubMed Google Scholar
32.
1. Young J.Z
1963Light-and Dark-Adaptation in the Eyes of Some CephalopodsProc Zool Soc Lond 140:255–272https://doi.org/10.1111/j.1469-7998.1963.tb01863.x Google Scholar
33.
1. Young J.Z
1962The retina of cephalopods and its degeneration after optic nerve sectionPhilos Trans R Soc Lond B Biol Sci 245:1–18https://doi.org/10.1098/rstb.1962.0004 Google Scholar
34.
1. Nixon M.
2. Young J.Z.
2003The brains and lives of cephalopodsOxford University Press Google Scholar
35.
1. Case N.M.
2. Gray E.
3. Young J
1972Ultrastructure and synaptic relations in the optic lobe of the brain of Eledone and OctopusJ Ultrastruct Res 39:115–123https://doi.org/10.1016/s0022-5320(72)80012-1 PubMed Google Scholar
36.
1. Dilly P.
2. Gray E.G.
3. Young J.Z
1963Electron microscopy of optic nerves and optic lobes of Octopus and EledoneProc R Soc Lond B Biol Sci 158:446–456https://doi.org/10.1098/rspb.1963.0057 PubMed Google Scholar
37.
1. Stephens P.
2. Young J.Z
1969The glio-vascular system of cephalopodsPhilos Trans R Soc Lond B Biol Sci 255:1–12https://doi.org/10.1098/rstb.1969.0001 Google Scholar
38.
1. Baden T.
2. Briseño J.
3. Coffing G.
4. Cohen-Bodénès S.
5. Courtney A.
6. Dickerson D.
7. Dölen G.
8. Fiorito G.
9. Gestal C.
10. Gustafson T.
11. et al.
2023Cephalopod-omics: Emerging Fields and Technologies in Cephalopod BiologyIntegr Comp Biol 63:1226–1239https://doi.org/10.1093/icb/icad087 PubMed Google Scholar
39.
1. Styfhals R.
2. Zolotarov G.
3. Hulselmans G.
4. Spanier K.I.
5. Poovathingal S.
6. Elagoz A.M.
7. De Winter S.
8. Deryckere A.
9. Rajewsky N.
10. Ponte G.
11. et al.
2022Cell type diversity in a developing octopus brainNat Commun 13:7392https://doi.org/10.1038/s41467-022-35198-1 PubMed Google Scholar
40.
1. Songco-Casey J.O.
2. Coffing G.C.
3. Piscopo D.M.
4. Pungor J.R.
5. Kern A.D.
6. Miller A.C.
7. Niell C.M
2022Cell types and molecular architecture of the Octopus bimaculoides visual systemCurr Biol https://doi.org/10.1016/j.cub.2022.10.015 PubMed Google Scholar
41.
1. Duruz J.
2. Sprecher M.
3. Kaldun J.C.
4. Al-Soudy A.-S.
5. Lischer H.E.
6. van Geest G.
7. Nicholson P.
8. Bruggmann R.
9. Sprecher S.G.
2023Molecular characterization of cell types in the squid Loligo vulgariseLife 12:e80670https://doi.org/10.7554/eLife.80670 PubMed Google Scholar
42.
1. Gavriouchkina D.
2. Tan Y.
3. Parey E.
4. Ziadi-Künzli F.
5. Hasegawa Y.
6. Piovani L.
7. Zhang L.
8. Sugimoto C.
9. Luscombe N.
10. Marlétaz F.
11. et al.
2025A single-cell atlas of the bobtail squid visual and nervous system highlights molecular principles of convergent evolution. NatEcol Evol 9:1245–1262https://doi.org/10.1038/s41559-025-02720-9 PubMed Google Scholar
43.
1. Schmidbaur H.
2. Kawaguchi A.
3. Clarence T.
4. Fu X.
5. Hoang O.P.
6. Zimmermann B.
7. Ritschard E.A.
8. Weissenbacher A.
9. Foster J.S.
10. Nyholm S.V.
11. et al.
2022Emergence of novel cephalopod gene regulation and expression through large-scale genome reorganizationNat Commun 13:2172https://doi.org/10.1038/s41467-022-29694-7 PubMed Google Scholar
44.
1. Rouressol L.
2. Briseno J.
3. Vijayan N.
4. Chen G.Y.
5. Ritschard E.A.
6. Sanchez G.
7. Nyholm S.V.
8. McFall-Ngai M.J.
9. Simakov O
2023Emergence of novel genomic regulatory regions associated with light-organ development in the bobtail squidiScience 26:107091https://doi.org/10.1016/j.isci.2023.107091 PubMed Google Scholar
45.
1. Yoshida M.
2. Hirota K.
3. Imoto J.
4. Okuno M.
5. Tanaka H.
6. Kajitani R.
7. Toyoda A.
8. Itoh T.
9. Ikeo K.
10. Sasaki T.
11. et al.
2022Gene Recruitments and Dismissals in the Argonaut Genome Provide Insights into Pelagic Lifestyle Adaptation and Shell-like Eggcase ReacquisitionGenome Biol Evol 14:evac140https://doi.org/10.1093/gbe/evac140 PubMed Google Scholar
46.
1. Albertin C.B.
2. Simakov O.
3. Mitros T.
4. Wang Z.Y.
5. Pungor J.R.
6. Edsinger-Gonzales E.
7. Brenner S.
8. Ragsdale C.W.
9. Rokhsar D.S
2015The octopus genome and the evolution of cephalopod neural and morphological noveltiesNature 524:220–224https://doi.org/10.1038/nature14668 PubMed Google Scholar
47.
1. Destanović D.
2. Schultz D.T.
3. Styfhals R.
4. Cruz F.
5. Gómez-Garrido J.
6. Gut M.
7. Gut I.
8. Fiorito G.
9. Simakov O.
10. Alioto T.S.
11. et al.
2023A chromosome-level reference genome for the common octopus, Octopus vulgaris (Cuvier, 1797)G3 GenesGenomesGenetics 13:jkad220https://doi.org/10.1093/g3journal/jkad220 PubMed Google Scholar
48.
1. Belcaid M.
2. Casaburi G.
3. McAnulty S.J.
4. Schmidbaur H.
5. Suria A.M.
6. Moriano-Gutierrez S.
7. Pankey M.S.
8. Oakley T.H.
9. Kremer N.
10. Koch E.J.
11. et al.
2019Symbiotic organs shaped by distinct modes of genome evolution in cephalopodsProc Natl Acad Sci 116:3030–3035https://doi.org/10.1073/pnas.1817322116 PubMed Google Scholar
49.
1. Albertin C.B.
2. Medina-Ruiz S.
3. Mitros T.
4. Schmidbaur H.
5. Sanchez G.
6. Wang Z.Y.
7. Grimwood J.
8. Rosenthal J.J.C.
9. Ragsdale C.W.
10. Simakov O.
11. et al.
2022Genome and transcriptome mechanisms driving cephalopod evolutionNat Commun 13https://doi.org/10.1038/s41467-022-29748-w PubMed Google Scholar
50.
1. Marino A.
2. Kizenko A.
3. Wong W.Y.
4. Ghiselli F.
5. Simakov O
2022Repeat Age Decomposition Informs an Ancient Set of Repeats Associated With Coleoid Cephalopod DivergenceFront Genet 13https://doi.org/10.3389/fgene.2022.793734 PubMed Google Scholar
51.
1. Simakov O.
2. Bredeson J.
3. Berkoff K.
4. Marletaz F.
5. Mitros T.
6. Schultz D.T.
7. O’Connell B.L.
8. Dear P.
9. Martinez D.E.
10. Steele R.E.
11. et al.
2022Deeply conserved synteny and the evolution of metazoan chromosomesSci Adv 8:eabi5884https://doi.org/10.1126/sciadv.abi5884 PubMed Google Scholar
52.
1. Zou Y.
2. Fu J.
3. Liang Y.
4. Luo X.
5. Shen M.
6. Huang M.
7. Chen Y.
8. You W.
9. Ke C
2024Chromosome-level genome assembly of the ivory shell Babylonia areolataSci Data 11:1201https://doi.org/10.1038/s41597-024-04001-9 PubMed Google Scholar
53.
1. Chen Z.
2. Baeza J.A.
3. Chen C.
4. Gonzalez M.T.
5. González V.L.
6. Greve C.
7. Kocot K.M.
8. Arbizu P.M.
9. Moles J.
10. Schell T.
11. et al.
2025A genome-based phylogeny for Mollusca is concordant with fossils and morphologyScience 387:1001–1007https://doi.org/10.1126/science.ads0215 PubMed Google Scholar
54.
1. Zeng Q.
2. Liu J.
3. Wang C.
4. Wang H.
5. Zhang L.
6. Hu J.
7. Bao L.
8. Wang S
2021High-quality reannotation of the king scallop genome reveals no ‘gene-rich’ feature and evolution of toxin resistanceComput Struct Biotechnol J 19:4954–4960https://doi.org/10.1016/j.csbj.2021.08.038 PubMed Google Scholar
55.
1. Sun S.
2. Han X.
3. Han Z.
4. Liu Q
2024Chromosomal-scale genome assembly and annotation of the land slug (Meghimatium bilineatum)Sci Data 11:35https://doi.org/10.1038/s41597-023-02893-7 PubMed Google Scholar
56.
1. Männer L.
2. Schell T.
3. Spies J.
4. Galià-Camps C.
5. Baranski D.
6. Ben Hamadou A.
7. Gerheim C.
8. Neveling K.
9. Helfrich E.J.N.
10. Greve C
2024Chromosome-level genome assembly of the sacoglossan sea slug Elysia timida (Risso, 1818)BMC Genomics 25:941https://doi.org/10.1186/s12864-024-10829-7 PubMed Google Scholar
57.
1. Ma B.
2. Jin W.
3. Fu H.
4. Sun B.
5. Yang S.
6. Ma X.
7. Wen H.
8. Wu X.
9. Wang H.
10. Cao X
2023A High-Quality Chromosome-Level Genome Assembly of a Snail Cipangopaludina cathayensis (Gastropoda: Viviparidae)Genes 14https://doi.org/10.3390/genes14071365 PubMed Google Scholar
58.
1. Liu Z.
2. Huang Y.
3. Chen H.
4. Liu C.
5. Wang M.
6. Bian C.
7. Wang L.
8. Song L
2023Chromosome-level genome assembly of the deep-sea snail Phymorhynchus buccinoides provides insights into the adaptation to the cold seep habitatBMC Genomics 24:679https://doi.org/10.1186/s12864-023-09760-0 PubMed Google Scholar
59.
1. Peñaloza C.
2. Gutierrez A.P.
3. Eöry L.
4. Wang S.
5. Guo X.
6. Archibald A.L.
7. Bean T.P.
8. Houston R.D
2021A chromosome-level genome assembly for the Pacific oyster Crassostrea gigasGigaScience 10:giab020https://doi.org/10.1093/gigascience/giab020 PubMed Google Scholar
60.
1. Arias-Montecino A.
2. Sykes A.
3. Álvarez-Hernán G.
4. de Mera-Rodríguez J.A.
5. Calle-Guisado V.
6. Martín-Partido G.
7. Rodríguez-León J.
8. Francisco-Morcillo J.
2024Histological and scanning electron microscope observations on the developing retina of the cuttlefish (Sepia officinalis Linnaeus, 1758)Tissue Cell 88:102417https://doi.org/10.1016/j.tice.2024.102417 PubMed Google Scholar
61.
1. O’Brien C.E.
2. Mezrai N.
3. Darmaillacq A.-S.
4. Dickel L
2017Behavioral development in embryonic and early juvenile cuttlefish (Sepia officinalis)Dev Psychobiol 59:145–160https://doi.org/10.1002/dev.21476 PubMed Google Scholar
62.
1. Bellingham J.
2. Morris A.G.
3. Hunt D.M
1998The Rhodopsin Gene of the Cuttlefish Sepia Officinalis: Sequence and Spectral TuningJ Exp Biol 201:2299–2306https://doi.org/10.1242/jeb.201.15.2299 PubMed Google Scholar
63.
1. Chemello G.
2. Faraoni V.
3. Notarstefano V.
4. Maradonna F.
5. Carnevali O.
6. Gioacchini G
2022First Evidence of Microplastics in the Yolk and Embryos of Common Cuttlefish (Sepia officinalis) from the Central Adriatic Sea: Evaluation of Embryo and Hatchling Structural Integrity and DevelopmentAnimals 13https://doi.org/10.3390/ani13010095 PubMed Google Scholar
64.
1. Court M.
2. Macau M.
3. Ranucci M.
4. Marquês T.
5. Repolho T.
6. Lopes V.M.
7. Rosa R.
8. Paula J.R
2024Oxygen loss compromises growth and cognition of cuttlefish newbornsProc Biol Sci 291:20241291https://doi.org/10.1098/rspb.2024.1291 PubMed Google Scholar
65.
1. Gladman N.W.
2. Askew G.N
2023The hydrodynamics of jet propulsion swimming in hatchling and juvenile European common cuttlefish, Sepia officinalisJ Exp Biol 226https://doi.org/10.1242/jeb.246225 PubMed Google Scholar
66.
1. Yang T.
2. Jia Z.
3. Chen H.
4. Deng Z.
5. Liu W.
6. Chen L.
7. Li L
2020Mechanical design of the highly porous cuttlebone: A bioceramic hard buoyancy tank for cuttlefishProc Natl Acad Sci U S A 117:23450–23459https://doi.org/10.1073/pnas.2009531117 PubMed Google Scholar
67.
1. Von Boletzky S.
1987Fecundity variation in relation to intermittent or chronic spawning in the cuttlefish, Sepia officinalis L. (Mollusca, Cephalopoda)Bull Mar Sci 40:382–388Google Scholar
68.
1. Laptikhovsky V.
2. Salman A.
3. Onsoy B.
4. Katagan T
2003Fecundity of the common cuttlefish, Sepia officinalis L. (Cephalopoda, Sepiidae): A new look at an old problemSci Mar 67:279–284https://doi.org/10.3989/scimar.2003.67n3279 Google Scholar
69.
1. Reiter S.
2. Hülsdunk P.
3. Woo T.
4. Lauterbach M.A.
5. Eberle J.S.
6. Akay L.A.
7. Longo A.
8. Meier-Credo J.
9. Kretschmer F.
10. Langer J.D.
11. et al.
2018Elucidating the control and development of skin patterning in cuttlefishNature 562:361–366https://doi.org/10.1038/s41586-018-0591-3 PubMed Google Scholar
70.
1. Woo T.
2. Liang X.
3. Evans D.A.
4. Fernandez O.
5. Kretschmer F.
6. Reiter S.
7. Laurent G
2023The dynamics of pattern matching in camouflaging cuttlefishNature 619:122–128https://doi.org/10.1038/s41586-023-06259-2 PubMed Google Scholar
71.
1. Imarazene B.
2. Andouche A.
3. Bassaglia Y.
4. Lopez P.-J.
5. Bonnaud-Ponticelli L
2017Eye Development in Sepia officinalis Embryo: What the Uncommon Gene Expression Profiles Tell Us about Eye EvolutionFront Physiol 8:613https://doi.org/10.3389/fphys.2017.00613 PubMed Google Scholar
72.
1. Cocci P.
2. Mosconi G.
3. Palermo F.A
2023Effect of polycyclic aromatic hydrocarbons on homeobox gene expression during embryonic development of cuttlefish, Sepia officinalisChemosphere 325:138315https://doi.org/10.1016/j.chemosphere.2023.138315 PubMed Google Scholar
73.
1. Andouche A.
2. Bassaglia Y.
3. Baratte S.
4. Bonnaud L
2013Reflectin genes and development of iridophore patterns in Sepia officinalis embryos (Mollusca, Cephalopoda)Dev Dyn 242:560–571https://doi.org/10.1002/dvdy.23938 PubMed Google Scholar
74.
1. McKenna V.
2. Archibald J.
3. Beinart R.
4. Dawson M.
5. Hentschel U.
6. Keeling P.
7. Lopez J.
8. Martin-Duran J.
9. Petersen J.
10. Sigwart J.
11. et al.
2024The Aquatic Symbiosis Genomics Project: probing the evolution of symbiosis across the Tree of Life [version 2; peer review: 1 approved, 1 approved with reservations]Wellcome Open Res 6https://doi.org/10.12688/wellcomeopenres.17222.2 PubMed Google Scholar
75.
1. The Darwin Tree of Life Project Consortium
2022Sequence locally, think globally: The Darwin Tree of Life Project. Proc. Natl. AcadSci 119:e2115642118https://doi.org/10.1073/pnas.2115642118 PubMed Google Scholar
76.
1. Rubino F.A.
2. Coffing G.C.
3. Gibbons C.J.
4. Small S.T.
5. Desvignes T.
6. Pessutti J.
7. Petersen A.M.
8. Arkhipkin A.
9. Shcherbich Z.
10. Postlethwait J.H.
11. et al.
2025A non-invasive method to genotype cephalopod sex by quantitative PCRbioRxiv https://doi.org/10.1101/2025.10.28.685099 PubMed Google Scholar
77.
1. Ranallo-Benavidez T.R.
2. Jaron K.S.
3. Schatz M.C
2020GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomesNat Commun 11:1432https://doi.org/10.1038/s41467-020-14998-3 PubMed Google Scholar
78.
1. Vurture G.W.
2. Sedlazeck F.J.
3. Nattestad M.
4. Underwood C.J.
5. Fang H.
6. Gurtowski J.
7. Schatz M.C
2017GenomeScope: fast reference-free genome profiling from short readsBioinformatics 33:2202–2204https://doi.org/10.1093/bioinformatics/btx153 PubMed Google Scholar
79.
1. Hjelmen C.E
2024Genome size and chromosome number are critical metrics for accurate genome assembly assessment in EukaryotaGenetics 227:iyae099https://doi.org/10.1093/genetics/iyae099 PubMed Google Scholar
80.
1. Simão F.A.
2. Waterhouse R.M.
3. Ioannidis P.
4. Kriventseva E.V.
5. Zdobnov E.M
2015BUSCO: assessing genome assembly and annotation completeness with single-copy orthologsBioinformatics 31:3210–3212https://doi.org/10.1093/bioinformatics/btv351 PubMed Google Scholar
81.
1. Zeng X.
2. Yi Z.
3. Zhang X.
4. Du Y.
5. Li Y.
6. Zhou Z.
7. Chen S.
8. Zhao H.
9. Yang S.
10. Wang Y.
11. et al.
2024Chromosome-level scaffolding of haplotype-resolved assemblies using Hi-C data without reference genomesNat Plants 10:1184–1200https://doi.org/10.1038/s41477-024-01755-3 PubMed Google Scholar
82.
1. Zhou C.
2. McCarthy S.A.
3. Durbin R
2023YaHS: yet another Hi-C scaffolding toolBioinformatics 39:btac808https://doi.org/10.1093/bioinformatics/btac808 PubMed Google Scholar
83.
1. Dudchenko O.
2. Shamim M.S.
3. Batra S.S.
4. Durand N.C.
5. Musial N.T.
6. Mostofa R.
7. Pham M.
8. Hilaire B.G.S.
9. Yao W.
10. Stamenova E.
11. et al.
2018The Juicebox Assembly Tools module facilitates de novo assembly of mammalian genomes with chromosome-length scaffolds for under $1000bioRxiv https://doi.org/10.1101/254797 Google Scholar
84.
1. Sanchez G.
2. Fernández-Álvarez F.Á.
3. Bernal A.
4. Heath-Heckman E.
5. Lami R.
6. McFall-Ngai M.
7. Nishiguchi M.
8. Nyholm S.
9. Simakov O.
10. Allcock A.L.
11. et al.
2026Rapid mid-Cretaceous diversification of squid and cuttlefish preceded radiation into coastal niches. NatEcol Evol https://doi.org/10.1038/s41559-026-03009-1 PubMed Google Scholar
85.
1. Gao Y.M.
2. Natsukari Y
1990Karyological studies on seven cephalopodsVenus Jpn J Malacol 49:126–145https://doi.org/10.18941/venusjjm.49.2_126 Google Scholar
86.
1. Jazayeri A.
2. Papan F.
3. Motamedi H.
4. Mahmoudi S
2011Karyological investigation of Persian Gulf cuttle fish (sepia arabica) in the coasts of Khuzestan provinceLife Sci J 8Google Scholar
87.
1. Ebrahimi Pour M.
2009The Study of Persian Gulf Cuttlefish ( Sepia pharaonis) Chromosome Via Incubation of Blood Cells. ApplBiol 22:1–8https://doi.org/10.22051/jab.2009.3307 Google Scholar
88.
1. Vitturi R.
2. Rasotto M.B.
3. Farinella-Ferruzza N
1982The chromosomes of 16 molluscan species. ItalJ Zool 49:61–71https://doi.org/10.1080/11250008209439373 Google Scholar
89.
1. Jain C.
2. Rhie A.
3. Hansen N.F.
4. Koren S.
5. Phillippy A.M
2022Long-read mapping to repetitive reference sequences using Winnowmap2Nat Methods 19:705–710https://doi.org/10.1038/s41592-022-01457-8 PubMed Google Scholar
90.
1. Jain C.
2. Rhie A.
3. Zhang H.
4. Chu C.
5. Walenz B.P.
6. Koren S.
7. Phillippy A.M
2020Weighted minimizer sampling improves long read mappingBioinformatics 36:i111–i118https://doi.org/10.1093/bioinformatics/btaa435 PubMed Google Scholar
91.
1. Vasimuddin M.
2. Misra S.
3. Li H.
4. Aluru S.
2019Efficient Architecture-Aware Acceleration of BWA-MEM for Multicore Systems2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) :314–324https://doi.org/10.1109/IPDPS.2019.00041 Google Scholar
92.
1. Abdennur N.
2. Fudenberg G.
3. Flyamer I.M.
4. Galitsyna A.A.
5. Goloborodko A.
6. Imakaev M.
7. Venev S.V.
2023Pairtools: from sequencing data to chromosome contactsbioRxiv https://doi.org/10.1101/2023.02.13.528389 PubMed Google Scholar
93.
1. Smit A.
2. Hubley R.
3. Green P.
no dateRepeatMaskerhttp://www.repeatmasker.org
94.
1. López-Córdova D.A.
2. Avaria-Llautureo J.
3. Ulloa P.M.
4. Braid H.E.
5. Revell L.J.
6. Fuchs D.
7. Ibáñez C.M
2022Mesozoic origin of coleoid cephalopods and their abrupt shifts of diversification patternsMol Phylogenet Evol 166:107331https://doi.org/10.1016/j.ympev.2021.107331 PubMed Google Scholar
95.
1. Coffing G.
2. Tittes S
3. Small S.
4. Songco-Casey J.O.
5. Pungor J.
6. Piscopo D.
7. Miller A.
8. Niell C.
9. Kern A
2024Data for: Cephalopod Sex Determination and its Ancient Evolutionary OriginZenodo https://doi.org/10.5281/zenodo.14010217 Google Scholar
96.
1. Coffing G.C.
2. Tittes S.
3. Small S.T.
4. Songco-Casey J.O.
5. Piscopo D.M.
6. Pungor J.R.
7. Miller A.C.
8. Niell C.M.
9. Kern A.D
2025Cephalopod sex determination and its ancient evolutionary originCurr Biol 35:931–939https://doi.org/10.1016/j.cub.2025.01.005 PubMed Google Scholar
97.
1. Flynn J.M.
2. Hubley R.
3. Goubert C.
4. Rosen J.
5. Clark A.G.
6. Feschotte C.
7. Smit A.F
2020RepeatModeler2 for automated genomic discovery of transposable element familiesProc Natl Acad Sci 117:9451–9457https://doi.org/10.1073/pnas.1921046117 PubMed Google Scholar
98.
1. Feschotte C
2008The contribution of transposable elements to the evolution of regulatory networksNat Rev Genet 9:397–405https://doi.org/10.1038/nrg2337 PubMed Google Scholar
99.
1. Hoff K.J.
2. Lomsadze A.
3. Borodovsky M.
4. Stanke M
2019Whole-Genome Annotation with BRAKERMethods Mol Biol Clifton NJ 1962:65–95https://doi.org/10.1007/978-1-4939-9173-0_5 PubMed Google Scholar
100.
1. Brůna T.
2. Hoff K.J.
3. Lomsadze A.
4. Stanke M.
5. Borodovsky M
2021BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein databaseNAR Genomics Bioinforma 3:lqaa108https://doi.org/10.1093/nargab/lqaa108 PubMed Google Scholar
101.
1. Brůna T.
2. Gabriel L.
3. Hoff K.J
2024Navigating Eukaryotic Genome Annotation Pipelines: A Route Map to BRAKER, Galba, and TSEBRAarXiv https://doi.org/10.48550/arXiv.2403.19416 Google Scholar
102.
1. Gabriel L.
2. Hoff K.J.
3. Brůna T.
4. Borodovsky M.
5. Stanke M
2021TSEBRA: transcript selector for BRAKERBMC Bioinformatics 22:566https://doi.org/10.1186/s12859-021-04482-0 PubMed Google Scholar
103.
1. Hoff K.J.
2. Lange S.
3. Lomsadze A.
4. Borodovsky M.
5. Stanke M
2016BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUSBioinforma Oxf Engl 32:767–769https://doi.org/10.1093/bioinformatics/btv661 PubMed Google Scholar
104.
1. Stanke M.
2. Schöffmann O.
3. Morgenstern B.
4. Waack S
2006Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sourcesBMC Bioinformatics 7:62https://doi.org/10.1186/1471-2105-7-62 PubMed Google Scholar
105.
1. Stanke M.
2. Diekhans M.
3. Baertsch R.
4. Haussler D
2008Using native and syntenically mapped cDNA alignments to improve de novo gene findingBioinformatics 24:637–644https://doi.org/10.1093/bioinformatics/btn013 PubMed Google Scholar
106.
1. Li H
2023Protein-to-genome alignment with miniprotBioinformatics 39:btad014https://doi.org/10.1093/bioinformatics/btad014 PubMed Google Scholar
107.
1. Iwata H.
2. Gotoh O
2012Benchmarking spliced alignment programs including Spaln2, an extended version of Spaln that incorporates additional species-specific featuresNucleic Acids Res 40:e161https://doi.org/10.1093/nar/gks708 PubMed Google Scholar
108.
1. Gotoh O
2008A space-efficient and accurate method for mapping and aligning cDNA sequences onto genomic sequenceNucleic Acids Res 36:2630–2638https://doi.org/10.1093/nar/gkn105 PubMed Google Scholar
109.
1. Buchfink B.
2. Xie C.
3. Huson D.H
2015Fast and sensitive protein alignment using DIAMONDNat Methods 12:59–60https://doi.org/10.1038/nmeth.3176 PubMed Google Scholar
110.
1. Kovaka S.
2. Zimin A.V.
3. Pertea G.M.
4. Razaghi R.
5. Salzberg S.L.
6. Pertea M
2019Transcriptome assembly from long-read RNA-seq alignments with StringTie2Genome Biol 20:278https://doi.org/10.1186/s13059-019-1910-1 PubMed Google Scholar
111.
1. Brůna T.
2. Lomsadze A.
3. Borodovsky M
2024GeneMark-ETP significantly improves the accuracy of automatic annotation of large eukaryotic genomesGenome Res 34:757–768https://doi.org/10.1101/gr.278373.123 PubMed Google Scholar
112.
1. Huang N.
2. Li H
2023compleasm: a faster and more accurate reimplementation of BUSCOBioinformatics 39:btad595https://doi.org/10.1093/bioinformatics/btad595 PubMed Google Scholar
113.
1. Pertea G.
2. Pertea M
2020GFF Utilities: GffRead and GffComparePreprint at F1000Research https://doi.org/10.12688/f1000research.23297.2 PubMed Google Scholar
114.
1. Gabriel L.
2. Brůna T.
3. Hoff K.J.
4. Ebel M.
5. Lomsadze A.
6. Borodovsky M.
7. Stanke M
2024BRAKER3: Fully automated genome annotation using RNA-seq and protein evidence with GeneMark-ETP, AUGUSTUS, and TSEBRAGenome Res 34:769–777https://doi.org/10.1101/gr.278090.123 PubMed Google Scholar
115.
1. Rogers T.F.
2. Yalçın G.
3. Briseno J.
4. Vijayan N.
5. Nyholm S.V.
6. Simakov O
2024Gene modelling and annotation for the Hawaiian bobtail squid, Euprymna scolopesSci Data 11:40https://doi.org/10.1038/s41597-023-02903-8 PubMed Google Scholar
116.
1. Zhang Y.
2. Mao F.
3. Mu H.
4. Huang M.
5. Bao Y.
6. Wang L.
7. Wong N.-K.
8. Xiao S.
9. Dai H.
10. Xiang Z.
11. et al.
2021The genome of Nautilus pompilius illuminates eye evolution and biomineralization. NatEcol Evol 5:927–938https://doi.org/10.1038/s41559-021-01448-6 PubMed Google Scholar
117.
1. Nevers Y.
2. Rossier V.
3. Train C.M.
4. Altenhoff A.
5. Dessimoz C.
6. Glover N
2022Multifaceted quality assessment of gene repertoire annotation with OMArkbioRxiv https://doi.org/10.1101/2022.11.25.517970 Google Scholar
118.
1. Goldfarb T.
2. Kodali V.K.
3. Pujar S.
4. Brover V.
5. Robbertse B.
6. Farrell C.M.
7. Oh D.-H.
8. Astashyn A.
9. Ermolaeva O.
10. Haddad D.
11. et al.
2025NCBI RefSeq: reference sequence standards through 25 years of curation and annotationNucleic Acids Res 53:D243–D257https://doi.org/10.1093/nar/gkae1038 PubMed Google Scholar
119.
1. Blum M.
2. Andreeva A.
3. Florentino L.C.
4. Chuguransky S.R.
5. Grego T.
6. Hobbs E.
7. Pinto B.L.
8. Orr A.
9. Paysan-Lafosse T.
10. Ponamareva I.
11. et al.
2025InterPro: the protein sequence classification resource in 2025Nucleic Acids Res 53:D444–D456https://doi.org/10.1093/nar/gkae1082 PubMed Google Scholar
120.
1. Cantalapiedra C.P.
2. Hernández-Plaza A.
3. Letunic I.
4. Bork P.
5. Huerta-Cepas J
2021eggNOG-mapper v2: Functional Annotation, Orthology Assignments, and Domain Prediction at the Metagenomic ScaleMol Biol Evol 38:5825–5829https://doi.org/10.1093/molbev/msab293 PubMed Google Scholar
121.
1. Huerta-Cepas J.
2. Forslund K.
3. Coelho L.P.
4. Szklarczyk D.
5. Jensen L.J.
6. von Mering C.
7. Bork P.
2017Fast Genome-Wide Functional Annotation through Orthology Assignment by eggNOG-MapperMol Biol Evol 34:2115–2122https://doi.org/10.1093/molbev/msx148 PubMed Google Scholar
122.
1. Emms D.M.
2. Liu Y.
3. Belcher L.
4. Holmes J.
5. Kelly S
2025OrthoFinder: scalable phylogenetic orthology inference for comparative genomicsbioRxiv https://doi.org/10.1101/2025.07.15.664860 Google Scholar
123.
1. Styfhals R.
2. Seuntjens E.
3. Simakov O.
4. Sanges R.
5. Fiorito G
2019In silico Identification and Expression of Protocadherin Gene Family in Octopus vulgarisFront Physiol Volume :9–2018https://doi.org/10.3389/fphys.2018.01905 PubMed Google Scholar
124.
1. Ahmed S.
2. Stanley D.
3. Kim Y
2018An Insect Prostaglandin E2 Synthase Acts in Immunity and ReproductionFront Physiol 9:1231https://doi.org/10.3389/fphys.2018.01231 PubMed Google Scholar
125.
1. Kanaoka Y.
2. Urade Y
2003Hematopoietic prostaglandin D synthaseProstaglandins Leukot Essent Fatty Acids 69:163–167https://doi.org/10.1016/s0952-3278(03)00077-2 PubMed Google Scholar
126.
1. Tomarev S.I.
2. Chung S.
3. Piatigorsky J
1995Glutathione S-transferase and S-crystallins of cephalopods: evolution from active enzyme to lens-refractive proteinsJ Mol Evol 41:1048–1056https://doi.org/10.1007/BF00173186 PubMed Google Scholar
127.
1. Tan W.-H.
2. Cheng S.-C.
3. Liu Y.-T.
4. Wu C.-G.
5. Lin M.-H.
6. Chen C.-C.
7. Lin C.-H.
8. Chou C.-Y
2016Structure of a Highly Active Cephalopod S-crystallin Mutant: New Molecular Evidence for Evolution from an Active Enzyme into Lens-Refractive ProteinSci Rep 6:31176https://doi.org/10.1038/srep31176 PubMed Google Scholar
128.
1. Mendes F.K.
2. Vanderpool D.
3. Fulton B.
4. Hahn M.W
2021CAFE 5 models variation in evolutionary rates among gene familiesBioinformatics 36:5516–5518https://doi.org/10.1093/bioinformatics/btaa1022 PubMed Google Scholar
129.
1. Summers M.F.
2. Henderson L.E.
3. Chance M.R.
4. South T.L.
5. Blake P.R.
6. Perez-Alvarado G.
7. Bess J.W.
8. Sowder R.C.
9. Arthur L.O.
10. Sagi I.
11. et al.
1992Nucleocapsid zinc fingers detected in retroviruses: EXAFS studies of intact viruses and the solution-state structure of the nucleocapsid protein from HIV-1Protein Sci 1:563–574https://doi.org/10.1002/pro.5560010502 PubMed Google Scholar
130.
1. Aceituno-Valenzuela U.
2. Micol-Ponce R.
3. Ponce M.R
2020Genome-wide analysis of CCHC-type zinc finger (ZCCHC) proteins in yeast, Arabidopsis, and humansCell Mol Life Sci 77:3991–4014https://doi.org/10.1007/s00018-020-03518-7 PubMed Google Scholar
131.
1. Qi R.
2. Li P.
3. Wang Q.
4. Miao J.
5. Li Z.
6. Pan L
2026Integrated analysis of detoxification pathways and hepatotoxicity of butylated hydroxytoluene (BHT) in the Manila clam (Ruditapes philippinarum)Ecotoxicol Environ Saf 309:119677https://doi.org/10.1016/j.ecoenv.2026.119677 PubMed Google Scholar
132.
1. Ahn S.-J.
2. Marygold S.J
2021The UDP-Glycosyltransferase Family in Drosophila melanogaster: Nomenclature Update, Gene Expression and Phylogenetic AnalysisFront Physiol 12https://doi.org/10.3389/fphys.2021.648481 PubMed Google Scholar
133.
1. Gene Ontology Consortium
2026The Gene Ontology knowledgebase in 2026Nucleic Acids Res 54:D1779–D1792https://doi.org/10.1093/nar/gkaf1292 PubMed Google Scholar
134.
1. Ashburner M.
2. Ball C.A.
3. Blake J.A.
4. Botstein D.
5. Butler H.
6. Cherry J.M.
7. Davis A.P.
8. Dolinski K.
9. Dwight S.S.
10. Eppig J.T.
11. et al.
2000Gene Ontology: tool for the unification of biologyNat Genet 25:25–29https://doi.org/10.1038/75556 PubMed Google Scholar
135.
1. Redies C
1997Cadherins and the formation of neural circuitry in the vertebrate CNSCell Tissue Res 290:405–413https://doi.org/10.1007/s004410050947 PubMed Google Scholar
136.
1. Joazeiro C.A.P.
2. Wing S.S.
3. Huang H.
4. Leverson J.D.
5. Hunter T.
6. Liu Y.-C
1999The Tyrosine Kinase Negative Regulator c-Cbl as a RING-Type, E2-Dependent Ubiquitin-Protein LigaseScience 286:309–312https://doi.org/10.1126/science.286.5438.309 PubMed Google Scholar
137.
1. Ryu K.-B.
2. Jo G.-H.
3. Gil Y.-C.
4. Jeon D.
5. Choi N.-R.
6. Jung S.-H.
7. Jo S.
8. An H.S.
9. Lee H.-Y.
10. Eyun S.
11. et al.
2023Eye development and developmental expression of crystallin genes in the long arm octopus, Octopus minorFront Mar Sci 10https://doi.org/10.3389/fmars.2023.1136602 Google Scholar
138.
1. Li H.
2. Durbin R
2024Genome assembly in the telomere-to-telomere eraNat Rev Genet 25:658–670https://doi.org/10.1038/s41576-024-00718-w PubMed Google Scholar
139.
1. Lukhtanov V.A.
2. Dincă V.
3. Talavera G.
4. Vila R
2011Unprecedented within-species chromosome number cline in the Wood White butterfly Leptidea sinapis and its significance for karyotype evolution and speciationBMC Evol Biol 11:109https://doi.org/10.1186/1471-2148-11-109 PubMed Google Scholar
140.
1. Lukhtanov V.A.
2. Dincă V.
3. Friberg M.
4. Šíchová J.
5. Olofsson M.
6. Vila R.
7. Marec F.
8. Wiklund C
2018Versatility of multivalent orientation, inverted meiosis, and rescued fitness in holocentric chromosomal hybridsProc Natl Acad Sci 115:E9610–E9619https://doi.org/10.1073/pnas.1802610115 PubMed Google Scholar
141.
1. Lam E.T.
2. Hastie A.
3. Lin C.
4. Ehrlich D.
5. Das S.K.
6. Austin M.D.
7. Deshpande P.
8. Cao H.
9. Nagarajan N.
10. Xiao M.
11. et al.
2012Genome mapping on nanochannel arrays for structural variation analysis and sequence assemblyNat Biotechnol 30:771–776https://doi.org/10.1038/nbt.2303 PubMed Google Scholar
142.
1. Wang J.
2. Zheng X
2017Comparison of the genetic relationship between nine Cephalopod species based on cluster analysis of karyotype evolutionary distanceComp Cytogenet 11:477–494https://doi.org/10.3897/compcytogen.v11i3.12752 PubMed Google Scholar
143.
1. Adachi K.
2. Ohnishi K.
3. Kuramochi T.
4. Yoshinaga T.
5. Okumura S.-I
2014Molecular cytogenetic study in Octopus (Amphioctopus) areolatus from JapanFish Sci 80:445–450https://doi.org/10.1007/s12562-014-0703-4 Google Scholar
144.
1. Tanner A.R.
2. Fuchs D.
3. Winkelmann I.E.
4. Gilbert M.T.P.
5. Pankey M.S.
6. Ribeiro Â.M.
7. Kocot K.M.
8. Halanych K.M.
9. Oakley T.H.
10. da Fonseca R.R.
11. et al.
2017Molecular clocks indicate turnover and diversification of modern coleoid cephalopods during the Mesozoic Marine RevolutionProc R Soc B Biol Sci 284:20162818https://doi.org/10.1098/rspb.2016.2818 PubMed Google Scholar
145.
1. Anderson F.E.
2. Lindgren A.R
2021Phylogenomic analyses recover a clade of large-bodied decapodiform cephalopodsMol Phylogenet Evol 156:107038https://doi.org/10.1016/j.ympev.2020.107038 PubMed Google Scholar
146.
1. Lindgren A.R.
2. Pankey M.S.
3. Hochberg F.G.
4. Oakley T.H
2012A multi-gene phylogeny of Cephalopoda supports convergent morphological evolution in association with multiple habitat shifts in the marine environmentBMC Evol Biol 12:129https://doi.org/10.1186/1471-2148-12-129 PubMed Google Scholar
147.
1. Sanchez G.
2. Setiamarga D.H.E.
3. Tuanapaya S.
4. Tongtherm K.
5. Winkelmann I.E.
6. Schmidbaur H.
7. Umino T.
8. Albertin C.
9. Allcock L.
10. Perales-Raya C.
11. et al.
2018Genus-level phylogeny of cephalopods using molecular markers: current status and problematic areasPeerJ 6:e4331https://doi.org/10.7717/peerj.4331 PubMed Google Scholar
148.
1. Schultz D.T.
2. Haddock S.H.D.
3. Bredeson J.V.
4. Green R.E.
5. Simakov O.
6. Rokhsar D.S
2023Ancient gene linkages support ctenophores as sister to other animalsNature 618:110–117https://doi.org/10.1038/s41586-023-05936-6 PubMed Google Scholar
149.
1. Rogers T.F.
2. Stock J.
3. Schulz N.G.
4. Yalçin G.
5. Rencken S.
6. Weissenbacher A.
7. Clarence T.
8. Schultz D.T.
9. Ragsdale C.W.
10. Albertin C.B.
11. et al.
2025Genome reorganisation and expansion shape 3D genome architecture and define a distinct regulatory landscape in coleoid cephalopodsbioRxiv https://doi.org/10.1101/2025.08.29.672809 PubMed Google Scholar
150.
1. Yoshida M.
2. Tóth E.
3. Kon-Nanjo K.
4. Kon T.
5. Hirota K.
6. Toyoda A.
7. Toh H.
8. Miyazawa H.
9. Terauchi M.
10. Noguchi H.
11. et al.
2025Giant genome of the vampire squid reveals the derived state of modern octopod karyotypesiScience 28https://doi.org/10.1016/j.isci.2025.113832 PubMed Google Scholar
151.
1. Amaral P.
2. Carbonell-Sala S.
3. De La Vega F.M.
4. Faial T.
5. Frankish A.
6. Gingeras T.
7. Guigo R.
8. Harrow J.L.
9. Hatzigeorgiou A.G.
10. Johnson R.
11. et al.
2023The status of the human gene catalogueNature 622:41–47https://doi.org/10.1038/s41586-023-06490-x PubMed Google Scholar
152.
1. Frankish A.
2. Carbonell-Sala S.
3. Diekhans M.
4. Jungreis I.
5. Loveland J.E.
6. Mudge J.M.
7. Sisu C.
8. Wright J.C.
9. Arnan C.
10. Barnes I.
11. et al.
2023GENCODE: reference annotation for the human and mouse genomes in 2023Nucleic Acids Res 51:D942–D949https://doi.org/10.1093/nar/gkac1071 PubMed Google Scholar
153.
1. Lynch M.
2. Force A
2000The probability of duplicate gene preservation by subfunctionalizationGenetics 154:459–473https://doi.org/10.1093/genetics/154.1.459 PubMed Google Scholar
154.
1. Force A.
2. Lynch M.
3. Pickett F.B.
4. Amores A.
5. Yan Y.
6. Postlethwait J
1999Preservation of Duplicate Genes by Complementary, Degenerative MutationsGenetics 151:1531–1545https://doi.org/10.1093/genetics/151.4.1531 PubMed Google Scholar
155.
1. Freemont P.S
2000Ubiquitination: RING for destruction?Curr Biol 10:R84–R87https://doi.org/10.1016/S0960-9822(00)00287-6 PubMed Google Scholar
156.
1. Gamsjaeger R.
2. Liew C.K.
3. Loughlin F.E.
4. Crossley M.
5. Mackay J.P
2007Sticky fingers: zinc-fingers as protein-recognition motifsTrends Biochem Sci 32:63–70https://doi.org/10.1016/j.tibs.2006.12.007 PubMed Google Scholar
157.
1. Wang Y.
2. Yu Y.
3. Pang Y.
4. Yu H.
5. Zhang W.
6. Zhao X.
7. Yu J
2021The distinct roles of zinc finger CCHC-type (ZCCHC) superfamily proteins in the regulation of RNA metabolismRNA Biol 18:2107–2126https://doi.org/10.1080/15476286.2021.1909320 PubMed Google Scholar
158.
1. Focareta L.
2. Sesso S.
3. Cole A.G
2014Characterization of Homeobox Genes Reveals Sophisticated Regionalization of the Central Nervous System in the European Cuttlefish Sepia officinalisPLOS One 9:e109627https://doi.org/10.1371/journal.pone.0109627 PubMed Google Scholar
159.
1. Aken B.L.
2. Ayling S.
3. Barrell D.
4. Clarke L.
5. Curwen V.
6. Fairley S.
7. Fernandez Banet J.
8. Billis K.
9. García Girón C.
10. Hourlier T.
11. et al.
2016The Ensembl gene annotation systemDatabase J Biol Databases Curation 2016:baw093https://doi.org/10.1093/database/baw093 PubMed Google Scholar
160.
1. Veenstra G.J.C
2025pita_genes_v0.3: Doryteuthis pealeii gene modelsZenodo https://doi.org/10.5281/zenodo.15222686
161.
1. Ponte G.
2. Roumbedakis K.
3. Galligioni V.
4. Dickel L.
5. Bellanger C.
6. Pereira J.
7. Vidal E.A.
8. Grigoriou P.
9. Alleva E.
10. et al.
2023General and species-specific recommendations for minimal requirements for the use of cephalopods in scientific researchLab Anim 57https://doi.org/10.1177/00236772221111261 PubMed Google Scholar
162.
1. Fiorito G.
2. Affuso A.
3. Basil J.
4. Cole A.
5. de Girolamo P.
6. D’Angelo L.
7. Dickel L.
8. Gestal C.
9. Grasso F.
10. Kuba M.
11. et al.
2015Guidelines for the Care and Welfare of Cephalopods in Research –A consensus based on an initiative by CephRes, FELASA and the Boyd GroupLab Anim 49:1–90https://doi.org/10.1177/0023677215580006 PubMed Google Scholar
163.
1. Andrews P.L.R.
2. Darmaillacq A.-S.
3. Dennison N.
4. Gleadall I.G.
5. Hawkins P.
6. Messenger J.B.
7. Osorio D.
8. Smith V.J.
9. Smith J.A
2013The identification and management of pain, suffering and distress in cephalopods, including anaesthesia, analgesia and humane killingJ Exp Mar Biol Ecol 447:46–64https://doi.org/10.1016/j.jembe.2013.02.010 Google Scholar
164.
1. Cartolano M.
2. Huettel B.
3. Hartwig B.
4. Reinhardt R.
5. Schneeberger K
2016cDNA Library Enrichment of Full Length Transcripts for SMRT Long Read SequencingPLOS One 11:e0157779https://doi.org/10.1371/journal.pone.0157779 PubMed Google Scholar
165.
1. Rhie A.
2. Walenz B.P.
3. Koren S.
4. Phillippy A.M
2020Merqury: reference-free quality, completeness, and phasing assessment for genome assembliesGenome Biol 21:245https://doi.org/10.1186/s13059-020-02134-9 PubMed Google Scholar
166.
1. Formenti G.
2. Rhie A.
3. Walenz B.P.
4. Thibaud-Nissen F.
5. Shafin K.
6. Koren S.
7. Myers E.W.
8. Jarvis E.D.
9. Phillippy A.M
2022Merfin: improved variant filtering, assembly evaluation and polishing via k-mer validationNat Methods 19:696–704https://doi.org/10.1038/s41592-022-01445-y PubMed Google Scholar
167.
1. Cheng H.
2. Concepcion G.T.
3. Feng X.
4. Zhang H.
5. Li H
2021Haplotype-resolved de novo assembly using phased assembly graphs with hifiasmNat Methods 18:170–175https://doi.org/10.1038/s41592-020-01056-5 PubMed Google Scholar
168.
1. Lyčka M.
2. Bubeník M.
3. Závodník M.
4. Peska V.
5. Fajkus P.
6. Demko M.
7. Fajkus J.
8. Fojtová M
2024TeloBase: a community-curated database of telomere sequences across the tree of lifeNucleic Acids Res 52:D311–D321https://doi.org/10.1093/nar/gkad672 PubMed Google Scholar
169.
1. Akasaki T.
2. Nikaido M.
3. Tsuchiya K.
4. Segawa S.
5. Hasegawa M.
6. Okada N
2006Extensive mitochondrial gene arrangements in coleoid Cephalopoda and their phylogenetic implicationsMol Phylogenet Evol 38:648–658https://doi.org/10.1016/j.ympev.2005.10.018 PubMed Google Scholar
170.
1. Li H
2018Minimap2: pairwise alignment for nucleotide sequencesBioinformatics 34:3094–3100https://doi.org/10.1093/bioinformatics/bty191 PubMed Google Scholar
171.
1. Li H.
2013Seqtk: a fast and lightweight tool for processing FASTA or FASTQ sequences
172.
1. Shumate A.
2. Wong B.
3. Pertea G.
4. Pertea M
2022Improved transcriptome assembly using a hybrid of long and short reads with StringTiePLOS Comput Biol 18:e1009730https://doi.org/10.1371/journal.pcbi.1009730 PubMed Google Scholar
173.
1. Haas B.J.
2026TransDecoder/TransDecoder: TransDecoder sourceGitHub https://github.com/TransDecoder/TransDecoder
174.
1. R Core Team
2024R: A Language and Environment for Statistical ComputingR Foundation for Statistical Computing
175.
1. Lovell J.T.
2. Sreedasyam A.
3. Schranz M.E.
4. Wilson M.
5. Carlson J.W.
6. Harkess A.
7. Emms D.
8. Goodstein D.M.
9. Schmutz J
2022GENESPACE tracks regions of interest and gene copy number variation across multiple genomeseLife 11:e78526https://doi.org/10.7554/eLife.78526 PubMed Google Scholar
176.
1. Emms D.M.
2. Kelly S
2019OrthoFinder: phylogenetic orthology inference for comparative genomicsGenome Biol 20:238https://doi.org/10.1186/s13059-019-1832-y PubMed Google Scholar
177.
1. Wang Y.
2. Tang H.
3. DeBarry J.D.
4. Tan X.
5. Li J.
6. Wang X.
7. Lee T.
8. Jin H.
9. Marler B.
10. Guo H.
11. et al.
2012MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearityNucleic Acids Res 40:e49https://doi.org/10.1093/nar/gkr1293 PubMed Google Scholar
178.
1. Hahsler M.
2. Piekenbrock M.
3. Doran D
2019dbscan: Fast Density-Based Clustering with RJ Stat Softw 91:1–30https://doi.org/10.18637/jss.v091.i01 Google Scholar
179.
1. pysam-developers
2026pysam
180.
1. Dobin A.
2. Davis C.A.
3. Schlesinger F.
4. Drenkow J.
5. Zaleski C.
6. Jha S.
7. Batut P.
8. Chaisson M.
9. Gingeras T.R
2013STAR: ultrafast universal RNA-seq alignerBioinforma Oxf Engl 29:15–21https://doi.org/10.1093/bioinformatics/bts635 PubMed Google Scholar
181.
1. Pedersen B.S.
2. Quinlan A.R
2018Mosdepth: quick coverage calculation for genomes and exomesBioinformatics 34:867–868https://doi.org/10.1093/bioinformatics/btx699 PubMed Google Scholar
182.
1. Emms D.M.
2. Kelly S
2018STAG: Species Tree Inference from All GenesbioRxiv https://doi.org/10.1101/267914 Google Scholar
183.
1. Paradis E.
2. Schliep K
2019ape 5.0: an environment for modern phylogenetics and evolutionary analyses in RBioinformatics 35:526–528https://doi.org/10.1093/bioinformatics/bty633 PubMed Google Scholar
184.
1. Emms D.M.
2. Kelly S
2017STRIDE: Species Tree Root Inference from Gene Duplication EventsMol Biol Evol 34:3267–3278https://doi.org/10.1093/molbev/msx259 PubMed Google Scholar
185.
1. Quinlan A.R.
2. Hall I.M
2010BEDTools: a flexible suite of utilities for comparing genomic featuresBioinformatics 26:841–842https://doi.org/10.1093/bioinformatics/btq033 PubMed Google Scholar
186.
1. Liao Y.
2. Smyth G.K.
3. Shi W
2014featureCounts: an efficient general purpose program for assigning sequence reads to genomic featuresBioinformatics 30:923–930https://doi.org/10.1093/bioinformatics/btt656 PubMed Google Scholar
187.
1. Love M.I.
2. Huber W.
3. Anders S
2014Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2Genome Biol 15:550https://doi.org/10.1186/s13059-014-0550-8 PubMed Google Scholar
188.
1. Zhu A.
2. Ibrahim J.G.
3. Love M.I
2019Heavy-tailed prior distributions for sequence count data: removing the noise and preserving large differencesBioinformatics 35:2084–2092https://doi.org/10.1093/bioinformatics/bty895 PubMed Google Scholar
189.
1. Xu S.
2. Hu E.
3. Cai Y.
4. Xie Z.
5. Luo X.
6. Zhan L.
7. Tang W.
8. Wang Q.
9. Liu B.
10. Wang R.
11. et al.
2024Using clusterProfiler to characterize multiomics dataNat Protoc 19:3292–3320https://doi.org/10.1038/s41596-024-01020-z PubMed Google Scholar
190.
1. Yu G.
2. Wang L.-G.
3. Han Y.
4. He Q.-Y
2012clusterProfiler: an R Package for Comparing Biological Themes Among Gene ClustersOMICS J Integr Biol 16:284–287https://doi.org/10.1089/omi.2011.0118 PubMed Google Scholar
191.
1. Challis R.
2. Richards E.
3. Rajan J.
4. Cochrane G.
5. Blaxter M
2020BlobToolKit – Interactive Quality Assessment of Genome AssembliesG3 GenesGenomesGenetics 10:1361–1374https://doi.org/10.1534/g3.119.400908 PubMed Google Scholar
192.
1. Kröger B.
2. Vinther J.
3. Fuchs D
2011Cephalopod origin and evolution: A congruent picture emerging from fossils, development and moleculesBioEssays 33:602–613https://doi.org/10.1002/bies.201100001 PubMed Google Scholar
193.
1. Song W.
2. Li R.
3. Zhao Y.
4. Migaud H.
5. Wang C.
6. Bekaert M
2021Pharaoh Cuttlefish, Sepia pharaonis, Genome Reveals Unique Reflectin Camouflage Gene SetFront Mar Sci 8https://doi.org/10.3389/fmars.2021.639670 Google Scholar
1. Rencken S
2. Tushev G
3. Hain D
4. Ciirdaeva E
5. Simakov O
6. Laurent G
2025Sepia officinalis genome sequencing and assemblyNCBI BioProject ID PRJNA1091451https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1091451
1. Rencken S
2. Tushev G
3. Hain D
4. Ciirdaeva E
5. Simakov O
6. Laurent G
2025Sepia officinalis isolate:GLC-03058 Genome sequencing and assemblyNCBI Sequence Read Archive ID SRP570862https://www.ncbi.nlm.nih.gov/sra/?term=SRP570862
1. Rencken S
2. Tushev G
3. Hain D
4. Ciirdaeva E
5. Simakov O
6. Laurent G
2025Genome annotation filesPublic repository of MPI for Brain Research https://doi.org/10.17617/1.5n7h-4385
1. Rencken S
2. Tushev G
2026Code for Sepia officinalis genome assemblyGitLab ID mpibr/laur/cuttlefishomics/soffgenomehttps://gitlab.mpcdf.mpg.de/mpibr/laur/cuttlefishomics/soffgenome

Article and author information

Author information

Simone Rencken
Max Planck Institute for Brain Research, Frankfurt am Main, Germany, Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, Netherlands
ORCID iD: 0009-0003-9341-2898
Georgi Tushev
Max Planck Institute for Brain Research, Frankfurt am Main, Germany
ORCID iD: 0000-0002-3340-9422
David Hain
Max Planck Institute for Brain Research, Frankfurt am Main, Germany, Faculty of Biological Sciences, Goethe University, Frankfurt am Main, Germany
ORCID iD: 0000-0002-8979-7938
Elena Ciirdaeva
Max Planck Institute for Brain Research, Frankfurt am Main, Germany
ORCID iD: 0009-0000-7360-7725
Oleg Simakov
Department of Neuroscience and Developmental Biology, University of Vienna, Vienna, Austria
ORCID iD: 0000-0002-3585-4511
Gilles Laurent
Max Planck Institute for Brain Research, Frankfurt am Main, Germany
ORCID iD: 0000-0002-2296-114X
- For correspondence: g.laurent@brain.mpg.de

Author Notes

Competing interests: No competing interests declared

Version history

Preprint posted: April 24, 2025
Sent for peer review: May 1, 2025
Reviewed Preprint version 1: August 1, 2025
Reviewed Preprint version 2: May 15, 2026
Version of Record published: May 28, 2026

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.107393. This DOI represents all versions, and will always resolve to the latest one.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

views: 1,499
downloads: 84
citations: 5

Views, downloads and citations are aggregated across all versions of this paper published by eLife.