Peer review process
Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.
Read more about eLife’s peer review process.Editors
- Reviewing EditorAndres Jara-OsegueraThe University of Texas at Austin, Austin TX, United States of America
- Senior EditorKenton SwartzNational Institute of Neurological Disorders and Stroke, Bethesda, United States of America
Reviewer #1 (Public review):
Summary:
In this manuscript, Taujale et al describe an interdisciplinary approach to mine the human channelome and further discover orthologues across diverse organisms, culminating in delineating co-conserved patterns in an example ion channel: CALHM. Overall, this paper comes in two sections, one where 419 human ion channels and 48,000+ channels from diverse organisms are found through a multidisciplinary data mining approach, and a second where this data is used to find co-conserved sequences, whose functional significance is validated via experiments on CALHM1 and CALHM6. Overall, this is an intriguing data-first approach to better understand even understudied ion channels like CALHM6. However, more needs to be done to pull this story together into a single, coherent narrative.
Strengths:
This manuscript takes advantage of modern-day LLM tools to better mine the literature for ion channel sequences in humans and other species with orthologous ion channel sequences. They explore the 'dark channome' of understudied ion channels to better reveal the information evolution has to tell us about our own proteins, and illustrate the information this provides access to in experimental studies in the final section of the paper. Finally, they provide a wealth of information in the supplementary tables (in the form of Excel spreadsheets) for others to explore. Overall, this is a creative approach to a wide-reaching problem that can be applied to other families of proteins.
Weaknesses:
Overall, while a considerable amount of work has been done for this manuscript, the presentation, both in terms of writing and figures, leaves much to be desired. One can imagine a story that clearly describes the need for a better-curated sequence database of ion channels, and clearly describes how existing resources fall short, but here this is not very clearly illustrated.
One question that arises with the part of the manuscript that discusses the identification and classification of ion channels is whether they plan to make these sequences available to the wider public. For the 419 human sequences, making a small database to share this result so that these sequences can be easily searched and downloaded would be desirable. There are a variety of acceptable formats for this: GitHub/figshare/zenodo/university website that allows a wider community to access their hard work. The authors have included enough information in the supplementary tables that this could be done by a motivated reader, but providing such a resource would greatly expand the impact of this paper. The same question can be asked of the 48,000+ ion channels from diverse organisms. For these, one is even worried that these are not properly sequenced genes? What checks have been done to confirm this? Uniport contains a good deal of unreviewed sequences, especially from single-celled organisms. Potentially, this is covered in the sentence in the Methods: "Finally, the results obtained from both the full-length and pore domains were retained as true orthologous relationships to remove extraneous hits." But this process could be discussed in more detail, clearly illustrating that the risk of gene duplicates and fragments in this final set of ion channel orthologues has been avoided. Related to this, does this analysis include or exclude isoforms?
Another aspect of the identification and classification of ion channel genes that could be improved is the figures for this section. One is relatively used to seeing trees as shown in Figures 3 and 4, which show relationships between genes as distances or evolutionary relationships. The decision to show the families of ion channels in Figure 1 as pie charts within a UMAP embedding is intriguing but somewhat non-intuitive and difficult to understand. Illustrating these results with a standard tree-like visualization of the relationship of these channels to each other would be preferred.
One aspect of the pie-chart/UMAP visualization that works well is the highlighting of the 'dark' ion channels according to the status as designated by IDG, which highlights a strength of this whole paper. However, throughout the paper, this could be emphasized more as the key advantage of this approach and how this or similar approaches could be used for other families of proteins. Specifically, in the initial statement describing 'light' vs 'dark channels', the importance of this distinction and the historical preference in science to study that which has already been studied can be discussed more, even including references to other studies that take this kind of approach. An example of a relevant reference here is to the Structural Genomics Consortium and its goals to achieve structures of proteins for which functions may not be well-characterized. Furthermore, this initial statement mentioning 'light channels' was initially confusing -- does this mean light-sensing channels? As one reads on this is clearly not the case, but for such an important central focus of this paper, these kinds of misunderstandings do not serve the authors well. Clarifying these motivations throughout the entire paper would strengthen it considerably.
Additionally, since the authors have generated this UMAP visualization, it would be interesting to understand how the human vs orthologue gene sets compare in this space. Furthermore, Figure 1, for just the human analysis, should say more clearly that this is an analysis of the human gene set and include more of the information in the text: 419 human ion channel sequences, 75 sequences previously unidentified, 4 major groups and 55 families, 62 outliers, etc. Clearer visualizations of these categories and numbers within the UMAP (and newly included tree) visualization would help guide the reader to better understand these results.
One of the most peculiar aspects of this paper is that it feels like two papers, one about better documenting the ion channel genes across species, and another with well-executed experiments on CALHM channels. One suggestion for how to link these two sections together better is to show that previous methods to analyze conserved residues in CALHM were significantly lacking. What results would that give? Why was this not enough? Were there just not enough identified CALHM orthologues to give strong signals in conservation analysis?
Some of the analysis pipeline is unclear. Specifically, the RAG analysis seems critical, but it is unclear how this works - is it on top of the GPT framework and recursively inquires about the answer to prompts? Some example prompts would be useful to understand this. Furthermore, the existence of 76 auxiliary non-pore containing 'ion channel' genes in this analysis is a little confusing, as it seems a part of the pipeline is looking for pore-lining residues. Furthermore, how many of these are picked up in the larger orthologues search? Are these harder to perform checks on to ensure that they are indeed ion channel genes? A further discussion of the choice to include these auxiliary sequences would be relevant. This could just be further discussion of the literature that has decided to do this in the past.
Overall, this manuscript is a valuable contribution to the field, but it requires a few main things to make it truly useful. Namely, how has this approach really improved the ability to identify conserved residues over a less-involved approach? A better description of their methods and results is required in the first section of the paper, as well as some cosmetic improvements.
Reviewer #2 (Public review):
Summary:
In this paper, the authors defined the "channelome," consisting of 419 predicted human ion channels as well as 48,000 ion channel orthologs from other organisms. Using this information, the ion channels were clustered into groups, which can potentially be used to make predictions about understudied ion channels in the groups. The authors then focused on the CALHM ion channel family, mutating conserved residues and assessing channel function.
Strengths:
The curation of the channelome provides an excellent resource for researchers studying ion channels. Supplemental Table 1 is well organized with an abundance of useful information.
Weaknesses:
There are substantial concerns regarding the analysis of the CALHM channels as detailed below.
(1) There are significant problems with the methodology used for the electrophysiology studies. Pulse protocol is used to assess the current voltage relationship (-100 to +140 mV), which extends far beyond the physiological range; currents for the mutant channels were only assessed at +120 mV. It is also unclear why a holding potential of 0 mV was used for CALHM6 recordings; the channel is already open at this voltage (and in Figure 4, only n = 3 for CALHM6). Further, proper controls were not performed. Inhibitors such as Gd3+ can be used to ensure that only CALHM currents are being measured.
(2) In line 334, the authors state that "expression levels of wild-type proteins and mutants are comparable." However, Western blots showing CALHM protein abundance (Supplementary Figure 3) are not of acceptable quality - in the top blot, WT CALHM1 can't even be seen. Representative blots were not shown for all mutants, and there was no effort to determine if levels were statistically significant compared to the wild-type control. Even if there is more or less protein, what does this mean? The protein could be in an intracellular compartment and not at the plasma membrane. In mammalian cells, CALHM6 is localized to intracellular compartments and only translocates to the plasma membrane upon activating stimulus (Danielli et al, EMBO J, 2023). Thus, if CALHM6 is only intracellular, the protein amount would not change, but the measured current would. Abundant intracellular CALHM1 has also been observed in mammalian cells transfected with this protein (Dreses-Werringloer et al., Cell, 2008). The best way to determine if mutations impact CALHM channel localization is to express GFP-tagged constructs in Xenopus oocytes and look for surface expression.
(3) Since the authors have not definitively shown that there are no defects in localization, they cannot make the claim in lines 346-356 that the mutations "either abolished or markedly reduced channel activity." Further, from their data, there is speculation regarding how these residues impact conformational changes during channel opening and closing. Line 404 - again, there is no concrete evidence that any of these residues play a role in gating function. Lines 406-433 - this entire paragraph is speculation without data to back it up. There is also a lack of specificity with statements such as "all mutants showed either reduced or completely abolished activity." What is meant by activity? Do the authors mean conductance?
(4) Line 303 - 13 aligned amino acids were conserved across all CALHM homologs - are these also aligned in related connexin and pannexin families? It is likely that cysteines and proline in TM2 are since CALHM channels overall share a lot of similarities with connexins and pannexins (Siebert et al, JBC, 2013). As in line 207, it would be expected that pannexins, connexins, and CALHM channel families would group together. Related to this, see Line 406 - in connexins, there is also a proline kink in TM2 that may play a role in mediating conformational changes between channel states (Ri et al, Biophysical Journal, 1999).