Introduction

Tropical islands of the Indo-Australian Archipelago (IAA), the Indo-Pacific, and the Caribbean and Atlantic have had complex geological pasts that have affected their species distribution patterns. While several Pacific islands are true oceanic islands arising de novo from the seafloor, many islands of the Caribbean and the IAA were once connected with each other or to the continental landmasses during periods of global sea-level fall in the Quaternary (Lohman et al., 2011; Voris, 2000). These land bridges and the additional habitat they offered during periods of sea-level fall facilitated on one hand colonisations among islands and between islands and the mainland (Andersen et al., 2015; Cros et al., 2020; Irestedt et al., 2013; Moyle et al., 2009; Ng et al., 2017; Pujolar et al., 2022) and on the other, range expansions within islands in response to fluctuations in suitable habitat space. The habitat type and quality of these intermittently exposed land bridges varied, ranging from open savannah biomes to lowland forests (Cannon et al., 2009; Lohman et al., 2011). In the case of stratified elevational gradients, with tropical upland montane forests expanding onto lower slopes during periods of global cooling, upland montane species might have been able to disperse across otherwise persistent dual barriers of land and sea (Cannon et al., 2009).

Tropical islands’ geologic pasts make them an important natural system to study species responses to changing habitat at both ecological and evolutionary timescales. Birds are an ideal clade for this because they are taxonomically well-characterised, are ecologically well-studied, and charismatic. Avian lineages have arisen on tropical islands during the Pleistocene (Andersen et al., 2015; Garg et al., 2018; Irestedt et al., 2013; Moyle et al., 2009), concomitant with complex, clade-wise colonisation and recolonisation trajectories (Filardi and Moyle, 2005; Jønsson et al., 2008, 2011b, 2014) including colonisations of the mainland (Jønsson et al., 2011a) and repeated, independent colonisation events (Cibois et al., 2011). In birds, these responses to changing habitat are governed by species traits in several continental taxa and non-single-island endemics (Brüniche-Olsen et al., 2019, 2021), but responses in single-island endemics are poorly understood.

With such complex biogeographic pasts, tropical, single-island endemics represent either refugial populations, or in-situ diversifications as explained by taxon cycles (Ricklefs, 1970; Ricklefs and Cox, 1972, 1978). Being confined to single islands, these species are exceptionally vulnerable to ongoing anthropogenic climate change which is unprecedented and unlike the past Pleistocene and early Holocene climatic change (Crowley, 1990). This is exacerbated because we are currently experiencing a period of relatively high sea levels with reduced habitat availability for species resulting in population bottlenecks across taxa (Hewitt, 2000; Willis et al., 2004) including birds (Nadachowska-Brzyska et al., 2015; Smith et al., 2021). Information on tropical, single-island endemics’ demographic responses to past climate change can inform conservation efforts, owing to the genomic signatures that predispose a species to extinction (Mays et al., 2018; Spielman et al., 2004) and the fact that demographic history is directly correlated to extinction risk and resilience (Wilder et al., 2023).

Pairwise Sequential Markovian Coalescent (PSMC) is a powerful method to reconstruct the effective population size (Ne) for a species using a single diploid genome (Li and Durbin, 2011) and has been used to infer demographic history across taxa (Chattopadhyay et al., 2019; Kim et al., 2016; Kozma et al., 2016; Murray et al., 2017; Nadachowska-Brzyska et al., 2015). Using this along with paleo-ecological niche modelling allows us to directly correlate the demographic history of a species with its distributional range at different time points (Chattopadhyay et al., 2019). We perform these analyses on a global panel of tropical single-island endemics to understand the effects of past climatic changes on species.

Results

Bird species panel

Our final species panel comprises 31 tropical single-island endemic bird species. Out of these, PSMC analyses were possible for 23 species and Ecological Niche Modelling (ENM) was possible for 29 species. Both PSMC and ENM analyses could be done successfully for 21 species (Table 1). These included eight Papuan species, five Philippine species, three Caribbean species, and three species from the Bismarck Archipelago. Of these 21 species, 14 species were passerines, with two from the white-eye family (Zosteropidae). Of the non-passerines, two species were parrots (family Psittacidae). All other families were represented by single species.

Details of the taxa included in this study.

PSMC = Pairwise Sequential Markovian Coalescent, ENM= Ecological Niche Modelling, y = yes, n = no.

Paleo-habitat reconstruction

Among the 19 climatic variables used (Supplementary table S1), we observed precipitation of the warmest quarter (BIO18) to contribute the most for each island endemic bird species except for the Caribbean species Amazona guildingii. 25 out of the 29 bird species experienced an increase in suitable habitat from the Last Interglacial (LIG) to the Last Glacial Maximum (LGM) (Supplementary table S2). 12 out of 14 Papuan species showed an increase in habitat availability from the LIG to the LGM, with Cicinnurus regius and Pseudorectus ferrugineus as exceptions. All Philippine species showed an increase in habitat from the LIG to the LGM as well (Figure 12, Supplementary information S2). Two out of the three Caribbean species experienced a decline in suitable habitat during this period. Of the remaining species, all showed an increase in available habitat. Upland Papuan species remained confined to upland montane regions at all the time points we reconstructed available habitat, with Rhagologus leucostigma as a notable exception (Supplementary information S2). Papuan lowland species remained largely confined to regions which were not newly exposed land bridges at all times as well. From the LGM to the present day, habitat decreased for 20 out of the 29 bird species, and two species experienced no change (Supplementary table S2). For the present study, we primarily concentrate on habitat fluctuation between the LIG and the LGM as this is the period for which we have comparative evidence of fluctuations of both habitat as well as effective population size.

Effective population size for passerines.

(a) Pairwise Sequential Markovian Coalescent (PSMC) plots using the settings –p “2 + 2 + 30 * 2 + 4 + 6 + 10” displaying reconstructed effective population size values with time for passerines based on whether habitat availability increased (left) or decreased (right) during the Last Glacial Period (LGP). Colours indicate the archipelago the bird belongs to, and the line style indicates the dietary habit of the bird species. Bold lines indicate large (> 50 g body mass) bird species. The grey bands indicate the approximate durations of the Last Interglacial (LIG) and the Last Glacial Maximum (LGM). Black arrows indicate if habitat availability increased or decreased during the LGP. A mutation rate of 1.4 x 10e–9 years/site and a generation time of 2 years for passerines, and a mutation rate of 1.91 x 10e–9 years/site and a generation time of 1 year for non-passerines were used to generate plots. Only species for which both Ecological Niche Modelling and PSMC analyses were possible are shown. Zosterops hypoxanthus is not displayed because its Ne values far exceed those of the other species. (b) Comparisons of Effective Population Size (Ne) at the Last Interglacial (LIG) and Last Glacial Maximum (LGM) incorporating bootstrapped Ne values generated using the same PSMC settings. Boxplots display bootstrapped Ne values, and blue points display the non-bootstrapped Ne value. Outlying bootstrapped and non-bootstrapped Ne values are not displayed. “***” indicates p < 0.001.

Effective population size for passerines.

(a) Pairwise Sequential Markovian Coalescent (PSMC) plots using the settings –p “2 + 2 + 30 * 2 + 4 + 6 + 10” displaying reconstructed effective population size values with time for non passerines based on whether habitat availability increased (left) or decreased (right) during the Last Glacial Period (LGP). Colours indicate the archipelago the bird belongs to, and the line style indicates the dietary habit of the bird species. Bold lines indicate large (> 50 g body mass) bird species. The grey bands indicate the approximate durations of the Last Interglacial (LIG) and the Last Glacial Maximum (LGM). Black arrows indicate if habitat availability increased or decreased during the LGP. A mutation rate of 1.4 x 10e–9 years/site and a generation time of 2 years for passerines, and a mutation rate of 1.91 x 10e–9 years/site and a generation time of 1 year for non-passerines were used to generate plots. Only species for which both Ecological Niche Modelling and PSMC analyses were possible are shown. (b) Comparisons of Effective Population Size (Ne) at the Last Interglacial (LIG) and Last Glacial Maximum (LGM) incorporating bootstrapped Ne values generated using the same PSMC settings. Boxplots display bootstrapped Ne values, and blue points display the non-bootstrapped Ne value. Outlying bootstrapped and non-bootstrapped Ne values are not displayed. “***” indicates p < 0.001, and “*” indicates p < 0.05.

Strong Quaternary fluctuations in effective population size

PSMC analyses could be successfully done for 23 species (Table 1). Results indicate large effective population size (Ne) variations in almost all bird species, corresponding to Quaternary climate shifts (Figure 3). The demographic history of these species reconstructed using PSMC analyses extends back over a million years (Supplementary information S1). Reconstructed Ne values were generally concordant across the three PSMC settings used, except for five species (Centropus unirufus, Dicaeum eximium, Irena cyanogastra, Sterrhoptilus dennistouni, Zosterops hypoxanthus) where Ne values had large differences across PSMC settings measured at the LIG and LGM (Supplementary table S3). However, in three of these species as well, trends of Ne increase or decrease from the LIG to LGM were robust across all three PSMC models considered (Centropus unirufus, Irena cyanogastra, Sterrhoptilus dennistouni) (Supplementary table S4). We found no significant difference between the different sets of PSMC settings used to estimate the change in Ne during the LGP (Kruskal-Wallis test, χ2= 0.11, DF = 2, p = 0.94).

Example Ecological Niche Modelling plots for species from Papua (Cnemophilus loriae), Australia (Alectura lathami), the Philippines (Irena cyanogastra), and the Caribbean (Puerto Rico, Todus mexicanus).

LIG = Last Interglacial. LGM = Last Glacial Maximum. MDH = Mid Holocene. CUR = Current. The continuous heatmap represents the probability of occurrence of the species and red points are known occurrences from GBIF. For all the plots see Figure S2.

The 14 passerine species we analysed showed overall lower Ne values in the LGM as compared to the LIG (Figure 12). We obtained similar results for non-passerines as well, with Rhynochetos jubatus as a notable exception using the –p “4 + 30 * 2 + 4 + 6 + 10” setting, displaying a large peak followed by a crash in Ne (Supplementary information S1). This is likely a PSMC artefact, because plots using other parameter settings did not show a peak (Supplementary information S1). Along with overall lower values, we also see a smaller range of Ne values in the LGM as compared to the LIG (Figure 12).

Correlation between historical fluctuations in Ne and distribution

Habitat change was poorly associated with change in Ne for the 21 species for which both PSMC and ENM analyses were possible (Cramer’s V = 0.11). However, when analysed separately, passerine species only showed a strong association (Cramer’s V = 0.98), while non-passerines showed a weak negative association (Cramer’s V = –0.15).

Bayesian multivariate regression models revealed that an overall fluctuation in Ne was associated with species biology and habitat change during the Last Glacial Period (LGP) even after controlling for geographical island group as a random effect (Figure 4). Most confidence intervals and beta coefficients did not overlap with zero (Figure 4) suggesting significant relationships. We found a positive association between change in habitat area (LIG to LGM) and Ne (β = 9.4, 95% CI: [1.95, 21.07]) and a negative relationship with body mass (β = −8.32, 95% CI: [–15.84, –2.59]). That is, large-bodied species showed decreases in Ne during the LGP. The change in Ne was positively correlated with the interaction term of habitat change and body mass (β = 10.28, 95% CI: [0.05, 25.18]). This suggests that habitat change and body mass act synergistically to predict the change in Ne. Diet specialists like frugivores (β = –11.89, 95% CI: [–22.57, –3.36]) and invertivores (β = –11.32, 95% CI: [–21.85, –2.85]) also showed significant negative associations with population change. However, omnivores (β = –1.28, 95% CI: [–12.77, 11.24]) did not show any association with population change. We found a marginally negative effect of clade identity as well (β = –3.64, 95% CI: [–7.91, –0.43]), that is, passerine status was associated with reducing Ne in the LGP in our dataset. Whether or not a species was a passerine was an important predictor of Ne in combination with the change in habitat from LIG to LGM (β = 8.01, 95% CI: [1.9, 17.11], suggesting that passerines respond positively to habitat change, as suggested by Cramer’s V as well. Finally, the random effects parameter for country showed that there exists country-to-country variation (β = 0.86, 95% CI: [0.04, 2.25]) (Figure 4, Supplementary table S5).

Results of the best Bayesian multivariate regression model performed.

The response variable represents the changes in effective population size (increased (=1) or decreased (=0)) during the Last Glacial Period. Shapes represent mean values, thick lines represent the 80% confidence intervals, and thin lines represent 95% confidence intervals. Model parameters and coefficients are provided in table S6. ΔHabitat = the change in suitable habitat from the Last Interglacial to the Last Glacial Maximum.

Discussion

Island birds are vulnerable to the effects of climate change

Our results on the fluctuations in paleohabitat and Ne during the LGP highlight the vulnerability of island birds to climatic fluctuations. Climate fluctuations change habitat availability and, in turn, Ne, making these species particularly sensitive to a changing environment. In response to landmass expansion, several species in our panel across islands showed an increase in their Ne during the LGP (Figure 12, Supplementary information S1), although the overall relationship was weak (Cramer’s V = 0.11). This could be because large amounts of intermittently exposed land bridge habitats remained uninhabited by the species in our panel (Supplementary figure S2). For example, we know that large amounts of exposed land bridge habitat available to our panel’s Papuan species ranged from lowland evergreen to seasonal rainforest, with an expansion of upland habitats of evergreen to seasonal rainforest. Transitional hill forest formed an ecotone between these two (Cannon et al., 2009). However, most of our species inhabit dense forest (Supplementary table), which is unlikely to comprise newly exposed landbridge habitat. Moreover, rainfall is overall lower in the LGM in the tropics including South and Southeast Asia, where most of our species are from (McGee, 2020), indicating a possible explanation of why we observed precipitation of the warmest quarter to be the largest contributing bioclimatic variable for all but one Caribbean species.

Our results also reveal that both passerine and non-passerine island endemics have entered the Holocene with low Ne, based on the PSMC plots (Figure 12). A loss of Ne predisposes species to extinction (Mays et al., 2018; Spielman et al., 2004) with a potentially causal relationship for birds (Evans and Sheldon, 2008). The current climate crisis has greatly imperilled biodiversity with the documented loss of many vertebrate species (Steadman and Martin, 2003; Tan et al., 2023; Willis et al., 2004). Fossil evidence has revealed the extinction of several species in the Caribbean since the LGM (Morgan and Woods, 1986; Orihuela et al., 2020; Steadman et al., 1984), and many of the species in our panel are Southeast Asian where rapid habitat loss is ongoing (Sodhi et al., 2004). Climate warming, sea level rise, and changes in vegetation are all associated with this loss, and flightless birds and endemics are particularly prone to extinction (Fromm and Meiri, 2021). Explicitly correlating past climate with demographic history—known to predict extinction risk (Wilder et al., 2023)—allows for a more robust validation of the latter.

Habitat loss, rise in sea level, and warming temperatures can rapidly accelerate species extinction, particularly in the tropics (Şekercioğlu et al., 2012). Effective population size, and in extension genetic diversity are associated with species survival and extinction risk (Frankham, 2005). Coalescent effective population size as used in PSMC is an important predictor of species vulnerability to extinction (Brüniche-Olsen et al., 2021; Wilder et al., 2023). Thus, an evolution-informed understanding of species vulnerability to climate change and their associations with paleohabitats can be an important tool to predict species vulnerability to the current climate crisis and future extinction risk (Brüniche-Olsen et al., 2021; Chattopadhyay et al., 2019; Gabrielli et al., 2024; Germain et al., 2023).

A strong passerine association might be driven by their rapid diversification

Passerines are a hyperdiverse clade representing over 60% of extant avian diversity. The most recent, fossil-calibrated passerine phylogeny (Oliveros et al., 2019) shows that passerines began diversifying in the Middle (47 mya) to Late Eocene (38–39 mya) with increasing diversification rates suggested as we move towards the present. Crown passerines originated in the Australo-Pacific region (Oliveros et al., 2019), and this places the many Papuan and Australian species in our panel close to the passerine diversification centre which is known to have rapid diversification rates (McCullough et al., 2022). Because bird lineages are known to have arisen on tropical islands in the Pleistocene (Andersen et al., 2015; Irestedt et al., 2013; Moyle et al., 2009), it is likely that the Southeast Asian species in our panel represent newly arisen lineages over refugia, possibly belonging to currently rapidly diversifying clades. This is supported by the fact that several of these species belong to small, often oligotypic families such as Oreoicidae (Aleadryas rufinucha), Ptilonorhynchidae (Amblyornis subalaris), Cnemophilidae (Cnemophilus loriae), Eulacestomatidae (Eulacestoma nigropectus), Ifritidae (Ifrita kowaldi), Machaerirhynchidae (Machaerirhynchus nigripectus), and Rhagologidae (Rhagologus leucostigma). Many of these species belong to poorly-studied tropical genera for which species-level divergence times are unavailable. The strong passerine association between change in Ne and available habitat area (Cramer’s V = 0.98; figure 4, β = 9.4) reveals that passerines are strongly plastic to the environment, most possibly as a consequence of their ability to rapidly diversify. This ability to diversify is reflected in their higher mutation rates as used in PSMC (Lanfear et al., 2010).

Non-passerines representing a paraphyletic outgroup do not show this trend (Cramer’s V = – 0.15). The seven non-passerine species for which both ENM and PSMC analyses were possible belong to six different families (Table 1), representing a diverse sample across the avian phylogeny and precluding non-passerine phylogenetic insight. Notably, we confirm (Cramer’s V = 0.11) that habitat area is positively correlated with genomic diversity in a clade non-specific manner for birds in general (Brüniche-Olsen et al., 2019, 2021).

Species traits influenced historic fluctuations in Ne during the LGP

The interplay of species traits and habitat availability determine how Ne values fluctuate with time in non-endemic birds (Brüniche-Olsen et al., 2021), and our results confirm this for tropical island endemics as well. Overall, habitat change in the LGP was positively associated with Ne fluctuations (Figure 4, β = 9.4). While diet generalists like omnivores showed no significant association with Ne fluctuations, specialists like frugivores (Figure 4, β = –11.89) and invertivores (Figure 4, β = –11.32) showed significant negative associations with fluctuation in Ne, implying that specialist species tended to show decreases in Ne in the LGP. Previous work has shown that large-bodied species have lower values of standing genetic diversity at a given time-point (Brüniche-Olsen et al., 2019, 2021; Eo et al., 2011). We find large-bodied species to show a negative association with Ne fluctuations (Figure 4, β = –8.32) as well. Traits like body size and diet determine extinction risk (Ripple et al., 2017; Willis et al., 2004), and our results add to the understanding of how traits modulate Ne changes through time.

Assessing the relative contribution of species traits and habitat availability in determining Ne requires their quantitative measurements. Our methods generated values for available habitat at various time-points (Supplementary tableS2). However, Ne values can so far only be reconstructed using coalescent methods like PSMC. These values depend upon the parameter settings used. While we find these to be generally concordant across settings (Kruskal-Wallis, p = 0.94, Supplementary table S4), the precise, estimated values vary. We therefore chose to cautiously measure only the direction of change (increase or decrease) and not use numerical values.

Caveats of PSMC analyses

Coalescent methods for inferring demographic history like PSMC assume no operant selection. Our bootstrapped PSMC estimates partially account for this by subsampling from across the genome and potentially breaking any linked regions under selection. Further, PSMC traces a local population’s demographic history rather than the entire species’ history (Gattepaille et al., 2013; Heller et al., 2013; Ptak and Przeworski, 2002). Population structure is thus a confounding factor. For the majority of the species in our panel, modelled habitat areas in the LGM were largely contiguous (Supplementary figure S2), making panmixia and reduced population structure a possibility, with a subsequent increase in isolation and population structure towards present day. Moreover, Papua is the only island in our panel large enough to likely support population structure.

Large peaks followed by apparent collapses in Ne have also recently been shown to be an artefact of PSMC analyses (Hilgers et al., 2024). We addressed this caveat by using three different parameter combinations and tested for consensus amongst the different parameters (increase/decrease/no change). Finally, hybridization may also increase the estimates of Ne, due to increase in heterozygosity. However, we currently lack species specific studies to address this issue. With additional population genomic data on island endemics, we can address this caveat in the future. Because of these potential caveats, we only infer the direction of population size change rather than actual estimates of Ne.

Methods

Species selection

We queried Avibase (Lepage et al., 2014) for all species endemic to tropical islands and selected all species for which assembled whole genome sequences were available on GenBank using short-read data from the Illumina platform. We excluded genomes assembled from museum specimens. This resulted in a panel of 31 species (Table 1; Supplementary table S6–S7).

Pairwise Sequential Markovian Coalescent (PSMC) analyses

For each assembled diploid genome, we identified the contigs corresponding to the sex chromosomes using Satsuma ver. 2.0 (Grabherr et al., 2010) by aligning contigs to a Gallus gallus (chicken; BioSample: SAMN02981218) genome as a reference. All contigs mapping to the sex chromosomes were removed following standard practice (Hawkins et al., 2018; Liu and Hansen, 2016). Next, we obtained all available Illumina short reads for each species from the Sequence Read Archive (SRA) database of the NCBI, and checked them for errors using FastQC (Andrews, 2010). We used Trimmomatic (Bolger et al., 2014) for preprocessing and trimming the reads using the parameter settings “ILLUMINACLIP:TruSeq3-PE.fa:2:30:10:2:True LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:36”.

The cleaned reads were aligned to the autosomes using BWA-MEM ver. 0.7.17 (Li, 2013) using default parameters. Next, we used SAMtools ver. 1.10 (Danecek et al., 2021) to merge all the resulting files to generate a single bam file for each species. We further sorted and removed duplicates using SAMtools and estimated the depth of coverage for each species. We excluded species with a depth of coverage < 18x. This is a sufficient value of coverage for PSMC analyses (Li and Durbin, 2011). Next, we implemented the SAMtools mpileup-bcftools pipeline to identify SNPs. The minimum and maximum depths for calling SNPs was set to one-third and double the mean depth respectively following Nadachowska-Brzyska et al. (Nadachowska-Brzyska et al., 2015).

However, because sex chromosome reads could potentially map to other regions of the genome, we also performed PSMC using an alternate method. For this, we directly mapped raw reads files onto the genome and then called SNPs on only autosomal regions using the SAMtools mpileup-bcftools pipeline, after which we performed PSMC as above (Supplementary Information S3–4). This was done for five species. Results between the two methods were not significantly different (Supplementary information S3–4).

We used three sets of parameters for PSMC analyses and looked for consensus trends amongst them because PSMC analyses can result in spurious peaks of effective population size (Hilgers et al., 2024). We used the following parameter sets for PSMC analyses: –t5 –b –r1 –p “4 + 30 * 2 + 4 + 6 + 10”, –t5 –b –r1 –p “2 + 2 + 30 * 2 + 4 + 6 + 10”, and –t5 –b –r1 –p “1 + 1 + 1 + 1 + 30 * 2 + 4 + 6 + 10”. These parameter options were chosen following Nadachowska-Brzyska et al. (2015). We performed 30 iterations for parameter optimisation and ran 100 bootstrap replicates using the “splitfa” PSMC utility on the psmcfa file to judge the uncertainty in our estimates. Bootstrapping was done by randomly sampling chromosome segments with replacement and running PSMC on them. To plot the results from the PSMC analyses, we used previously estimated values of mutation rates and generation times for passerines (1.4 x 10e–9 years/site and 2 years respectively; (Ellegren et al., 2012; Nadachowska-Brzyska et al., 2016) and non-passerines (1.91 x 10e–9 years/site and 1 year respectively) (Nam et al., 2010) (Supplementary material S1, supplementary table S3).

We extracted the precise values of the population scaled mutation rate (theta) from the bootstrapped and non-bootstrapped .psmc files and used these to calculate the precise values of Ne at the LIG and LGM (Supplementary information S5). We defined the LIG to be equal to 108,000–143,000 years ago, and the LGM to be equal to the atomic time interval closest to 20,000 years ago. We performed this for all three PSMC settings used and compared boxplots of bootstrapped PSMC values. We further performed the Kruskal-Wallis test for comparing across the three different PSMC settings using the “ggstatsplot” package in R (ver. 0.13.0; Patil, 2021).

Species and climate records

We performed reconstructions of paleohabitats through ecological niche models and reconstructed species distributions from four time periods: the Last Interglacial (LIG, approx. 110,000–130,000 years ago), the Last Glacial Maximum (LGM, approx. 20,000 years ago), the Mid-Holocene (MDH, approx. 6000 years ago) and Current (CUR, present day).

We used R (ver. 4.2.1; R Core Team, 2021) for all paleo-habitat modelling analyses. We accessed the location records of our bird species from the Global Biodiversity Information Facility (GBIF) in September, 2023 using the “rbgif” package (ver. 3.7.9) (Chamberlain and Boettiger, 2017) in R. We discarded duplicates and retained only human observed records using the “tidyverse” (v-2.0.0) package (Wickham and RStudio, 2023) followed by the ‘CoordinateCleaner’ package (v-3.0.1) (Zizka et al., 2023). We used the “spThin” package (Aiello-Lammens et al., 2014) to account for spatial autocorrelation and finally discarded spurious distribution records like records overlaying water bodies, buildings, roadways, and railways using QGIS (v-3.34+; https://www.qgis.org/). GBIF dataset keys used to access all the data points are in Supplementary table S8.

We used 19 climate variables (labelled BIO1 – BIO19; Supplementary table S4) for the four different time periods considered. We downloaded CHELSA (https://chelsa-climate.org/; Karger et al., 2021) climate predictors at a resolution of 2.5 arc-minutes (∼5 km) for all regions.

We followed Chattopadhyay et al., (2019) and used a global dataset approach to account for idiosyncratic biases due to smaller datasets to extract the location points of each endemic-island group against each climatic variable. We tested for multicollinearity using variance inflation factor (VIF) using the ‘usdm’ R package (Naimi, 2014) and considered variables for further analyses if their VIF value was ≤ 5 (Shrestha, 2020).

Habitat Suitability Modelling and Area Calculation

We used R to reconstruct paleo-climatic suitable habitat for 29 endemic island birds through ecological niche models for the four time-periods considered. For this, we accessed the Global Biodiversity Information Facility (GBIF) to extract our species’ occurrence records. We could not perform ENM analyses for two species due to a paucity in the number of GBIF occurrence points for them (Table 1).

Data partitioning and model evaluation

For each species, we used available occurrence records from the present day to generate pseudo-absence data points. We generated 500–10,000 pseudo-absence/background points for each species depending upon the area of the island (Supplementary table S6), except for Amazona guildingii endemic to Saint Vincent where we generated 15 points. Saint Vincent’s small land area could not accommodate more points than this. We used a subset of bioclimatic variables to generate a weighted average ensemble species distribution model (eSDM) for each island group. eSDMs were implemented in the R package “sdm” (v-1.2.37; Naimi and Araujo, 2016) applying the ‘MaxEnt’, ‘GLM’, and ‘BRT’ algorithms. eSDMs account for the limitations of different models by generating a weighted average of multiple models (Araújo and New, 2007; Dormann et al., 2018; Naimi and Araujo, 2016). We used k-fold cross-validation (CV) with replication for each method for training and test datasets across each endemic island group (Supplementary table S9). Model accuracy was measured using the Area Under the Curve (AUC) and True Skill Statistic (TSS) metrics. We first performed the above analyses for the present-day (CUR) distribution using the weighted-average of AUC and TSS. AUC and TSS values greater than 0.9 and 0.75 respectively have been shown to be indicators of superior model performance (Ahmad et al., 2019) and we chose a similar threshold (AUC ≥ 0.9 & TSS ≥ 0.8) to get the best ensemble model for most of the species except for 10 species for which data quality was poor (Supplementary table S9). Models for the other time points: the LIG, LGM, and MDH, were generated using the eSDM model generated based on CUR data.

Suitable area analysis

All spatial analyses were carried out using the R package ‘terra’ (ver. 1.7.13) (Hijmans et al., 2024). Equal area projections were used to calculate the absolute suitable area of each species across the four time-periods. To account for a spherical Earth and the limitations of landmass depiction of the original eSDM raster (World Geodetic System 1984) we reprojected it into a pseudocylindrical projection (Eckert IV). Further, we used the average quantile threshold pixels to measure the availability of absolute suitable areas for the LIG, LGM, MDH, and CUR periods (Supplementary table S2).

Statistical Analyses

We checked for associations between Ne and habitat change during the Last Glacial Period (from LIG to LGM) (LGP) using Cramer’s V implemented in the“polycor” package in R (Fox, 2022). Cramver’s V varies between –1 to 1, with values closer to 0 suggesting no-correlation. This was done for the entire dataset, and then for passerines and non-passerines separately.

We further explored the effects of species biology, habitat, and phylogenetic constraint on the fluctuation in Ne during the LGP using Bayesian Multilevel Models (MLMs), using the “brms” package in R (Bürkner, 2018). Our fixed effect predictors were habitat change from the LIG to LGM, diet (invertivore, frugivore, or omnivore; Supplementary table S6), body mass, and clade identity (passerine and non-passerine). The latter allows us to check if belonging to the rapidly diversifying passerine clade results in a signal. We also included country (the island/archipelago a species is endemic to) as a random effect variable for the analysis to account for country-specific effects. For the response variable i.e., the change in Ne, a Bernoulli distribution with a logit link was used because it is a binary response variable. Thus, species traits, clade identity, and change in habitat were three sets of independent predictor variables, with the change in Ne as the response variable.

We ran a series of models (n=12) to find the best-fit model (brms11, Supplementary table S10) based on leave-one-out cross-validation. We generated a total of 16,000 posterior samples by using the No-U-Turn Sampler (Hamiltonian Monte Carlo) algorithm with 4 chains with 5000 iterations each (with 1000 warm-up iterations; supplementary Information S6). Further, we estimated the posterior parameters with 95% confidence intervals to find negative or positive associations between the Ne and its predictors. In addition, we assessed the posterior convergence of the sample through the R-hat statistic (R^ = 1) where values close to 1 suggest an ideal convergence. We also reported the effective sample size for bulk and tail distributions (Supplementary table S9).

Data availability

The raw data used for this study are available on NCBI and details of the accession IDs are provided in Supplementary table S7. All codes used in this study are provided on the Zenodo URL https://doi.org/10.5281/zenodo.14603965.

Acknowledgements

BC acknowledges the support of the Trivedi School of Biosciences. KMG acknowledges the support from the DBT-Ramalingaswami Fellowship (No. BT/HRD/35/02/2006). VI is supported by DBT grant (BT/PR42830/BRB/10/1995/2021) to KMG.

Additional files

Supplementary figures

Supplementary tables

Additional information

Funding

Trivedi School of Biosciences

  • Balaji Chattopadhyay

DBT Ramalingaswami Fellowship (BT/HRD/35/02/2006)

  • Kritika M Garg

Department of Biotechnology (BT/PR42830/BRB/10/1995/2021)

  • Kritika M Garg