Introduction

Numerous animals, spanning diverse taxa, navigate within a three-dimensional world where changes in altitude are common. Birds, fish, mammals and insects change their flight altitude, climbing height, or swimming depth from near the ground to great heights above or depths below the water surface during foraging, nesting, or in search for shelter [4, 17, 20, 23, 22, 16]. Also humans experience that the landscapes and views of the environment change with increasing altitude during activities like hiking, climbing, or aeroplane travel. This prompts us to question whether animals that have evolved to solve three-dimensional challenges are also adapted to efficiently use bird’ s-eye view perspectives for navigational purposes. Over the past decades, researchers have delved into the navigational strategies employed by insects, commonly referred to as their navigational toolkit [46]. This toolkit primarily comprises a compass and an odometer, which can be synergistically employed as a path integrator—enabling the integration of distance and direction travelled. Additionally, insects utilise landmark guidance and exploration behaviour in their navigation. Taken all together, the navigational toolkit has been studied by analysing the insects’ walking or flight paths mostly in two dimensions (e.g. [5, 43]).

Accordingly, the visual mechanisms that have been considered are primarily based on the information that can be seen from close to the ground (frog’s-eye view) [50]. However, when flying insects exit their nest for the first time and are not yet familiar with their surroundings, they increase the distance to the nest during loops, arcs and spirals and vary their flight altitude during so-called learning flights [10, 29, 35, 38, 47]. They may therefore learn and use visual sceneries at different altitudes. The visual scenery may drastically change when insects change their flight altitude in cluttered environments. Flying insects use views from above the clutter, i.e., bird’ s-eye views, to recognize ground-level landmarks and locations for large-scale navigation [3, 9, 11, 30]. Such bird’s-eye views might not only be relevant for high altitudes and navigation on a large spatial scale, but also on smaller spatial scales and altitudes in the context of local homing.

This might be especially helpful for underground nesting species, such as bumblebees Bombus ter-restris, whose nest entrance is often inconspicuously located within the undergrowth. In such cluttered environments, bumblebees need to change their flight altitude both when learning the surroundings and when homing. Bird’s eye views might be helpful for guiding homing behaviour in the near-range of the hidden nest hole by providing a kind of overview. In contrast, frog’s eye views might help pinpointing the nest entrance. Computational models suggest such views of the visual scenery from within cluttered environments is sufficient to explain the returning journey of ants [49, 48, 2, 31] and bees [13, 14, 50] under a variety of experimental conditions. These models rely on visual memories of the environments acquired during previous journeys or learning flights at the nest site or a flower. To navigate visually, the modelled insect compares its current view with the memorised ones and steers toward the most familiar memories (e.g. active scanning [2, 1], rotation-invariant comparison [41], while oscillating by left and right turns [24, 25]). This class of models can simulate route-following behaviour within clutter, even at different flight altitudes [36, 7]. Despite all these results at the model level, how insects use views at different altitudes to navigate has never been studied experimentally. Because this is a particular challenge for bumblebees in finding their nest hole, we investigated this in a combined experimental and model analysis.

We addressed the question of whether bumblebees, Bombus terrestris, learn views at different altitudes and if they can return home by just using either frog’s or bird’s eye views. We first investigated the navigation of modelled bees guided by standard models (multi-snapshot model), which are broadly used in the literature for homing [51]. We then designed behavioural experiments to challenge bees in the same environment. The analysis was based on homing experiments in cluttered laboratory environments, where the flight altitude during the learning and homing flights was systematically constrained by manipulating the environment. We related our results to predictions of snapshot models [52, 50] comparing the performance of bird’s and frog’s eye view snapshots in the same cluttered environment like the behavioural experiments.

Results

Snapshot models perform best with bird’s eye views

To guide our hypotheses about which views bees should prioritise during their homing flight, we compared predictions of homing models based on either bird’s eye views or frog’s eye views. In previous studies on the mechanism of visual homing, models could replicate the insects’ homing behaviour at least in spatially relatively simple situations [6, 2, 25]. These models assume that insects learn one or multiple views, so-called snapshots, around their goal location, like their nest. During their return, they compare their current views with the stored snapshot(s). Therefore we tested the homing performance in a cluttered meadow (clutter) of rotational image difference models based on two parameters: brightness values of the pixels [52, 12] or contrast-weighed-nearness values encoding the depth and contrast of the environment based on optic flow information [13] (for details see Methods). The environment for the model simulations was the same as that later used in the experimental analysis (Fig. 2 A&B). It consisted of an inconspicuous goal on the floor, i.e. the nest hole, surrounded by multiple, similarly-looking objects creating an artificial meadow around the nest. Thus, we investigated the model performance of homing in a cluttered environment and carried out an image comparison of the snapshots taken either above the meadow or within it.

Brightness based homing model at four altitude layers in the environment (0.02 m, 0.15 m, 0.32 m, and 0.45 m) for the area around the artificial meadow (clutter). The rows show different parameters for the memorised snapshots (eight positions taken either outside or inside the meadow and either above the meadow, bird’s eye view, or close to the ground, frog’s eye views). A&B: Examples of panoramic snapshots in A from the bird’s eye view outside the clutter and in B frog’s eye views inside the clutter. The axes of the panoramic images refer to the azimuthal directions (x-axis) and to the elevational directions from the simulated bee’s point of view (y-axis, dorsal meaning upwards, ventral downwards, equatorial towards the horizon). C: Rendered layers of the environment for a comparison of the current view of the simulated bee. The layers are at 0.02 m(orange), 0.15 m (blue), 0.32 m(green) and 0.45 m (red) heights. D&E: The first column shows were the snapshots were taken in relation to the nest position (nest position in black, objects in red and snapshot positions indicated by coloured arrows). The other two columns show the comparison of memorised snapshot for two layers of the environment (0.02 m and 0.45 m as shown in C). The heatmaps show the image similarity between the current view at the position in the arena and the memorised snapshots taken around the nest (blue = very similar, white = very different). Additionally, white lines and arrows present the vector field from which the homing potential is derived. Red circles indicate the positions of the objects and the white dot indicates the nest position. The background colour of each column indicates the height of the current views that the snapshots are compared to. D: Memorised bird’s eye view snapshots taken outside (distance to the nest = 0.55 m) and above the clutter (height = 0.45 m) can guide the model at the highest altitude (red background) to the nest but fails to do so at the three lower altitudes. E: Memorised frog’s eye view snapshots taken outside (distance to the nest = 0.55 m) the clutter and close to the floor (height = 0.02 m) can only guide the model towards the center.

A: We trained two groups of bees in a cylindrical flight arena with cylindrical cluttered objects (‘artificial meadow’) around the nest entrance. The first group, B+ F+, was trained with a ceiling height of the arena twice the height of the objects providing space to fly above the objects. They might have memorised both a frog’s (F+) as well as a bird’s eye (B+) view when leaving the nest. The second group, B F+, was trained with an arena height restricted to the height of the objects, allowing the bees to only use frog’s eye views B F+. Both groups were tested to return home in three test conditions: HighCeiling (BF), Covered (B) and LowCeilng (F). In test BF the artificial meadow was shifted from the training position to another position in the arena to exclude the use of potential external cues; the bees could use both, a frog’s and a bird’s eye view, during return. In test B a partial ceiling above and a transparent wall was placed around the objects preventing the bees from entering the artificial clutter during return. In test F the ceiling was lowered to the top of the objects allowing the bees to use only frog’s eye views during return. B: 3D view of the setup with the hive and the foraging chamber.

Eight snapshots around the nest position were compared to panoramic images either based on brightness values or contrast-weighted nearness of the rendered environment at four altitudes. These altitudes were chosen to provide information related to ecologically relevant heights for the bees during their return home (Fig. 5). These were either close to the ground (0.02 m) resembling walking or flying close the floor, half the height of the objects (0.15 m), at the maximum height of the objects (0.32 m) or above the objects (0.45 m). We hypothesised that the simulated bee could memorise panoramic snapshots either at high altitude (bird’s eye views) or close to the floor (frog’s eye views). Further, these snapshots were taken either close to the nest (0.1 m) or just outside the clutter (0.55 m) (Fig.1 positions of snapshots). Previous behavioural studies showed that views close to the goal are acquired during learning walks or flight [11, 42] and modelling studies could emphasise the successful use during homing [12, 14, 13]. Views just at the border of the clutter display the change of the environment from within the clutter surrounded by many objects to a clearance outside the clutter. Those abrupt changes of the environment have been shown to trigger reorientation by convoluted flight pattern [8]. The comparison of simulated homing either with memorised snapshots taken from the bird’s eye or the frog’s eye views, revealed the best performance, highest image similarity and homing vectors point towards the nest, with bird’s eye view snapshots outside the clutter (brightness model: Fig. 1 and Fig. 9, contrast-weighted nearness model: Fig.13). Bird’s eye view snapshots led the simulated bee very close to the nest (brightness model: Fig. 1 D&E and Fig. 9 - 10, contrast-weighted nearness model: Fig. 13 - 14) while frog’s eye view snapshots inside the clutter could only lead into the clutter but not to the nest (brightness model: Fig. 1 F&G and Fig. 11 - 12, contrast-weighted nearness model: Fig. 15 and 16). Snapshots above the clutter showed the best homing performance when compared to images above the clutter. Based on these results we made qualitative predictions on the bumblebees homing behaviour, assuming they are guided by homing mechanisms akin to the one tested in simulation. First, this suggests bumblebees would return to their environment by flying above the clutter. Second, bees that could not acquire views above the clutter, would not be able to locate their nest entrance.

Examples of return flights and bees’ search for their nest in a cluttered environment (N = 26). A&B:Exemplary flight trajectories in 3D (left column) and a top view in 2D (right column) from the group B+F+ in the BF condition (A), and from the group B F+ in the B condition (B). The colour indicates the time, blue the start and red the end of the flight. The objects are depicted by red cylinders in the 3D plot and as red circles in the 2D plot. The black dots in the 2D plot shows the visual nest position within the clutter. A: The bee searches for the nest within the clutter at a low flight altitude. B: The bee is mainly trying to enter the covered clutter from the side. C: Spatial search distribution represented by hexagonal binning of the percentage of visits of all bees (relative to each bee’s total flight time) in the BF condition from group B+ F+. Orange circles indicate the true and the visual nest position. Black circles indicate the object positions of the clutter. D: Percentage searching for the two groups B+F+ (filled boxes, N = 26) and B F+ (hatched boxes, N = 26) in the tests BF, F and B relative to the total flight time. The search percentage at the true nest is given in blue and at the visual nest in green. For all tested conditions and both groups, the bees searched more at the visual nest within the clutter than at the true nest location (refer to SI Table 1 for statistical tests).

Entry points of the bees (N = 26) to the clutter, from the side (A, circular histogram in grey) and from the top (B, scatter plot in blue) of the clutter. The direction of the direct path from the arena entrance to the nest is given by the green triangle. The kernel density estimation (KDE) of the entries from the side of the clutter is shown as a black, dashed line in A. The radial axes represents the normalized magnitude of the KDE.

Probability density distribution of the flight altitude and the search distribution for the B condition of the groups B F+ (A&B) and B+ F+ C&D). A flight altitude of 0 is at the floor, of 300mm is at the height of the objects and 600mm at the ceiling. A: The group B F+ constrained to a low ceiling during training shows two peaks for a low altitude and for a high altitude. C: The group B+ F+ shows a broader distribution, with three peaks either half the height of the objects, just above the height of the objects, or close below the ceiling. B&D: The search distributions reveal that the bees tried to enter the clutter between the objects.

Schematic of a multi-step process with visually triggered reloading of memorised home vectors. A bee entering the arena (grey circle) could have experienced the normal path integration vector (black arrow) pointing towards the true nest location at the training position in clutter (black dot within the light red circle). However, the visual scene changed drastically in the test. Hence, a vector memory might point coarsely to the clutter (light green arrow) triggered by the prominent visual cue of the clutter shifted to the test position (dark red circle). Since this vector would not point precisely towards the nest, the bee could search at the border of the clutter (curved, yellow arrow) for a previously experienced entry position to the clutter. This position could reload a refined clutter vector pointing to the more precise nest position within the clutter (blue arrow).

Examples of the memorised snapshots based on brightness values for inside and outside the clutter as well as from the bird’s and frog’s eye view perspective. The axes of the panoramic images refer to the azimuthal directions (x-axis) and to the elevational directions from the simulated bee’s point of view (y-axis, dorsal meaning upwards, ventral downwards, equatorial towards the horizon).

Examples of the memorised snapshots based on contrast-weighted nearness for inside and outside the clutter as well as from the bird’s and frog’s eye view perspective. The axes of the panoramic images refer to the azimuthal directions (x-axis) and to the elevational directions from the simulated bee’s point of view (y-axis, dorsal meaning upwards, ventral downwards, equatorial towards the horizon).

Brightness-based homing model at four altitudes (0.02 m, 0.15 m, 0.32 m, and 0.45 m) with frog’s eye view snapshots taken outside the clutter (distance = 0.55 m, height = 0.45 m). A: We applied the model to a list of images, one of them (memory 0) being the memory inside the model. We observe that the image difference function is minimum for the memorised image at a null rotation, as expected. If the other images are not two far from the nest, we may see other local minima for each of these images, where the local minima are shifted according the nest bearing. B-C: Heatmaps of full environment (B) and only the cluttered area (C, as shown in D) of the image similarity. The colour indicates the image difference between the view at the position in the arena and the snapshots taken around the nest. The model leads the simulated bees with bird’s eye view snapshots outside the clutter to the nest position when the images are compared to images at an altitude of 0.45 m.

Brightness-based homing model at four altitudes (0.02 m, 0.15 m, 0.32 m, and 0.45 m) with bird’s eye view snapshots taken inside the clutter (distance = 0.1 m, height = 0.45 m). The colour indicates the image difference between the view at the position in the arena and the snapshots taken around the nest. A: We applied the model to a list of images, one of them (memory 0) being the memory inside the model. We observe that the image difference function is minimum for the memorised image at a null rotation, as expected. If the other images are not two far from the nest, we may see other local minima for each of these images, where the local minima are shifted according the nest bearing. B: Heatmaps of full environment (B) and only the cluttered area (C, as shown in D) of the image similarity. The colour indicates the image difference between the view at the position in the arena and the snapshots taken around the nest. The model leads the simulated bees, with bird’s eye view snapshots inside the clutter, only coarsely to the clutter but not to the nest when the images are compared to images at an altitude of 0.45 m.

Brightness-based homing model at four altitudes (0.02 m, 0.15 m, 0.32 m, and 0.45 m) with frog’s eye view snapshots taken inside the clutter (distance = 0.1 m, height = 0.02 m). The colour indicates the image difference between the view at the position in the arena and the snapshots taken around the nest. A: We applied the model to a list of images, one of them (memory 0) being the memory inside the model. We observe that the image difference function is minimum for the memorised image at a null rotation, as expected. If the other images are not two far from the nest, we may see other local minima for each of these images, where the local minima are shifted according the nest bearing. B: Heatmaps of full environment (B) and only the cluttered area (C, as shown in D) of the image similarity. The colour indicates the image difference between the view at the position in the arena and the snapshots taken around the nest. The model leads the simulated bees, with frog’s eye view snapshots inside the clutter, only to the center of the clutter (altitude of 0.32 m) or shifted away from the nest (altitude of 0.45 m) but not to the nest.

Brightness-based homing model at four altitudes (0.02 m, 0.15 m, 0.32 m, and 0.45 m) with frog’s eye view snapshots taken outside the clutter (distance = 0.55 m, height = 0.02 m). The colour indicates the image difference between the view at the position in the arena and the snapshots taken around the nest. A: We applied the model to a list of images, one of them (memory 0) being the memory inside the model. We observe that the image difference function is minimum for the memorised image at a null rotation, as expected. If the other images are not two far from the nest, we may see other local minima for each of these image, where the local minima are shifted according the nest bearing. B: Heatmaps of full environment (B) and only the cluttered area (C, as shown in D) of the image similarity. The colour indicates the image difference between the view at the position in the arena and the snapshots taken around the nest. The model leads the simulated bees, with frog’s eye view snapshots outside the clutter, only to the center of the clutter (altitude of 0.32 m) but not to the nest.

Contrast-weighted nearness based homing model at four altitudes (0.02 m, 0.15 m, 0.32 m, and 0.45 m) with bird’s eye snapshots taken outside the clutter (distance = 0.55 m, height = 0.45 m). A: We applied the model to a list of images, one of them (memory 0) being the memory inside the model. We observe that the image similarity as based on [13] is the maximum for the memorised image at a null rotation, as expected. If the other images are not two far from the nest, we may see other local maxima for each of these image, where the local maxima are shifted according the nest bearing. B: Heatmaps of full environment (B) and only the cluttered area (C, as shown in D) of the image similarity. The colour indicates the image similarity between the view at the position in the arena and the snapshots taken around the nest. The model leads the simulated bees, with bird’s eye view snapshots outside the clutter, only to the center of the clutter (altitude of 0.32 m) or shifted away from the nest (altitude of 0.45 m) but not to the nest.

Contrast-weighted nearness based homing model at four altitudes (0.02 m, 0.15 m, 0.32 m, and 0.45 m) with bird’s eye view snapshots taken inside the clutter (distance = 0.1 m, height = 0.45 m). A: We applied the model to a list of images, one of them (memory 0) being the memory inside the model. We observe that the image similarity as based on [13] is the maximum for the memorised image at a null rotation, as expected. If the other images are not two far from the nest, we may see other local maxima for each of these image, where the local maxima are shifted according the nest bearing. B-D: Heatmaps of full environment (B) and only the cluttered area (C, as shown in D) of the image similarity. The colour indicates the image similarity between the view at the position in the arena and the snapshots taken around the nest. The model leads the simulated bees with bird’s eye view snapshots inside the clutter only to the center of the clutter (altitude of 0.45 m) but not to the nest.

Contrast-weighted nearness based homing model at four altitudes (0.02 m, 0.15 m, 0.32 m, and 0.45 m) with frog’s eye view snapshots taken inside the clutter (distance = 0.1 m, height = 0.02 m). The colour indicates the image difference between the view at the position in the arena and the snapshots taken around the nest. A: We applied the model to a list of images, one of them (memory 0) being the memory inside the model. We observe that the image similarity as based on [13] is the maximum for the memorised image at a null rotation, as expected. If the other images are not two far from the nest, we may see other local maxima for each of these images, where the local maxima are shifted according the nest bearing. B-D: Heatmaps of full environment (B) and only the cluttered area (C, as shown in D) of the image similarity. The colour indicates the image similarity between the view at the position in the arena and the snapshots taken around the nest. The model leads the simulated bees with frog’s eye view snapshots inside the clutter only to the center clutter (altitude of 0.45 m) but not to the nest.

Contrast-weighted nearness based homing model at four altitudes (0.02 m, 0.15 m, 0.32 m, and 0.45 m) with frog’s eye view snapshots taken outside the clutter (distance = 0.55 m, height = 0.02 m). The colour indicates the image difference between the view at the position in the arena and the snapshots taken around the nest. A: We applied the model to a list of images, one of them (memory 0) being the memory inside the model. We observe that the image similarity as based on [13] is the maximum for the memorised image at a null rotation, as expected. If the other images are not two far from the nest, we may see other local maxima for each of these images, where the local maxima are shifted according the nest bearing. B-D: Heatmaps of full environment (B) and only the cluttered area (C, as shown in D) of the image similarity. The colour indicates the image similarity between the view at the position in the arena and the snapshots taken around the nest. The model leads the simulated bees with frog’s eye view snapshots outside the clutter only to the center clutter (altitude of 0.32 m) or slightly shifted away from center (altitude of 0.45 m) but not to the nest.

Frog’s eye views are sufficient for bees’ homing in clutter

Having shown that snapshot models perform best with bird’s eye views, we tested whether homing bees employed this strategy accordingly. Based on the model results, we hypothesised that bees should show the best homing performance when they can learn bird’s eye views and perform worse when only having access to frog’s eye view during learning. We first needed to assess their ability to return by using only visual cues provided by the clutter. Two groups of bees were trained to find their way in a cylindrical flight arena (with a diameter of 1.5 m and a height of 0.8 m) from the nest entrance to the foraging entrance and back (Fig. 2). The group B+ F+ was trained with unrestricted access to bird’s eye views (B+) above the clutter and frog’s eye views ( F+) within the clutter. The group B F+ only had access to frog’s eye views within the clutter during the training period. To test if the bees associated the clutter with their nest location and did not use non-intentional visual cues, e.g., outside above the flight arena (even though we tried to avoid such cues), the clutter was shifted in the tests to another position in the arena, creating two possible nest locations: the true nest location in an external reference system as during the training condition and a visual nest location relative to the clutter (see Materials and Methods). Further, the ceiling of the arena was either placed at the height of the clutter, allowing only frog’s eye views (F), or high above the clutter (BF), allowing the bees to get both: frog’s and bird’s eye views.

In the tests BF and F where the bees had physical access to the clutter, they searched for their nest within the clutter. This search was in the vicinity of the visual nest entrance, though it was not always precisely centred at the nest entrance; instead, it showed some spatial spread around this location. A comparison of the time spent at the two possible nest entrance locations showed that the bees were able to find back to the visual nest in the clutter even when the clutter was shifted relative to an external reference frame (Fig. 3). Furthermore, we obtained a higher search percentage at the visual nest in the clutter for the BF and F tests as well as for both groups trained to either B+ F+ or B F+ (Fig. 3 and Fig. 18, statistical results in SI Table 1). Most time was spent within the arena close to the visual nest location. Still, the spatial distribution of search locations also shows high search percentages in two to three other areas near the nest location (Fig. 3). These other search areas can be explained by locations similar-looking to the visual nest location between the objects as the simulations depicted a rather broad area with a high image similarity (Fig. 1, Fig. 13 - 14 for the contrast-weighted nearness model). Both groups, B+ F+ and B F+, showed similar search distribution during F and BF tests, indicating that the ceiling height does not change the bees’ search distributions (t-test results with Bonferroni-correction: SI Table 1). In conclusion, when the bees had physical access to the clutter during the test, i.e. they could fly between the objects in the BF and F test, they were able to find back the nest location within the clutter by using only frog’s eye views.

Four exemplary flight trajectories of the first outbound flight of bees in 3D (left subplot) and a top view in 2D (right subplot). The colour indicates the time, blue the start of the and red the end of the flight. The objects are depicted by red cylinders in the 3D plot and as red circles in the 2D plot. The black dots in the 2D plot shows the nest position within the clutter.

Search distributions for the group B+ F+ (left column) in the condition F (condiditon BF is shown in the main results in Fig.2D) and for group B F+ (right column) in the tests BF and F (N = 26 for each plot). The bees of group B+ F+ in the condition F searched for the nest inside the clutter and spent more time at the visual nest (in the clutter) than at the true nest (as during training). The bees of group B F+ in the condition F and BF searched for the nest within the clutter and spent more time at the visual nest than at the true nest.

Statistical results of t-tests

Bird’s eye views are not sufficient for bees to return

Based on the results of the simulations (Fig. 1), we hypothesised that bees could pinpoint their nest position by using only bird’s eye views. In addition, we observed that during the first outbound flights, bees quickly increased their altitude to fly above objects surrounding the nest entrance (Fig. 17). Therefore, we restricted the bees’ access to the clutter by a transparent cover around it, so they did not have access to the clutter from the side but only from above (see test B, Fig. 2 A). In this test, the bees tried to enter the clutter sideways between the objects as the search distributions show (Fig. 5 B&D). The bees did not search at the nest location above the clutter (Fig. 5 B&D) which they could have done if they would have learned to enter the clutter from above. We can therefore reject the hypothesis that bees use bird’s eye view to return home in a cluttered environment. Rather frog’s eye views seem to be sufficient for homing. Nevertheless, an interesting aspect with respect to finding the nest hole is revealed by the entry positions into the clutter. Entry points from all three tests (B, BF and F for both groups) from the top and the side to clutter supported the finding that most bees tried to cross the boundary to the clutter from the side while only very few tried to cross the clutter from the top (Fig. 4). The entry points from the side concentrate mainly on three locations indicating that the bees learned similar entrance points to the clutter (Fig. 4 A). Taken together, these results show that the bees learned certain positions where they enter the clutter indicating that these positions might be used to store snapshots for the return.

Flight altitude changes with training views

We investigated the influence of the potentially available views during training on the bees’ altitude during homing by comparing the employed flight altitudes. The flight altitude probability density distributions for the F and BF tests showed for both groups that the bees flew mostly very low searching for the nest entrance (Fig. 19). The distributions, however, looked very different for the test B. The flight altitude distribution of the group B F+ shows two peaks, one around 130mm and one around 550mm (Fig. 5 A). These peaks indicate that the bees either flew just below half the object height, probably trying to enter the covered clutter from the side or they flew close to the ceiling of the arena. The flights close to the ceiling might reflect exploratory behaviour in a section of the flight arena the bees could not experience before as they were trained to a ceiling constrained to the height of the objects. The behaviour of flying at a low altitude, trying to enter the clutter from the side, is fitting to the peaks of search time around the covered clutter (Fig. 2 A&B). The group B+ F+ showed a broader probability density distribution indicating three shallower peaks, one at an altitude around 170mm, which refers to half the height of the objects, a second peak around 370mm, referring to just above the objects and a third peak just below the ceiling where the bees might have explored the boundaries of the arena. This indicates that the bees trained with B+ F+ seemed to have learned to fly above the objects and used this information when the cover blocked the entrance to the clutter. The group B F+, constrained to fly only up to the top of the objects but not above, seemed to be unable to search for another way to enter the clutter from above. Although the bees exposed to B+ F+ during training, did not search above the nest position, they seemed to have learned the height of the objects which the altitude-constrained group, B F+, could not learn because the bees flew more at the height of the objects. Overall, there is a clear difference in the flight altitude distributions between the two groups and between the tests. The group trained with B+ F+ flew more at the height of the objects when the clutter was covered, while the group trained with a B F+ seemed to just explore the newly available space at a very high altitude with flying little at the height of the objects. This indicates that the flight altitude during learning plays a role in what altitude the bees fly during the return flights, and that this could influence what aspects are learned.

Flight altitude distributions for the tests BF and F for the groups B+ F+ and B F+ (N = 26 for each plot).

Exemplary flight trajectories in 3D (left column) and a top view in 2D (right column) from the group B+ F+ in the B test. The colour indicates the time, blue the start of the and red the end of the flight. The objects are depicted by red cylinders in the 3D plot and as red circles in the 2D plot. The black dots in the 2D plot shows the nest position within the clutter. The bee is trying to enter the covered clutter from the side but, eventually, it is increasing its altitude and it flies above the clutter. However, the bee is not searching for the visual nest above the covered clutter.

Discussion

We investigated how bumblebees find home in a cluttered environment using views at different altitudes. The results of simulations of snapshot-based homing models [51] suggested the best homing performance with bird’s eye views from above the clutter. Surprisingly, bees performed equally well whether or not they could experience bird’s eye views during training or just frog’s eye views, i.e. views taken close to the ground. Even bees trained with both frog’s and bird’s eye views, predominantly used frog’s eye views for homing. Thus, instead of solely relying on snapshots, we are suggesting a multi-step process for homing in clutter the bees could have used using a variety of tools from the navigational toolkit supported by the bees’ behaviour.

Working range of snapshot model is limited inside clutter

Snapshot matching is widely assumed to be one of the most prominent navigational strategies of central-place foraging insects [50]. In the simulations of this study, we found that bird’s eye view snapshots led to the best homing performance, while frog’s eye view snapshots could only find the clutter but not the nest position within the clutter. Additionally, the model steered only to the nest when the current views were above the objects (altitudes 0.32 m and 0.45 m). Previous studies showed that snapshots at higher altitudes result in larger catchment areas than snapshots close to the ground [32]. Furthermore, in a particular form of snapshot model, i.e. skyline snapshots, occluding objects lead to smaller catchment areas [33]. Our model simulations support both conclusions, as the bird’s eye views above the clutter were less occluded. Behavioural studies also confirm the advantage of higher altitude snapshots as it was found that honeybees and bumblebees may use ground features of the environment while navigating on a large spatial scale (i.e. of hundreds of metres) [30, 3]. A modelling and robotic approach of Stankiewicz and Webb [40] could also confirm these findings. Overall, our model analysis could show that snapshot models are not able to find home with views within a cluttered environment but only with views from above it.

Bees’ homing performance in a cluttered environment

To our great surprise, we found that the behavioural performance of bumblebees differed quite substantially from our snapshot-based model predictions, when we tested bees either trained with bird’s and frog’s eye views or trained only with frog’s eye views. Both groups of bees performed equally well in finding the nest position between the cluttered objects. However, bees that could experience heights above the objects during training also used bird’s eye views during homing to fly at the altitude of the objects. This finding indicates that they learned to fly above and find other possibilities to enter the clutter. From a flight control perspective, changing between flying up and down could be energetically more demanding than between left and right. Large-scale studies (e.g. [30]) state that bees fly more in a plane than up/down. Moreover, from a flight control perspective, it might be difficult for the bees to approach the clutter from above and then descend to the nest between the objects. This hypothesis is supported by a study showing that bees prefer to avoid flying above obstacles at short distances when they have the choice to fly around [44].

A multi-step process as a hypothetical homing mechanism in clutter

Since snapshot models could not explain the bees’ homing behaviour in clutter, we propose an alternative explanatory hypothesis for the bees’ behaviour by reloading a path integration vector as described by Webb [45]. This alternative suggests that the bee can reload a memorised home vector triggered by external stimuli, such as visual cues, to navigate back home. In our study, a standard path integration vector of a bee entering the arena to return home (as illustrated by the bee’s position in Fig. 6) would point toward the true nest hole at the training position in the flight arena (Fig. 6, indicated by the black arrow). Following the conceptual framework proposed by Webb [45], a vector memory could be triggered by the presence of the cluttered objects around the nest entrance as a visual cue (Fig. 6, depicted by the light green arrow). In the training situation, both vectors were congruent, pointing in the same direction. However, they would point in different directions in the test situation, as the clutter and, thus, the visual nest hole were displaced within the flight arena. Given that the cluttered objects were the most salient cues within the cylindrical environment, it is plausible that the bees relied on them as robust cues for homing. Based on the association between the clutter and the nest, this vector could be used by bees that learned to use the shortest path between the entrance to the flight arena and the nest to enter the clutter. The behavioural findings support the use of the shortest path by a clear peak in the entrance position around the clutter in Fig. 4. However, we also found two other entry positions that many bees used, which could be explained by locations the bees memorised to enter the clutter. These memories might be triggered by visual cues, e.g. a memorised snapshot, at the border of the clutter (Fig. 6, denoted by the yellow arrow and could be used to refine the coarse vector of the clutter vector memory. As the bee traverses along the clutter’s border, it eventually reaches a location where it learned to enter the clutter during training. This hypothesis is supported by the observed similarity in entry points into the clutter exhibited by the bees (Fig. 4). Once within the clutter, the bee might utilise the distance between the nest and the wall as a reference, coupled with the direction memorised at the entry point, to locate the nest (Fig. 6, represented by the blue arrow). This hypothetical multi-step process underlines the complexity of the bees’ navigation strategy in cluttered environments, involving the integration of various environmental cues and memories to achieve precise homing.

Outlook

Our study underscores the limitations inherent in snapshot models, revealing their inability to provide precise positional estimates within densely cluttered environments, especially when compared to the navi-gational abilities of bees using frog’s-eye views. Notably, bees trained with bird’s-eye views demonstrated adaptability in spatially constrained situations, although this strategy was not employed explicitly for searching nests above clutter. Future research should extend these findings on a larger scale and explore the development of 3D snapshot models that account for altitude variations. Furthermore, these insights extend beyond bees and may have implications for other species, like birds with altitude fluctuations during foraging or nesting [4]. The use of bird’s-eye views in densely cluttered forests, akin to our findings with bees, prompts consideration of similar behaviours in other flying animals, but also for walking animals, such as ants such navigating varied vegetation heights [17]. Switching views might considerably affect the ability of animals to solve spatial problems, as shown in humans [21], and understanding how best to combine these information for navigation for navigation will no doubt benefit the development autonomously flying robots, for instance to help them navigating in cluttered environments.

Materials and Methods

View-based homing models

We rendered an environment with 40 randomly placed red cylinders, creating an artificial meadow (clutter), surrounding the nest entrance placed on a floor with a black and white pattern with a 1/f spatial frequency distribution (a distribution observed in nature [39]) in a graphic software (Blender, 2.82a). A panoramic image was taken on a 1 cm spaced grid along the x-axis of the arena. The arena was 1.5 m in diameter like for the following behavioural experiments. For the rendering we focused only on relevant cues like the objects and the patterned floor. We did not render the nest hole which was visually covered during the behavioural experiments. In addition, the arena wall was also not rendered as it did not contain any information for bees other than for flight control and simulations with a wall led to similar results (data not shown). The grid-spacing was used at four altitudes: 2 cm for walking or hovering bees, 15 cm for bees flying half of the object height, 32 cm for bees flying just above the objects and 45 cm for bees flying very high to compare how the calculated similarity changes regarding the altitude. In the next step, eight images around the nest were taken and stored as memorised snapshots, and the current views were compared (examples in Fig. 7-8).

We used the method presented in Doussot et al. [14] to calculate the image differences according to the brightness and the contrast-weighted-nearness method, each with eight snapshots. The image difference was calculated either by the brightness method relying on the brightness value of each pixel in one panoramic image or the contrast-weighted nearness method (see SI Methods) relying on the contrast calculated by the Michelson contrast (ratio of the luminance-amplitude (ImaxImin)) and the luminance-background (Imax + Imin)) weighted by the inverse of the distance [13]. We wanted to predict the simulated bee’s probable endpoint by analyzing vector fields from different homing models. Convergence points in these vector fields were determined using the Helmholtz-Hodge Decomposition, focusing on the curl-free component, representing the potential ϕ ([14]). After applying the Helmholtz-Hodge Decomposition to each vector field, we scaled the resulting potential between 0 and 1 for all homing models. With these potentials, we were able to plot heatmaps accordingly to estimate the areas in the arena where view-based agents/ simulated bees would most likely steer towards. These heatmaps were compared to the bees’ search distributions in the arena.

Animal handling

We used four B. terrestris hives provided by Koppert B.V., The Netherlands, that were ordered sequentially to test one colony at a time. The bee hives arrived in small boxes and were transferred to acrylic nest boxes under red light (non-visible to bees [15]) to 30 × 30 × 30 cm. The nest box was placed on the floor and connected to an experimental arena. We covered the nest box with a black cloth to mimic the natural lighting of underground habitats of B. terrestris [19]. The bee colonies were provided with pollen balls ad libitum directly within the nest boxes. The pollen balls were made of 50 mL ground, commercial pollen collected by honeybees (W. Seip, Germany), and 10 mL water. The bees reached a foraging chamber via the experimental arena containing gravity feeders. These are small bottles with a plate at the bottom so that the bees can access the sugar solution through small slots in the plate. The feeders were filled with a sweet aqueous solution (30% saccharose, 70% water in volume). Lighting from above was provided in a 12 h/12 h cycle and the temperature was kept constantly at 20. Throughout the experiments, foraging bees were individually marked using coloured, numbered plastic tags glued with melted resin on their thorax.

Experimental arena

The experimental arena was a cylinder with a diameter of 1.5 m and a height of 80 cm as in [14]. A red and white pattern, perceived black and white by bumblebees [15], with a 1/f spatial frequency distribution (a distribution observed in nature [39]) covered the wall and floor of the cylindrical arena; the bees were provided with enough contrast to allow them to use optical flow. Red light came from 18 neon tubes (36 W Osram + Triconic) filtered by a red acrylic plate (Antiflex ac red 1600 ttv). Bees did not see these lights and perceived the red-white pattern as black and white. An adjustable transparent ceiling prevented the bees from exiting the arena. It allowed lighting from 8 neon tubes (52 W Osram + Triconic) and 8 LEDs (5 W GreenLED) as in [14], and recording from five high-speed cameras with different viewing angles. A foraging bee exiting the hive crossed three small acrylic boxes (inner dimension of 8 × 8× 8 cm) with closable exits to select the bee to be tested. The bee then walked through a plastic tube with 2.5 cm in diameter and entered the cylindrical arena. This nest exit was surrounded by visual objects. The foraging chamber was reached through a hole at a height of 28 cm above the floor in the arena wall. After foraging the bees exited the foraging chamber and entered the flight arena through a hole in the arena wall at a height of 28 cm above the floor. The nest entrance within the arena was surrounded by visual objects. When it found this nest entrance, the bee walked through a plastic tube with 2.5 cm in diameter and reached the hive by crossing three small acrylic boxes (inner dimension of 8 × 8× 8 cm) with closable exits which were used during the experiments to block bees from entering the arena. The objects surrounding the nest exit in the arena consisted of 40 randomly placed red cylinders (2 cm in diameter and 30 cm in height), creating an artificial meadow. The size of the artificial meadow was 80 cm in diameter to allow its displacement to a different location in the flight arena. We considered that the artificial meadow was dense enough to pose a challenge in finding the nest within the cylinder constellations, as we could show in the snapshot-model comparison. For example, if only three cylinders were used, the bee may only search there because these would be the only conspicuous landmarks. Additionally, the configuration had to be sufficiently sparse for the bee to fly through [37]. We used the object density and object distances of Gonsek et al. [18] as a reference to find a randomly distributed object configuration. Red-lighting from below was used only during recordings. The cylindrical arena had a door, allowing the experimenter to change the objects within the arena.

Experimental design

We tested two groups of bees in three tests trained with the objects of the artificial meadow surrounding the nest entrance. The two groups differed in the training condition. For the group B+ F+, the flight altitude was unconstrained, and they could experience bird’s eye views above the objects. The second group, B F+, were restricted during training to a maximum flight altitude of the height of the objects, so the frog’s eye views.

We tested 26 bees per group (4 colonies were used, 2 per group, from each colony 13 bees are included in the analysis resulting in a total of 52 individuals). In both groups, the foraging bees travelled between their nest and the foraging chamber. The return flight of each bee was recorded in all tests of a given experiment. Between individual tests, the bees were allowed to forage ad libitum. An artificial meadow surrounded the nest, and the area around the nest positions was cleaned with 70% ethanol between the tests to avoid chemical markings.

To test the behaviour of the bees, we locked up to six individually marked bees in the foraging chamber at a time. Each bee participated either in group B+ F+ (high ceiling during training, resulting in available frog’s and bird’s eye views) or B F+ (low ceiling during training, resulting in only available frog’s eye views) and was tested once in all tests of the respective experiment. The order of the tests was pseudo-random only that in the group B F+ the test BF was always tested last to not let the bees experience bird’s eye views before testing the B test. Before the tests, the cylindrical arena was emptied of bees, the spatial arrangement of objects was shifted, and the nest entrance closed. One bee at a time was allowed to search for its home for three minutes after take-off. After this time, the spatial arrangement of the ceiling height and the artificial meadow was placed back in the training condition, and the nest entrance opened. The bees had up to two minutes to take off. Otherwise, they were captured and released close to the nest. Between tests, the bees could fly ad libitum between the nest and foraging chamber under the training conditions.

For the tests, the artificial meadow was placed at a different location than during the training condition, and thus, it did not surround the true nest entrance leading to the hive. The artificial meadow indicated the location of a visual nest entrance. The true nest entrance and the visual nest entrance were covered by a piece of paper with the same texture as the arena floor so that they were not discernible by the bees. For the test BF, no other constraint was added to the general tests to test if the bees associated their nest entrance with the artificial meadow. The F test consisted of a transparent wall and ceiling on top of the objects, which prevented the bees from entering the meadow. To return to the nest location in the meadow, they were only able to pinpoint the position from above if they were using only bird’s eye views. In the F test, the flight altitude was constrained to the height of the artificial meadow so that bees could no longer experience bird’s eye views from above the meadow. Thus, the bees had to use the frog’s eye view to return to their nest.

Flight trajectories

Bee trajectories were recorded at 62.5 Hz (16 ms between two consecutive frames) with five synchronised Basler cameras (Basler acA 2040um-NIR) with different viewing angles (similar to [14, 34]). One camera was placed on top of the middle of the arena to track the bumblebees’ movements in the plane, and the other four cameras were distributed around the arena to record the position of the bees from different angles to minimise the error during triangulation of the bee’s position in 3D. Before the bees entered the setup, the recording had already started, and the first 60 frames were used to calculate a bee-less image of the arena (background image). During the rest of the recording, only crops, i.e. image sections (40x40 pixels), containing large differences between the background image and the current image (i.e. potentially containing the bee) were saved to the hard drive together with the position of the crop in the image. The recording scripts were written in C++. The image crops were analysed by a custom-written neural network to classify the crops in bees or not a bee. When non-biological speed (speed above 4 m/s [19]) or implausible positions (outside the arena) were observed, the crops neighbouring these time points were manually reviewed.

The trajectories were analysed in Python (version 3.8.17), notably with OpenCV. A detailed list of packages used is published in the data publication. The time course of the positions of the bees in 3D within the arena is shown for a selection of flights (Fig. 2 and S13). For each of the tests, a distribution of presence in the flight arena was computed by using hexagonal binning of the 2D positions to show the search areas of the bees qualitatively. Bees that collected food in the foraging chamber returned home and searched for their nest entrance. Even when objects are displaced to a novel location [26, 27, 38, 28], replaced by smaller, differently coloured, or camouflaged objects [26, 27, 13], or moved to create visual conflict between several visual features [14], bees search for their nest entrance associated to such objects. Therefore, we assumed that bees entering the arena after visiting a foraging chamber will search for their nest location. The returning bees could spend an equal amount of time in any location in the arena or concentrate their search around the physical position of their nest or around visual objects associated with the nest entrance [14]. In our experiments, the nest entrance was surrounded by cylindrical objects. During the tests, when the objects were shifted to a novel location, a bee guided by the objects might have searched for its nest around the objects. In contrast, they might not have used the objects to navigate and might have searched for their nest at the original location. We, therefore, quantified the time spent around two locations: at the true nest and the nest according to the objects (‘visual nest’). Additionally, the positions of the bees crossing the boundaries of the cluttered objects for the first time at the side or at the top of the clutter were visualised a circular histogram (entries from the top) and a scatter plot (entries from the side). These were used to describe where the bees entered the cluttered area. As the clutter was shifted to another position in the arena, we can exclude the use of compass, odour or magnetic cues.

Statistical analysis for hypotheses testing

Hypotheses about the time spent in one area compared to another were tested using the dependent t-test for paired samples. Hypotheses involving multiple comparisons were tested using a Bonferroni correction for the significance level. As long as a hypothesis concerned only two areas, no adjustment was made to the significance level. With a sample size of 26 bees, we were able to detect a time difference spent in two areas (e.g. fictive and true nest locations) of 0.25 s assuming a standard deviation of 0.38 s (estimated from [14]) with a power of 80% at a significance level of 0.05. The analysis was performed with Python using the Scipy library for statistical analyses.

Acknowledgements

We would like to thank Vedant Dixit, Maximilian Stahlsmeier, Pia Hippel and Helene Schnellenberg for their help during the data collection. Additionally, we would like to thank Sina Mews for helpful discussions on statistical models. This project was supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) and the Agence Nationale de la Recherche (ANR, French National Research Agency). We also acknowledge support for the publication costs by the Open Access Publication Fund of Bielefeld University.

Supporting Information

Methods

Rotational Image Difference Functions

For each position (x, y) in the arena an equi-rectangular panoramic image (360deg along the azimuth, 180 deg along the elevation) was acquired. To determine the most familiar direction, each snapshot in the arena were compared to memorised snapshot. This comparison was based on the minimum rotational image difference function (RIDF, Eq. 1) for the brightness model. The minimum RIDF is the minimum root mean squared image difference dx,y between two views (the current view Ix,y and the view at the nest IN) for different azimuthal viewing directions α weighted by w(v). w(v) is a sine wave along the y-axis counterbalancing the oversampling of the poles at the transformation of the 3D sphere mimicking the bee’s eye to 2D equirectangular images by giving values of 1 at the equator and 0 at the poles (Eq. 2). In the equation below, (u, v) corresponds to the viewing direction in the azimuthal direction u, and direction along the elevation v. The images resolution were (Nu, Nv) = (360, 180) pixels.

The RIDF dsi,x,y is calculated for each snapshot si in S = s0, s1, …, sn around the nest location and the heading direction hsi,x,y at each grid location x, y and each snapshot si is determined by taking the location of the minimum RIDF (Eq. 3). To weigh the heading direction the ratio wsi was calculated between the minimum RIDF of all snapshots si in Sdmin and the current RIDF dsi (Eq. 5). The homing vector H⃗V results from the weighted circular mean of the different heading directions hsi (Eq. 6).

The contrast-weighted-nearness model will use the contrast weighted by the depth of the environment to steer an agent home. The contrast was calculated as the ratio of luminance-amplitude (i.e. standard deviation of the luminance) and luminance-background (i.e. average of the luminance) within a 3x3 pixel window of the snapshot image (i.e. Michelson contrast). As described in [13], we used the rotational similarity function between the current view Ix,y and the memorized view IN (Eq. 7). As for the brightness-based model, the homing vector (Eq. 6) was computed by the weighted circular means of each vectors derived from each memorised views (Eq. 9).