1 Project Overview
Natural Power, alongside Project partners NatureMetrics and EDF Renewables, were successful in securing funding via an Offshore Wind Growth Partnership (OWGP) Innovation Grant for 50% of the Project costs for an environmental DNA (eDNA) Fish Ecology Research Project. This 18-month Project involved surveying using both traditional trawl and eDNA survey methods around a commercial offshore wind farm to trial the viability of eDNA sampling and provide a method for such data collection for fish ecology assessments.
Four surveys were conducted over 12 months beginning in March 2022 and continuing in each concurrent season (quarter), mirroring the typical compliance frequency for fish ecology monitoring (Natural Power, 2021). The data from the concomitant trawls and eDNA samples were compared to assess the effectiveness of the eDNA method. For eDNA analysis, two typical assays used by NatureMetrics for ‘fish’ and ‘invertebrate’ were tested as well as a new ‘marine vertebrate’ assay currently in the research and development (R&D) stage. Comparative analysis of the ‘fish’ and ‘marine vertebrate’ assays have been presented in this report. The R&D ‘marine vertebrate’ assay also provided results on marine mammal species which are included in this report.
An initial review of invertebrate eDNA results proved encouraging and the project scope was extended, in August 2022, to allow for ground truthing of the eDNA data. Results from the ‘invertebrate’ assay will be investigated in detail and presented in a subsequent report.
The Project was carried out at Blyth Offshore Demonstrator, where pre- and post-construction monitoring surveys using traditional methods have been conducted for over a decade. The 2022 dataset was also compared against the historical fish community data from the site to investigate alignment with known long-term ecological trends. Additional sampling within the turbine array where trawling could not be conducted due to safety reasons was also undertaken, with the view to obtaining information on how the area in the vicinity of the turbines is utilised by fish species.
1.1 Project Drivers
Governments and the energy industry are rapidly transitioning to renewable sources of energy in response to climate change. This has led to ambitious time-bound goals, with the overall aim of Net Zero by 2050 (GovUK, 2022). One key target is deriving 50 gigawatts (GW) of energy from offshore wind by 2030 (Department for Energy Security and Net Zero and Department for Business and Trade, 2023). The Crown Estate’s Offshore Wind Leasing Round (Round 4) will bring approximately 8 GW of additional capacity around England and Wales (Crown Estate, 2023), and Crown Estate Scotland’s ScotWind leasing round aims to provide as much as 27.6 GW of offshore wind capacity delivered between 2027-2032 (Crown Estate Scotland, 2023). This places increasing pressure on existing supply chains, not only for offshore wind infrastructure, but also to satisfy the environmental and consenting requirements for offshore developments. With current resource limitation (e.g., vessel availability and specialist staff), innovative methods of obtaining biological data at greater frequencies without an increase in surveying effort are essential to meet demand. Furthermore, as Offshore Wind Farm (OWF) developments move into deeper offshore waters, current survey techniques can become more challenging or infeasible, potentially limiting the amount of targeted sites-specific data that can be obtained.
Environmental DNA methods provide a non-invasive solution which can outperform conventional methods for a variety of terrestrial and marine biological surveys (Fediajevaite et al. 2021). Despite this, offshore environmental surveys using eDNA are relatively new, and have typically been conducted in nearshore areas (e.g., Ely et al. 2021; Monuki, Barber, and Gold 2021; Mynott and Marsh 2020) or around oil and gas platforms (Alexander et al. 2022; Mauffrey et al. 2021; Cordier et al. 2019). Such studies have shown that eDNA often captures additional species compared to traditional methods, including those which are ecologically important in environmental assessments for consent applications and ongoing monitoring (Mynott & Marsh, 2020).
There is a developing body of literature comparing eDNA based surveys with conventional surveys for fish ecology (e.g., Alexander et al. 2022; Port et al. 2016; Thomsen et al. 2016; Stoeckle, Ausubel, and Coogan 2022). However, to date only one other study has investigated the potential application of eDNA methods around offshore wind infrastructure, and concluded it was a powerful future tool but recommended comparative studies of eDNA and trawl data to further understanding (Ray et al. 2023).
1.2 Aims & Objectives
The overall aim of the project was to compare eDNA based methods with conventional fish trawl surveys with a view to identifying whether using eDNA can support or replace traditional methods for environmental baseline setting and compliance monitoring. This was evaluated throughout the project from survey design, method development and statistical analysis of resulting datasets to answer a number of key questions provided below:
Can eDNA samples be practically obtained offshore whilst working around a commercial OWF?
Does the number of species and/or species composition differ between the two methods?
Can seasonal trends be identified in eDNA data and if so, do they align with trawl data (present and historic)?
Can spatial trends be identified in eDNA data and if so, do they align with trawl data (present and historic)?
Does eDNA data identify any differences in fish community composition in the vicinity of the turbines from that of trawl stations? If so, does it support the theory that artificial reef habitat is having a positive effect on fish ecology in the area?
Can eDNA provide data on other species groups such as invertebrates and marine mammals?
Which eDNA assay (‘fish’ or ‘marine vertebrate’ (henceforth referred to as ‘vertebrate’)) performs the best?
Does eDNA sampling reduce survey costs and sampling effort?*
*NB. Reference to savings in survey cost and sampling effort (question 8) have been referred to throughout the report qualitatively.
1.3 Report Scope
This report has been produced as a key output of the OWGP Innovation Grant. The purpose of the funding is to support projects which result in market-ready technologies, products, and services to accelerate offshore wind site development during the consenting phase. The study aims to identify a potential solution to overcome a current challenge associated with offshore wind site development through the investigation of an innovative environmental survey technology to feed into the assessment process. This report focuses primarily on fish ecology assessment, but also includes reference to data obtained on invertebrates (investigated in detail in a subsequent report), as well as marine mammals which could feed into future research.
2 Sampling Design & Methodology
2.1 Survey Design
The Project was carried out at Blyth Offshore Demonstrator (BOD), located approximately 5 km off the coast of Blyth, northeast England. Pre- and post-construction monitoring surveys at the site using traditional trawl methods have been conducted for over a decade as part of the Marine License condition.
Three trawl locations were selected from those conducted during the pre/post-construction monitoring at BOD ensuring coverage of a range of habitats and depths representative of the site conditions. The existing monitoring station numbering has been retained in this report. Nearshore and offshore locations were chosen as previous results indicated that fish catch composition changed with depth (Natural Power, 2021). A new trawl station as close to the turbines as practically possible was also selected to compare species composition within the turbine area (from eDNA samples) with the nearby trawl station.
Survey locations were Station 8 (nearshore along the cable route), 3 (offshore, at the greatest depth), 5 (furthest north and slightly inshore) and Station 9 (new station closest to the turbines) as illustrated in Figure 2.1. Water samples for eDNA analysis were collected from within the BOD array at eDNA stations 1 and 2 (where trawling could not be conducted due to safety risks of snagging gear), and at the beginning and end of each trawl sampling station.
The different taxa targeted by both survey techniques undergo seasonal migrations as well as changes in biological activity (e.g., spawning) that are considered to alter the quantity of eDNA produced. Therefore, sampling was conducted quarterly to align with previous survey frequency (Natural Power, 2021) and to capture this seasonal variation.
The surveys took place during 2022 on the below dates within the pre-defined seasonal sampling windows:
Winter: 28th March
Spring: 24th May
Summer: 2nd September
Autumn: 12th December
Additional sampling took place on 5th January 2023 to conduct benthic invertebrate sampling for the extension of the project, as well as obtaining one remaining set of eDNA samples from Station 8 which were not collected in the autumn survey due to equipment failure.
2.2 Sampling Methodology
2.2.1 Traditional techniques
An otter trawl was used with a commercially comparable net (80 mm mesh in the main body and cod-end with the foot rope using 15-20 cm rubber hoppers). The net was towed for 30 minutes at approximately 3.5 knots. Any reduction of the tow duration was recorded on the deck logs stating why the tow was hauled early. Coordinates and times (GMT) for the beginning and end of each trawl were recorded using a handheld GPS or the vessel GPS (preferred). Depth of water (m) was recorded from the vessels system, as well as the prevailing weather conditions and sea state.
For each haul, the catch was unloaded into fish boxes/bins and photographed with a waterproof label showing the sample number. Any unidentifiable or unusual specimens were also photographed for later identification/verification. All individuals in the catch were enumerated and measured where appropriate in accordance with Annex IV of EU Regulation 2019/1241; total length for fish and elasmobranch species (width for skates and rays); carapace width for crabs (length for lobsters) and mantle length for squid and identified to species level. All macro-invertebrate species were enumerated, with sub-sampling undertaken if large quantities were captured (as specified in Boyd et al. 2016). Following identification and enumeration all specimens were returned to the sea.
2.2.2 Environmental DNA field sampling
At the beginning and end of each tow, three 5 L replicate water samples for eDNA analysis were collected using a Niskin bottle. The vessel was positioned relative to the wind/tide/current to avoid the cable leading under the vessel and the Niskin bottle was deployed from the side of the vessel, avoiding the propellor. There was no vessel transiting during the deployment. Taking above into consideration, where possible, the vessel was orientated into the tide when taking the samples, to minimise any contamination from the trawl catch.
The bottle was lowered on a dyneema cable, moved up and down a few times to flush the inside and the sample taken c.1 meter above the seabed to capture the near-benthic eDNA without disturbing the seafloor. The correct length of cable was payed out using the depth on the vessel sounder and meter markings on the cable. Three 5 L replicates were also taken at two sample stations (eDNA Stations 1 and 2) as close to the turbines as possible. The coordinates and time (GMT) when the sample was taken was recorded using the NatureMetrics app or a handheld GPS. Depth of water (m) was recorded from the vessels system.
A clean set of gloves was worn between each sample to prevent contamination. Sampling kits with all the equipment required for eDNA sampling were provided by NatureMetrics. Once on board the water sample was transferred from the Niskin bottle to a single-use sterile sample bag for filtering each replicate.
A peristatic pump (vampire sampler) was used to filter the water samples. One end of a length of tubing was connected to the filter and the tubing was fed through the pump. The other end of the tubing was then placed into the sample bag and the pump was turned on to filter the sample (Figure 2.2 and Figure 2.3). Once the sample bag was empty, any water remaining in the tubing was pumped through to ensure the complete sample had passed through the filter. A preservation buffer was then added to the filter and caps added to both ends of the filter. The filter and completed data sheet were stored in the corresponding specimen bag labelled with the station and sample number for analysis in the laboratory.
The Niskin bottle and buckets were cleaned with spray bleach and flushed with deionized water between each sample station.
During each survey, two field control samples were taken (towards the beginning and end of the survey) to test for contamination that may have occurred during sampling. Field controls were collected by filling the cleaned Niskin bottle with deionized water. The deionized water was then emptied into a sample bag and the water was filtered following the above procedure.
2.2.3 Environmental DNA Laboratory Analysis
Technical details of the eDNA laboratory analysis have been provided in Appendix A, with an overview summarised below.
DNA Extraction, Amplification & Sequencing
In the laboratory, DNA was extracted from each filter and a DNA extraction blank was processed with each batch to assess potential contamination in the extraction process. DNA was then purified and quantified.
Two typical assays used by NatureMetrics for ‘fish’ and ‘invertebrate’ were tested as well as a vertebrate assay. A genetic barcode region was amplified using primers specific to the assay for each sample. The amplified DNA was then sequenced to identify unique genetic sequences.
Bioinformatics
The raw sequence data was processed and compared to genetic reference databases of species through a bioinformatics process to generate an output list of taxa detected in each sample for ecological analysis.
For each assay, assignments were made to the lowest possible taxonomic level, using similarity thresholds >90%. A country-based sense-checking step was also implemented in line with the Global Biodiversity Information Facility (GBIF) occurrence records for the United Kingdom. All taxonomic units with species-level identifications were queried against the International Union for Conservation of Nature (IUCN) Red List to obtain global threat status.
The number of reads assigned to each species per sample during the taxonomic assignment against the reference database (i.e., read count) (as in Muri et al. 2016) was used for down-stream analysis.
3 Data analysis
As sequence read counts (henceforth referred to as ‘read counts’) from eDNA and abundance of fauna in traditional trawl datasets vary in their units and scale of capture, it was decided to analyse the two datasets separately. Each dataset is processed and analysed with univariate and multivariate techniques (see below Section 3.3 - 3.4 for details) separately based on the read count and abundance information respectively. Both datasets were then simplified to presence / absence of occurrence of each species at a station during each survey and combined to allow simple multivariate comparisons to be made.
3.1 Trawl data processing
Raw trawl data were imported into R programming software (R Core Team, 2021). The counts of any species sub-sampled during trawl surveys were raised by the appropriate amount in order to provide the total abundance in the whole sample. Catch data were then standardised to 30-minute trawls using a standardisation factor to account for differences in sampling effort.
3.2 eDNA data processing
All analysis for eDNA was conducted using the ‘vertebrate’ and ‘fish’ assays separately for comparison. Results from the ‘invertebrate’ assay will be investigated in detail in a subsequent report.
Read count datasets were filtered to remove reads that were identified as likely contaminants using the ‘decontam’ package (Davis et al. 2018) in R. This filtering used prevalence testing which flags species as potential contaminants if read counts in samples are similar to those in control runs (field blanks). Where a species was deemed to be a potential contaminant (using the default probability threshold of 0.1), it was removed from the dataset for all stations within that month. Decontamination was carried out for all surveys separately. Seven species were removed during the decontamination process across the three assays, including John dory (Zeus faber) from the Fish Assay in the May 2022 survey (p = 0.066, prevalence = 4), and brill (Scophthalmus rhombus) in the Fish Assay from December 2022 (p=0.083, prevalence=3). Other contaminants were non-fish species (e.g., Acanthogorgiidae, unidentified duck species (Antatidae) and, marine worms (Spionidae)).
Additionally, only species identified to species-level were retained for analysis (except for a skate species which was identified through bioinformatics as either the cuckoo ray (Leucoraja naevus) or shagreen ray (Leucoraja fullonica) which was included in the analysis as Leucoraja sp.). Additionally, freshwater species Stone loach (Barbatula barbatula) and Minnow (Phoxinus phoxinus) were removed from the dataset as their eDNA was likely to be present in the vicinity from either freshwater input or deposited by predator species, rather than the fish being present. This is also the case for three-spined stickleback (Gasterosteus aculeatus) which is commonly found in brackish water and was also removed due to uncertainty around whether these records indicate marine presence, however they are very tolerant of salinity changes and some populations are known to be anadromous (Kottelat,& Freyhof, 2007). Taking a cautious approach, this species was also removed due to uncertainty around whether these records indicate marine presence.
As part of the decontamination process, the mammal species blue whale (Balaenoptera musculus) which was recorded in the spring vertebrate assay data, was removed from the dataset as it only appeared in one of the three station replicates samples. It seemed unlikely that this result could have been possible, however, following a sighting of two blue whales in the central North Sea just north of Newcastle in 2020, there is growing evidence to suggest their presence in shallow waters of the central North Sea (Lavallin et al. 2023).
Seasonal variation in occurrence from historic trawls and drop-down video at the site is known, with a seasonal signal for haddock (Melanogrammus aeglefinus), whiting (Merlangius merlangus), plaice (Pleuronectes platessa), grey gurnard (Eutrigla gurnadus), dab (Limanda limanda), lough rough dab (Hippoglossoides platessoides), and lemon sole (Microstomus kitt) (Natural Power, 2021). Seasonal occurrences were visualised for these species from eDNA, with read counts considered to provide an indication of the extent of occurrence (see caveats listed in the discussion of this report) to investigate whether eDNA showed a similar seasonal trend.
3.3 Univariate analysis
The following species diversity indices were calculated for both trawl and eDNA data:
Number of Species (S) (Taxa): provides the number of species present in a sample, with no indication of relative abundances;
Effective species: the number of equally abundant species needed to obtain the same mean proportional species abundance as that observed in the survey data;
Number of individuals / read counts (n) (Abundance): provides the total number of individuals or read counts counted;
Species Diversity - Shannon-Wiener index (H’): measures the uncertainty in predicting the identity of the next species withdrawn from a sample. Typically between 1.5 and 3.5, a lower value shows lower diversity;
Species Richness - Margalef’s index (d): measures the number of species present for a given number of individuals. The higher the index, the greater the diversity;
Pielou’s evenness (J’): shows how evenly the individuals in a sample are distributed. J’ is a range of zero to one. The less variation in the samples, the higher J’ is.
These univariate indices enable the reduction of large datasets into useful metrics, which can be used to accurately describe community structures. However, where eDNA read counts are used, it should be noted that read counts are not directly linked to abundance, and indices that incorporate abundance may be less reliable with eDNA compared to trawls. Regardless, results are presented equally for both methods for comparison.
3.4 Multivariate analyses
Multivariate analysis is an effective method for detecting subtle changes in species community datasets. Multivariate analyses were calculated in R using the vegan package (Oksanen et al. 2022). Due to the partially skewed nature of species data, and its varying abundances, a square root transformation was applied to normalise the trawl data distribution, and fourth root to eDNA read counts - reducing dominant effects of highly abundant taxa. A Bray-Curtis resemblance matrix was applied to the transformed infauna data.
To cluster stations based on the similarity profiles (SIMPROF) of community composition, hierarchical clustering and permutation testing were utilized to identify the coherence of groups of stations. This process effectively creates a dendrogram of similarity between stations and descends nodes while testing for significant multivariate structure within the node until the total number of significant stations is identified.
During the summer survey, trawl sampling at station 8 was not possible due to the presence of static fishing gear in the region. It was however possible to collect eDNA samples at the planned trawl start and end locations. In order to reduce potential effects from this unbalanced design (Anderson & Walsh, 2013), station 8 was removed prior to ANOSIM and permutational multivariate analysis of variance (PERMANOVA) tests when using the trawl data (see below).
ANOSIM was used to determine whether there was a significant difference between community composition between surveys, stations, and between stations within and outside the turbine development area in the case of the eDNA dataset. This uses a Bray-Curtis dissimilarity matrix to determine whether there is a greater difference in the mean ranks between groups than those within groups, where groups are variables such as station and survey. The resultant R statistic quantifies that difference, with values of 0 representing random groupings (I.e., there is no significant influence of group on species composition), and values of closer to –1 or +1 showing a stronger influence of groups.
PERMANOVA was used to determine whether the same variables tested with ANOSIM affected community composition. PERMANOVA is a semiparametric method that partitions multivariate variation within dissimilarity measures. This tests whether the centroid and spread of dissimilarity differs between groups (e.g., stations, surveys, location), and permutes with random draws from the dataset to calculate the probability of the given groups explaining variation in composition.
3.5 Comparison
3.5.1 Data processing
In order to compare the species occurrence in eDNA and trawls directly, the differences in species detected by the two methods were investigated. Venn diagrams were constructed with the Venn Diagram package in R (Chen 2022), to visualize the species that were detected uniquely by each sampling methodology or shared in both datasets. The list of species picked up by one method but not the other is presented to determine whether key species are missed or conversely detected; or whether there is a pattern in those differences. Stations 1 and 2 were omitted from eDNA analysis for this element, as they were not sampled by trawls.
3.5.2 Multivariate
As the units and scale of trawl abundance and eDNA read counts are not equal, these measures were transformed to be presence/absence of each species in each sample. Due to the binary nature of this occurrence, Jaccard index of dissimilarity was adopted as a measure of distance. The Jaccard measure calculates the proportion of species that are shared between pairs of samples. This can be written as
\[ 1 - a / (a+ b+c) \]
Where a is the number of species present in both samples, b is the number of species present in x but not y, and c is the number of species that are present in y but not x.
Due to station 8 being missed during the summer trawl survey, this station was removed from eDNA and trawl datasets in all seasons for multivariate analysis to improve comparability. Using the Jaccard distances, ANOSIM was used to determine whether species composition varied by sampling methodology. Non-Metric Multidimensional Scaling (NMDS) plots were produced to examine the similarity between sampling methods.
4 Method Development
A key aspect of the Project was to trial how the eDNA method could be practically implemented at sea, whilst working on a commercial offshore wind farm site. Table 4.1 outlines the lessons learnt and improvements made throughout the Project.
Despite a few initial tweaks to improve efficiencies, the methods were successfully implemented, and no issues were significant enough to impact the completion of the surveys to meet the project aims.
As such the Project has concluded that eDNA samples can be practically obtained offshore whilst working around a commercial OWF.
5 Results
5.1 Species Occurrence and Community Composition
A total of 26 fish species and 1,483 individuals were captured during the 2022 trawl surveys. The most abundant species of fish were whiting, haddock, dab, plaice, and long rough dab (Table 5.1). This aligns with the post-construction compliance monitoring results whereby in year 3, the most abundant species from the trawl catches across seasons were dab, plaice and haddock followed by long rough dab (Natural Power, 2021). A total of 59 species of fish were detected in the fish assay (including within turbine Stations 1 and 2) and 54 species were detected in the fish assay from trawl stations alone (Table 5.2). Read counts indicate the most abundant species of fish were haddock, long rough dab, and dab, followed by whiting, lemon sole and plaice.
There were 42 species of fish identified in the vertebrate assay (including within turbine Stations 1 and 2) and 41 fish species detected in the vertebrate assay from the trawl stations alone (Appendix B, Table 11.1). Read counts indicate the most abundant species of fish were cod (Gadus morhua), long rough dab, lemon sole and hake (Merluccius merluccius).
There are several species that were not detected by one sampling method, which were detected by the other method. Nineteen species were recorded using both the trawl and fish assay eDNA method, whilst 35 species were unique to the eDNA data and seven species unique to the trawls (Figure 5.1).
When using the fish assay, the species not detected by eDNA, that were present in the trawl data were: red gurnard (Aspitriglia cuculus), tub gurnard (Chelidonichthys lucerna), grey gurnard (Eutrigla gurnardus), anglerfish (Lophius sp.), thornback ray (Raja clavata), cuckoo ray (Raja naevus) and small spotted catshark (Scyliorhinus canicula). However, anglerfish was identified to species level using eDNA as European angler/common monkfish (Lophius piscatorius). In addition, a Triglidae species sequence was frequently detected in the fish assay data, which is likely to be one or several of the gurnard species identified (which share the same fish assay (zero-radius Operational Taxanomic Unit) (ZOTU) sequence). A ray was identified in the vertebrate assay as either the cuckoo or shagreen ray (Leucoraja naevus/Leucoraja fullonica) and was included in the final vertebrate assay dataset as Leucoraja sp. (Appendix B).
Trawls did not catch the following species that were detected by eDNA (using the fish assay): Atlantic wolfish (Anarhichas lupus), European eel (Anguilla anguilla), spotted dragonet (Callionymus maculatus), Yarell’s blenny (Chirolophis ascanii), five-bearded rockling (Ciliata mustela), Northern rockling (Ciliata septentrionalis), herring (Clupea harengus), crystal goby (Crystallogobius linearis), goldsinny (Ctenolabrus rupestris), lumpsucker (Cyclopterus lumpus), bass (Dicentrarchus labrax), lesser weaver (Echiichthys vipera), fourbeard rockling (Enchelyopus cimbrius), anchovy (Engraulis encrasicolus), witch (Glyptocephalus cynoglossus), sea snail (Liparis liparis), Montagu’s sea snail (Liparis montagui), shanny (Lipophrys pholis), anglerfish, thickback sole (Microchirus variegatus), bull rout (Myoxocephalus Scorpius), European smelt (Osmerus eperlanus), butterfish (Pholis gunnellus), Norwegian topknot (Phrynorhombus norvegicus), pollack (Pollachius pollachius), saithe (Pollachius virens), sand goby (Pomatoschistus minutus), tadpole fish (Raniceps raninus), sea trout (Salmo trutta), pilchard (Sardina pilchardus), Atlantic mackerel (Scomber scombrus), brill (Scophthalmus rhombus), sea scorpion (Taurulus bubalis), Atlantic horse mackerel (Trachurus trachurus) and bib (Trisopterus luscus).
For the vertebrate assay, thirteen species were common to both the trawl and vertebrate assay eDNA method, whilst 28 species were unique to the eDNA data and 13 species unique to the trawls (Appendix B, Figure 11.1). Again, gurnard species were present in the trawl but not the eDNA when using the vertebrate assay as well as species such as haddock, whiting and plaice (which were not identified to species level in this assay). Species such as Atlantic salmon (Salmo salar) and sea trout and bottom dwelling fish species such as rockling, goby and shanny were present in the eDNA data, but not the trawl. A full breakdown of species by method using the vertebrate assay is provided in (Appendix B, Table 11.1).
There was a significant difference between community composition between methods when read counts from the fish assay and trawl abundance were simplified to presence or absence at each station and dissimilarity was calculated by Jaccard dissimilarity index (ANOSIM P = 0.001; R = 0.9615) (Figure 5.2). Similarly, a significant difference was found between eDNA (using the vertebrate assay) and trawl abundance (ANOSIM P = 0.002; R=0.606) (Appendix B, Figure 11.2).
The NMDS plot (@FigFigure-5-2) clearly shows the dissimilarity with no overlap between the two methods. This is expected given the greater number of species detected using the fish assay eDNA results and the trawl catch data. However, to investigate the species driving this dissimilarity between sampling methods, redundancy analysis was conducted with constrained ordination. This allowed the investigation and visualization of species that were significantly different between methods, filtered to only include those that explained over 45% of variation for the fish assay (Figure 5.3) and vertebrate assay (Appendix B, Figure 11.3).
Figure 5.3 suggests the main differences in species composition between the methods is that flat fish and ling are more dominant in the trawl method results and whilst sprat (Sprattus sprattus) and herring are common in the eDNA results, they are not present or are only present in small numbers in the trawl catches. Bib and Norway pout are also driving the dissimilarity between methods, as are five bearded rockling and sand goby. This may reflect fish that can easily escape trawl nets, either due to being smaller or evading capture (e.g., by hiding in crevices), but are being captured by the eDNA method (fish assay).
There is more overlap between the methods when comparing community composition at stations using trawl catch data and the vertebrate assay eDNA data, although the results remain significantly different (Appendix B, Figure 11.2). Species driving dissimilarity are also flat fish for the trawls and to an extent cod, common dragonet and Norway pout in the vertebrate assay eDNA (Appendix B, Figure 11.2).
5.2 Seasonal Trends
Seasonal fluctuations in abundances of certain fish species in the area can be seen in the trawl sampling results (Table 5.1) and in the eDNA results (Table 5.2, Appendix B, Table 11.2)
The 2022 trawl results found whiting and haddock in much greater abundances in the autumn, long rough dab and lemon sole in highest abundances in winter, while dab and plaice were less abundant in winter than in the other seasons and most abundant in spring (Table 5.1, Figure 5.4).
Seasonal trends from the eDNA data are largely consistent with those of the trawl data, particularly the fish assay data. Whiting was present in the highest relative abundances during autumn in the trawl and fish assay data. Cod and haddock were most abundant in the autumn trawls, but most prevalent in winter in the fish assay results (Table 5.2, Figure 5.5). Whiting and haddock were not captured by the vertebrate eDNA assay (Appendix B, Table 11.2, Figure 11.4). Dab were most abundant in the spring trawls and had the highest relative abundances in spring and summer in both the eDNA fish and vertebrate assays. Long rough dab were most abundant in winter and spring in the trawl and both fish and vertebrate assay results. Plaice were most abundant in spring in the trawl and fish assay data but was not detected using the vertebrate assay.
When looking at univariate analysis for the trawl data; diversity, effective species and evenness indices values are lowest in the autumn at all trawl stations other than Station 8 which is lowest in spring (Figure 5.6). Conversely, richness values are lowest in spring at trawl Stations 3, 5, and 8 and lowest in autumn at trawl Station 9 (Figure 5.6). This is reflective of the seasonal and migratory variation in species occurrence; for offshore stations 3, 5 and 9 diversity indices are lower in the autumn due to the large abundances of whiting and haddock. Whereas diversity indices for the inshore Station 8 are lowest in spring where the catch was dominated by dab with some flounder (Platichthys flesus) and plaice with only one other fish species present.
Multivariate analysis was used to determine whether there was a significant difference between community composition for the trawl data between surveys and both ANOSIM and PERMANOVA found that community structure differs significantly between seasons (ANOSIM; P = 0.002; R2 = 0.5741) (PERMANOVA; P < 0.001; R = 0.57605; F = 7.5402).
Fish and vertebrate assay univariate analysis across seasons shows diversity values are greatly reduced at Stations 1 and 2 in autumn (Figure 5.7, Appendix B, Figure 11.5). Diversity values for Station 9, which lies closest to Stations 1 and 2, are also reduced in autumn in the fish assay and invertebrate assay data and in the trawl data (Figure 5.6). This is likely due to a reduced number of taxa with high relative abundance (dab, haddock and whiting) recorded within the turbine stations and much higher abundance of haddock and whiting at Station 9 in the trawl data in autumn (Table 5.1).
When comparing eDNA and trawl univariate results over seasons, Station 8 has lower diversity values in the spring than in other seasons and this is not reflected in the eDNA assay results. The trawl catch was dominated by dab with some flounder while plaice was the only other fish species recorded in the trawl catch. This is similar to the historical data at BOD, whereby post-construction year 3 monitoring had only dab and plaice present in the trawl catch at Station 8. The eDNA method however detected 31 species at Station 8 in spring (fish assay) and 21 fish species in the vertebrate assay. These species included dab and plaice but also several species that are not generally captured using the otter trawl method such as crystal goby (Crystallogobius linearis) and sand eel species (Ammodytes) as well as haddock, herring, sprat, poor cod (Trisopterus minutus), bib, sea scorpion and bull rout.
As with the seasonal changes seen in the trawl data, multivariate analysis showed statistically significant difference between community composition between surveys in the eDNA data. When using the fish assay, ANOSIM suggested that community composition in eDNA varies significantly between seasons (P = 0.001; R = 0.6395), This is supported by PERMANOVA, with survey season a significant predictor (P = 0.001; R2 = 0.439; F = 5.716) is significant. When using the vertebrate assay, ANOSIM suggested that community composition in eDNA varies significantly between seasons (P = 0.001; R = 0.8307), as does PERMANOVA (P < 0.001; R = 0.573; F=14.966).
5.3 Spatial Trends
Univariate analyses of the trawl catch data shows diverse and species rich communities across all trawl stations (Table 5.3, Figure 5.8),
Station 5 has the greatest abundance and lowest diversity scores of the four trawl stations. Station 3, followed by Station 9 have the lowest abundances and highest diversity scores of the trawl stations (Figure 5.8).
Univariate analysis of the eDNA datasets included Stations 1 and 2, despite these stations not being present in the trawl data. The fish assay detected more fish taxa and number of read counts than the vertebrate assay overall and at each station (Table 5.4, Appendix B, Table 11.3). Stations 8 and 9 have the highest number of taxa in the fish assay data, whereas stations 3 and 8 have the highest number of taxa in the vertebrate assay data. Univariate diversity indices: Shannon-Wiener, Richness, Evenness and Effective species number are all higher when using the fish assay than the vertebrate assay, however both sets of eDNA results indicate a diverse and species rich community across the project area (Figure 5.9, Appendix B, Figure 11.6).
When comparing eDNA univariate results with those of the trawl, results are broadly similar; diversity values are highest at stations 3 and 9 in the trawl data and at 8 and 9 in the eDNA fish assay data (excluding within turbine stations 1 and 2) and species richness lowest at station 5 in both datasets. Diversity and evenness are slightly greater at station 3 than 5 in the fish assay results. However, station 5 has the lowest number of species in both the trawl and fish assay results. (Figure 5.8 and Figure 5.9).
When using hierarchical clustering (SIMPROF) based on Bray-Curtis dissimilarity, two significant clusters were identified from the trawls. Station 8 (which is the closest inshore) was clustered on its own, with a separate cluster containing stations 9, 5 and 3. Within the cluster of three stations, there was greater similarity between species composition at stations 9 and 5 (Figure 5.10). Station 3 is least similar to the other stations in the cluster and this station lies furthest offshore. Station 8 has the least similarity to the other trawl stations, and this may be expected due to the shallower waters supporting a slightly different community than the deeper offshore locations and is consistent with the findings of the historical data (Natural Power, 2021). Stations 5 and 9 shared most similarity; Station 5 is further north and slightly inshore and Station 9 was not trawled previously but was included in this Project to provide a trawl station as close to the turbines as possible.
Multivariate analysis to explore whether there was a significant difference between community composition between stations found no significant difference between stations (ANOSIM: P =0.054; R2 = 0.22); whereas PERMANOVA did (P < 0.001; R = 0.534; F = 4.716).
Hierarchical clustering (SIMPROF) of eDNA stations, using the fish assay, found three significant clusters. Station 1 and 2 (within the area occupied by turbines and therefore not sampled by trawls) were clustered separately from the rest. Station 8 (the closest to shore) was more closely related to the other stations outside of the turbine development area but was clustered alone. Stations 3 (furthest offshore), 5 (furthest north), and 9 were clustered together (Figure 5.10). This mirrors the hierarchical clustering (SIMPROF) of stations from the trawl data for the trawl stations (Figure 5.11).
When using the fish assay, ANOSIM suggested that community composition in eDNA varies significantly between stations within the array area (Stations 1 and 2) and those outside (Stations 3,5, 8, 9) (P=0.002; R=0.303), but not between stations overall (P = 0.458; R = 0.005). This is supported by PERMANOVA, with station found not to be a significant predictor (P = 0.544; R2 = 0.097; F=0.943), whereas location (P = 0.005; R2 = 0.081; F=3.157) was significant. When using the vertebrate assay, hierarchical clustering resulted in Stations 1 and 2 clustered together, as were Stations 3 and 5. However, in comparison to the fish assay, Stations 8 and 9 were also clustered together (Appendix B, Table 11.3, Figure 11.7).
For Multivariate analysis., when using the vertebrate assay, ANOSIM suggested that community composition in eDNA varies significantly between stations within and outside the turbine development area (P=0.02; R=0.238), but not between stations (P = 0.260; R=0.0647). Conversely, PERMANOVA suggests that station is a significant predictor (P =0.001; R2 = 0.138; F=2.715), as well as location (P < 0.001; R2 = 0.108; F = 8.474).
5.4 Within Turbine Stations
Given the assumptions and caveats surrounding using read counts as a proxy for abundance (see Section 6. Discussion), and different numbers of stations within the development area (Stations 1 & 2) and those outside (all others), data were simplified in an attempt to identify whether species occurring within these two areas differ biologically or ecologically. The mean read count of each species was calculated per area, and then normalized for each area so that the value represents the percentage mean contribution of the read counts of that species.
Table 5.5 shows relative occurrence of species within the area for the fish assay which can be used to compare between areas for each species. For example, Five-bearded rockling (Ciliata mustela) contributes 0.14% of mean read counts outside of the turbine area, whereas this species contributes 13.26% at stations within – suggesting higher levels of occurrence within turbines.
The species which occur in greater relative abundance within the turbines than outside include a variety of bottom dwelling flat fish, such as flounder (Platichthys flesus), plaice, witch (Glyptocephalus cynoglossus) and other bottom dwelling fish species which prefer rocky, reefy or sandy habitats such as goldsinny (Ctenolabrus rupestris), bull rout (Myoxocephalus scorpius), fourbearded rockling (Enchelyopus cimbrius) and Norway bullhead (Micrenophrys lilljeborgii).
There is also a group of pelagic fish which, from the fish assay, occur in greater relative abundances inside the turbines than outside and include cod, poor cod (Trisopterus minutus), Atlantic salmon (Salmo salar), sea trout (Salmo trutta), herring (Clupea harengus) and sprat (Sprattus sprattus). The predicted habitat type in the turbine area is mud (Figure 5.12), however, many of the species found in greater relative abundances here prefer coarser sediment types. For example, the goldsinny inhabits rocks or algae (MarLIN), Norway bullhead inhabits hard bottoms or algae (GBIF), the four-bearded rockling dwells on muddy sand between patches of hard substrate (Fishbase) and the bull rout is usually found on rocky substrate with sand and mud (MarLIN).
Seasonal trends inside and outside the turbine stations
Cod were found in greatest abundances in the trawl catches in Autumn (Table 5.6). The fish assay shows an overall greater abundance of cod in the winter and greater percentage mean contribution within the turbines (9.4%) than outside the turbine area (1.42%) (Table 5.6). This coincides with the spawning period for cod in the area which is January to April (Ellis et al. 2012). Conversely, haddock were found in greater relative abundance out with the turbines in autumn, winter and spring but at greater relative abundance within the turbines in summer (Table 5.6).
Atlantic herring (Clupea harengus) spawning grounds lie close to the BOD site and spawn August-October in this region (Ellis et al., 2012, Coull et al. 1998). Herring are found at greater relative abundance within the turbines in spring, whilst sprat are found in much greater abundances within the turbines in summer (Table 5.6).
Similar trends are seen when comparing results from the vertebrate assay with cod and poor cod greater at the within turbine stations than out with (Appendix B, Table 11.4, Table 11.5). However, herring and sprat were not identified to species level in the vertebrate assay. This is also true for bottom dwelling species five-bearded rockling, fourbeard rockling, goldsinny, and Norway bullhead.
5.5 Application to other Species Groups
Marine mammal data
The following marine mammal species were identified by the vertebrate assay – minke whale (Balaenoptera acutorostrata), white-sided/white-beaked dolphin (Lagenorhynchus species), bottlenose dolphin (Tursiops truncatus), and harbour porpoise (Phocoena phocoena) (Table 5.7).
6 Discussion
Suitability of the eDNA method whilst working at an OWF
This study compared the occurrence of fish species in concurrent water samples for eDNA sequencing and trawls around a commercial offshore wind farm. In terms of method development and suitability whilst working offshore at a commercial wind farm site, eDNA sampling was successfully implemented. A number of lessons learnt have been outlined to improve efficiencies, but none of these were significant enough to impact the completion of the surveys to meet the project aims.
During one of the surveys, a trawl stationed could not be sampled by trawl due to the presence of static fishing gear however it was possible to collect eDNA samples from the trawl start and end locations. Furthermore, eDNA samples were collected from stations within the turbine locations, where it has never been possible to trawl due to the health and safety risk of gear snagging. The method developed during this study can be used to practically sample the fish ecology offshore whilst working around a commercial OWF.
Species occurrence and community composition between methods
eDNA consistently detected a greater number of species compared to traditional methods, including smaller fish species, migratory species, and bottom-dwellers that are not often captured in trawl gear due to biases associated with gear selectivity and limitations in the locations in which trawl fishing can occur.
The most abundant species were consistent across the two methods, as well as being in line with historical site data. This indicates that the eDNA method provides data for the species captured by the traditional method as well as many other species that wouldn’t typically be captured by trawl sampling alone.
Despite this, trawls appeared to capture some species not identified by concurrent eDNA samples, including three species of gurnard and a species of elasmobranch. In the case of the gurnard species, this was due to low taxonomic resolution of the assay (as the family level (Triglidae) was identified). Similarly, the cuckoo ray recorded in the trawl survey was identified as either the cuckoo or shagreen ray in the vertebrate assay eDNA data. As it was identified as one of two species of the of the same genus of skate, the decision was taken to include this in the final dataset as Leucoraja sp. As such it may be overly precautionary to remove taxa not identified beyond species level in the eDNA data and identification to genus can be seen as a suitable step for consideration in the data decontamination process. as with the traditional trawl method occasionally taxa can only be identified to genus level (e.g., as Leucoraja sp.). Low abundances of elasmobranchs and therefore the potentially the lower concentration of eDNA produced may also result in their eDNA not being detected. Fish identification in the field also can be subject to human error at times and identification of the juvenile forms of closely related species and/or species that interbreed can be difficult. eDNA would remove this potential source of error.
When read counts from the fish/vertebrate assay and trawl abundance were simplified to a presence/absence metric at each station there was a strong statistical difference in community composition found between the methods, likely due to greater numbers of species detected by eDNA.
Ordination plots indicate that the dissimilarity between the methods is primarily driven by flat fish species being more dominant in the trawls, whilst pelagic species including key forage fish (herring, sprat and Norway pout) were prevalent in the eDNA data. This aligns with known selectivity of the otter trawl gear. It also indicates the ability for eDNA sampling to detect both commercially and ecologically important species.
Seasonal & spatial trends
Both seasonal and spatial patterns in species occurrence and community composition were similar between the trawl and eDNA based sampling methodologies.
Seasonal trends noted in the 2022 trawl data were evident in the eDNA results, for example peak whiting catches and read counts (using the fish assay) were recorded in autumn and peak dab catches/read counts were recorded in spring/summer (using both the fish and vertebrate assays). Historical pre- and post-construction monitoring at the site found whiting and dab have been recorded as contributing to seasonal differences over the entire monitoring period. Overall, there have generally been larger catches of; whiting in the summer and dab in autumn and spring. Although some slight seasonal variations from historic data were noted, these seasons are consecutive and could relate to specific survey timings within the seasonal sampling window. Sampling timings were often dictated by suitable weather windows for trawling, however the use of eDNA methods alone could alleviate such restrictions by reducing the health and safety limitations of standard survey equipment.
When comparing univariate analysis between the eDNA and trawl methods, station trends in diversity indices were broadly similar. However seasonal trends in the eDNA data at stations differed from both the 2022 and historic trawl data. For example, at station 8 this was driven by the greater species diversity in the eDNA data in spring (in both the fish and vertebrate assay). Multivariate analysis showed the same hierarchical clustering and similar ANOSIM results for the 2022 trawl data and eDNA fish assay (for stations sampled by both methods). These results indicate that eDNA methods not only pick up individual species trends but can also be used to calculate ecological diversity metrics and to track seasonal and between station differences in community composition.
Spatial trends: inside/outside the array area
Utilising eDNA methods allowed the area within the array to be surveyed which is not typically feasible using trawl methods and allowed for an assessment of the species composition around the turbines for the first time. The results indicate species which occur in greater relative abundance within the turbines (compared to outside) include bottom dwelling fish that prefer coarser rocky, reefy or sandy habitats. Given the predicted habitat type in the turbine area is mud, this finding supports the hypothesis made in the original Environmental Statement (NaREC, 2012) that the artificial hard substrate created by turbines may be providing sheltered feeding grounds for fish (such as mature cod and haddock).
Cod occurred in greater relative abundance within the turbines than outside for the vertebrate and fish assays, with the greatest relative abundance in winter, coinciding with their spawning period. Herring were found at greater relative abundance within the turbines in spring, whilst sprat and herring were found in greater abundances within the turbines in summer according to the fish assay. It may be the case that these species are utilising the shelter and food provided by the colonised artificial substrates as nursery and/or feeding grounds. In addition, the timing of the peak in haddock within the turbine area (summer) differing from the peak in cod (winter) could relate to cod being active hunters, preying on a range of species, including haddock (Durant et al. 2020).
Both Atlantic salmon and sea trout are migratory species and were both picked up in the eDNA data. These species are not typically captured using traditional trawl methods due to gear selectivity and nearshore migration routes which do not often interact with trawl fishing areas. Improving the understanding of their oceanic ecology and distribution has been identified as a current knowledge gap (Rikardsen et al. 2021).
There was also a group of pelagic species (from the fish assay) that occur in greater relative abundance inside the turbines including forage fish species. Herring and sprat are key forage fish species, playing an important role in the marine food chain (Englehard et al. 2014). These findings indicate that the eDNA methods can be used to evidence and potentially assess net positive impacts from offshore wind infrastructure due to the ability to obtain robust samples within the array area.
Other species groups
Four marine mammal species were identified by the vertebrate assay: minke whale, white-sided/white-beaked dolphin, bottlenose dolphin, and harbour porpoise. Information from visual boat-based surveys such as those conducted as part of Blyth OWF’s monitoring programme confirms the presence of these species in the area. Minke whales are known to be seasonally present (late summer/autumn) while harbour porpoises, the most abundant species in the area, are known to be present year-round. The dolphin species also occur year-round but are observed in groups rather than singly (in contrast to harbour porpoise and minke whale).
The vertebrate assay detected the seasonal occurrence of minke whale and demonstrated that harbour porpoise are present all year round. More frequent sampling would be advised for capturing the dolphin species as they travel in groups and could be missed due to the timing of sampling.
Nonetheless, this work provides evidence for use of eDNA for describing the marine mammal fauna of the area.
eDNA data suitability and array performance
This study provides a proof of concept for use of eDNA to describe the fish ecology around offshore wind farms however, there are some limitations to the method. Firstly, the comparison of results between the fish and vertebrate assay, when looking at fish ecology alone, indicates the importance of selecting or designing the most effective assay to deliver the required information on receptor(s).
The vertebrate assay primers have good efficiency, in that they correspond well with sequences for fish, mammal and bird taxa. However, the region targeted can have the same sequences for multiple fish species, such as haddock, whiting and cod. The fish assay is designed to amplify and differentiate fishes which often results in identifying more fishes with more specific taxonomy assignment than assays designed to target a broader range of taxa such as the vertebrate assay. As such, the fish assay detected more fish species and more fish taxa identified to species level than the broader vertebrate assay.
A perceived limitation of the eDNA method was that read counts have to be used as a proxy for abundance. In this study forth root transformation of read counts was used as a proxy for abundance. Despite this initial concern, for the purposes of baseline setting or monitoring around offshore wind farms, this does not seem to be an issue as univariate and multivariate results between the trawls and eDNA were broadly consistent with each other, other than the greater number of species and therefore greater diversity captured in the eDNA data. Seasonal and spatial differences in community composition are captured effectively using the eDNA method with the same patterns in the data captured by both eDNA and trawl results.
Whilst these limitations exist, eDNA may provide a better tool for fish ecology assessments as it offers data on wider range of species (not just commercially important species) including migratory species and species of conservation importance, providing a robust baseline and informing better targeted mitigation. It also allows for sampling within turbines which cannot be surveyed by trawl. Furthermore, it is a non-destructive sampling method as opposed to trawling.
Survey costs and sampling effort
Replacing traditional survey methods for assessing fish populations around OWFs with eDNA sampling provides greater opportunities to collect the data required as a larger pool of vessels becomes accessible to undertake the survey work. The survey work can also be combined with other site-based activities (e.g., site investigation work). This greatly reduces the costs, resource consumption (e.g., fuel) and risks of delays to surveys and therefore to subsequent consents.
Furthermore, using eDNA also removes issues around uncertainty in the data from gear selectivity and human error in the misidentification of species. The adoption of the method has the potential for huge benefits to the industry with more efficient, affordable, and scalable consenting and site survey solutions which will speed up developments of OWFs and reduce costs for developers/operators, ultimately reducing the cost of overall energy production.
7 Recommendations
Given the findings of the Project, the authors believe eDNA sampling provides a viable alternative to traditional fish survey methods around OWFs. Regulator acceptance of eDNA for use in offshore baseline setting and monitoring will therefore be a key step towards accelerating and improving environmental monitoring for future offshore wind development. The findings from the Project will be shared with regulators to encourage discussion and provide evidence of the benefits demonstrated.
8 Acknowledgements
The authors would like to thank the Offshore Wind Growth Partnership and EDF Renewables for providing the funding for the Project to go ahead. We would also like to thank Natural Power and NatureMetrics who provided in kind contributions to the Project. We would like to thank the crew of the RV Princess Royal as well as the Natural Power survey team who conducted the offshore survey work. Finally, we would like to thank the NatureMetrics bioinformatics team for undertaking the laboratory analysis.
9 References
10 Document Information
11 Appendices
11.1 Appendix A. eDNA Laboratory Analysis
DNA Extraction
DNA was extracted from each filter using a DNeasy Blood and Tissue Kit (Qiagen) with the modified protocol for disc filters in buffer described in Spens et al., 2017. An extraction blank was processed with each batch of extractions to assess potential contamination in the extraction process. DNA was purified to remove polymerase chain reaction (PCR) inhibitors using a DNeasy PowerClean Pro Cleanup Kit (Qiagen). Purified DNA extracts were quantified using a Qubit 3.0 fluorometer (Thermo Scientific).
DNA Amplification
The 18S ribosomal Ribonucleic acid (RNA) (invertebrates) (Capra et al. 2016) and the 12S ribosomal RNA (teleost fish and vertebrates) (Miya et al. 2015; Riaz et al. 2011; Kelly et al. 2014) genes were amplified via a two-step PCR process. In the first step, multiple PCR replicates were performed on each water sample for each assay. PCR positive controls (i.e., a mock community with a known composition of proprietary synthetic sequences that do not match biological records) were included to verify sequence quality and PCR negative controls (i.e., PCR grade water) were included to detect potential cross-contamination. Amplification success was confirmed via gel electrophoresis. Successfully amplified first round PCR replicates were pooled per sample and purified using magnetic beads
Library Preparation and Sequencing
A sequencing library was prepared from the purified amplicons using a combinational dual index approach, following Illumina’s 16S Metagenomic Sequencing Library Preparation protocol. Indexed PCR products were subsequently purified using magnetic beads prior to being quantified, normalised, and pooled in equal volumes. The final pooled library was denatured, diluted and sequenced on an Illumina MiSeq using a V3 600 cycle reagent kit. A PhiX control library (illumina) was included on each sequencing run to provide a quality control for cluster generation, sequencing, and alignment, and a calibration control for cross-talk matrix generation, phasing, and prephasing.
Bioinformatics
All libraries were processed together for each of the three assays. Sequences were demultiplexed with bcl2fastq and processed via a custom NatureMetrics eDNA analysis pipeline. Paired-end FASTQ reads for each sample were merged with USEARCH (Edgar 2010). Forward and reverse primers were trimmed from the merged sequences using cutadapt (Martin 2011) with a length filter of 80-120 bp. Sequences were quality filtered with USEARCH to retain only those with an expected error rate per base of 0.01 or below and dereplicated by sample, retaining singletons to obtain zero-radius Operational Taxonomic Units (zOTUs). Unique sequences from all samples were denoised in a single analysis with UNOISE (Edgar, 2016).
Consensus taxonomic assignments were made for each zOTU using sequence similarity searches against the NCBI nucleotide (NCBI nt) reference. Searches against databases were made using blastn (Altschul et al. 1990; Camacho et al. 2009) and required hits to have a minimum e-score of 1e-20 and cover at least 90% of the query sequence. The taxonomic identification associated with all hits was converted to match the GBIF taxonomic backbone.
Assignments were made to the lowest possible taxonomic level where there was consistency in the matches, with minimum similarity thresholds of 98%, 95% and 92% for species, genus, and higher-level assignments respectively. Identifications were sense-checking against GBIF occurrence records for presence in the UK and elevated to higher taxonomic levels where required (rgbif; Chamberlain et al., 2022).
zOTUs were clustered at 97% similarity with USEARCH to obtain OTUs. An OTU-by-sample table was generated by mapping all dereplicated reads for each sample to the OTU representative sequences with USEARCH at an identity threshold of 97%.
All OTUs with species-level identifications were queried against the IUCN Red List (rredlist; Chamberlin 2018) to obtain global threat status and the Global Register of Introduced and Invasive Species (GRIIS) to obtain their invasive status in the UK. The OTU table was filtered to remove low abundance OTUs from each sample (<0.02% or <10 reads, whichever is the greater threshold for the sample). Unassigned OTUs, and OTUs identified to human and domesticated mammals, were removed from the dataset for subsequent analyses.