Feral Cat Bait Analysis

Investigating the longevity of baits and the response of cats and other species when encountering the bait

Justin G Cally https://justincally.github.io/blog/
2022-10-10

Setup

Load useful packages and some data for the analysis

Show code
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  cache = FALSE,
  message = FALSE, 
  warning = FALSE, 
  tidy.opts = list(width.cutoff = 60), tidy = TRUE
)
library(tidyverse)
library(survival)
library(tidycmprsk)
library(survminer)
library(ggsurvfit)
library(data.table)
library(brms)
library(survminer)
library(ggplot2)
library(kableExtra)
require(AICcmodavg) # has the aictab function
library(adjustedCurves)
library(patchwork)

read_excel_allsheets <- function(filename, tibble = FALSE) {
    sheets <- readxl::excel_sheets(filename)
    x <- lapply(sheets, function(X) readxl::read_excel(filename, sheet = X))
    if(!tibble) x <- lapply(x, as.data.frame)
    names(x) <- sheets
    x
}

bait_state <- read_csv("data/Bait_Attractiveness.csv") %>%
  mutate(`24-hour precipitation (sqrt)` = sqrt(prev24), 
         `24-hour temperature` = prev24temp, 
         Location = factor(Location)) %>%

  # mutate(Season = as.integer(factor(Season)),
  #        Habitat = factor(Habitat),
  #        Location = factor(Location)) %>%
  as.data.frame()

bait_state <- read_csv("data/Bait_Attractiveness_time_varying.csv") %>%
  mutate(`24-hour precipitation (sqrt)` = sqrt(rain24), 
         `24-hour temperature` = temp24, 
         `Cumulative Rain (sqrt)` = sqrt(cumrain),
         Location = factor(Location)) %>%
  group_by(BaitID) %>%
  arrange(time1) %>%
  mutate(cumtemp = cumsum(temp24)) %>%
  ungroup() %>%
  as.data.frame() 

delwp_cols <- c(Teal = "#00B2A9",
                Navy = "#201547",
                Environment = "#CEDC00",
                `Climate Change` = "#FDDA24",
                Energy = "#0072CE",
                Water = "#71C5E8",
                Planning = "#642667",
                Infrastructure = "#AF272F",
                FFR = "#E57200",
                Corporate = "#201547")

Bait Fate Analysis

We investigate the effect of several environmental covariates on the survivorship of feral cat baits. For this analysis we use a Cox-proportional hazard regression model that accounts for various longevity of the bait (Andersen and Gill 1982). Models were implemented using the survival R package (Therneau 2022; Terry M. Therneau and Patricia M. Grambsch 2000). We compared model fits for several candidate models using second-order Akaike’s information criterion (AICc) with the AICcmodavg R package (Mazerolle 2020).

We have various deployments across sites and different seasons. Given that the experimental design has not been set-up to look at the crossed effects we should really only use one categorical group: Location

Given, that sample sizes are most balanced for region (GLCP, HKNP and WPNP) we will exclusively use this group for the analysis. However, it should be kept in mind that the regional effects may be confounded by season and habitat types. Based on ‘location’, rainfall and temperature, we use AIC-based model selection to compare and select a model to use in the analysis.

Show code
Temp <- bait_state$`24-hour temperature`
Precip <- bait_state$`cumrain`
par(mfrow = c(1,2))
hist(Temp, breaks = 10)
hist(Precip, breaks = 10)

Show code
# time varying model
baitcrr <- coxph(Surv(time2, time2 = time1, status) ~ `24-hour temperature` 
                 + `Cumulative Rain (sqrt)` 
                 + Location, data = bait_state, x = TRUE)

baitcrr1 <- coxph(Surv(time2, time2 = time1, status) ~ Location, data = bait_state)

baitcrr2 <- coxph(Surv(time2, time2 = time1, status) ~ `24-hour temperature` 
                 + `Cumulative Rain (sqrt)`, data = bait_state)

baitcrr3 <- coxph(Surv(time2, time2 = time1, status) ~ `Cumulative Rain (sqrt)` 
                 + Location, data = bait_state)

baitcrrNull <- coxph(Surv(time2, time2 = time1, status) ~ 1, data = bait_state)

model_list <- list(baitcrr, baitcrr1, baitcrr2, baitcrr3, baitcrrNull)
names(model_list) <- 1:5

# Compare models with AIC table
aic_table <- aictab(model_list)

# Add a column to the table with variable names
model_vars <- sapply(model_list, function(x) gsub(":", "*", paste(names(x$means), collapse = " + ")))
aic_table <- cbind(Modnames = names(model_vars), model_vars) %>%
  merge(aic_table, by="Modnames") %>%
  arrange(AICc)

aic_table %>% 
  kbl("html", caption = "Model comparison of models", digits = 2) %>%
  kable_styling(bootstrap_options = "striped")
Table 1: Model comparison of models
Modnames model_vars K AICc Delta_AICc ModelLik AICcWt LL Cum.Wt
1 24-hour temperature + Cumulative Rain (sqrt) + LocationHKNP + LocationWPNP 4 430.93 0.00 1.00 0.73 -211.45 0.73
4 Cumulative Rain (sqrt) + LocationHKNP + LocationWPNP 3 432.90 1.97 0.37 0.27 -213.44 1.00
2 LocationHKNP + LocationWPNP 2 471.54 40.61 0.00 0.00 -233.76 1.00
5 0 484.63 53.70 0.00 0.00 -242.32 1.00
3 24-hour temperature + Cumulative Rain (sqrt) 2 485.49 54.56 0.00 0.00 -240.74 1.00
Show code
lapply(model_list[1:4], cox.zph)
#> $`1`
#>                          chisq df       p
#> `24-hour temperature`     17.7  1 2.6e-05
#> `Cumulative Rain (sqrt)`  32.3  1 1.3e-08
#> Location                  45.8  2 1.1e-10
#> GLOBAL                    55.1  4 3.2e-11
#> 
#> $`2`
#>          chisq df       p
#> Location  32.3  2 9.7e-08
#> GLOBAL    32.3  2 9.7e-08
#> 
#> $`3`
#>                          chisq df       p
#> `24-hour temperature`     13.5  1 0.00024
#> `Cumulative Rain (sqrt)`  17.5  1 2.8e-05
#> GLOBAL                    22.5  2 1.3e-05
#> 
#> $`4`
#>                          chisq df       p
#> `Cumulative Rain (sqrt)`  21.2  1 4.2e-06
#> Location                  29.6  2 3.8e-07
#> GLOBAL                    29.8  3 1.5e-06

Based on the above checks, no models meet the assumptions of a cox-proportional hazards model. We can try and circumvent this issue by stratifying the location:

Show code
baitcrrS <- coxph(Surv(time2, time2 = time1, status) ~ `24-hour temperature` 
                  + `Cumulative Rain (sqrt)`  
                  + strata(Location), data = bait_state)

summary(baitcrrS)
#> Call:
#> coxph(formula = Surv(time2, time2 = time1, status) ~ `24-hour temperature` + 
#>     `Cumulative Rain (sqrt)` + strata(Location), data = bait_state)
#> 
#>   n= 1187, number of events= 68 
#> 
#>                                coef  exp(coef)   se(coef)  z Pr(>|z|)
#> `24-hour temperature`    -6.597e-02  9.362e-01  3.759e+06  0        1
#> `Cumulative Rain (sqrt)`         NA         NA  0.000e+00 NA       NA
#> 
#>                          exp(coef) exp(-coef) lower .95 upper .95
#> `24-hour temperature`       0.9362      1.068         0       Inf
#> `Cumulative Rain (sqrt)`        NA         NA        NA        NA
#> 
#> Concordance= 0.5  (se = 0 )
#> Likelihood ratio test= 0  on 1 df,   p=1
#> Wald test            = 0  on 1 df,   p=1
#> Score (logrank) test = 0  on 1 df,   p=1
Show code

cox.zph(baitcrrS)
#>                          chisq df p
#> `24-hour temperature` 4.54e-16  1 1
#> GLOBAL                4.54e-16  1 1

Based on the stratification it looks like the model has issues in estimate coeffieiceients for more than one parameter. It does not seem to be the solution we are after.

To circumvent this issue we are going to try a different approach

Bait fate analysis using logistic regression

Given the issues with the above approach in survival analysis and the violation of the cox-proportional hazards test it is not feasible to use a cox-proportional hazard model to evaluate the response of baits to various environmental factors. Another option is to use logistic regression with the time-varying components. This approach models the likelihood of a bait becoming unattractive as a function of environmental and other covariates for each given bait at each period the bait is still uncensored. For a vignette on the topic on how to apply this approach in brms see:

https://bookdown.org/content/ef0b28f7-8bdf-4ba7-ae2c-bc2b1f012283/extending-the-discrete-time-hazard-model.html#time-varying-predictors-1

We can assess the correlations between predictors with the following checks:

Show code
panel.cor <- function(x, y, digits=2, prefix="", cex.cor, ...)
{
  usr <- par("usr"); on.exit(par(usr))
  par(usr = c(0, 1, 0, 1))
  r <- abs(cor(x, y, use="pairwise.complete.obs"))
  txt <- format(c(r, 0.123456789), digits=digits)[1]
  txt <- paste0(prefix, txt)
  if(missing(cex.cor)) cex.cor <- 0.8/strwidth(txt)
  text(0.5, 0.5, txt, cex = cex.cor * r)
}

baitselectdata <- bait_state %>%
  dplyr::select(cumtemp, cumrain, time1, Location, rain24, temp24)

pairs(baitselectdata, lower.panel=panel.smooth, upper.panel=panel.cor)

Based on this we should not use ‘cumtemp’ and ‘time1’ (days) in the same model. We should also avoid ‘rain24’ and ‘cumrain’. We run several select models and compare the models using leave-one-out cross validation (LOO-CV) to select the best performing model (Vehtari et al. 2020; Vehtari, Gelman, and Gabry 2017). Models were implemented using Bayesian methods with the brms R package (Bürkner 2017, 2018, 2021). We used a weakly informative normal prior with a standard deviation of 4. 4 chains were run in parallel with 1,000 warmup iterations and 2,000 sampling iterations.

Show code
brms_bait_survival_model <- brm(data = bait_state,
      family = bernoulli(link = "logit"),
      formula = status ~ scale(cumtemp) + scale(sqrt(cumrain)) + Location  + (1|BaitID),
      prior(normal(0, 4), class = b),
      chains = 4, cores = 4, iter = 2000, warmup = 1000,
      seed = 12)

brms_bait_survival_model2 <- brm(data = bait_state,
      family = bernoulli(link = "logit"),
      formula = status ~ scale(cumtemp) + scale(sqrt(cumrain)) + (1|Location)  + (1|BaitID),
      prior(normal(0, 4), class = b),
      chains = 4, cores = 4, iter = 2000, warmup = 1000,
      seed = 12)

brms_bait_survival_model3 <- brm(data = bait_state,
      family = bernoulli(link = "logit"),
      formula = status ~ scale(time1) + scale(sqrt(cumrain)) + Location  + (1|BaitID),
      prior(normal(0, 4), class = b),
      chains = 4, cores = 4, iter = 2000, warmup = 1000,
      seed = 12)

brms_bait_survival_model4 <- brm(data = bait_state,
      family = bernoulli(link = "logit"),
      formula = status ~ scale(temp24) + scale(sqrt(cumrain)) + scale(time1) + Location  + (1|BaitID),
      prior(normal(0, 4), class = b),
      chains = 4, cores = 4, iter = 2000, warmup = 1000,
      seed = 12)

brms_bait_survival_model5 <- brm(data = bait_state,
      family = bernoulli(link = "logit"),
      formula = status ~ Location*scale(time1)  + (1|BaitID),
      prior(normal(0, 4), class = b),
      chains = 4, cores = 4, iter = 2000, warmup = 1000,
      seed = 12)

brms_bait_survival_model6 <- brm(data = bait_state,
      family = bernoulli(link = "logit"),
      formula = status ~ Location*scale(time1) + scale(temp24) + scale(sqrt(rain24)) + (1|BaitID),
      prior(normal(0, 4), class = b),
      chains = 4, cores = 4, iter = 2000, warmup = 1000,
      seed = 12)

brms_bait_survival_modelHS <- brm(data = bait_state,
      family = bernoulli(link = "logit"),
      formula = status ~ Location*scale(time1) + scale(temp24) + scale(sqrt(rain24)) + (1|BaitID),
      prior(horseshoe(df = 2, par_ratio = 1), class = b),
      chains = 4, cores = 4, iter = 2000, warmup = 1000,
      seed = 12)

brms_bait_survival_modelNull <- brm(data = bait_state,
      family = bernoulli(link = "logit"),
      formula = status ~ 1,
      chains = 4, cores = 4, iter = 2000, warmup = 1000,
      seed = 12)

model_list <- list(brms_bait_survival_model, 
            brms_bait_survival_model2, 
            brms_bait_survival_model3, 
            brms_bait_survival_model4, 
            brms_bait_survival_model5, 
            brms_bait_survival_model6, 
            brms_bait_survival_modelHS,
            brms_bait_survival_modelNull)

loo_compare <- loo_compare(loo(brms_bait_survival_model), 
            loo(brms_bait_survival_model2), 
            loo(brms_bait_survival_model3), 
            loo(brms_bait_survival_model4), 
            loo(brms_bait_survival_model5), 
            loo(brms_bait_survival_model6), 
            loo(brms_bait_survival_modelHS),
            loo(brms_bait_survival_modelNull))

saveRDS(loo_compare, "models/loo_compare.rds")
saveRDS(brms_bait_survival_model6, "models/brms_bait_survival_model6.rds", compress = "xz")
Show code
loo_compare <- readRDS("models/loo_compare.rds")
loo_compare
#>                              elpd_diff se_diff
#> brms_bait_survival_model6       0.0       0.0 
#> brms_bait_survival_modelHS     -2.0       1.2 
#> brms_bait_survival_model5      -2.8       3.2 
#> brms_bait_survival_model3      -7.0       6.6 
#> brms_bait_survival_model4      -8.0       6.5 
#> brms_bait_survival_model2     -10.9       7.8 
#> brms_bait_survival_model      -11.1       7.6 
#> brms_bait_survival_modelNull -117.7      17.7
Show code
brms_bait_survival_model6 <- readRDS("models/brms_bait_survival_model6.rds")
bait_survival_top_model <- brms_bait_survival_model6

Bait survival evaluation

Show code
bayesplot::mcmc_trace(bait_survival_top_model, facet_args = list(ncol = 2), regex_pars = c("time1", "Location", "temp24", "rain24", "lp"))

Bait survival results

Bait survival was primarily impacted by the location of the bait station and the time the bait was left out for; with some slight evidence suggesting rainfall may improve bait survival (reduce the rate of bait decay). The effect of location on bait survival suggests that there is unexplained variation driving bait decay; with GLCP appearing to have faster rates of decay.

Show code
summary(bait_survival_top_model)
#>  Family: bernoulli 
#>   Links: mu = logit 
#> Formula: status ~ Location * scale(time1) + scale(temp24) + scale(sqrt(rain24)) + (1 | BaitID) 
#>    Data: bait_state (Number of observations: 1187) 
#>   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
#>          total post-warmup draws = 4000
#> 
#> Group-Level Effects: 
#> ~BaitID (Number of levels: 117) 
#>               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
#> sd(Intercept)     0.37      0.29     0.02     1.11 1.00     1052
#>               Tail_ESS
#> sd(Intercept)     1509
#> 
#> Population-Level Effects: 
#>                         Estimate Est.Error l-95% CI u-95% CI Rhat
#> Intercept                  -6.88      1.23    -9.49    -4.72 1.00
#> LocationHKNP               -4.21      2.17    -8.63    -0.11 1.00
#> LocationWPNP                2.77      1.29     0.36     5.35 1.00
#> scaletime1                  6.36      1.31     3.95     9.04 1.00
#> scaletemp24                -0.04      0.26    -0.57     0.46 1.00
#> scalesqrtrain24            -0.45      0.21    -0.87    -0.05 1.00
#> LocationHKNP:scaletime1    -0.37      1.53    -3.48     2.56 1.00
#> LocationWPNP:scaletime1    -4.28      1.31    -6.91    -1.82 1.00
#>                         Bulk_ESS Tail_ESS
#> Intercept                   1653     2259
#> LocationHKNP                2019     2231
#> LocationWPNP                1782     2379
#> scaletime1                  1630     2039
#> scaletemp24                 2851     3091
#> scalesqrtrain24             3734     3085
#> LocationHKNP:scaletime1     1627     2293
#> LocationWPNP:scaletime1     1634     2104
#> 
#> Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).
Show code
bait_aggregated <-
  bait_state %>% 
  mutate(event = case_when(status == 1 ~ "event",
                           TRUE ~ "no_event")) %>% 
  group_by(Location) %>% 
  count(event, temp24, rain24, time1) %>% 
  ungroup() %>% 
  complete(Location, event, temp24, rain24, time1) %>% 
  pivot_wider(names_from = event,
              values_from = n) %>% 
  mutate(total = event + no_event) %>% 
  mutate(proportion = event / total) %>% 
  mutate(logit = log(proportion / (1 - proportion)))
Show code
scenarios <- data.frame(cond = c("Hot and Dry", 
                                 "Hot and Wet", 
                                 "Hot and Mid-precipitation",
                                 "Cold and Dry", 
                                 "Cold and Wet", 
                                 "Cold and Mid-precipitation",
                                 "Mid-temp and Dry", 
                                 "Mid-temp and Wet", 
                                 "Mid-temp and Mid-precipitation"), 
                        q1 = c(0.9, 0.9, 0.9, 0.1, 0.1, 0.1, 0.5, 0.5, 0.5), 
                        q2 = c(0.1, 0.9, 0.5, 0.1, 0.9, 0.5, 0.1, 0.9, 0.5))


newdata_env <- list()

for(i in 1:nrow(scenarios)) {
newdata_env[[i]] <- expand.grid(Type = scenarios$cond[i], 
                           rain24 = quantile(bait_state$rain24, scenarios$q1[i]), 
                           temp24 = quantile(bait_state$temp24, scenarios$q2[i]), 
                           Location = unique(bait_state$Location), 
                           time1 = as.numeric(1:14), 
                           # status = 1, 
                           Title = paste0("Temp = ", round(quantile(Temp, scenarios$q1[i]),1), "°C, Precip = ", round(quantile(Precip, scenarios$q2[i]),1), "mm")) %>%
  mutate(time2 = time1 - 1)

}

newdata_all <- bind_rows(newdata_env)

preds_bs <- cbind(newdata_all, predict(bait_survival_top_model, newdata = newdata_all, re_formula = NA)) %>%
  rowwise() %>%
  mutate(LCI = max(c(Estimate - Est.Error, 0)), 
         UCI = min(c(Estimate + Est.Error, 1))) %>%
  ungroup() %>%
  group_by(Location, time1, Type, Title) %>% 
  mutate_at(vars(Estimate, LCI, UCI), .funs = ~cumprod(1 - .))

ggplot(data = preds_bs) +
  geom_ribbon(aes(x = time1, ymin = LCI, ymax = UCI, fill = Location), alpha = 0.5) +
  geom_line(aes(x = time1, y = Estimate, colour = Location)) + 
  scale_fill_manual(values = unname(delwp_cols[1:3])) + 
  scale_colour_manual(values = unname(delwp_cols[1:3])) + 
  facet_wrap(~ Type + Title, nrow = 3) +
  scale_x_continuous(breaks = seq(0,14, by = 2), expand = expansion(0, 0), name = "Days") +
  ylab("Survival probability") +
  theme_bw() 
Survival of bits under different possible scenarios. Scenarios were derived from the 0.9, 0.5 and 0.1 quantiles of the data for temperature and precipitation values. Lines are the mean prediction and shaded area represents ± 1 standard error

Figure 1: Survival of bits under different possible scenarios. Scenarios were derived from the 0.9, 0.5 and 0.1 quantiles of the data for temperature and precipitation values. Lines are the mean prediction and shaded area represents ± 1 standard error

Based on the above scenarios we can generate the following table that shows the day when the probability of bait survival drops below 0.5.Note that in some cases survival probability does not drop below 0.5 within the 14 days.

Show code
preds_bs %>%
  group_by(Type, Location) %>%
  arrange(desc(Estimate)) %>%
  filter(Estimate <= 0.5) %>%
  slice(1) %>%
  select(Type, Conditions = Title, Location, Time = time1, `Survival Probability` = Estimate, `Survival Probability SE` = Est.Error) %>%
  kbl("html", digits = 2, caption = "Time at which bait survival probability falls below 0.5") %>%
  kable_styling(bootstrap_options = "striped") %>%
  kableExtra::collapse_rows(columns = 1:2)
Table 2: Time at which bait survival probability falls below 0.5
Type Conditions Location Time Survival Probability Survival Probability SE
Hot and Dry Temp = 20.2°C, Precip = 1.2mm GLCP 11 0.25 0.43
HKNP 14 0.21 0.40
WPNP 14 0.49 0.50
Hot and Wet Temp = 20.2°C, Precip = 15.2mm GLCP 11 0.27 0.44
HKNP 14 0.23 0.42
Hot and Mid-precipitation Temp = 20.2°C, Precip = 3.6mm GLCP 11 0.26 0.44
HKNP 14 0.21 0.41
Cold and Dry Temp = 13.5°C, Precip = 1.2mm GLCP 10 0.46 0.50
HKNP 13 0.39 0.49
WPNP 13 0.46 0.50
Cold and Wet Temp = 13.5°C, Precip = 15.2mm GLCP 10 0.48 0.50
HKNP 13 0.41 0.49
WPNP 13 0.49 0.50
Cold and Mid-precipitation Temp = 13.5°C, Precip = 3.6mm GLCP 10 0.47 0.50
HKNP 13 0.38 0.49
WPNP 13 0.47 0.50
Mid-temp and Dry Temp = 16.8°C, Precip = 1.2mm GLCP 11 0.17 0.37
HKNP 13 0.43 0.50
WPNP 14 0.39 0.49
Mid-temp and Wet Temp = 16.8°C, Precip = 15.2mm GLCP 11 0.20 0.40
HKNP 13 0.46 0.50
WPNP 14 0.40 0.49
Mid-temp and Mid-precipitation Temp = 16.8°C, Precip = 3.6mm GLCP 11 0.18 0.39
HKNP 13 0.45 0.50
WPNP 14 0.39 0.49

If we predict survival rates for mean levels of various environmental predictors (rainfall and temperature) we can obtain more general survival curves:

Show code
f <-
  fitted(bait_survival_top_model,
         newdata = newdata_all, re_formula = NA) %>% 
  data.frame() %>% 
  bind_cols(newdata_all)

# hazard (top panel)
p1 <-
  f %>% 
  group_by(Location, time1) %>% 
  summarise_all(mean) %>%
  ggplot(aes(x = time1, group = Location, color = Location, fill = Location)) +
  geom_ribbon(aes(ymin = Q2.5, ymax = Q97.5),
              alpha = 0.5, size = 0) +
  geom_line(aes(y = Estimate)) +
  scale_y_continuous("fitted hazard")

# survival (bottom panel)
p2 <-
  f %>%
  group_by(Location, time1) %>% 
  mutate_at(vars(Estimate, Q2.5, Q97.5), .funs = ~cumprod(1 - .)) %>% 
  summarise_all(mean) %>%
  ggplot(aes(x = time1, group = Location, color = Location, fill = Location)) +
  geom_ribbon(aes(ymin = Q2.5, ymax = Q97.5),
              alpha = 0.5, size = 0) +
  geom_line(aes(y = Estimate)) +
    geom_hline(yintercept = .5, color = "red", linetype = "dashed") +
  scale_y_continuous("fitted survival probability", limits = c(0, 1)) +
  theme(strip.background.x = element_blank(),
        strip.text.x = element_blank())

(
  (p1 / p2) & 
  scale_fill_manual(values = unname(delwp_cols[1:3])) & 
  scale_colour_manual(values = unname(delwp_cols[1:3])) & 
  theme_bw() &
  scale_x_continuous(breaks = seq(0,14, by = 2), expand = expansion(0, 0), name = "Days")
) +
  plot_layout(guides = "collect")
Hazard rates and survival probability of fitted models

Figure 2: Hazard rates and survival probability of fitted models

Bait encounter analysis

Due to the data structures provided a decent amount of data wrangling was needed in order to format the data into a suitable long format for binomial regression. The aim of this analysis is to determine the probability of a cat (or other animal) taking a bait, having encountered it.

Show code
melt_taken <- function(x, region, round) {
  df <- data.table::as.data.table(x) %>%
    melt(id.vars = "Station", variable.name = "Date") %>% 
    mutate(Date = as.Date(Date)) %>%
    group_by(Station) %>%
    arrange(Date) %>%
    tidyr::fill(Date, .direction = "down") %>%
    mutate(value = coalesce(value, "BP")) %>%
    filter(value != "BA") %>%
    mutate(StartDate = min(Date), 
           EndDate = max(Date)) %>%
    slice_tail(n = 1) %>%
    ungroup() %>%
    mutate(Date = as.Date(Date)) %>%
    mutate(Taken = case_when(value == "BP" ~ 0, 
                             TRUE ~ 1)) %>%
    rename(Species = value)
  
  df$Region <- region
  df$Round <- round
  
  return(df)
}

observations <- list()
takes <- list()

# BD
BD_data_R1 <- readxl::read_excel("data/FateData/BD_record_table_15min_BaitFate.xlsx", sheet = 1) %>%
  mutate(Region = "BD", 
         Round = 1)
BD_data_R2 <- readxl::read_excel("data/FateData/BD_record_table_15min_BaitFate.xlsx", sheet = 2) %>%
    mutate(Region = "BD", 
         Round = 2)
BD_taken_matrix <- openxlsx::read.xlsx("data/FateData/BD_record_table_15min_BaitFate.xlsx", sheet = 3, detectDates = T)
BD_take_R1 <- BD_taken_matrix[2:nrow(BD_taken_matrix),c(1,which(BD_taken_matrix[1,] == "R1"))]
BD_take_R2 <- BD_taken_matrix[2:nrow(BD_taken_matrix),c(1,which(BD_taken_matrix[1,] == "R2"))]
observations[["BD"]] <- bind_rows(BD_data_R1, BD_data_R2)

melt_BD_take_R1 <- melt_taken(BD_take_R1, region = "BD", round = 1)
melt_BD_take_R2 <- melt_taken(BD_take_R2, region = "BD", round = 2)
takes[["BD"]] <- bind_rows(melt_BD_take_R1, melt_BD_take_R2)

# BM 
BM_data_R1 <- readxl::read_excel("data/FateData/BM_record_table_15min_BaitFate.xlsx", sheet = 1)[,1:5] %>%
  mutate(Region = "BM", 
         Round = 1)

BM_taken_matrix <- openxlsx::read.xlsx("data/FateData/BM_record_table_15min_BaitFate.xlsx", sheet = 2, detectDates = T) %>%
  rename(Station = CamID)

observations[["BM"]] <- BM_data_R1

takes[["BM"]] <- melt_taken(BM_taken_matrix, region = "BM", round = 1)

#GLCP 
GLCP_data_R1 <- readxl::read_excel("data/FateData/GLCP_record_table_15min_BaitFate.xlsx", sheet = 1, 
                                   col_types = c("numeric", "text", "text", "date", "date"))[,1:5] %>%
  mutate(Region = "GLCP", 
         Round = 1, 
         Station = paste0("GLCP", Station))

GLCP_taken_matrix <- openxlsx::read.xlsx("data/FateData/GLCP_record_table_15min_BaitFate.xlsx", sheet = "BaitFate", detectDates = T) %>%
  rename(Station = `Site.ID`)

observations[["GLCP"]] <- GLCP_data_R1

takes[["GLCP"]] <- melt_taken(GLCP_taken_matrix, region = "GLCP", round = 1)

#HKNP
HKNP_data_R1 <- readxl::read_excel("data/FateData/HKNP_record_table_15min_BaitFate_2.xlsx", sheet = 1) %>%
  mutate(Region = "HKNP", 
         Round = 1)
HKNP_data_R2 <- readxl::read_excel("data/FateData/HKNP_record_table_15min_BaitFate_2.xlsx", sheet = 3) %>%
  mutate(Region = "HKNP", 
         Round = 2)

HKNP_take_R1 <- openxlsx::read.xlsx("data/FateData/HKNP_record_table_15min_BaitFate_2.xlsx", sheet = 2, detectDates = T)

HKNP_take_R2 <- openxlsx::read.xlsx("data/FateData/HKNP_record_table_15min_BaitFate_2.xlsx", sheet = 4, detectDates = T) %>%
  rename(Station = `Row.Labels`)

observations[["HKNP"]] <- bind_rows(HKNP_data_R1, HKNP_data_R2)

melt_HKNP_take_R1 <- melt_taken(HKNP_take_R1, region = "HKNP", round = 1)
melt_HKNP_take_R2 <- melt_taken(HKNP_take_R1, region = "HKNP", round = 2)
takes[["HKNP"]] <- bind_rows(melt_HKNP_take_R1, melt_HKNP_take_R2)

#WPNP
WPNP_data_R1 <- readxl::read_excel("data/FateData/WPNP_record_table_15min_BaitFate.xlsx", sheet = 1)[,1:5] %>%
  mutate(Region = "WPNP", 
         Round = 1, 
         Station = as.character(Station))

WPNP_taken_matrix <- openxlsx::read.xlsx("data/FateData/WPNP_record_table_15min_BaitFate.xlsx", sheet = "BaitFate", detectDates = T) 

observations[["WPNP"]] <- WPNP_data_R1

takes[["WPNP"]] <- melt_taken(WPNP_taken_matrix, region = "WPNP", round = 1) %>%
  mutate(Station = as.character(Station))

# TA Data 
TA_data <- readxl::read_excel("data/FateData/TA_record_table_15min_BaitFate.xlsx", sheet = 1)[,1:5] %>%
  mutate(Region = "TA", 
         Round = as.numeric(substr(Round,2,2)), 
         Station = as.character(Station), 
         Date = as.Date(DateTimeOriginal))

observations[["TA"]] <- TA_data

TA_taken_matrix <- openxlsx::read.xlsx("data/FateData/TA_record_table_15min_BaitFate.xlsx", sheet = 2, detectDates = T) %>%
  rename(Station = Date)
TA_take_R1 <- TA_taken_matrix[2:nrow(TA_taken_matrix),c(1,which(TA_taken_matrix[1,] == "R1"))]
TA_take_R2 <- TA_taken_matrix[2:nrow(TA_taken_matrix),c(1,which(TA_taken_matrix[1,] == "R2"))]


melt_TA_take_R1 <- melt_taken(TA_take_R1, region = "TA", round = 1)
melt_TA_take_R2 <- melt_taken(TA_take_R2, region = "TA", round = 2)
takes[["TA"]] <- bind_rows(melt_TA_take_R1, melt_TA_take_R2)

##############

observations_joined <- bind_rows(observations)
taken_joined <- bind_rows(takes)

unique_sp_df <- list()
unique_sp_df[["unique_obs_species"]] <- data.frame(`NameInData` = unique(observations_joined$Species) %>% sort(), `ConsistentStandardisedSpeciesName` = NA, GroupLabel = NA)
unique_sp_df[["unique_taken_species"]] <- data.frame(`NameInData` = unique(taken_joined$Species) %>% sort(), `ConsistentStandardisedSpeciesName` = NA, GroupLabel = NA)

# openxlsx::write.xlsx(unique_sp_df, "data/unique_sp_df.xlsx", overwrite = T)
Show code
matched_sp_data <- readxl::read_excel("data/matched_sp.xlsx", sheet = 1) %>% unique()
matched_sp_data2 <- readxl::read_excel("data/matched_sp.xlsx", sheet = 2) %>% unique()

taken_date <- taken_joined %>% 
  # filter(Taken == 1) %>%
  mutate(Date = coalesce(Date, as.Date("2050-01-01"))) %>%
  dplyr::select(Station, TakenDate = Date, Region, Round)

taken_by <- taken_joined %>%
  mutate(Taken = 1) %>%
  left_join(matched_sp_data2, by = c("Species" = "NameInData")) %>%
  dplyr::select(Station, Date, Region, Round, ConsistentStandardisedSpeciesName, Taken) 

StationStarts <- taken_joined %>%
  dplyr::select(Station, Region, Round, StartDate)

observations_anti <- observations_joined %>% 
  left_join(matched_sp_data, by = c("Species" = "NameInData")) %>%
  left_join(taken_date, by = c("Station", "Region", "Round")) %>%
  anti_join(taken_by, y = .) %>% filter(!is.na(ConsistentStandardisedSpeciesName))

write.csv(observations_anti, "data/non-matching-data.csv")

data_size <- observations_joined %>% 
  left_join(StationStarts, by = c("Station", "Region", "Round"))

observations_matched<- observations_joined %>% 
  left_join(StationStarts, by = c("Station", "Region", "Round")) %>%
  mutate(Day = as.numeric(as.Date(Date) - StartDate + 1)) %>%
  left_join(matched_sp_data, by = c("Species" = "NameInData")) %>%
  left_join(taken_date, by = c("Station", "Region", "Round")) %>%
  left_join(taken_by) %>%
  filter(Date <= TakenDate) %>%
  mutate(Taken = coalesce(Taken, 0)) %>%
  group_by(Station, Region, Round, Taken) %>%
  filter(Taken == 0 | Taken == 1 & DateTimeOriginal == min(DateTimeOriginal)) %>% ungroup() %>%
  dplyr::select(Station, Species, DateTimeOriginal, Date, Region, Round, StartDate, Day, ConsistentStandardisedSpeciesName, GroupLabel_1, GroupLabel_2, TakenDate, Taken) %>%
filter(!is.na(GroupLabel_2))

write_csv(observations_matched, "data/observations_matched.csv")

Bait encounter model

Given we are modelling the probability of an animal taking the bait, having already encountered it, we do not need to model factors affecting encounter rate (which is likely linked to abundance and behaviour). Instead, we model the probability an animal will take a bat, having encountered it based on the days a bait has been deployed and a random effect of region and station. ‘Days’ is modelled as an interaction with species group (cat or other). Below we run a binomial regression (logit-link) model with region and station as random effects, just region and a null model. We compare the models using leave-one-out cross validation (LOO-CV) to select the best performing model (Vehtari et al. 2020; Vehtari, Gelman, and Gabry 2017). Models were implemented using Bayesian methods with the brms R package (Bürkner 2017, 2018, 2021). We used a weakly informative student-t prior with 3 degrees of freedom. 3 chains were run in parallel with 1,000 warmup iterations and 2,000 sampling iterations.

Show code
brms_bait_model <- brm(formula = Taken ~ GroupLabel_2*scale(Day) + (1|Region), 
                       data = observations_matched, 
                       family = bernoulli(link = "logit"),
                       warmup = 1000, 
                      iter = 2000, 
                      chains = 3, 
                      inits= "0", 
                      cores=3,
                      seed = 123, silent = 2, refresh = 0)

brms_bait_model_st <- brm(formula = Taken ~ GroupLabel_2*scale(Day) + (1|Region) + (1|Station), 
                       data = observations_matched, 
                       family = bernoulli(link = "logit"),
                       warmup = 1000, 
                      iter = 2000, 
                      chains = 3, 
                      inits= "0", 
                      cores=3,
                      seed = 123, silent = 2, refresh = 0)

brms_bait_model_null <- brm(formula = Taken ~ 1, 
                       data = observations_matched, 
                       family = bernoulli(link = "logit"),
                       warmup = 1000, 
                      iter = 2000, 
                      chains = 3, 
                      inits= "0", 
                      cores=3,
                      seed = 123, silent = 2, refresh = 0)

LOO

We use leave-one-out cross validation to compare the written model against null model. Broadly, this tests the posterior predictive power of the given model against one with no predictors.

Show code
loo_compare(loo(brms_bait_model), loo(brms_bait_model_null), loo(brms_bait_model_st))
#>                      elpd_diff se_diff
#> brms_bait_model_st     0.0       0.0  
#> brms_bait_model      -12.1       5.6  
#> brms_bait_model_null -43.7      10.9

Based on the above comparison, our model with days and a region and station random effect performs better than a null model.

Show code
summary(brms_bait_model_st)
#>  Family: bernoulli 
#>   Links: mu = logit 
#> Formula: Taken ~ GroupLabel_2 * scale(Day) + (1 | Region) + (1 | Station) 
#>    Data: observations_matched (Number of observations: 1761) 
#>   Draws: 3 chains, each with iter = 2000; warmup = 1000; thin = 1;
#>          total post-warmup draws = 3000
#> 
#> Group-Level Effects: 
#> ~Region (Number of levels: 6) 
#>               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
#> sd(Intercept)     1.01      0.49     0.38     2.32 1.00      807
#>               Tail_ESS
#> sd(Intercept)     1392
#> 
#> ~Station (Number of levels: 280) 
#>               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
#> sd(Intercept)     1.23      0.23     0.78     1.70 1.01      616
#>               Tail_ESS
#> sd(Intercept)      749
#> 
#> Population-Level Effects: 
#>                            Estimate Est.Error l-95% CI u-95% CI Rhat
#> Intercept                      0.23      0.74    -1.20     1.69 1.00
#> GroupLabel_2Other             -2.86      0.58    -4.00    -1.71 1.00
#> scaleDay                       1.29      0.46     0.45     2.21 1.00
#> GroupLabel_2Other:scaleDay    -1.14      0.45    -2.06    -0.30 1.00
#>                            Bulk_ESS Tail_ESS
#> Intercept                      1497     1392
#> GroupLabel_2Other              2688     2262
#> scaleDay                       1459     1628
#> GroupLabel_2Other:scaleDay     1667     1544
#> 
#> Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).

Traceplots

We assess convergenve via inspection of traceplots and Rhat values. Traceplots show good evidence of convergence and no Rhat values are above 1.05.

Show code
bayesplot::mcmc_trace(brms_bait_model_st, facet_args = list(ncol = 3), regex_pars = c("GroupLabel_2", "Region", "Day", "lp"))
Traceplots for the binomial regression encounter model

Figure 3: Traceplots for the binomial regression encounter model

Model Results

From our model, we can generate predictions of the probability a cat or other species will take a bait, having encountered it. This probability will vary with the ‘days’ since the bait was deployed.

Show code
ce_plot <- conditional_effects(brms_bait_model_st, effects = "Day:GroupLabel_2")

plot(ce_plot, plot = FALSE)[[1]] +
  scale_fill_manual(values = unname(delwp_cols[1:2]), name = "Species") + 
  scale_colour_manual(values = unname(delwp_cols[1:2]), name = "Species") + 
  ylab("Probability of taking bait, having encountered it") +
  ggplot2::theme_bw()

From the results we see that the probability a cat will take a bait increases the longer a bait is deployed. This does not appear to be the case for other species. This then raises the question on the various probabilities a bait will be taken by a cat under different encounter rates (by cat and others). To investigate these scenarios we will use the distribution of encounter rates present in the original data:

Show code
encounter_scenarios <- list()
encounter_scenarios[["cat_1"]] <- data.frame(GroupLabel_2 = "Cat", 
                                  Day = 1:38)
encounter_scenarios[["cat_2"]] <- data.frame(GroupLabel_2 = "Cat", 
                                  Day = seq(2, 38, by = 2))
encounter_scenarios[["cat_week"]] <- data.frame(GroupLabel_2 = "Cat", 
                                  Day = seq(7, 35, by = 7))
# encounter_scenarios_cat_once <- data.frame(GroupLabel_2 = "Cat", 
#                                   Day = round(runif(1,1,38)))

encounter_scenarios[["other_1"]] <- data.frame(GroupLabel_2 = "Other", 
                                  Day = 1:38)
encounter_scenarios[["other_5"]] <- data.frame(GroupLabel_2 = "Other", 
                                  Day = rep(1:38, each = 5))
encounter_scenarios[["other_10"]] <- data.frame(GroupLabel_2 = "Other", 
                                  Day = rep(1:38, each = 10))

postdraws <- list()
pp_day <- list()

for(i in 1:length(encounter_scenarios)) {
  
postdraws[[i]] <- posterior_predict(brms_bait_model_st, newdata = encounter_scenarios[[i]], re_formula = NA, ndraws = 1000) 

pp_day[[i]] <- data.frame(DayTaken = encounter_scenarios[[i]]$Day[apply(postdraws[[i]], 1, which.max)])

colnames(pp_day[[i]]) <- encounter_scenarios[[i]]$GroupLabel_2[1]

}

bind_cols_scenarios <- function(df1, df2, name) {
  new_df <- bind_cols(df1, df2) %>%
    mutate(Draw = 1:n()) %>%
    rowwise() %>%
    mutate(Scenario = name, 
           DayTaken = min(Cat, Other), 
           SpeciesTaken = case_when(DayTaken == Cat & DayTaken !=Other ~ "Cat", 
                                    DayTaken != Cat & DayTaken ==Other ~ "Other", 
                                    DayTaken == Cat & DayTaken == Other ~ "Same Day")) %>%
    ungroup()
  
  return(new_df)
}

scenario_joined <- bind_cols_scenarios(pp_day[[2]], pp_day[[5]], "Cat encounter every second day, \nothers have 5 encounters a day")  %>%
  bind_rows(bind_cols_scenarios(pp_day[[3]], pp_day[[5]], "Cat encounter every 7 days, \nothers have 5 encounters a day")) %>%
  bind_rows(bind_cols_scenarios(pp_day[[2]], pp_day[[6]], "Cat encounter every second day, \nothers have 10 encounters a day"))

scenario_joined_summary <- scenario_joined %>%
    group_by(Scenario, SpeciesTaken) %>%
    summarise(`Proportion %` = 100*n()/1000)

scenario_joined_summary %>%
  kbl("html", digits = 2, caption = "Simulated bait taking rates based on different encounter volumes") %>%
  kable_styling(bootstrap_options = "striped") %>%
  collapse_rows(1:2)
Table 3: Simulated bait taking rates based on different encounter volumes
Scenario SpeciesTaken Proportion %
Cat encounter every 7 days, others have 5 encounters a day Cat 7.6
Other 90.1
Same Day 2.3
Cat encounter every second day, others have 10 encounters a day Cat 11.5
Other 80.8
Same Day 7.7
Cat encounter every second day, others have 5 encounters a day Cat 25.2
Other 65.7
Same Day 9.1

If other species encounter the baits a lot it will be less likely that cats encounter and take the baits first. Because probability of bait taking by cats, having encountered the bait is low early in deployment; even several encounters early on do not ensure bait taking.

Show code
rdraws <- sample(1:1000, 100) # change random sample number
ggplot(scenario_joined %>% filter(Draw %in% rdraws) %>% group_by(Scenario) %>% mutate(Draw = 1:100) %>% ungroup(), aes(x=Draw, y=DayTaken)) +
  geom_segment( aes(x=Draw, xend=Draw, y=0, yend=DayTaken, colour = SpeciesTaken)) +
  geom_point(aes(fill = SpeciesTaken), size=3, alpha=0.9, shape = 21) +
  theme_light() +
  coord_flip() +
  xlab("Bait number") +
  ylab("Days") +
  scale_fill_manual(values = c(Cat = "#e41a1c", `Same Day` = "#377eb8", Other = "#4daf4a")) + 
  scale_colour_manual(values = c(Cat = "#e41a1c", `Same Day` = "#377eb8", Other = "#4daf4a")) + 
  theme(
    panel.grid.major.y = element_blank(),
    panel.border = element_blank(),
    axis.ticks.y = element_blank(), legend.position = "bottom"
  ) + facet_wrap(~Scenario)
Simulation of 100 posterior draws showing the fait of 100 baits for various encounter volumes

Figure 4: Simulation of 100 posterior draws showing the fait of 100 baits for various encounter volumes

The above plot visualises the variability in probability that a given bait will be consumed by a cat or other species. Unless cat-bait encounter rates are high, the probability the bait will be consumed by a cat is low.

Andersen, Per Kragh, and Richard D Gill. 1982. “Cox’s Regression Model for Counting Processes: A Large Sample Study.” The Annals of Statistics, 1100–1120.
Bürkner, Paul-Christian. 2017. brms: An R Package for Bayesian Multilevel Models Using Stan.” Journal of Statistical Software 80 (1): 1–28. https://doi.org/10.18637/jss.v080.i01.
———. 2018. “Advanced Bayesian Multilevel Modeling with the R Package brms.” The R Journal 10 (1): 395–411. https://doi.org/10.32614/RJ-2018-017.
———. 2021. “Bayesian Item Response Modeling in R with brms and Stan.” Journal of Statistical Software 100 (5): 1–54. https://doi.org/10.18637/jss.v100.i05.
Mazerolle, Marc J. 2020. AICcmodavg: Model Selection and Multimodel Inference Based on (q)AIC(c). https://cran.r-project.org/package=AICcmodavg.
Terry M. Therneau, and Patricia M. Grambsch. 2000. Modeling Survival Data: Extending the Cox Model. New York: Springer.
Therneau, Terry M. 2022. A Package for Survival Analysis in r. https://CRAN.R-project.org/package=survival.
Vehtari, Aki, Jonah Gabry, Mans Magnusson, Yuling Yao, Paul-Christian Bürkner, Topi Paananen, and Andrew Gelman. 2020. “Loo: Efficient Leave-One-Out Cross-Validation and WAIC for Bayesian Models.” https://mc-stan.org/loo/.
Vehtari, Aki, Andrew Gelman, and Jonah Gabry. 2017. “Practical Bayesian Model Evaluation Using Leave-One-Out Cross-Validation and WAIC.” Statistics and Computing 27 (5): 1413–32.

References

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Citation

For attribution, please cite this work as

Cally (2022, Oct. 10). Justin's Code Blog: Feral Cat Bait Analysis. Retrieved from https://justincally.github.io/blog/posts/2022-10-10-feral-cat-bait-analysis/

BibTeX citation

@misc{cally2022feral,
  author = {Cally, Justin G},
  title = {Justin's Code Blog: Feral Cat Bait Analysis},
  url = {https://justincally.github.io/blog/posts/2022-10-10-feral-cat-bait-analysis/},
  year = {2022}
}