Overview

This vignette highlights the ability to generate coastwide indices for West Coast groundfish stocks for bottom trawl surveys using available functions in the indexwc package and the available configuration data object.

library(dplyr)
library(indexwc)
options(repos = c(CRAN = "https://cloud.r-project.org"))

Get data and settings

indexwc includes a maintained configuration file that contains default settings for index creation for West Coast groundfish species. The configuration is a data frame with entries by row for species, alternative model error structures (e.g., delta-lognormal, delta-gamma), formulas, and various bottom trawl surveys (e.g., West Coast Groundfish Bottom Trawl Survey [WCGBTS], etc.). The data frame includes various columns defining the commonly used model configuration parameters (e.g., spatiotemporal settings, anisotropy, etc.). Please see the configuration vignette for additional information.

In the below example, the configuration file will be used to pull and run a model for sablefish using data from the WCGBTS (aka “NWFSC.Combo” as defined in the nwfscSurvey) using a delta-gamma model error structure.

# Load from the rda file
configuration_to_run <- configuration |>
  dplyr::filter(species == "sablefish", source == "NWFSC.Combo", family == "sdmTMB::delta_gamma()") 

print(configuration_to_run)
#>     species                                                          fxn
#> 1 sablefish nwfscSurvey::pull_catch(common_name = species,survey=source)
#>        source                family                                formula
#> 1 NWFSC.Combo sdmTMB::delta_gamma() catch_weight ~ 0 + fyear + pass_scaled
#>   min_depth max_depth min_latitude max_latitude min_year max_year anisotropy
#> 1       -55     -1280         31.9           49     2003     2050       TRUE
#>   knots spatiotemporal1 spatiotemporal2 share_range used
#> 1   500             iid             iid       FALSE TRUE

The selected configuration is then passed to pull_and_format_data() that pulls and returns a data list with entries for model settings based on the configuration file.

my_data <- pull_and_format_data(
   configuration_to_run = configuration_to_run
)

Double check the items in the my_data list created above:

names(my_data)
#>  [1] "species"         "fxn"             "source"          "family"         
#>  [5] "formula"         "min_depth"       "max_depth"       "min_latitude"   
#>  [9] "max_latitude"    "min_year"        "max_year"        "anisotropy"     
#> [13] "knots"           "spatiotemporal1" "spatiotemporal2" "share_range"    
#> [17] "used"            "data_raw"        "data_filtered"
my_data$family
#> [1] "sdmTMB::delta_gamma()"
my_data$formula
#> [1] "catch_weight ~ 0 + fyear + pass_scaled"
my_data$knots
#> [1] 500
my_data$share_range
#> [1] FALSE
my_data$anisotropy
#> [1] TRUE
list(my_data$spatiotemporal1, my_data$spatiotemporal2)
#> [[1]]
#> [1] "iid"
#> 
#> [[2]]
#> [1] "iid"

Estimate the model

Running sdmTMB models can be done through run_sdmtmb() that has function arguments defining the model type and structure to run. run_sdmtmb() includes an argument for saving the fit object from sdmTMB, however, the default setting is dir = NULL where output will not be saved at this stage.

fit <- run_sdmtmb(
  data = my_data$data_filtered[[1]],
  family = my_data$family,
  formula = my_data$formula,
  n_knots = my_data$knots,
  share_range = my_data$share_range,
  anisotropy = my_data$anisotropy,
  spatiotemporal = list(my_data$spatiotemporal1, my_data$spatiotemporal2)
)

run_sdmtb() returns a list that contains all information from running the model. To see the estimated parameters for the model, type the assigned R object from run_sdmtmb().

fit

Diagnostics

A range of diagnostics are available and all diagnostics can be done using diagnose():

diag <- diagnose(
 fit = fit
)

A sanity check is performed to evaluate model convergence and the AIC is calculated and the fixed effects are evaluated. A prediction grid based upon the California Current is created within diagnose() that is then used to evaluate QQ, residuals, and model predicted density.

diagnose() has the following arguments available for use:

args(diagnose)

If a directory is not provided (dir = NULL), then no output is saved at this point.

Calculate an index of abundance

Finally, an index of abundance is estimated using calc_index_areas()

index <- calc_index_areas(
   data = fit$data,
   fit = fit,
   boundaries = "Coastwide"
)

calc_index_areas() has the following arguments available for use:

args(diagnose)

If a directory is not provided (dir = NULL), then no output is saved at this point.

Write out all files and figures

Now that the model has been fit and diagnostics and an index of abundance has been calculated, all of the output objects can be written using save_index_outputs():

save_index_outputs(
 fit = fit,
 diagnostics = diag,
 indices = index,
 dir = "C:/indices"
)

The directory structure created is:

 dir/
 └── species_name/
     └── survey_name/
         └── family_name/
             ├── data/
             │   ├── data.rdata
             │   └── fit.rds
             ├── diagnostics/
             │   ├── mesh.png
             │   ├── sanity_data_frame.csv
             │   ├── aic_nll.txt
             │   ├── run_diagnostics_and_estimates.rdata
             │   ├── qq.png
             │   ├── residuals_1_*.png
             │   ├── residuals_2_*.png (if delta model)
             │   ├── anisotropy.png
             │   ├── fixed_effects.png
             │   ├── density_*.png
             │   ├── data_with_residuals.rdata
             │   └── predictions.rdata
             └── index/
                 ├── est_by_area.csv
                 ├── index_coastwide.png (if Coastwide calculated)
                 └── index_all_areas.png