Create a state column based on input column specified in source

getState(Pdata, source = c("AGENCY_CODE"), verbose = TRUE)

Arguments

Pdata

A data frame returned from PullBDS.PacFIN() containing biological samples. These data are stored in the Pacific Fishieries Information Network (PacFIN) data warehouse, which originated in 2014 and are pulled using sql calls.

source

The column name where state information is located in Pdata. See the function call for options, where only the first value will be used.

verbose

A logical specifying if output should be written to the screen or not. Good for testing and exploring your data but can be turned off when output indicates information that you already know. The printing of output to the screen does not affect any of the returned objects. The default is to always print to the screen, i.e., verbose = TRUE.

Value

The input data frame is returned with an additional column, state, which is filled with two-character values identifying the state or a three-character value UNK for all rows that do not have an assigned state. All rows are returned, but users should pay attention to the warning that is returned for rows that have no state id.

Details

Create a categorical column, state, based upon the AGENCY_CODE. It is no longer advisable as of February 14, 2021 to create states based on PSMFC_CATCH_AREA_CODE or PSMFC_ARID because areas are not mutually exclusive to a state. Previous code set areas 1[a-z] to Washington, 2[a-z] to Oregon, and 3[a-z] to California. The PacFIN documentation suggests that the following area codes can be assigned to the following states:

  • WA: 1C, 2A, 2B, 2C, 2E, 2F, 3A, 3B, 3C, 3N, 3S

  • OR: 1C, 2A, 2B, 2C, 2E, 2F, 3A, 3B, CS

  • CA: 1A, 1B, 1C

However, the look-up table for PacFIN Catch Area Code indicates that they should be:

  • WA: 3A, 3B, 3C, 3S

  • OR: 2A, 2B, 2C, 2E, 2F, 3A

  • CA: 1A, 1B, 1C

If you see the need to use PSMFC_CATCH_AREA_CODE to set states please contact the package maintainer.

See also

cleanPacFIN calls getState.

Examples

data <- data.frame(
  AGENCY_CODE = rep(c("W", "O", "C"), each = 2),
  info = 1:6
)
testthat::expect_true(
  all(getState(data)[["state"]] == rep(c("WA", "OR", "CA"), each = 2))
)
#> 
#> There are 0 records for which the state (i.e., 'CA', 'OR', 'WA')
#> could not be assigned and were labeled as 'UNK'.
#> 
#> CA OR WA 
#>  2  2  2