| Title: | Quality Control Process for the Integrated Marine Observing System's Argos Location Data |
|---|---|
| Description: | An automated Argos location quality control process for Argos location data from satellite tags. Functions automatically download and collate data from one of several potential remote source: a user-supplied URL, a user-supplied Google Drive link, a user-supplied Dropbox link, the SMRU server, or the Wildlife Computers Portal API. The package matches deployment data with user-supplied deployment metadata; projects location data from lon,lat to a user-supplied projection or a default projection; fits user-specified SSM's in 2 passes to estimate most plausible locations; collates results by species & deployment program; generates diagnostic plots & maps; appends predicted locations at tag-measured event times to the tag manufacturer activity files such as CTD profiles, dive records, haulout records, and the Argos and (when present) GPS location files; saves activity files as .csv in one of several possible schema (IMOS ATF, ATN, User-defined); pushes QC'd files to a user-specified server or saves to a local archive (zipfile). |
| Authors: | Ian Jonsen [aut, cre, cph] |
| Maintainer: | Ian Jonsen <[email protected]> |
| License: | CC BY 4.0 |
| Version: | 0.9-16 |
| Built: | 2026-05-30 09:15:38 UTC |
| Source: | https://github.com/ianjonsen/ArgosQC |
produces a map of all QC'd tracks and generates various diagnostics to assess QC run
diagnostics( fit, fit1, what = "p", cut, data, ssm, meta, lines = FALSE, obs = FALSE, mpath = NULL, dpath = NULL, QCmode = "nrt", tag_mfr = "wc", cid = NULL )diagnostics( fit, fit1, what = "p", cut, data, ssm, meta, lines = FALSE, obs = FALSE, mpath = NULL, dpath = NULL, QCmode = "nrt", tag_mfr = "wc", cid = NULL )
fit |
the final aniMotum fit object from QC process |
fit1 |
the initial aniMotum fit object from QC process |
what |
the SSM-estimated or rerouted locations to be used |
cut |
logical; should predicted locations be dropped if keep = FALSE - ie. in a large data gap |
data |
the standardized WC Locations or SMRU diag file (prior to truncation by metadata CTD start and end dates) |
ssm |
the ssm-annotated WC/SMRU tables |
meta |
metadata |
lines |
add track lines to map (default = FALSE) |
obs |
add observed locations to map (default = FALSE) |
mpath |
path to write map file |
dpath |
path to write all other diagnostic files |
QCmode |
specify whether QC is near real-time (nrt) or delayed-mode (dm), in latter case start end end of dive data are displayed rather than ctd data. |
tag_mfr |
the tag manufacturer. Currently, only |
cid |
SMRU campaign id (from config file). Ignored if WC data is used. |
Satellite tracking data are accessed from the SMRU data server,
or accessed from the Wildlife Computers Portal API via the source argument.
Data files are saved to the data.dir specified in the JSON config file.
SMRU tag data are currently downloaded as a single .mdb (Microsoft Access
Database) file. Wildlife Computers tag data are downloaded as a series of .CSV
files saved in tag-specific directories (uniquely named with WC UUID's).
Wildlife Computers data, partial deployment metadata are output as an R object.
download_data( dest = NULL, source = "smru", cid = NULL, user = NULL, pwd = NULL, wc.akey = NULL, wc.skey = NULL, owner.id = NULL, subset.ids = NULL, download = TRUE, ... )download_data( dest = NULL, source = "smru", cid = NULL, user = NULL, pwd = NULL, wc.akey = NULL, wc.skey = NULL, owner.id = NULL, subset.ids = NULL, download = TRUE, ... )
dest |
destination path to save download |
source |
source type of data to be downloaded. Can be one of:
|
cid |
SMRU tag deployment campaign id(s) to download, eg. "ct180" |
user |
SMRU data server username as a quoted string |
pwd |
SMRU data server password as a quoted string |
wc.akey |
an Access Key issued by Wildlife Computers for their API |
wc.skey |
a Secret Key issued by Wildlife Computers for their API |
owner.id |
the Wildlife Computers uuid associated with the data owner |
subset.ids |
a single column .CSV file of WC UUID's to be included in the QC, with uuid as the variable name. |
download |
(logical) indicating if the data is to be downloaded from
the tag manufacturer's server. If the source is |
... |
additional arguments passed to |
## Not run: ## SMRU data download download_data( dest = file.path(wd, config$setup$data.dir), source = "smru", cid = config$harvest$cid, user = config$harvest$smru.usr, pwd = config$harvest$smru.pwd, timeout = config$harvest$timeout ) ## Wildlife Computers data download & deployment metadata acquisition wc.deploy.meta <- download_data( dest = file.path(wd, config$setup$data.dir), source = "wc", unzip = TRUE, wc.akey = config$harvest$wc.akey, wc.skey = config$harvest$wc.skey, subset.ids = config$harvest$tag.list, download = TRUE, owner.id = config$harvest$owner.id ) ## End(Not run)## Not run: ## SMRU data download download_data( dest = file.path(wd, config$setup$data.dir), source = "smru", cid = config$harvest$cid, user = config$harvest$smru.usr, pwd = config$harvest$smru.pwd, timeout = config$harvest$timeout ) ## Wildlife Computers data download & deployment metadata acquisition wc.deploy.meta <- download_data( dest = file.path(wd, config$setup$data.dir), source = "wc", unzip = TRUE, wc.akey = config$harvest$wc.akey, wc.skey = config$harvest$wc.skey, subset.ids = config$harvest$tag.list, download = TRUE, owner.id = config$harvest$owner.id ) ## End(Not run)
downloads, restructures & formats metadata, appends dive/CTD start and end datetimes (for QC), & fills in missing required metadata - eg. release_datetime, release_longitude/latitude's with data from the GPS (if present) or Argos location file.
get_metadata( source = "smru", tag_mfr = "smru", tag_data = NULL, cid = NULL, user = NULL, pwd = NULL, dropIDs = NULL, file = NULL, meta.args, subset.ids = NULL, wc.meta = NULL )get_metadata( source = "smru", tag_mfr = "smru", tag_data = NULL, cid = NULL, user = NULL, pwd = NULL, dropIDs = NULL, file = NULL, meta.args, subset.ids = NULL, wc.meta = NULL )
source |
the source of the deployment metadata, current options are
|
tag_mfr |
the tag manufacturer, current options are |
tag_data |
a list of either |
cid |
SMRU campaign id must be provided when the tag_mfr is |
user |
SMRU data server username as a quoted string - to be used only if
metadata are to be built from SMRU server details ( |
pwd |
SMRU data server password as a quoted string - to be used only if
metadata are to be built from SMRU server details ( |
dropIDs |
SMRU refs or WC ids to be dropped from QC |
file |
path to metadata .csv file, if provided then metadata will be
read from the provided |
meta.args |
optional metadata fields to be passed from config file when downloading tag metadata from SMRU server. Typically used only when no metadata filepath is provided in the config file. |
subset.ids |
a character vector of comma-separated (no spaces) WC UUID's
to be included in the QC. Ignored if |
wc.meta |
an R data.frame of Wildlife Computers tag deployment metadata
obtained via |
map aniMotum-estimated locations and behavioural indices with coastline and projection options
map_QC( x, y = NULL, what = c("fitted", "predicted", "rerouted"), aes = aes_lst(), by.id = TRUE, by.date = FALSE, cut = FALSE, crs = NULL, ext.rng = c(0.05, 0.05), buffer = 10000, normalise = TRUE, group = FALSE, silent = FALSE )map_QC( x, y = NULL, what = c("fitted", "predicted", "rerouted"), aes = aes_lst(), by.id = TRUE, by.date = FALSE, cut = FALSE, crs = NULL, ext.rng = c(0.05, 0.05), buffer = 10000, normalise = TRUE, group = FALSE, silent = FALSE )
x |
a |
y |
optionally, a |
what |
specify which location estimates to map: fitted, predicted or rerouted |
aes |
a list of map controls and aesthetics (shape, size, col, fill, alpha)
for each map feature (estimated locations, confidence ellipses, track lines,
observed locations, land masses, water bodies). Constructed by |
by.id |
when mapping multiple tracks, should locations be coloured by
id (logical; default = TRUE if |
by.date |
when mapping single tracks, should locations be coloured by date (logical; default = FALSE; ignored if behavioural index provided) |
cut |
logical; should predicted locations be dropped from mapping if keep = FALSE. default = FALSE. |
crs |
|
ext.rng |
proportion (can exceed 1) to extend the plot range in x and y dimensions |
buffer |
distance (in km) to buffer locations for subsetting land
polygons (default = 10000). If map extents are expanded by many factors then
the buffer distance may need to be increased, otherwise this should not be
used. Ignored if |
normalise |
logical; if output includes a move persistence estimate, should g (the move persistence index) be normalised to have minimum = 0 and maximum = 1 (default = TRUE). |
group |
logical; should g be normalised among individuals as a group, a 'relative g', or separately to highlight regions of lowest and highest move persistence along a track (default = FALSE). |
silent |
logical; generate maps silently (default = FALSE). |
a map as a ggplot2 object
apply SSM filter to diag data across multiple processors
multi_filter(x, vmax = 4, model = "rw", ts = 6, verbose = FALSE)multi_filter(x, vmax = 4, model = "rw", ts = 6, verbose = FALSE)
x |
|
vmax |
for prefilter |
model |
|
ts |
|
verbose |
turn on/off furrr::future_map progress indicator |
reads SMRU or WC tag datafiles & combines in a unified list
pull_local_data(path2data, cid = NULL, tag_mfr)pull_local_data(path2data, cid = NULL, tag_mfr)
path2data |
path to local datafile(s) |
cid |
SMRU campaign id. Ignored if |
tag_mfr |
either "smru" or "wc" |
re-apply SSM filter to diag data for id's that failed to converge. parallelized
redo_multi_filter( fit, diag_sf, model = "crw", ts = 3, vmax = 2, ang = c(15, 25), distlim = c(1500, 5000), min.dt = 180, map = NULL, reroute = TRUE, dist = 500, barrier = NULL, verbose = TRUE, ... )redo_multi_filter( fit, diag_sf, model = "crw", ts = 3, vmax = 2, ang = c(15, 25), distlim = c(1500, 5000), min.dt = 180, map = NULL, reroute = TRUE, dist = 500, barrier = NULL, verbose = TRUE, ... )
fit |
aniMotum fit object from first round of filtering |
diag_sf |
|
model |
model argument ("rw" or "crw) for |
ts |
time.step argument for |
vmax |
threshold travel speed (m/s) to apply during track pre-filtering |
ang |
sdafilter argument |
distlim |
sdafilter argument |
min.dt |
min.dt argument for |
map |
params to fix |
reroute |
(logical) should SSM-predicted locations be re-routed off of land (default is TRUE) |
dist |
the distance (in km) to buffer around predicted locations. This buffer allows a larger portion of coastline to be selected for rerouting any locations that are on land. More coastline polygon data can help rerouting, but too much will make computation very slow. |
barrier |
add a custom POLYGON/MULTIPOLYGON shapefile to use as a land
barrier. Default (NULL) reverts to the |
verbose |
turn on/off furrr::future_map progress indicator |
... |
additional arguments to |
append SMRU tables so each event has SSM-derived lon, lat, x, y, x.se, y.se.
smru_append_ssm(smru, fit, what = "p", meta, cut = FALSE, dropIDs = NULL)smru_append_ssm(smru, fit, what = "p", meta, cut = FALSE, dropIDs = NULL)
smru |
SMRU table file - output of |
fit |
final |
what |
choose which locations to use for annotating SMRU tables (default = "predicted") |
meta |
metadata used to truncate start of diag data for each individual |
cut |
drop predicted locations if keep = FALSE, ie. locations in a large data gap |
dropIDs |
SMRU refs to be dropped |
restructures diag files, formats dates & lc's in preparation for SSM-filtering
smru_clean_diag(smru, dropIDs = NULL)smru_clean_diag(smru, dropIDs = NULL)
smru |
list of SMRU tables |
dropIDs |
SMRU refs to be dropped (eg. tags were turned on but not deployed) |
restructures diag files, formats dates & lc's; truncates start (and end for "nrt") of individual deployments using ctd dates; converts to sf geometry - all in preparation for SSM-filtering. Splits resulting truncated diag files by species.
smru_prep_loc(smru, meta, dropIDs = NULL, crs = NULL, QCmode = NULL)smru_prep_loc(smru, meta, dropIDs = NULL, crs = NULL, QCmode = NULL)
smru |
list of SMRU tables |
meta |
metadata used to truncate start of diag data for each individual |
dropIDs |
SMRU refs to be dropped (eg. tags were turned on but not deployed) |
crs |
a proj4string to re-project diag locations from longlat. Default is NULL
which results in one of 4 possible projections applied automatically, based on
the centroid of the tracks. See |
QCmode |
specify whether QC is near real-time (nrt) or delayed-mode (dm), in latter case diag is not right-truncated & date of first dive is used for the track start date |
extracts specified tables from SMRU .mdb files, using Hmisc::mdb.get
smru_pull_tables( cids, path2mdb, tables = c("diag", "gps", "haulout", "ctd", "dive", "cruise", "summary"), p2mdbtools = NULL, verbose = FALSE )smru_pull_tables( cids, path2mdb, tables = c("diag", "gps", "haulout", "ctd", "dive", "cruise", "summary"), p2mdbtools = NULL, verbose = FALSE )
cids |
SMRU campaign ids |
path2mdb |
path to SMRU .mdb file(s) |
tables |
specify which tables to extract, default is to extract all tables |
p2mdbtools |
path to mdbtools binaries. Specifying the path can avoid an error when calling from within RStudio, eg. on MacBook Pro M1 Pro with homebrew-installed mdbtools @ /opt/homebrew/Cellar/mdbtools/1.0.0/bin/ |
verbose |
turn on/off progress indicator |
Wrapper function that executes the complete SMRU QC workflow from data download to SSM-appended tag data files output as CSV files. All settings are specified in a JSON config file, including program - currently, IMOS, ATN or OTN. The program field determines the specific ArgosQC workflow functions called within the wrapper fn.
smru_qc(wd, config)smru_qc(wd, config)
wd |
the path to the working directory that contains: 1) the data directory
where tag data files are stored (if |
config |
a hierarchical JSON configuration file containing the following blocks, each with a set of block-specific parameters:
|
right-truncate all SSM-appended SMRU tables using CTD end date for given individuals, using CTD date-times from metadata
smru_truncate_ssm(smru_ssm, meta, refs)smru_truncate_ssm(smru_ssm, meta, refs)
smru_ssm |
SSM-appended SMRU file to use |
meta |
metadata used to truncate SSM-appended SMRU tables for each individual |
refs |
device_id's (SMRU ref's) to apply truncation |
reconfigure annotated tables - subsample predicted locations to 6-h interval, write to .csv and zip by campaign id
smru_write_csv( smru_ssm, fit, what, meta, program = "imos", proj = NULL, test = TRUE, path = NULL, dropIDs = NULL, suffix = "_nrt" )smru_write_csv( smru_ssm, fit, what, meta, program = "imos", proj = NULL, test = TRUE, path = NULL, dropIDs = NULL, suffix = "_nrt" )
smru_ssm |
SSM-appended SMRU table file - output of |
fit |
final |
what |
specify whether predicted or rerouted locations are to be used |
meta |
metadata |
program |
Determines structure of output metadata. The |
proj |
the proj4string specified in the .JSON config file & used to project the location data prior to SSM fitting. It is passed in here to be added to the output metadata .CSV file |
test |
should variables be tested for standards compliance, default is TRUE.
Standards compliance is specific to the program. Currently, only program = |
path |
path to write .csv files |
dropIDs |
individual SMRU ids to be dropped |
suffix |
suffix to add to .csv files (_nrt, _dm, or _hist) |
Identify & mark SSM predicted & rerouted (if present) location estimates in track segments with data gaps of a specified minimum duration.
ssm_mark_gaps(ssm, min.gap = 24, mark = TRUE)ssm_mark_gaps(ssm, min.gap = 24, mark = TRUE)
ssm |
the SSM fit object from |
min.gap |
the minimum data gap duration from which SSM estimates are removed (in hours) |
mark |
logical; should the SSM data be marked (TRUE; default), otherwise the function does no marking and returns the original SSM fit object |
append WC tag datafiles so each event has SSM-derived lon, lat, x, y, x.se, y.se.
wc_append_ssm( wc, fit, what = "p", meta, cut = FALSE, dropIDs = NULL, crs = "+proj=merc +units=km +ellps=WGS84 +no_defs" )wc_append_ssm( wc, fit, what = "p", meta, cut = FALSE, dropIDs = NULL, crs = "+proj=merc +units=km +ellps=WGS84 +no_defs" )
wc |
WC tag datafiles - output of |
fit |
final |
what |
choose which locations to use for annotating WC tag datafiles (default = "predicted") |
meta |
metadata used to truncate start of diag data for each individual |
cut |
drop predicted locations if keep = FALSE, ie. locations in a large data gap (currently, only used in DM QC mode) |
dropIDs |
SMRU DeploymentIDs to be dropped |
crs |
CRS to be applied when interpolating SSM-estimated locations and re-projecting back from Cartesian coords to longlat |
restructures Locations files, formats dates & lc's; truncates start (and end for "nrt") of individual deployments using ctd dates; converts to sf geometry - all in preparation for SSM-filtering. Splits resulting truncated diag files by species.
wc_prep_loc(wc, meta, dropIDs, crs = NULL, program = "atn", QCmode = "nrt")wc_prep_loc(wc, meta, dropIDs, crs = NULL, program = "atn", QCmode = "nrt")
wc |
list of WC datafiles |
meta |
metadata used to truncate start of diag data for each individual |
dropIDs |
WC DeploymentID's to be dropped (eg. tags were turned on but not deployed) |
crs |
a proj4string to re-project diag locations from longlat. Default is NULL
which results in one of 4 possible projections applied automatically, based on
the centroid of the tracks. See |
program |
specify the aniBOS program contributing data (currently: 'atn', 'irap') |
QCmode |
specify whether QC is near real-time (nrt) or delayed-mode (dm), in latter case wc is not right-truncated & date of first dive is used for the track start date. |
extracts data from X-Locations.csv (Argos), X-FastGPS.csv,
ECDHistos.csv, Histos.csv, MixLayer.csv, PDTs.csv, DSA.csv,
MinMaxDepth.csv, HaulOut.csv, and SST.csv files.
Extracted data are aggregated across individual tags and returned in a
single named list with the following data.frames:
Argos
FastGPS
ECDHistos_SCOUT_TEMP_361A
ECDHistos_SCOUT_DSA
Histos
Mixlayer
PDTs
DSA
MinMaxDepth
Haulout
SST
WC tag data files downloaded via download_data will be stored in separate,
tag-specific subdirectories. path2data should point to the outer directory.
wc_pull_data(path2data, subset.ids = NULL)wc_pull_data(path2data, subset.ids = NULL)
path2data |
path to all WC tag data files. |
subset.ids |
a single column .CSV file of WC UUID's to be included in the QC, with uuid as the variable name. |
Wrapper function that executes the complete workflow from data download to SSM-appended tag data files output as CSV files.
wc_qc(wd, config)wc_qc(wd, config)
wd |
the path to the working directory that contains: 1) the data
directory where tag data files are stored (if source = |
config |
a hierarchical JSON configuration file containing the following blocks, each with a set of block-specific parameters:
|
subsample SSM-predicted locations to 6-h intervals, write annotated files to .csv
wc_write_csv( wc_ssm, fit, what, meta, program = "atn", path = NULL, dropIDs = NULL, suffix = "_nrt", pred.int = 6 )wc_write_csv( wc_ssm, fit, what, meta, program = "atn", path = NULL, dropIDs = NULL, suffix = "_nrt", pred.int = 6 )
wc_ssm |
SSM-appended WC tag datafiles - output of |
fit |
final SSM fit object |
what |
specify whether predicted or rerouted locations are to be used |
meta |
metadata |
program |
Determines structure of output metadata. Currently, either |
path |
path to write .csv files |
dropIDs |
individual WC DeploymentID's to be dropped |
suffix |
suffix to add to .csv files (_nrt, _dm, or _hist) |
pred.int |
prediction interval to use for sub-sampling predicted locations (default = 6 h) |