Assign cells and timesteps to coordinates and dates
Source:R/snap_to_birdflow.R
snap_to_birdflow.Rd
This function aligns bird movement data that has both location and date information with a BirdFlow model's cells and timesteps. Optionally,it will aggregate multiple points from a track within the same week into a single point and date prior to determining the location ID.
Usage
snap_to_birdflow(
d,
bf,
x_col = "lon",
y_col = "lat",
date_col = "date",
id_cols = "id",
crs = "EPSG:4326",
aggregate = NULL
)
Arguments
- d
A data frame with bird movement data it must have the columns specified by
x_col
,y_col
,date_col
, andid_cols
. The default values for the first three are"lon"
,"lat"
, and"date"
.- bf
A BirdFlow model
- x_col
The name of the column that holds x or longitude coordinates. Default is
"lon"
.- y_col
The name of the y or latitude column, default is
"lat"
.- date_col
Name of the date column, defaults to
"date"
- id_cols
The name of identification columns. One or more columns that collectively define a unique track. If
aggregate
is notNULL
these along withtimestep
(calculated from thedate_col
) are used to define the groups. If aggregate isNULL
these columns are not used but are retained in the output.- crs
The coordinate reference system used by
x_col
andy_col
. The default"EPSG:4326"
corresponds to WGS 1984- aggregate
Leave the default
NULL
for no aggregation - each row ind
is processed separately and represented in the returned object. Ifaggregate
is set to one of the values below then the locations (rows) ind
that fall in the same week and track (seeid_cols
) will be aggregated together such that there is only one location per week for each track.The possible values for
aggregate
are:mean
The mean of the "x", "y", and "date" columns are used for the week.median
The median of the "x", "y", and "date" columns are used for the week.midweek
The observation that is closest to the middle of the week is used to represent the week. With ties the observation that occurs first is used.
Pending ideas, not yet implemented:
gmedian
geometric mediancentral
The point closest to the centroid of all the points is used to represent the week.
Value
A data frame with columns
<id_cols>
The columns identified with
id_cols
will be retained and be leftmost.- date
This column will have information from
date_col
but not retain its name or original formatting. If aggregate is NULL the input dates will be retained, if not the date will vary with the aggregation method but will represent the dates that went into the summary and not the mid-week date associated withtimestep
. For example withaggregate = "mean"
the date will be the average date of the points in the week.- timestep
The model timestep associated with
date
.- x, y
The x and y coordinates of the point. These will always be in
crs(bf)
and will represent the original or aggregated location and not the cell center.- i
The location index of the cell associated with the
x
andy
inbf
. Seei_to_xy()
.- n
The number of rows in
d
that contributed to each output row. If aggregate is NULL every value will be 1.- error
TRUE if there was an error.
- message
NA or the error message. The possible messages are:
"err_date" - The date could not be parsed with
lubridate::as_date()
"err_truncated" -
bf
is truncated and the date falls outside of portion of the year the model covers."err_coords"
- The coordinates could not be transformed intocrs(bf)
and thus likely are corrupt in some way."err_not_active"
- the location does not fall within an active cell as defined by the static mask."err_dynamic"
- the location does not fall within the dynamic mask on the associated date."err_sparse"
- the location falls within the dynamic mask but that location and date combination has been eliminated by sparsification.
The function will always return the error message that appears first on this list, even though in some cases multiple errors can be triggered.
With aggregation the first three errors prevent a row from being used and it will be dropped prior to aggregation with a warning.
Details
If aggregate
is NULL than than each row in the output will correspond with
the same row in the input. With aggregation rows can be lost in two ways
(1) if the coordinates or timestep can't be resolved (first three errors
on list) then the row is dropped prior to aggregation.
(2) via aggregation all the rows within a week for a track will be collapsed
to one. These aggregated locations can still trigger the other three
errors on the list.
A location id will only be assigned if the location is a valid location for the model on the associated date.
Examples
bf <- BirdFlowModels::rewbla |> add_dynamic_mask()
obs <- BirdFlowModels::rewbla_observations
a <- snap_to_birdflow(obs, bf, id_cols = "bird_id")