This function computes rolling sums over a specified window for selected columns in a Comex (Brazilian trade) dataset. It operates on columns that start with the specified prefixes.
Usage
comex_roll(
data,
x = c("qt_stat", "kg_net", "fob_", "freight_", "insurance_", "cif_"),
k = 12
)
Arguments
- data
A data frame or tibble containing Comex data, with a
date
column.- x
A character vector specifying the prefixes of the column names for which to calculate rolling sums. Defaults to
c('qt_stat', 'kg_net', 'fob_', 'freight_', 'insurance_', 'cif_')
, which captures common columns related to quantities, weights, and various costs.- k
An integer specifying the window size (in months) for the rolling sum calculation. Defaults to 12.
Value
A modified version of the input data
, with new columns added for each selected column.
The new column names are of the format 'original_col_name_k', where k
is the window size.
These new columns contain the rolling sums for the corresponding original columns. Rows with
incomplete windows (less than k
months of data) will have NA values in the new columns.
Details
This function uses the slider
package's slide_index_dbl
function to efficiently calculate rolling sums.
The rolling sum for each date is calculated by summing the values from the current date up to k-1
months prior.
Since the .complete
argument in slide_index_dbl
is set to TRUE
, the function will only calculate rolling sums
for dates where there are at least k
months of prior data available. Rows with incomplete windows will have NA
values.
Examples
#' # Create sample Comex data
set.seed(123)
library(lubridate)
#>
#> Attaching package: ‘lubridate’
#> The following objects are masked from ‘package:base’:
#>
#> date, intersect, setdiff, union
library(dplyr)
#>
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#>
#> filter, lag
#> The following objects are masked from ‘package:base’:
#>
#> intersect, setdiff, setequal, union
comex_data <- tibble::tibble(
date = rep(seq(from = ymd('2023-01-01'), to = ymd('2023-12-01'), by = 'month'), each=2),
direction=rep(c("imp", "exp"), 12),
qt_stat = rpois(24, lambda = 100),
fob_usd = runif(24, min = 500, max = 2000)
)
# Example usage with default prefixes and window size of 12 months, grouped by direction:
rolled_data <- comex_data%>%group_by(direction)%>%comex_roll()
# Calculate 2-month rolling sums for columns starting with 'qt_' and 'fob_', grouped by direction:
rolled_data <- comex_roll(comex_data%>%group_by(direction), x = c('qt_', 'fob_'), k = 2)
rolled_data%>%arrange(direction, date)%>%filter(date<="2023-03-01")
#> # A tibble: 6 × 6
#> # Groups: direction [2]
#> date direction qt_stat fob_usd qt_stat_2 fob_usd_2
#> <date> <chr> <int> <dbl> <dbl> <dbl>
#> 1 2023-01-01 exp 111 1498. NA NA
#> 2 2023-02-01 exp 101 1076. 212 2574.
#> 3 2023-03-01 exp 104 1722. 205 2798.
#> 4 2023-01-01 imp 94 1062. NA NA
#> 5 2023-02-01 imp 83 642. 177 1704.
#> 6 2023-03-01 imp 117 912. 200 1554.