Skip to contents

This function computes rolling sums over a specified window for selected columns in a Comex (Brazilian trade) dataset. It operates on columns that start with the specified prefixes.

Usage

comex_roll(
  data,
  x = c("qt_stat", "kg_net", "fob_", "freight_", "insurance_", "cif_"),
  k = 12
)

Arguments

data

A data frame or tibble containing Comex data, with a date column.

x

A character vector specifying the prefixes of the column names for which to calculate rolling sums. Defaults to c('qt_stat', 'kg_net', 'fob_', 'freight_', 'insurance_', 'cif_'), which captures common columns related to quantities, weights, and various costs.

k

An integer specifying the window size (in months) for the rolling sum calculation. Defaults to 12.

Value

A modified version of the input data, with new columns added for each selected column. The new column names are of the format 'original_col_name_k', where k is the window size. These new columns contain the rolling sums for the corresponding original columns. Rows with incomplete windows (less than k months of data) will have NA values in the new columns.

Details

This function uses the slider package's slide_index_dbl function to efficiently calculate rolling sums.

The rolling sum for each date is calculated by summing the values from the current date up to k-1 months prior. Since the .complete argument in slide_index_dbl is set to TRUE, the function will only calculate rolling sums for dates where there are at least k months of prior data available. Rows with incomplete windows will have NA values.

Examples

#' # Create sample Comex data
set.seed(123)
library(lubridate)
#> 
#> Attaching package: ‘lubridate’
#> The following objects are masked from ‘package:base’:
#> 
#>     date, intersect, setdiff, union
library(dplyr)
#> 
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#> 
#>     filter, lag
#> The following objects are masked from ‘package:base’:
#> 
#>     intersect, setdiff, setequal, union
comex_data <- tibble::tibble(
  date = rep(seq(from = ymd('2023-01-01'), to = ymd('2023-12-01'), by = 'month'), each=2),
  direction=rep(c("imp", "exp"), 12),
  qt_stat = rpois(24, lambda = 100),
  fob_usd = runif(24, min = 500, max = 2000)
)
# Example usage with default prefixes and window size of 12 months, grouped by direction:
rolled_data <- comex_data%>%group_by(direction)%>%comex_roll()

# Calculate 2-month rolling sums for columns starting with 'qt_' and 'fob_', grouped by direction:
rolled_data <- comex_roll(comex_data%>%group_by(direction), x = c('qt_', 'fob_'), k = 2)
rolled_data%>%arrange(direction, date)%>%filter(date<="2023-03-01")
#> # A tibble: 6 × 6
#> # Groups:   direction [2]
#>   date       direction qt_stat fob_usd qt_stat_2 fob_usd_2
#>   <date>     <chr>       <int>   <dbl>     <dbl>     <dbl>
#> 1 2023-01-01 exp           111   1498.        NA       NA 
#> 2 2023-02-01 exp           101   1076.       212     2574.
#> 3 2023-03-01 exp           104   1722.       205     2798.
#> 4 2023-01-01 imp            94   1062.        NA       NA 
#> 5 2023-02-01 imp            83    642.       177     1704.
#> 6 2023-03-01 imp           117    912.       200     1554.