Download Comexstat Data from MDIC (Brazilian Ministry of Development, Industry, Commerce, and Services)
Source:R/comexstat_download_2.R
comex_download.RdThis function downloads Comexstat (Brazilian trade statistics) data from the MDIC website for specified years, directions (imports/exports), and types (NCM/HS4). It can also optionally download and manage auxiliary data tables. After download and checking the csv files, the data is stores as parquet files under a data directory, in order to increase speed.
Arguments
- years
A numeric vector or integer specifying the years for which data should be downloaded. Defaults to the current year.
- directions
A character vector specifying the directions of trade: 'imp' (imports) and/or 'exp' (exports). Defaults to both.
- types
A character vector specifying the types of data: 'ncm' (Nomenclatura Comum do Mercosul) and/or 'hs4' (Harmonized System 4-digit). Defaults to both.
- cache
A logical value indicating whether to use cached files if they exist. Defaults to
TRUE.- .progress
A logical value indicating whether to display a progress bar during downloads. Defaults to
TRUE.- n_tries
The maximum number of download attempts before giving up. Defaults to 30.
- force_download_aux
A logical value indicating whether to force the download of auxiliary data tables (e.g., URF, VIA, country codes), even if they already exist in the cache. Defaults to
FALSE. Auxiliary data is typically downloaded when a new trade data file is downloaded.- timeout
The maximum time (in seconds) to wait for a download response. Defaults to 600 seconds (10 minutes).
- ...
Additional arguments to be passed to
curl::multi_download, such asheaders,handle, etc.
Value
invisible(NULL) if successful. The function primarily downloads data to the specified directories.
Details
This function performs the following steps:
File Structure: Ensures the necessary directories exist to store downloaded files.
URL Generation: Constructs URLs for Comexstat data files based on the specified years, directions, and types.
Download (with Retry): Downloads files using
curl::multi_downloadwith retry logic in case of failures.Auxiliary Data: If there is new trade data or
force_download_auxisTRUE, it downloads and manages auxiliary data tables.Error Handling: Checks if any downloads failed or if the downloaded files are valid.
Write parquet files: Stores data as parquet files in order to speed up analyses.