---
title: "Introduction"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Introduction}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
```{r setup}
library(ecdata)
```
The Executive Communcations Dataset (ECD) is a dataset comprised of executive communications across 41 differenct countries. The `ecdata` package is a minimal package to download data from the ecd repositories. It includes caching and data dicitionaries.
## `load_ecd`
The default function for loading the ECD is `load_ecd`. This function will download data from our repositories and load them into memory. You can load the full ECD by setting `load_ecd(full_ecd = TRUE)` This can take awhile because you are downloading a `1.9GB` parquet file.
```{r load-full-ecd, eval = FALSE}
full_ecd = load_ecd(full_ecd = TRUE)
```
If you want a specific country or countries you can feed a character vector to the `country` argument.
```{r country-example, eval = FALSE}
load_ecd(country = 'Greece')
```
The country argument tolerates some typos, common abbreviations, and common country names. If you want to load data based on the language of the statement you can provide a character string or character vector of languages to the `language` argument.
```{r lang-example, eval=FALSE}
english = load_ecd(language = 'English')
polyglot = load_ecd(language = c('French', 'Italian', 'Korean'))
```
For a full list of accepted country names and abbreviations you can call `ecd_country_dictionary`
```{r}
ecd_country_dictionary |>
head()
```
Note that the time to download and load a file will vary a lot due to various file sizes.
## `lazy_load_ecd`
We also have a "lazy" option which will download the files and then use `arrow::open_dataset` to open the dataset out of memory.
```{r eval = FALSE}
nigeria = lazy_load_ecd(country = 'Nigeria')
```
To bring the dataset into memory you simply need to call.
```{r eval = FALSE}
nigeria |>
dplyr::collect()
```
This has some speed benefits when data wrangling. One thing to be aware of is that if you lazy load a dataset previously it may bring in additional files. To prevent this behavior run
```{r eval = FALSE}
clear_cache()
```
Then restart your R session.