Subsetting and sorting observations and variables

Main goal of this exercise is to try subsetting and arranging observations and variables in the trade data in R. For doing that, use R package dplyr and data transformation related functions particularly commands:

Following the previous example, use the same working directory and download this data file to the folder 02-filter-ct-data within your working directory. This data file contains a compress subset of the WTO data for the years 2014-2016, for all countries available in the CT and for all most aggregated commodities listed in the Annex 1 of the WTO Agreement on Agriculture. We will be using this data in the current and other exercises.

To load this data into the R session use readr package and the function read_rds().

library(tidyverse)
library(readr)
ctData <- read_rds("02-filter-ct-data/subsetting-data.rds")

dplyr::filter()

Practicing with the dplyr filter() function.

Here you may need to use datasets available in the tradeAnalysis package together with the dplyr function for data filtering in order to generate the results that are.

Load tradeAnalysis package and explore built in datasets. Those are:

  1. wtoAgFood
  2. wtoAgFoodFull What are the difference between the previous one?
  3. classes
  4. reporters
  5. partners
  6. regimes
  7. units

If any of these datasets is not working try to access to the dataset using the namespace specification such as tradeAnalysis::units.

library(tradeAnalysis)
tradeAnalysis::units
## # A tibble: 13 x 3
##    Qty.Unit.Code         Unit
##            <int>        <chr>
##  1             1            -
##  2             2           m2
##  3             3     1000 kWh
##  4             4            m
##  5             5            u
##  6             6           2u
##  7             7            l
##  8             8           kg
##  9             9        1000U
## 10            10 U (jeu/pack)
## 11            11          12u
## 12            12           m3
## 13            13        carat
## # ... with 1 more variables: Unit.Description <chr>

Exercise 1.

Export a timeseries for wheat and barley (HS 1001 and 1003) import and export values only in 1000 USD from all EU countries to the world for all available years. Save the time series in an object and then in the .csv file.

To see what are the EU countries either use the full list of reporters reporters and manually record all EU countries of subtract EU country codes from the output of the function getEU() in the tradeAnalysis package.

dplyr::select()

Practicing with the dplyr select() function.

In order to make results from the exercise 1 more user friendly, we need to be able to select different set of columns for presentation using dplyr::select() and other “Select helpers” see ?everything().

Exercise 2.

Use the output of the previous exercise in order to create to objects tables:

  • one with the trade quantities expressed in 1000 MT.
  • second with the trade values expressed in 1000 USD.

dplyr::arrange()

Sometimes it is necessary to be able to arrange data in the user-friendly way. To do so, use function arrange() and desc() (to revert order withing the function arrange()). For more information see arrange().

Exercise 3.

Use the output ctSubsetQuant, to arrange all observations in the way that you can see data in the following way:

  1. one reporter, one Partner, one commodity, Export (Import), years in the reverse order (2016, 2015, 2014)
  2. one reporter, one Partner, one commodity,Import, years in the reverse order (2016, 2015, 2014)
  3. one reporter, one Partner, second commodity, Export (Import), years in the reverse order (2016, 2015, 2014)
  4. one reporter, one Partner, second commodity, Import, years in the reverse order (2016, 2015, 2014)
  5. one reporter, second Partner, one commodity, Export (Import), years in the reverse order (2016, 2015, 2014)
  6. one reporter, second Partner, one commodity, Import, years in the reverse order (2016, 2015, 2014)

dplyr::disitnct()

Sometimes it is very important to see what are the unique values existing in the data. For finding this, we can use function disitnct() from dplyr.

Exercise 4.

Find all unique partners of French (Reporter code 251) export and import of wheat and indicate, what are the differences.

Copyright © 2017 Eduard Bukin. All rights reserved.