take output from Extract.wrap and clean data. apply labels, account for missing vars. merge topical and core data. output two datasets, differing in time resolution (monthly or 4-monthly).

Clean.Sipp(inpath = "~/Dropbox/research/mobility/data/SIPP",
  outpath = "~/git/migration/mig-pkg/data", TM.idx = list(p96 = c(3, 6,
  9, 12), p01 = c(3, 6, 9), p04 = c(3, 6), p08 = c(4, 7, 10)),
  agg.by = "age", verbose = TRUE)

Arguments

inpath

to output from Extract.wrap. These are called subsetxxxx.RData.

outpath

to save resulting dataset to disk. Object is called merged.

TM.idx

list with one index vector of Topic Module (TM) waves to use per panel. Name list elements like "p96" [panel 96]

agg.by

list of variable names by which to aggregate. those should be time variables present in the dataset like qtr, year, age etc

Value

NULL. Saves 2 data.tables to dropbox.

Details

Data is cleaned for inconsistencies across SIPP panels 1996-2008, merged with house price indices by state, and dollar denoted variables are deflated to 2012 as a base year using the US cpi. All dollar values are denoted in 1000s of US dollars. The SIPP can be cast at different time resolutions, i.e. you can look at monthly data quarterly data, annual, etc. you chose the level of aggregation by setting the argument agg.by