take output from Extract.wrap
and clean data. apply labels, account for
missing vars. merge topical and core data.
output two datasets, differing in time
resolution (monthly or 4-monthly).
Clean.Sipp(inpath = "~/Dropbox/research/mobility/data/SIPP", outpath = "~/git/migration/mig-pkg/data", TM.idx = list(p96 = c(3, 6, 9, 12), p01 = c(3, 6, 9), p04 = c(3, 6), p08 = c(4, 7, 10)), agg.by = "age", verbose = TRUE)
inpath | to output from |
---|---|
outpath | to save resulting dataset to disk. Object is called |
TM.idx | list with one index vector of Topic Module (TM) waves to use per panel. Name list elements like "p96" [panel 96] |
agg.by | list of variable names by which to aggregate. those should be time variables present in the dataset like qtr, year, age etc |
NULL. Saves 2 data.tables to dropbox.
Data is cleaned for inconsistencies across
SIPP panels 1996-2008, merged with house price
indices by state, and dollar denoted variables
are deflated to 2012 as a base year using the
US cpi. All dollar values are denoted in 1000s of
US dollars. The SIPP can be cast at different
time resolutions, i.e. you can look at monthly data
quarterly data, annual, etc. you chose the level of
aggregation by setting the argument agg.by