THE EUROPEAN MASTER IN OFFICIAL STATISTICS
Alexander Kowarik, Johannes Gussenbauer &
Bernhard Meindl @Statistics Austria
In this EMOS webinar, Alexander Kowarik, Mag. Bernhard Meindl and DI Johannes Gussenbauer presented the basic workflow in R on the application of statistical disclosure methods to tabular and micro data with sdcTable, cellKey and sdcMicro. They also showed the process of generating a synthetic data set from samples, census micro data and/or marginal tables with the R package simPop.At the beginning, the speakers discussed the importance of statistical disclosure control (SDC) in protecting the confidentiality of sensitive data. They then introduced important SDC methods and showed how to implement them in R using the sdcTable, cellKey and sdcMicro packages. Aftwerwards, the topic was changed to synthetic data generation. Synthetic data is a type of SDC method that creates a new data set that is statistically similar to the original data, but that does not contain any identifiable information. In the end of the webinar, it is shown how to generate synthetic data from samples, census micro data and/or marginal tables using the R package simPop.
This webinar is intended for researchers and practitioners who are interested in learning more about SDC and synthetic data generation in R.
Contact:
Webinar details
The webinar aims to provide e researchers and practitioners with an overview of statistical disclosure control (SDC) and synthetic data generation in R.
Prerequisites and further readings:
-
Basic knowledge of R.
-
https://cran.r-project.org/web/packages/cellKey/vignettes/introduction.html
-
https://cran.r-project.org/web/packages/sdcTable/vignettes/sdcTable.html
-
https://cran.r-project.org/web/packages/sdcMicro/vignettes/recordSwapping.html
-
https://cran.r-project.org/web/packages/sdcMicro/vignettes/sdcMicro.html