eCommons

DigitalCollections@ILR
ILR School
 

Presentations by Cornell University NCRN node

Permanent URI for this collection

This collection contains presentations made by Cornell NCRN node members at various conferences and meetings.

Browse

Recent Submissions

Now showing 1 - 10 of 38
  • Item
    Reproducibility Confidentiality Data Access
    Vilhuber, Lars (2018-05-01)
    Talk on Replicability given by Lars Vilhuber.
  • Item
    Large-scale Data Linkage from Multiple Sources: Methodology and Research Challenges
    Abowd, John M. (2017-10-27)
    Presentation on methods for record linkage
  • Item
    Confidentiality Protection and Physical Safeguards (LatAm version)
    Vilhuber, Lars (2017-06-07)
    Confidentiality protection is a multi-layered concept, involving statistical (cryptographic) methods and physical safeguards. When providing access to researchers (both internal to the agency and external academic), a tension arises between the level of trust vis-à-vis the researcher, the statistical disclosure limitation applied to the data visible to the researcher; and the physical access mechanisms used by the researcher. In this presentation, I (attempt to) review systems used by national and private research organizations around the world, putting them into the relevant legal and societal context.
  • Item
    Two Perspectives on Commuting: A Comparison of Home to Work Flows Across Job-Linked Survey and Administrative Files
    Green, Andrew S.; Kutzbach, Mark J.; Vilhuber, Lars (2017-04)
    Commuting flows and workplace employment data have a wide constituency of users including urban and regional planners, social science and transportation researchers, and businesses. The U.S. Census Bureau releases two, national data products that give the magnitude and characteristics of home to work flows. The American Community Survey (ACS) tabulates households’ responses on employment, workplace, and commuting behavior. The Longitudinal Employer-Household Dynamics (LEHD) program tabulates administrative records on jobs in the LEHD Origin-Destination Employment Statistics (LODES). Design differences across the datasets lead to divergence in a comparable statistic: county-to-county aggregate commute flows. To understand differences in the public use data, this study compares ACS and LEHD source files, using identifying information and probabilistic matching to join person and job records. In our assessment, we compare commuting statistics for job frames linked on person, employment status, employer, and workplace and we identify person and job characteristics as well as design features of the data frames that explain aggregate differences. We find a lower rate of within-county commuting and farther commutes in LODES. We attribute these greater distances to differences in workplace reporting and to uncertainty of establishment assignments in LEHD for workers at multi-unit employers. Minor contributing factors include differences in residence location and ACS workplace edits. The results of this analysis and the data infrastructure developed will support further work to understand and enhance commuting statistics in both datasets.
  • Item
    Excerpt: Usage and outcomes of the Synthetic Data Server
    Vilhuber, Lars; Abowd, John M. (2017-05-09)
    This is an excerpt from a prior presentation at the Society of Labor Economists (2016). The Synthetic Data Server (SDS) at Cornell University was set up to provide early access to new synthetic data products by the U.S. Census Bureau. These datasets are made available to interested researchers in a controlled environment, prior to a more generalized release. Over the past 5 years, 4 synthetic datasets were made available on the server, and over 100 users have accessed the server over that time period. This paper reports on interim outcomes of the activity: results of validation requests from a user perspective, functioning of the feedback loop due to validation and user input, and the role of the SDS as an access gateway to and educational tool for other mechanisms of accessing detailed person, household, establishment, and firm statistics.
  • Item
    Confidentiality of the SynLBD
    Vilhuber, Lars; Kinney, Saki (2017-05-09)
    We describe the confidentiality protection provided by the SynLBD. The presentation was originally prepared by Saki Kinney for the World Statistics Congress 2013.
  • Item
    SynLBD Inputs: Structure, Example
    Vilhuber, Lars; Drechsler, Jörg (2017-05-09)
    We describe the structure of inputs for the SynLBD, and discuss challenges in preparing them.
  • Item
    Overview: Synthetic Longitudinal Business Data International User Seminar
    Vilhuber, Lars; Kinney, Saki (2017-05-09)
    An overview over the content of the Synthetic Longitudinal Business Data International User Seminar, based in part on a presentation prepared by Saki Kinney for the 2013 World Statistics Congress (WSC2013).
  • Item
    Confidentiality Protection and Physical Safeguards
    Vilhuber, Lars (2017-02-09)
    Confidentiality protection is a multi-layered concept, involving statistical (cryptographic) methods and physical safeguards. When providing access to researchers (both internal to the agency and external academic), a tension arises between the level of trust vis-à-vis the researcher, the statistical disclosure limitation applied to the data visible to the researcher; and the physical access mechanisms used by the researcher. In this presentation, I (attempt to) review systems used by national and private research organizations around the world, putting them into the relevant legal and societal context.