Differentially Private Synthetic Medical Data Generation using Convolutional GANs
Deep learning models have demonstrated superior performance in several application problems, such as image classification and speech processing. However, creating a deep learning model using health record data requires addressing certain privacy challenges that bring unique concerns to researchers working in this domain. One effective way to handle such private data issues is to generate realistic synthetic data that can provide practically acceptable data quality and correspondingly the model performance. To tackle this challenge, we develop a differentially private framework for synthetic data generation using Rényi differential privacy. Our approach builds on convolutional autoencoders and convolutional generative adversarial networks to preserve some of the critical characteristics of the generated synthetic data. In addition, our model can also capture the temporal information and feature correlations that might be present in the original data. We demonstrate that our model outperforms existing state-of-the-art models under the same privacy budget using several publicly available benchmark medical datasets in both supervised and unsupervised settings.
- Date of publication:
- December 22, 2020
- Cornell University
- Publication note:
Amirsina Torfi, Edward A. Fox, Chandan K. Reddy: Differentially Private Synthetic Medical Data Generation using Convolutional GANs. CoRR abs/2012.11774 (2020)