Towards Shareable Data in Clinical Natural Language Processing: Generating Synthetic Electronic Health Records
Natural language processing has a potential to significantly improve healthcare research by uncovering information stored in unstructured clinical text; however, there are privacy concerns in many databases. We would like to investigate the extent to which we might generate wholly artificial clinical databases that would be anonymous by their nature and could be easily used for the development of natural language processing methods.