Using data science techniques to identify self-harm from free-text
Previous work using CRIS has created a large dataset of Emergency Department attendances following self-harm (CRIS project 14-026). This was done by manual coders reading free text from assessments written by mental health staff in A&E to decide whether the reason for attendance was self-harm. Self-harm is an important adverse outcome in mental health and so this dataset has been useful for various research projects, however the process for creating it was very time consuming. This project aims to use the large dataset of coded free text to develop data science techniques that can automatically detect the presence and type of self-harm from similar documents.