Date of Award


Degree Type


Degree Name

Master of Science (MS)


Computer Science


Substance misuse is a major problem in the world. in 2014, as many as 52,404 deaths in the US were caused by drug overdoses. in 2001, the monetary cost of drug misuse has been estimated to be 414 billion dollars. in this work, we explore the use of different machine learning algorithms in the prediction of cocaine misuse using structured and unstructured data found in electronic health records. These records contain various attributes that can help with this prediction, including but not limited to chart text data, previous diagnoses of certain diseases and information about the area the patient lives in. Traditional models using only one kind of data are compared with ensembles and neural networks. Finally, the models are evaluated using the PR AUC metric.

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.