A Comparison of Machine Learning Techniques for Taxonomic Classification of Teeth from the Family Bovidae
Document Type
Article
Publication Date
2018
Publication Title
Journal of Applied Statistics
Abstract
This study explores the performance of modern, accurate machine learning algorithms on the classification of fossil teeth in the Family Bovidae. Isolated bovid teeth are typically the most common fossils found in southern Africa and they often constitute the basis for paleoenvironmental reconstructions. Taxonomic identification of fossil bovid teeth, however, is often imprecise and subjective. Using modern teeth with known taxons, machine learning algorithms can be trained to classify fossils. Previous work by Brophy et. al. 2014 uses elliptical Fourier analysis of the form (size and shape) of the outline of the occlusal surface of each tooth as features in a linear discriminant analysis framework. This manuscript expands on that previous work by exploring how different machine learning approaches classify the teeth and testing which technique is best for classification. Five different machine learning techniques including linear discriminant analysis, neural networks, nuclear penalized multinomial regression, random forests, and support vector machines were used to estimate these models. Support vector machines and random forests perform the best in terms of both log-loss and misclassification rate; both of these methods are improvements over linear discriminant analysis. With the identification and application of these superior methods, bovid teeth can be classified with higher accuracy.
Identifier
10.1080/02664763.2018.1441381
Recommended Citation
G. J. Matthews, J.K. Brophy, M. P. Luetkemeier, H. Gua, and G. K. Thiruvathukal, A comparison of machine learning techniques for taxonomic classification of teeth from the Family Bovidae, Journal of Applied Statistics (2018), https://doi.org/10.1080/02664763.2018.1441381
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.
Comments
This is an Accepted Manuscript of an article published by Taylor & Francis in Journal of Applied Statistics on 2018 available online at doi.org/10.1080/02664763.2018.1441381.
The accepted manuscript is available from arxiv.org/abs/1802.05778.