Document Type

Working Paper

Publication Date

Winter 12-19-2022

Publication Title

GLO Discussions Paper

Abstract

Popular approaches to building data from unstructured text come with limitations, such as scalability, interpretability, replicability, and real-world applicability. These can be overcome with Context Rule Assisted Machine Learning (CRAML), a method and no-code suite of software tools that builds structured, labeled datasets which are accurate and reproducible. CRAML enables domain experts to access uncommon constructs within a document corpus in a low-resource, transparent, and flexible manner. CRAML produces document-level datasets for quantitative research and makes qualitative classification schemes scalable over large volumes of text. We demonstrate that the method is useful for bibliographic analysis, transparent analysis of proprietary data, and expert classification of any documents with any scheme. To demonstrate this process for building data from text with Machine Learning, we publish open-source resources: the software, a new public document corpus, and a replicable analysis to build an interpretable classifier of suspected “no poach” clauses in franchise documents.

Comments

https://glabor.org/platform/discussion-papers/

Recommended Citation

Meisenbacher, Stephen; Norlander, Peter (2022) : Creating Data from Unstructured Text with Context Rule Assisted Machine Learning (CRAML), GLO Discussion Paper, No. 1214, Global Labor Organization (GLO), Essen

Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.

Copyright Statement

Download

Find in your library

Included in

Artificial Intelligence and Robotics Commons, Business Commons, Cataloging and Metadata Commons, Computational Linguistics Commons, Databases and Information Systems Commons, Labor and Employment Law Commons

COinS

School of Business: Faculty Publications and Other Works

Creating Data from Unstructured Text with Context Rule Assisted Machine Learning (CRAML)

Document Type

Publication Date

Publication Title

Abstract

Comments

Recommended Citation

Creative Commons License

Copyright Statement

Included in

Submission Tools

Explore

For Contributors

About eCommons

School of Business: Faculty Publications and Other Works

Creating Data from Unstructured Text with Context Rule Assisted Machine Learning (CRAML)

Authors

Document Type

Publication Date

Publication Title

Abstract

Comments

Recommended Citation

Creative Commons License

Copyright Statement

Included in

Share

Submission Tools

Explore

For Contributors

About eCommons