Computer Science: Faculty Publications and Other Works

PeaTMOSS: Mining Pre-Trained Models in Open-Source Software

Document Type

Data Set

Publication Date

10-5-2023

Abstract

Developing and training deep learning models is expensive, so software engineers have begun to reuse pre-trained deep learning models (PTMs) and fine-tune them for downstream tasks. Despite the widespread use of PTMs, we know little about the corresponding software engineering behaviors and challenges. To enable the study of software engineering with PTMs, we present the PeaTMOSS dataset: Pre-Trained Models in Open-Source Software. PeaTMOSS has three parts: a snapshot of (1) 281,638 PTMs, (2) 27,270 open-source software repositories that use PTMs, and (3) a mapping between PTMs and the projects that use them. We challenge PeaTMOSS miners to discover software engineering practices around PTMs. A demo and link to the full dataset are available at: https://github.com/PurdueDualityLab/PeaTMOSS-Demos.

Identifier

arXiv:2310.03620

Comments

https://github.com/PurdueDualityLab/PeaTMOSS-Demos

Recommended Citation

Jiang, W., Jones, J., Yasmin, J., Synovic, N., Sashti, R., Chen, S., Thiruvathukal, G.K., Tian, Y., & Davis, J.C. (2023). PeaTMOSS: Mining Pre-Trained Models in Open-Source Software, arXiv:2310.03620

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright Statement

Download

Included in

Artificial Intelligence and Robotics Commons, Software Engineering Commons

COinS

Computer Science: Faculty Publications and Other Works

PeaTMOSS: Mining Pre-Trained Models in Open-Source Software

Document Type

Publication Date

Abstract

Identifier

Comments

Recommended Citation

Creative Commons License

Copyright Statement

Included in

Submission Tools

Explore

For Contributors

About eCommons

Computer Science: Faculty Publications and Other Works

PeaTMOSS: Mining Pre-Trained Models in Open-Source Software

Authors

Document Type

Publication Date

Abstract

Identifier

Comments

Recommended Citation

Creative Commons License

Copyright Statement

Included in

Share

Submission Tools

Explore

For Contributors

About eCommons