Virginia Tech® home

Investigating The Reproducibility of NPM Packages

Danfeng (Daphne) Yao

Abstract

Node.js has been popularly used for web application development, partially because of its large software ecosystem known as NPM (Node Package Manager) packages. When using open-source NPM packages, most developers download prebuilt packages on npmjs.com instead of building those packages from available source, and implicitly trust the downloaded packages. However, it is unknown whether the blindly trusted prebuilt NPM packages are reproducible (i.e., whether there is always a verifiable path from source code to any published NPM package). Therefore, for this paper, we conducted an empirical study to examine the reproducibility of NPM packages, and to understand why some packages are not reproducible.Specifically, we downloaded versions/releases of 226 most popularly used NPM packages and then built each version with the available source on GitHub. Next, we applied a differencing tool to compare the versions we built against versions downloaded from NPM, and further inspected any reported difference. Among the 3,390 versions of the 226 packages, only 2,087 versions are reproducible. Based on our manual analysis, multiple factors contribute to the non-reproducibility issues, such as flexible versioning information in package.json file and the divergent behaviors between distinct versions of tools used in the build process. Our investigation reveals challenges of verifying NPM reproducibility with existing tools, and provides insights for future verifiable build procedures.

Publication Details

Date of publication: November 19, 2020

Conference: Conference on Software Maintenance and Evolution (ICSME)

Page number(s): 677-681

Volume:

Issue Number:

Publication Note: Pronnoy Goswami, Saksham Gupta, Zhiyuan Li, Na Meng, Danfeng Daphne Yao: Investigating The Reproducibility of NPM Packages. ICSME 2020: 677-681