This is a list of important publications in data science, generally organized by order of use in a data analysis workflow.
See the list of important publications in statistics for more research-based and fundamental publications; while this list is more applied, business oriented, and cross-disciplinary.
General article inclusion criteria are: Some reasons why a particular publication might be regarded as important: When possible, a reference is used to validate the inclusion of the publication in this list.
Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) 50 Years of Data Science The Composable Data Management System Manifesto Tidy Data Data Organization in Spreadsheets Quantitative Graphics in Statistics: A Brief History Hidden Technical Debt in Machine Learning Systems A few useful things to know about machine learning The Introductory Statistics Course: A Ptolemaic Curriculum