From: lexfridman

The ongoing discussion about the relationship between machine learning and computational statistics is a recurring theme in the fields of computer science and statistics. This discussion often centers on the question of whether machine learning is simply a form of computational statistics or if it is a distinct discipline with its own unique features.

The Disagreement

In a conversation with Charles Isbell, Dean of the College of Computing at Georgia Tech, and Michael Littman, a computer science professor at Brown University, the debate surfaced about the nature and scope of machine learning in relation to computational statistics. Littman and Isbell explored the complex relationship between the two areas, highlighting both their overlaps and distinctions.

Michael Littman

“Whether or not machine learning is computational statistics… it’s not. But it is. Well, it’s not. And in particular, more importantly, it is not just computational statistics.” [00:02:55]

Distinct Features

Isbell and Littman pointed out that while computational statistics and machine learning share a foundation in statistical methods, the latter often encompasses additional computational elements that extend beyond traditional statistical approaches. This includes, but is not limited to, the incorporation of rules, symbols, and algorithms that are more characteristic of computer science than of traditional statistics.

Additional Elements

Machine learning is not solely about statistical methods but includes computational aspects such as algorithms and data structures essential in computer science [00:03:29].

Machine Learning as Software Engineering

Isbell emphasized viewing machine learning within the broader context of software engineering and programming languages. He suggested considering machine learning as a practice that involves hyperparameters, metrics, and loss functions - all of which are decisions made during the modeling process and are akin to the considerations made in software development.

Charles Isbell

“A lot of what AI and machine learning is or certainly should be as software engineering…” [00:06:24]

The Role of Data

Another crucial distinction made was the emphasis machine learning places on data. The importance of data in machine learning is often greater than in computational statistics, where the primary focus tends to be on the creation and testing of mathematical models. The practitioners of machine learning place significant emphasis on understanding the data, designing experiments, and interpreting the outcomes based on data-driven insights.

Emphasis on Data

In machine learning, the data is of utmost importance, often more so than the algorithms themselves. This differs from traditional computational statistics where the focus remains primarily on the modeling process [00:12:15].

Conclusion

The debate over whether machine learning is a subset of computational statistics or a distinct field continues to challenge academics and practitioners alike. While acknowledging their shared foundations, it is increasingly recognized that machine learning integrates a broader range of computational techniques from areas such as software engineering and computer science. Thus, machine learning distinguishes itself as a robust and dynamic field with applications extending far beyond the traditional boundaries of statistical methods.