by Nick Elprin

7 data science predictions for 2021

feature
Feb 03, 20216 mins
AnalyticsData ScienceMachine Learning

After a trying 2020, signs point to data science becoming an enterprise-wide capability that impacts every line of business and functional department in the coming year.

dataanalytics
Credit: shutterstock

After a year unlike any other, I am looking forward to 2021 with renewed hope and positivity. And I’m not just talking about the COVID-19 vaccines that promise to control the pandemic and bring life back to (somewhat) normal. I’m also hopeful for a renewed economy and global stability as the pandemic subsides and government as well as businesses return to “business as usual.”

While virtually every market sector and segment has been impacted by the coronavirus pandemic, technology has seen an outsized disruption. From manufacturing to supply chain and logistics to retail and consumer demand, businesses have needed to pivot quickly not just once but often twice or three times. However, technology is already showing some positive indicators for the coming year and I see some clear trends in the data science space.

Here are seven that are already beginning to emerge.

Rather than decreasing investment in data science, the coronavirus pandemic will continue spurring companies to increase their investment.

Organizations are making dramatic budget cuts in many areas in an effort to overcome the effects of COVID-19 and keep their business viable. Yet, in 2021, I predict that many companies will sustain or even increase their investment in data science. COVID-19 accelerated the Fortune 500’s move to public cloud and modern data science tools in the rush to support remote workers. This was one of the last remaining barriers holding back outright data science investment. Now the seal’s broken, it’s easier for organizations to continue the investment, and many of the Fortune 500 are investing in building a core competency in machine learning and data science as a “big bet” to be faster/smarter/better than their competitors.

As a corollary, the ongoing pandemic will accelerate development of model monitoring solutions.

COVID-19 has had an enormous impact on nearly every facet of business operations, and organizations that depend on artificial intelligence (AI) and machine learning (ML) to automate business decisions have been particularly vulnerable. One of the biggest issues that companies are experiencing is a massive data drift—a change in model input leading to performance degradation and inaccurate output, due to the large-scale changes in human behavior since the pandemic. The development of new and more robust model monitoring solutions that enable organizations to catch data drift will be a huge area of innovation and investment in 2021.

A new ‘sheriff’ is coming to town.

In 2021, Chief Analytics Officers (CAOs) and Chief Data Analytics Officers (CDAOs) will be the new face at the boardroom table, supplanting the traditional Chief Data Officer role as companies move from focusing on the “data” behind the models to the AI/ML models themselves.

Model risk will enter the mainstream.

Financial services firms have long used predictive models to drive decisions subject to regulatory scrutiny and have understood the risks associated with this approach. In 2021, we’ll see broader awareness of legal implications and the risks of automated decisions across a wide variety of industries. In addition, it’s likely that we’ll see public lawsuits related to discrimination or liability stemming from decisions made by models. Companies won’t be able to hide behind “The AI made me do it” excuses as they’re held accountable under increased scrutiny on their pricing and business practices.

IT organizations will be empowered to manage data science “shadow IT.”

Until now, data science has lacked both appropriate governance and a centralized platform within enterprises, leading to widespread “shadow IT” practices that include data scientists downloading unapproved tools and data science packages and using unofficial infrastructure for storage and specialized compute power. The risk of these rogue systems—from both a security and an IT perspective—is no longer tenable, as we see data science becoming increasingly pervasive and critical to every business function. We’ve seen cases where companies that underwent dramatic organizational changes due to COVID layoffs, etc., couldn’t figure out how to update their pricing after COVID because some data science team built the pricing model on their own systems. This coming year we’ll see more organizations treat models as material assets, motivating IT to take the reins and provide the infrastructure to support data science at scale.  

Organizations will experience increased pressure to ensure transparency in the use of algorithms and predictive models.

While increased AI engineering capabilities provide greater structure and sophistication in how we bring models into production, rapidly evolving privacy standards (e.g., GDPR and California’s CCPA) will require that equal attention be paid to making AI models more transparent and secure. However, this won’t be easy. It will require a very heavy lift involving ModelOps, DevOps, model risk management, explainable AI, and ethical AI requiring both an evolution of technology and processes.

Data science education programs will continue to be on the rise.

Just as the dotcom boom spurred interest in computer science courses and majors, 2021 will see skyrocketing demand for classes and degrees in data science, as models increasingly drive every part of business and our economy. Data science will become the most popular freshman class at schools where it’s offered.

The net-net of these predictions is that data science is not only not going away, it’s  becoming more popular and more ingrained in business practices. Whereas, in the past, data science has been compartmentalized and siloed in organizations, 2021 is the year in which it will become an enterprise-wide capability that impacts every line of business and functional department.

I’ll be interested to look back at the end of the year and revisit these predictions to evaluate how many have come to fruition and how many were completely off-base. Personally, I’m betting on greater than 90 percent accuracy, but that depends on my model and quality of the data input, of course.

And one more thing… If you have any predictions for data science in 2021 that I haven’t covered in this article, I’d love to hear them.

Nick Elprin is co-founder and CEO of Domino Data Lab.

New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.