If you want to squeeze the most value from your data, teach your employees Python and Excel instead of specialized programming languages. Credit: Christina Morillo In yet another installment of “everyone is doing it, but no one knows how,” a recent NewVantage Partners survey found that while 93.9% of executives surveyed expect to increase their data investments in 2023, just 23.9% of organizations characterize themselves as data-driven. Where is all that investment going, if not to change the way their companies operate? What’s stopping these executives from imposing this vision of a glorious data future on their companies? People. The problem is always people. Of these same executives, 79% cite cultural issues as the biggest impediment to embracing a data-driven future. It turns out to be easy to say “data-driven” but much harder to implement because people ultimately animate a business, not data. The key, then, is to ensure that data enables and augments people rather than replaces them. Python and friends More than a decade ago, Gartner analyst Svetlana Sicular posited two fundamental truths about (big) data that we too often forget: “Organizations already have people who know their own data better than mystical data scientists” and “learning Hadoop is easier than learning the company’s business.” One way to boost the intelligent use of data is by lowering the bar to programming literacy. As arcane as data tools can be, the much more valuable “tool” is an employee’s grasp of the company’s business because expert employees can ask more intelligent questions from the company’s data. To that end, the focus for every enterprise should be to make data tools more accessible to a greater population of employees. Efforts to make Microsoft Excel a key component of data analytics should be encouraged, including recent attempts to use Excel for data transformation initiatives. There are far more people proficient with Excel than, say TensorFlow or Hugging Face models. Helping them do more with a tool they already know is a big win. Same with Python. Although R and other more specialized languages continue to be valuable, Python is the single-biggest driver of AI productivity for a swelling army of would-be data engineers. As I’ve written, following Nick Elprin’s projection that data science would become an enterprisewide capability with far-reaching implications, then “the language most likely to dominate is the one that is most accessible to the broadest population within the enterprise.” Namely, Python. And SQL, of course. It’s telling that a recent IEEE Spectrum analysis of programming language popularity found that Python and SQL are the two most popular languages right now. Python is on top with a lead that keeps widening. For employers looking to hire, SQL tops the list (with Python a close second). The two together are a solid combination given that both tap into skills that many employees already have rather than forcing people (and their employers) to learn new ways of dealing with data. Generative AI (GenAI) is another way we’ll see more employees empowered to work with data. I’ve tried using GenAI tools like ChatGPT to automate some of the work my team does with answering questions on our public forums, but the output is still not good enough, requiring more work to fix ChatGPT’s answers than to simply write a better answer to start with. (Beware of GenAI when it comes up with great prose at the expense of technical accuracy. Users may like it, as one recent analysis found, but that will dim when they try some of those AI-suggested answers in production.) The point, however, isn’t the technology. It’s the people using it. This is where most companies continue to get things wrong. Power to the people As the NewVantage report notes, every year “a great majority of respondents report that the principal challenges to becoming a data-driven organization are human—culture, people, process, or organization—rather than technological,” but each year the survey uncovers little progress toward overcoming these human issues. “Too much of the focus of data executives is on non-human issues” like “data modernization, data products, AI and ML, data quality, and various data architectures.” In other words, we seem to realize we have a people problem, yet we keep trying to fix it with tech. I’ve mentioned a few technologies that allow developers and others to work with data using familiar tools rather than imposing new technologies that force them to change how they work and think to conform to the strictures of the tool, which is a losing strategy. The crowning asset of a company is the people who interpret the data, not the data itself. These people already work for you; the key is to figure out how to leverage data tools they already know or can easily learn. Related content feature Dataframes explained: The modern in-memory data science format Dataframes are a staple element of data science libraries and frameworks. Here's why many developers prefer them for working with in-memory data. By Serdar Yegulalp Nov 06, 2024 6 mins Data Science Data Management analysis How to support accurate revenue forecasting with data science and dataops Data science and dataops have a critical role to play in developing revenue forecasts business leaders can count on. By Isaac Sacolick Nov 05, 2024 8 mins Data Science Machine Learning Artificial Intelligence feature The best Python libraries for parallel processing Do you need to distribute a heavy Python workload across multiple CPUs or a compute cluster? These seven frameworks are up to the task. By Serdar Yegulalp Oct 23, 2024 11 mins Python Data Science Machine Learning news Julia language adds lower-overhead Memory type Dynamic language built for fast numerical computing introduces lower-level alternative to Array that delivers significant speedups and more maintainable code. By Paul Krill Oct 08, 2024 3 mins Julia Data Science Programming Languages Resources Videos