AI can’t replace bad developers because it only works for good developers. Recognizing when code from an LLM will fail requires skill and experience. Credit: thinkhubstudio/Shutterstock “AI models currently shine at helping so-so coders get more stuff done that works in the time they have,” argues engineer David Showalter. But is that right? Showalter was responding to Santiago Valdarrama’s contention that large language models (LLMs) are untrustworthy coding assistants. Valdarrama says, “Until LLMs give us the same guarantees [as programming languages, which consistently get computers to respond to commands], they’ll be condemned to be eternal ‘cool demos,’ useless for most serious applications.” He is correct that LLMs are decidedly inconsistent in how they respond to prompts. The same prompt will yield different LLM responses. And Showalter is quite possibly incorrect: AI models may “shine” at helping average developers generate more code, but that’s not the same as generating usable code. The trick with AI and software development is to know where the rough edges are. Many developers don’t, and they rely too much on an LLM’s output. As one HackerNews commentator puts it, “I wonder how much user faith in ChatGPT is based on examples in which the errors are not apparent … to a certain kind of user.” To be able to use AI effectively in software development, you need sufficient experience to know when you’re getting garbage from the LLM. No simple solutions Even as I type this, plenty of developers will disagree. Just read through the many comments on the HackerNews thread referenced above. In general, the counterarguments boil down to “of course you can’t put complete trust in LLM output, just as you can’t completely trust code you find on Stack Overflow, your IDE, etc.” This is true, so far as it goes. But sometimes it doesn’t go quite as far as you’d hope. For example, while it’s fair to say developers shouldn’t put absolute faith in their IDE, we can safely assume it won’t “prang your program.” And what about basic things like not screwing up Lisp brackets? ChatGPT may well get those wrong but your IDE? Not likely. What about Stack Overflow code? Surely some developers copy and paste unthinkingly, but more likely a savvy developer would first check to see votes and comments around the code. An LLM gives no such signals. You take it on faith. Or not. As one developer suggests, it’s smart to “treat both [Stack Overflow and LLM output as] probably wrong [and likely written by an] inexperienced developer.” But even in error, such code can “at least move me in the right direction.” Again, this requires the developer to be skilled enough to recognize that the Stack Overflow code sample or the LLM code is wrong. Or perhaps she needs to be wise enough to only use it for something like a “200-line chunk of boilerplate for something mundane like a big table in a React page.” Here, after all, “you don’t need to trust it, just test it after it’s done.” In short, as one developer concludes, “Trust it in the same way I trust a junior developer or intern. Give it tasks that I know how to do, can confirm whether it’s done right, but I don’t want to spend time doing it. That’s the sweet spot.” The developers who get the most from AI are going to be those who are smart enough to know when it’s wrong but still somewhat beneficial. You’re holding it wrong Back to Datasette founder Simon Wilison’s early contention that “getting the best results out of [AI] actually takes a whole bunch of knowledge and experience” because “a lot of it comes down to intuition.” He advises experienced developers to test the limits of different LLMs to gauge their relative strengths and weaknesses and to assess how to use them effectively even when they don’t work. What about more junior developers? Is there any hope for them to use AI effectively? Doug Seven, director of AI developer experiences at Amazon Web Services, believes so. As he told me, coding assistants such as Amazon Q Developer, formerly CodeWhisperer, can be helpful even for less experienced developers. “They’re able to get suggestions that help them figure out where they’re going, and they end up having to interrupt other people [e.g., to ask for help] less often.” Perhaps the right answer is, as usual, “It depends.” And, importantly, the right answer to software development is generally not “write more code, faster.” Quite the opposite, as I’ve argued. The best developers spend less time writing code and more time thinking about the problems they’re trying to solve and the best way to approach them. LLMs can help here, as Willison has suggested: “ChatGPT (and GitHub Copilot) save me an enormous amount of ‘figuring things out’ time. For everything from writing a for loop in Bash to remembering how to make a cross-domain CORS request in JavaScript—I don’t need to even look things up anymore, I can just prompt it and get the right answer 80% of the time.” Knowing where to draw the line on that “80% of the time” is, as noted, a skill that comes with experience. But the practice of using LLMs to get a general idea of how to write something in, say, Scala, can be helpful to all. As long as you keep one critical eye on the LLM’s output. Related content analysis Azure AI Foundry tools for changes in AI applications Microsoft’s launch of Azure AI Foundry at Ignite 2024 signals a welcome shift from chatbots to agents and to using AI for business process automation. By Simon Bisson Nov 20, 2024 7 mins Microsoft Azure Generative AI Development Tools news Microsoft unveils imaging APIs for Windows Copilot Runtime Generative AI-backed APIs will allow developers to build image super resolution, image segmentation, object erase, and OCR capabilities into Windows applications. By Paul Krill Nov 19, 2024 2 mins Generative AI APIs Development Libraries and Frameworks feature A GRC framework for securing generative AI How can enterprises secure and manage the expanding ecosystem of AI applications that touch sensitive business data? Start with a governance framework. By Trevor Welsh Nov 19, 2024 11 mins Generative AI Data Governance Application Security news Go language evolving for future hardware, AI workloads The Go team is working to adapt Go to large multicore systems, the latest hardware instructions, and the needs of developers of large-scale AI systems. By Paul Krill Nov 15, 2024 3 mins Google Go Generative AI Programming Languages Resources Videos