Hackers have infiltrated a tool your software development teams may be using to write code. Not a comfortable place to be. Developers have long used sites like Stack Overflow as forums where they could get code examples and assistance. That community is rapidly being replaced by generative AI tools such as ChatGPT. Today, developers ask AI chatbots to help create sample code, translate from one programming language to another, and even write test cases. These chatbots have become full-fledged members of your development teams. The productivity gains they offer are, quite simply, impressive. There’s only one problem. How did your generative AI chatbot team-members learn to code? Invariably by reading billions of lines of open-source software, which is full of design errors, bugs, and hacker-inserted malware. Letting open source train your AI tools is like letting a bank-robbing getaway driver teach high school driver’s ed. It has a built-in bias to teach something bad. There are well over a billion open-source contributions annually to various repositories. GitHub alone had more than 400 million in 2022. That’s a lot of opportunity to introduce bad code, and a huge “attack surface” to try to scan for issues. Once open source has been used to train an AI model, the damage is done. Any code generated by the model will be influenced by what it learned. Code written by your generative AI chatbot and used by your developers can and should be closely inspected. Unfortunately, the times your developers are most likely to ask a chatbot for help are when they lack sufficient expertise to write the code themselves. That means they also lack the expertise to understand if the code produced has an intentionally hidden back door or malware. I asked LinkedIn how carefully people inspect the quality and security of the code produced by AI. A couple of thousand impressions later, the answers ranged from “very, very carefully” to “this is why I don’t use generative AI to generate code,” “too early to use” and “[too much risk of] embedded malware and known design weakness.” But the fact remains that many companies are using generative AI to develop code, and more are jumping on the bandwagon. So what should companies do? First, they need to carefully inspect and scan code written by generative AI. The types of scans used matter. Don’t assume that generative AI malware will match well-known malware signatures, because generated code changes each time it’s written. Instead, use static behavioral scans and software composition analysis (SCA) to see if generated software has design flaws or will do malicious things. Perhaps it goes without saying, but it isn’t a good idea to let the same generative AI that produces high-risk code write the test cases to see if the code is risky. That’s like asking a fox to check the henhouse for foxes. While the risks of generating bad code are real, so are the benefits of coding with generative AI. If you are going to trust generated code, the old adage applies: Trust, but verify. Lou Steinberg is founder and managing partner at CTM Insights, a cybersecurity research lab and incubator. Prior to launching CTM he was CTO of TD Ameritrade, where he was responsible for technology innovation, platform architecture, engineering, operations, risk management, and cybersecurity. — Generative AI Insights provides a venue for technology leaders to explore and discuss the challenges and opportunities of generative artificial intelligence. The selection is wide-ranging, from technology deep dives to case studies to expert opinion, but also subjective, based on our judgment of which topics and treatments will best serve InfoWorld’s technically sophisticated audience. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Contact doug_dineley@foundryco.com. Related content analysis Azure AI Foundry tools for changes in AI applications Microsoft’s launch of Azure AI Foundry at Ignite 2024 signals a welcome shift from chatbots to agents and to using AI for business process automation. By Simon Bisson Nov 20, 2024 7 mins Microsoft Azure Generative AI Development Tools news Microsoft unveils imaging APIs for Windows Copilot Runtime Generative AI-backed APIs will allow developers to build image super resolution, image segmentation, object erase, and OCR capabilities into Windows applications. By Paul Krill Nov 19, 2024 2 mins Generative AI APIs Development Libraries and Frameworks feature A GRC framework for securing generative AI How can enterprises secure and manage the expanding ecosystem of AI applications that touch sensitive business data? Start with a governance framework. By Trevor Welsh Nov 19, 2024 11 mins Generative AI Data Governance Application Security news Go language evolving for future hardware, AI workloads The Go team is working to adapt Go to large multicore systems, the latest hardware instructions, and the needs of developers of large-scale AI systems. By Paul Krill Nov 15, 2024 3 mins Google Go Generative AI Programming Languages Resources Videos