Matt Asay
Contributor

AI software resists open source

opinion
12 Aug 20244 mins
Artificial IntelligenceGenerative AIOpen Source

A core group of smart and well-meaning people would dearly love all AI to be open source. If only developers cared.

Business people having a meeting in a digital marketing agency. Group of business professionals discussing a project in an office. Teamwork and collaboration in a creative workplace.
Credit: Jacob Lund / Shutterstock

For some reason, we keep thinking history won’t repeat itself. We expect perfect open source licenses and that everyone will geek out on peace, love, and freely shared code. Yet we’re 26 years into the term “open source” and, wildly popular though open source software absolutely is, the vast majority of the world’s software may contain open source but isn’t open-source licensed. Developers seem fine with this, including those active on Open Source Initiative (OSI) mailing lists. Most of them use a plethora of closed hardware and software. As a result, despite the groundswell of support for “open source AI” (whatever that means), we are very much on track for mostly proprietary licensing to dominate AI, just as the proprietary model has dominated software.

Do we care?

It’s nice to think we do, or that we should, but the history of software is an ongoing blend of proprietary and open source. It seems to work. There’s little reason to think AI will be any different—or that it should be any different.

Here we go again

Steven Vaughan-Nichols wrote a great synopsis of open source and AI. “Defining open source AI is a messy issue that has yet to be settled.” I’ll go one further: It’s not going to be settled. Not soon. Not ever.

I explored this idea a few weeks ago. “While the OSI and others are trying to committee their way to an updated Open Source Definition (OSD),” I suggested, “powerful participants like Meta are releasing industry-defining models, calling them ‘open source,’ and not remotely caring when some vocally chastise them for affixing a label that doesn’t seem to fit the OSD.” Despite earnest pleas for everyone to open source their AI models, “basically none of today’s models are ‘open source’ in the way we’ve traditionally considered the term.”

We don’t have to be happy about that, but we should get used to it. Open source has never been bigger, and yet it’s still a relative rounding error in terms of the software we use every day. Most software that we use is not licensed as open source, even if it has open source inside. Open source is an essential ingredient, for sure, but it’s rarely the finished product.

Given the likelihood that AI will increasingly permeate the software and systems we depend on, it’s fair but unrealistic to want those AI models to be open source. Vaughan-Nichols blames “top AI vendors [that] are unwilling to commit to open sourcing their programs and data sets,” suggesting that “businesses hope to gild their programs with open source’s positive connotations of transparency, collaboration, and innovation.” Maybe? Or maybe they don’t have the luxury of giving away all their code because that turns out to be really bad business. I know some like to lazily gesture at Red Hat as some classic example of what business success looks like, but it’s actually a terrible example when compared to Meta, AWS, etc. As Hugging Face’s Sasha Luccioni said at the United Nations OSPOs for Good Conference, “You can’t really expect all companies to be 100% open source as the open source license defines it. You can’t expect companies just to give up everything that they’re making money off of and do so in a way they’re comfortable with.”

Let’s just get the work done

Maybe we’d like reality to be different, but after decades of open source and proprietary software living comfortably together, why would we expect AI to be any different?

Just as with cloud and with on-premises software before that, most AI software will not be open source. Now, as then, most developers simply won’t care, because most developers care more about going to their kids’ soccer games after work than existential open source issues. For years we’ve fixated conversations about open source on the wrong things, and younger developers have mostly tuned it out. But whether young or old, developers care about getting stuff done. They care about the cost, speed, and performance gains of Mistral’s latest model, and not so much about its non-open source license. Ditto OpenAI, Meta’s Llama, etc.

All of which is not to say that open source doesn’t matter for AI. It’s one thing that matters, not the only thing. When we obsess about open source licensing, we lose sight of the tens of millions of developers who just need software to help them get their jobs done with a minimum of fuss.

Matt Asay
Contributor

Matt Asay runs developer relations at MongoDB. Previously. Asay was a Principal at Amazon Web Services and Head of Developer Ecosystem for Adobe. Prior to Adobe, Asay held a range of roles at open source companies: VP of business development, marketing, and community at MongoDB; VP of business development at real-time analytics company Nodeable (acquired by Appcelerator); VP of business development and interim CEO at mobile HTML5 start-up Strobe (acquired by Facebook); COO at Canonical, the Ubuntu Linux company; and head of the Americas at Alfresco, a content management startup. Asay is an emeritus board member of the Open Source Initiative (OSI) and holds a J.D. from Stanford, where he focused on open source and other IP licensing issues.

More from this author

Exit mobile version