Simon Bisson
Contributor

Get started with AI using ML.Net and Model Builder

analysis
Aug 06, 20196 mins
AnalyticsArtificial IntelligenceMicrosoft .NET

Microsoft’s .Net machine learning tooling makes it easy to add AI to your code

virtual brain / digital mind / artificial intelligence / machine learning / neural network
Credit: MetamorWorks / Getty Images

Machine learning is an important tool for modern application development. We’ve gone from the cold depths of an AI winter to an explosion of new neural networks and models, building on the hyperscale compute capabilities of the cloud and on the requirements of big data services. If you’re an AI researcher it’s an exciting time, with new discoveries and new tools arriving weekly.

But that’s only part of the story. Far more exciting is the democratization of machine learning. Research is a good thing, but it’s far better to put the results of that research into action and into the hands of developers. APIs like Microsoft’s Azure Cognitive Services are one way to do this, but not every application has a permanent connection to Azure, so it’s important to have ML tools that build into our everyday development environments and tools.

Introducing ML.Net

That’s where ML.Net comes in. It’s Microsoft’s open source, cross-platform machine learning tool for .Net and .Net Core, targeting .Net Standard, running on Windows, Mac OS, and Linux systems. It’s extensible, so it works with not only Microsoft’s own ML tooling but also with other frameworks such as Google’s TensorFlow and the ONNX cross-platform model export technology. By supporting as wide a selection of frameworks as possible, it gives you the option to pick and choose the ML models that are closest to your needs, fine tuning them to fit.

Getting started with ML.Net is easy. First download and install the ML.Net packages, and then add the appropriate libraries to your .Net code, declaring them in your headers with using statements. It handles both training and inference, so you can use it to work with both training and live data, using an iterative process to fine tune your model and improve accuracy.

Training data is initially loaded into an IDataView object and then used to train your chosen ML algorithm. You build a ML pipeline from a selection of extension methods that implement statistical and machine learning algorithms, load the data, and use Fit() to train your model. Once that’s run, you then need to evaluate the results and tune your model, iterating until you’re happy with its performance. Once trained, save your model as a binary, loading it into an ITransformer object before calling it from CreatePredictionEngine.Predict().

Model evaluation is probably the most important part of the training process, and you need to prepare your training data before you do this. Once you’ve cleaned up your training data set, segregate some of the data to use as a test data set. You can then check your model against this data, correlating the results with your predicted outputs.

Using off-the-shelf machine learning models or cloud-hosted ML APIs are the most popular ways to quickly add intelligence to your code, but they may not offer the solution for your particular problem. General purpose machine learning is useful for finding sentiment in Twitter feeds or categorizing the items in a real estate photograph, but it’s not appropriate if you’re trying to spot snow leopards in the rocky landscapes of the Himalayas or wanting to differentiate between impurities in a bottle of beer and scratches on the recycled glass of the bottle.

Using Model Builder to create machine learning models

Although coding a machine learning model has its advantages, Microsoft offers additional tools to help simplify the process of building and training ML models. ML.Net Model Builder is a Visual Studio extension that turns your IDE into a tool for building custom ML models that can be dropped straight into your code, using Microsoft’s AutoML technology to help choose an appropriate ML algorithm for your specific application needs.

All you need to get started with Model Builder is a copy of Visual Studio and enough data to use as a training set. Download it from the Visual Studio Marketplace to install it in Visual Studio. Its AutoML tools will use a template based on a set of bundled scenarios to generate models based on your training data, with a focus on regression and classification-based prediction. The bundled scenario templates cover sentiment analysis, issue classification, price prediction, and custom analysis.

The scenario names might appear to lock you into specific implementations, but they’re a lot more flexible. For example, the sentiment analysis scenario is actually a tool for exploring binary classifications, where you’re asking “Is this a yes or a no?” That gives you the option of using it to build models that cover things like “is this spam?” or “is this a valid purchase?” or “is this person a good credit risk?”

Model Builder is currently in preview, so there are some limitations to the size of the training set. Currently it’s limited to 1GB or 100 thousand rows in SQL Server. You can install it in Visual Studio 2017 or 2019, with a minimum requirement of the .Net Core 2.1 SDK.

Using AutoML for training

Once installed, you choose a scenario and connect Model Builder to a training data set. In the training set pick the attribute you want to predict along with the inputs used to predict it. For example, if you’re trying to build a model that predicts whether a flight will be on time, mark the arrival time as your labelled attribute, with attributes like departure time, aircraft type, weather conditions, and passenger load used as inputs. Model Builder then trains the model, building a model that links your inputs to the predicted output. There’s no need to tune the model, it’s all controlled by AutoML. The process can take time; a 1GB data set will require around three hours to train.

With Model Builder you don’t need to worry about writing your application code, it’s all generated for you. Models are delivered as zip files, and all the code you need is added to your current Visual Studio solution, along with a console app you can use to test your model before adding it to your application.

One advantage of using Model Builder is that you don’t have to pre-prepare your training data. It’s able to connect to common data formats, initially to CSV and TSV files, as well as to SQL Server databases. Microsoft is promising more file formats and connectors in future releases, which should go a long way to reduce the data preparation workload. Everything runs locally, so you don’t need to be online to build and test a model (though you may need a fairly powerful PC to get the most from it).

With ML.Net and Model Builder, Microsoft has gone a long way to delivering a simple machine learning framework. Mixing an easy-to-use programming model with an automated model generator is a smart move, as is integrating them with familiar developer tools and environments. It’s a combination well worth investigating if you want to add AI to your applications.

Simon Bisson
Contributor

Author of InfoWorld's Enterprise Microsoft blog, Simon Bisson prefers to think of “career” as a verb rather than a noun, having worked in academic and telecoms research, as well as having been the CTO of a startup, running the technical side of UK Online (the first national ISP with content as well as connections), before moving into consultancy and technology strategy. He’s built plenty of large-scale web applications, designed architectures for multi-terabyte online image stores, implemented B2B information hubs, and come up with next generation mobile network architectures and knowledge management solutions. In between doing all that, he’s been a freelance journalist since the early days of the web and writes about everything from enterprise architecture down to gadgets.

More from this author