PyTorch 1.10 is production ready, with a rich ecosystem of tools and libraries for deep learning, computer vision, natural language processing, and more. Here's how to get started with PyTorch. PyTorch is an open source, machine learning framework used for both research prototyping and production deployment. According to its source code repository, PyTorch provides two high-level features: Tensor computation (like NumPy) with strong GPU acceleration. Deep neural networks built on a tape-based autograd system. Originally developed at Idiap Research Institute, NYU, NEC Laboratories America, Facebook, and Deepmind Technologies, with input from the Torch and Caffe2 projects, PyTorch now has a thriving open source community. PyTorch 1.10, released in October 2021, has commits from 426 contributors, and the repository currently has 54,000 stars. This article is an overview of PyTorch, including new features in PyTorch 1.10 and a brief guide to getting started with PyTorch. I’ve previously reviewed PyTorch 1.0.1 and compared TensorFlow and PyTorch. I suggest reading the review for an in-depth discussion of PyTorch’s architecture and how the library works. The evolution of PyTorch Early on, academics and researchers were drawn to PyTorch because it was easier to use than TensorFlow for model development with graphics processing units (GPUs). PyTorch defaults to eager execution mode, meaning that its API calls execute when invoked, rather than being added to a graph to be run later. TensorFlow has since improved its support for eager execution mode, but PyTorch is still popular in the academic and research communities. At this point, PyTorch is production ready, allowing you to transition easily between eager and graph modes with TorchScript, and accelerate the path to production with TorchServe. The torch.distributed back end enables scalable distributed training and performance optimization in research and production, and a rich ecosystem of tools and libraries extends PyTorch and supports development in computer vision, natural language processing, and more. Finally, PyTorch is well supported on major cloud platforms, including Alibaba, Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure. Cloud support provides frictionless development and easy scaling. How to get started with PyTorch Reading the version update release notes won’t tell you much if you don’t understand the basics of the project or how to get started using it, so let’s fill that in. The PyTorch tutorial page offers two tracks: One for those familiar with other deep learning frameworks and one for newbs. If you need the newb track, which introduces tensors, datasets, autograd, and other important concepts, I suggest that you follow it and use the Run in Microsoft Learn option, as shown in Figure 1. Figure 1. The “newb” track for learning PyTorch. If you’re already familiar with deep learning concepts, then I suggest running the quickstart notebook shown in Figure 2. You can also click on Run in Microsoft Learn or Run in Google Colab, or you can run the notebook locally. Figure 2. The advanced (quickstart) track for learning PyTorch. PyTorch projects to watch As shown on the left side of the screenshot in Figure 2, PyTorch has lots of recipes and tutorials. It also has numerous models and examples of how to use them, usually as notebooks. Three projects in the PyTorch ecosystem strike me as particularly interesting: Captum, PyTorch Geometric (PyG), and skorch. Captum As noted on this project’s GitHub repository, the word captum means comprehension in Latin. As described on the repository page and elsewhere, Captum is “a model interpretability library for PyTorch.” It contains a variety of gradient and perturbation-based attribution algorithms that can be used to interpret and understand PyTorch models. It also has quick integration for models built with domain-specific libraries such as torchvision, torchtext, and others. Figure 3 shows all of the attribution algorithms currently supported by Captum. Figure 3. Captum attribution algorithms in a table format. PyTorch Geometric (PyG) PyTorch Geometric (PyG) is a library that data scientists and others can use to write and train graph neural networks for applications related to structured data. As described on its GitHub repository page: PyG offers methods for deep learning on graphs and other irregular structures, also known as geometric deep learning. In addition, it consists of easy-to-use mini-batch loaders for operating on many small and single giant graphs, multi GPU-support, distributed graph learning via Quiver, a large number of common benchmark datasets (based on simple interfaces to create your own), the GraphGym experiment manager, and helpful transforms, both for learning on arbitrary graphs as well as on 3D meshes or point clouds. Figure 4 is an overview of PyTorch Geometric’s architecture. Figure 4. The architecture of PyTorch Geometric. skorch skorch is a scikit-learn compatible neural network library that wraps PyTorch. The goal of skorch is to make it possible to use PyTorch with sklearn. If you are familiar with sklearn and PyTorch, you don’t have to learn any new concepts, and the syntax should be well known. Additionally, skorch abstracts away the training loop, making a lot of boilerplate code obsolete. A simple net.fit(X, y) is enough, as shown in Figure 5. Figure 5. Defining and training a neural net classifier with skorch. Conclusion Overall, PyTorch is one of a handful of top-tier frameworks for deep neural networks with GPU support. You can use it for model development and production, you can run it on-premises or in the cloud, and you can find many pre-built PyTorch models to use as a starting point for your own models. Related content analysis How to support accurate revenue forecasting with data science and dataops Data science and dataops have a critical role to play in developing revenue forecasts business leaders can count on. By Isaac Sacolick Nov 05, 2024 8 mins Data Science Machine Learning Artificial Intelligence feature The machine learning certifications tech companies want Not all machine learning courses and certifications are equal. Here are five certifications that will help you get your foot in the door. By Bob Violino Nov 04, 2024 9 mins Certifications Machine Learning Software Development how-to Download the AI in the Enterprise (for Real) Spotlight This issue showcases practical AI deployments, implementation strategies, and real-world considerations such as for data management and AI governance that IT and business leaders alike should know before plunging into AI. By InfoWorld and CIO.com contributors Nov 01, 2024 1 min Machine Learning Data Governance Artificial Intelligence feature The best Python libraries for parallel processing Do you need to distribute a heavy Python workload across multiple CPUs or a compute cluster? These seven frameworks are up to the task. By Serdar Yegulalp Oct 23, 2024 11 mins Python Data Science Machine Learning Resources Videos