Learn how to combine Python code, freeform text, mathematical formulas, and graphics in an interactive, shareable notebook Credit: NASA / JPL Jupyter Notebooks let you combine code, comments, multimedia, and visualizations into an interactive document that can be shared, re-used, and re-worked. Originally developed for data science applications written in Python, R, and Julia, Jupyter Notebooks are useful in all kinds of ways for all kinds of projects. You can use Jupyter Notebooks to share Python code and its output with third parties, to run code with live interactive feedback, or to systematically track and document the progress of your work. In this article, we’ll walk through setting up Jupyter Notebook for Python, working with Jupyter’s various features, and sharing the results with others, whether they have Jupyter installed or not. Jupyter Notebook installation and setup The easiest way to create and work with Jupyter Notebooks for Python is to set up an instance of the Anaconda distribution of Python. Anaconda was created to make it easy to work with Python and its galaxy of data science tools, and it includes the Jupyter Notebook software as a standard-issue pack-in. In addition to making Jupyter Notebooks easy to start up and use, Anaconda provides by default many of the other packages you’re likely to use in conjunction with Jupyter: Pandas, NumPy, TensorFlow, Matplotlib, and so on. Anaconda also makes it easier to do workaday things like manage virtual environments, keep Python packages up-to-date, and find good documentation for everything you’re working with. One potential drawback to using Anaconda: If you’ve already built up a large Python workflow, you’ll have to migrate the work to the new Anaconda instance. If you’re not married to using your original setup, that’s the better choice in the long run. But if you need to stick with the environment you do have, you’ll need to install the Jupyter Notebook packages manually. IDG The Anaconda distribution of Python comes preloaded with Jupyter Notebook. Running it is as easy as clicking an icon. The good news is that a manual Jupyter Notebooks setup is easy too. Use pip to add Jupyter to a Python installation: pip install jupyter If you’re using Anaconda, the main Anaconda Navigator interface has a launcher for the Jupyter Notebook interface in the Home panel. Click it to start an instance of the Jupyter server, and your system’s default web browser will launch to access it. If you’ve installed Jupyter manually with pip, you can launch it by typing jupyter notebook at a command line. Note that if you’re using Jupyter in a virtual environment for Python (a good idea if you’re not using Anaconda), you need to activate that environment before running that command. Jupyter Notebook basics When you first start Jupyter, you’ll see a file browser, typically for the files in the current user’s home directory. You can navigate using this browser to an existing notebook, or launch a new notebook using the “New” dropdown at the top right of the file list. Select “New / Notebook: Python 3” to do this. IDG Jupyter’s file browser. Notebooks marked with a green icon have been launched and are running. A Jupyter Notebook has two essential components: a kernel and cells. A kernel is a runtime instance of the runtime of the programming language used in the notebook. In our case, Jupyter is a running Python instance (namely the IPython kernel), with its own namespace and memory allotment. A notebook can connect to only one kernel at a time, but can switch between kernels if needed. Cells are akin to the cells in a spreadsheet. Each cell contains code, freeform text, or other content. You can start running code at any cell, and you can run each cell individually (even out of order!). You can run cells from a given point forward, or start from the top and run everything with a clean slate. When cells are evaluated, they’re numbered to indicate the order in which they’ve been evaluated. The state of the code used in the notebook (its global variables, etc.) is preserved unless you specifically tell Jupyter otherwise. Cells can be cut, copied, pasted, and reordered by way of the toolbar icons. IDG A simple example of a Jupyter Notebook. The graph seen at bottom is generated by the code in the section marked In [9]:. The most common commands used in a notebook are available either in the drop-down menus or in the clickable toolbar above the cell area. One good way to explore everything available is to click the keyboard icon in the toolbar. This opens the command palette, from which you can summon many commands by simply typing. IDG The Jupyter Notebook command palette. Type to search for any available command. A good way to get the hang of working with notebooks is to play around with existing ones. Jupyter’s creators have curated a gallery of notebooks that cover a wide range of applications, and also show off many examples of what’s commonly done in notebooks, such as graphing or generating visuals from data. Jupyter Notebook cell formats Cells can contain four kinds of content: Python code. There is no practical limit on the amount of code any one cell can contain, but for the sake of comprehension it’s a good idea to break up long sections of code into separate blocks. You don’t have to cut and paste: You can split a notebook cell by pressing Ctrl + Shift + -. Markdown. Use text in the Markdown format for annotations, comments, inline images, and other elements that aren’t executable code. Markdown cells can even contain attached images, which are saved with the notebook. Use the notebook’s “Edit | Insert Image” menu option to add the image. Heading. Legacy Jupyter Notebooks used special “heading” cells to partition notebooks into sections. This isn’t used anymore in new notebooks; use Markdown headings instead. Raw NBConvert. Cells marked with “Raw NBConvert” are left exactly as-is and not processed or converted by Jupyter. This is useful if you want to insert text that needs to be processed by a third-party add-on, such as LaTex for math formulas. Cells with Python code can be prefixed with % or %% to create magic commands. Magic commands are specific to the Python kernel running in Jupyter, and typically control how the kernel interfaces with the notebook and the surrounding system. IDG Different kinds of cells in a Jupyter Notebook. From top: a Markdown-format cell, a Python code cell, and a raw NBConvert cell. Running Jupyter Notebooks Code in a Jupyter Notebook can be run all at once or incrementally. When a code cell is in focus, click the “Run” button in the notebook’s command bar to execute the contents of the current cell and advance to the next cell. The process is akin to “single-stepping” a running program using a debugger, except here you’re going one cell at a time instead of one line of code at a time. Note that whenever you run one or more cells, the state of those cells is preserved by the kernel as long as it’s active. For instance, if you import a module, the module will remain loaded in the kernel, so any future import statements will function just like the second instance of that import statement in a conventional Python script. If you want to restart the kernel and reset the context for the notebook, but keep the results you have displayed in the notebook so far, click the “restart” button (the loop-arrow icon). If you want to run the whole notebook from the top with a newly launched kernel, erasing all previous results displayed in the notebook, click the double-arrow “run” button. If you want to add more interactivity to a Jupyter Notebook, one way to do this is with the ipywidgets add-on. With ipywidgets, cells can generate basic GUI controls like sliders or input boxes. The values supplied by the user from those controls can be used in other cells in the notebook. IDG Interactive controls can be inserted into a notebook by way of the ipywidgets library. Saving and loading Jupyter notebooks Jupyter notebooks can be saved and restored for future use. When you save a Jupyter notebook, you save the textual contents of each cell, as well as the output of each cell from when it was last run. However, the state of the kernel used to run the notebook—the state of the variables and data structures created when the notebook was run—is not saved. It is possible to save and restore the kernel state by using the third-party dill module in Python, but saving kernel state is not directly supported by Jupyter. Jupyter Notebook also provides a simple mechanism for reverting the most recent set of changes. When you save a notebook, a “checkpoint” is made from the previous saved copy, and you can revert to that checkpoint by choosing “File | Revert to Checkpoint.” For anything more sophisticated, use an actual version control system like Git on the directory where you keep your notebook. Related content analysis 7 steps to improve analytics for data-driven organizations Effective data-driven decision-making requires good tools, high-quality data, efficient processes, and prepared people. Here’s how to achieve it. By Isaac Sacolick Jul 01, 2024 10 mins Analytics news Maker of RStudio launches new R and Python IDE Posit, formerly RStudio, has released a beta of Positron, a ‘next generation’ data science development environment based on Visual Studio Code. By Sharon Machlis Jun 27, 2024 3 mins Integrated Development Environments Python R Language feature 4 highlights from EDB Postgres AI New platform product supports transactional, analytical, and AI workloads. By Aislinn Shea Wright Jun 13, 2024 6 mins PostgreSQL Generative AI Databases analysis Microsoft Fabric evolves from data lake to application platform Microsoft delivers a one-stop shop for big data applications with its latest updates to its data platform. By Simon Bisson Jun 13, 2024 7 mins Microsoft Azure Natural Language Processing Data Architecture Resources Videos