Invented for Linux kernel development, Git version control now powers millions of projects across the globe. You can use it with or without GitHub.
Git is a software platform mainly used by computer programmers for collaboration. At its core, Git keeps track of changes to files and allows multiple users to coordinate updates to those files. The most common use case for Git is developers working on source code files, but it could be used to manage updates to files of any type.
Git is also the version control standard for GitHub and other source code management systems, and it is widely used with within devops to implement CI/CD. For developers deploying and managing their applications on Kubernetes or other cloud-native platforms, GitOps offers best practices for working with containerized clusters and applications.
[ Download our editors’ PDF version control system (VCS) enterprise buyer’s guide today! ]
Is Git a programming language?
Git is not a programming language, but it’s become incredibly important for computer programmers working in almost any language you can name. Today, Git is the de facto standard for what’s known as version control software. Programmers use version control to keep track of updates to large codebases, roll back to earlier versions if needed, and see any changes that were made, as well as who made them. It’s become an integral part of agile software development, and is a central feature of GitOps, which extends the agile devops philosophy to container-based systems.
Why is it called Git?
Git’s name is intimately tied to its history. Git was created by someone whose name you almost certainly know: Linus Torvalds, the creator of Linux. Git was created in 2005 specifically to help manage the development of the Linux kernel. Torvalds was dissatisfied with many other version control systems at the time, and BitKeeper, which was favored by some kernel developers, wasn’t open source. (It’s a testament to Torvalds’s impact on computing that a software platform as ubiquitous as Git is only his second-largest claim to fame.)
When the earliest version of Git was rolled out, Torvalds cheekily offered a variety of explanations for its name. The most likely explanation is that Git is a three-letter combination that was easy to pronounce and wasn’t already in use by another Unix command. The word also sounds like get—relevant because you can use Git to get source code from a server. The word git is also a mild term of abuse in British English—relevant if you’re getting mad at some software. Torvalds added that you might say it’s an abbreviation for “global information tracker” if you were in a good mood, and “goddamn idiot truckload of [rude word here]” if you were in a bad one.
Who owns Git?
As noted, Git was specifically created as an open source alternative to existing version control software, which means that no single person or entity controls it. A few months after its creation, Torvalds handed off maintenance duties to Junio Hamano, who had been a major contributor to the project up to that point. Hamano, who now works for Google, continues to be Git’s core maintainer today.
Git vs. GitHub
Git offers distributed version control functionality. You can use Git to manage your own private coding efforts on your computer alone, but it’s much more commonly used for multiple people on multiple computers who want to collaborate. In such projects, the canonical version of the source code lives on a server somewhere—a central repository in Git parlance—and individual users can upload and download updates from that repository.
Git allows you to use your own computer as a central repository for others or set one up elsewhere, but there are also many service providers who offer commercial Git hosting services. GitHub, founded in 2008 and purchased by Microsoft in 2018, is by far the most prominent, offering not just hosting services but a variety of other features. You can learn more about GitHub from InfoWorld, but the important thing to keep in mind for now is that, while GitHub is built around development with Git, you don’t need to use GitHub to use Git.
Version control with Git
We’ve covered some of the basics, so now let’s dive into more detail about how Git works and why it’s so popular. A full-blown Git tutorial is beyond the scope of this article, but we can look into the most important Git concepts and terminology to get you started.
Git repository
We’ve already touched on the concept of a repository. The repository is the conceptual space where all parts of your project live. If you’re working on a project by yourself, you likely need just a single repository, whereas on a collaborative project, you would likely be working from a central repository. The central repository would be hosted on a server or a central provider like GitHub, and each developer would also have their own repository on their own computer. (We’ll discuss how the code files in all those repositories get properly synced up in a moment.)
A Git repository is subdivided into two areas. There’s a staging area, where you can add and remove files that make up your project, and then there’s the commit history. Commits are at the heart of how Git works, so let’s discuss them next.
Git commit
A commit can best be thought of as a snapshot of what your project looks like at a given moment in time. Once you’re satisfied with the files you’ve put in your staging area, you would issue the git commit
command, which freezes in time the current state of those files. You can make further changes and new commits down the line, but you’ll always be able to revert back to a previous commit. You can also compare two commits to get a quick look at what’s changed in your project.
An important thing to keep in mind is that creating a commit isn’t the same thing as putting code into production. A commit creates a version of your application that you can test, experiment with, and so on. A development team can rapidly iterate through commits as part of the process of getting an application into a production-ready state.
Git stash
Even though commits can be reverted, they do represent a certain amount of, well, commitment. If you’re working on files in your staging area and want to move on to something else, without actually committing your changes, you can use the git stash
command to save them away for later use.
Git branch and git merge
So far, you might imagine commits as a linear series of snapshots of code evolving over time. But one of the really cool and powerful aspects of Git is that you can use it to work on different versions of your application in parallel, which is crucial for agile software development.
To understand Git branches and merging in practice, imagine you’ve got an application called CoolApp, with version 1.0 in production. You’re steadily working on CoolApp 2.0, with all sorts of fun new features, which you are developing in the form of a series of commits in your repository. But then you find out that CoolApp 1.0 has a serious security flaw and needs a patch right away. You can go back to your commit of CoolApp 1.0, make the patch, and send that code into production as CoolApp 1.1—all without disturbing or adding to the series of commits leading to CoolApp 2.0, which still have 1.0 as their parent. Versions 1.1 and 2.0 are now said to be on separate branches of your codebase. Because version 1.1 is in production while 2.0 is under development, we call 1.1 the main branch.
Once CoolApp 2.0 is ready to roll out, you need to combine its new code and functionality with the security update from version 1.1. This process, called merging the two branches, is a key part of Git’s magic. Git tries to create a new commit out of two different “parents,” meaning, the most recent commits from the two branches. It creates the new commit by comparing its predecessors back to the point where the two branches split off, then consolidating all the changes made along both branches in the new, merged commit. If some piece of information—a specific block of code, say—was changed in both branches, in different ways, Git would punt the question of which version belonged in the new commit back to the developer.
Git checkout
Many large projects have multiple active branches under development at once, in parallel. The git checkout
command is how you change which branch you’re actively working on. This process updates the files in the working directory to the latest versions for the branch you’re interested in; all your new commits will then be committed on that branch until you check out another one.
Using Git for collaboration
So far, we’ve been talking about what happens in a Git repository as if you were the only one working on it. But Git is best known as a collaborative tool. Next, we’ll look at how Git concepts work in collaborative contexts.
Git clone
The easiest way to start collaborating with others on a project is by cloning a repository that already exists on another computer. Cloning downloads the entire contents of that repository into a repository on your own machine.
We’ve already discussed the concept of a central repository. It’s very common for projects to treat such a repository, hosted on GitHub or elsewhere, as the canonical “source of truth” about what a project’s codebase looks like. Let’s assume such an arrangement for the remainder of this article. Do note, however, that the question of which repository is the central one is matter of convention agreed upon by project participants and isn’t enforced by Git itself. In theory, you could have various repositories exchanging code with no single repository being central.
Git pull and Git push
We’ve discussed how Git can reconcile two branches of commits on the same machine. It can do the same for two branches on separate machines, using basically the same techniques. The process by which one branch is moved between machines is called either a pull or a push, depending on how it’s initiated. If you’re bringing a branch from a remote server onto your machine, you’re pulling. If you’re sending a branch from your machine to another, you’re pushing.
Git pull request
Pushing your code onto another machine—or onto the central repository that the whole project depends on—may seem kind of, well, pushy. A more common scenario, which is key to the collaborative nature of Git development, is the pull request. Let’s say you’ve finalized the code for a new feature, and you want it integrated into your project’s codebase. You’d issue a pull request, which formally asks the project managers to pull your new code onto the central repository.
The pull request not only gives the project managers the chance to accept or reject your contribution, it also creates a mini-discussion forum on the central repository where all project members can chime in about the request. This is a key way that developers can hash out changes to a project’s codebase, especially in open source projects where Git might be the primary place where contributors interact.
Git fork
A branch is meant to be a temporary departure from the main codebase, which will ultimately be merged back into it. A fork, on the other hand, is a more permanent departure. For open source projects in particular, a fork happens when a developer decides they want to take an existing open source codebase and develop it for their own goals, which may be different from those of the project’s current maintainers. GitHub makes it particularly easy to fork from existing Git repositories; with a single click you can clone an existing repository and start working on it on your own terms.
Git with Windows
As noted earlier, Git was developed originally for Linux kernel development, and it takes the form of a series of command-line utilities. Its structure and command syntax are very much based on Unix, which means it runs more or less natively on Unix-like operating systems such as Linux and macOS. Porting Git to Windows is a little trickier, and relies on Git bash, a Bourne shell emulator for Windows that’s built into Git for Windows.
GUI and IDE integration
Of course, many Windows developers are accustomed to using a GUI, and so Git for Windows also includes a graphical user interface. Users of macOS and Linux shouldn’t feel left out, either: there are plenty of GUIs to go around. Cross-platform GUIs also exist and offer various bells and whistles.
You can also integrate Git into your favorite IDEs, including Eclipse and Microsoft Visual Studio.
Git tutorial: How to use Git and GitHub
Are you ready to learn more about using Git and Git commands? To start, we recommend the comprehensive and easy-to-follow tutorial from Atlassian. Do note that Atlassian makes this tutorial available in the context of Bitbucket, which is Atlassian’s competitor to GitHub, but it’s still a great introduction to Git basics.
If you want to learn how to use GitHub, InfoWorld’s own Martin Heller has a great tutorial for you. And if you want to better understand the technical details of how Git works under the covers—particularly how it stores the various components of your project—check out “Commits are snapshots, not diffs” on the GitHub blog.