What is Git (version control system) and why you need to use it for your projects (any project with files and content, not just code)

When you work on something for more than a couple of hours and across several days or months, you are bound to try new things, ways of writing, code, images and so on. And while doing so, you will probably start creating many files with different names for the same content so that you can have versions, such as Text-1.txt, texts-1-1.txt, project-a-final.docx, project-a-final-2.docx, and so on. If you have a more structured mindset you may even use a time stamp on your file names in the form of project-a-2022-11-12-1211.docx or similar. After a couple of tries, you have tens or even hundreds of files and you won’t even know which one was the correct one. This gets even more complicated if you also want to collaborate with others (and not depend on Google Docs (or similar) or if your project cannot be done in such tools).

Here comes a version control system (aka as VCS) to help. Instead of manually creating files and giving them a name, saving them somewhere and tying to find out which one you really wanted to use, or which parts of which files, you my as well use a vcs, in this case git, to manage the whole thing. If you use it the right way, you will even have a nice history of your changes.

A VCS, also sometimes known as a SCM (source code management) or RCS (revision construal system) is a system which tracks changes to a file or a set o files. Usually, developers use a VCS to track changes to code and collaborate with others, but software development is not the only use case.

Types of version control systems

There are several kinds of VCS’s, and they are categorized in centralized and distributed.

Centralized VCS

A centralized VCS works on a client-server model, where a centralized master code base is in a server, and a developer or collaborator can lock (check-out) a piece of this code (version) and work on it. In this case everything is controlled by the server.

The best known examples are CVS and Subversion (open source) and IBM ClearCase (commercial).

Distributed VCS

A distributed VCS works on a peer-to-peer model, where a code are or project is distributed among the individual developer’s devices and the entire history as well as the versions are mirrored on each system.

In this case the emphasis is on changes instead of versions, so that any version (branch) is a combination of many sets of changes (commits).

The main examples here are Git and Mercurial.

In fact the slogan from git is everything is local, high emphasizes the distributed part.

Git (the system we will use)

We will discuss Git, which is a distributed version control system.

Git (/ɡɪt/)[8] is free and open source software for distributed version control: tracking changes in any set of files, usually used for coordinating work among programmers collaboratively developing source code during software development. -Wikipedia

What is git in simple words?

Git is a version control system (VCS) which allows you to track changes to a file or set of files over time.

In other words, Git is a tool or system that allows you to add files to a tracked repository (a folder which was initialized or configured as a Git Repository), commit changes to the repository with a message stating which changes were made and, optionally, push those changes to a centralized server to collaborate with others and pull them to get the changes others made.

Within this repository you can have different branches (as the branches of a tree), each of which can have a specific version (combination of changes) of a file or set of files. This means you can have a main or master branch (think of this as the tree trunk) from which several branches (or none) can grow (or be created).

The commits and its messages build the log, journal or history of a repository and each of its branches.

Usually what you want is to have as little branches as possible, and as many branches as projects, ideas or issues in which you are working on.

You can think of it like the branches of a bush or garden tree. The more the branches grow and in more directions, the more complicated it is to prune the tree and maintain a shape. The same goes for repositories. The more branches you have the more the branches differ from each other, making it more complicated to get back to the master branch.

Each branch will grow or stagnate independently, and the way to get the changes from one branch to another is to merge them. You usually merge two branches using a merge request, where you can check the differences and changes to each file between two branches.

Once you are ok with the merge request you can approve the merge, meaning you will mix (merge) the content of each file into a new version of the file on the destination branch (usually master or develop) or reject the merge request (aka MR) and just discard the changes. You may even remove the whole branch or leave it there for historic purpose.

Use cases

As explained before, a vcs is a way to track files (not only code) and you can use it for almost anything (videos and really big files might be a bad idea though, although there is a solution for that called Git Large File Storage or Git LFS).

I have used git to track the changes of several books written in markdown and asciidoc, this site (written in markdown, restructured text and python), as well as personal and professional programming projects, notes and even be used as an alternative to Dropbox (check git-annex. There are ways for designers to use git to track changes to designs, parsers for office files, and many others.

The main point here is that git is not only for tracking code.

How to use Git

You may use git on the command line, a desktop graphical user interface or a web-based management platform. There are several to choose from, but the most popular ones are:

  • GitHub (saas, now part of Microsoft)
  • GitLab (saas and open source)
  • Gitea (open source, self-hosted)
  • Codeberg (gitea as non-profit saas, backed by non-profit Codeberg e.V. in Germany)
  • BitBucket (commercial)

The web-based management platforms also add several features which are not part of git (or any vcs) but are really useful (mostly for software projects but also for other kinds of projects).

For example they usually include:

  • Issue tracking (integrated with git)
  • Continuous Development/ Continuous Integration
  • Merge request and comments
  • Approval or rejection of merge requests
  • ACLs or access control lists for repositories based on roles and permissions
  • A web editor for code with syntax-highlighting

Depending on which system you use to create and manage your repository the way to do the same may differ and the tool itself is really flexible.

Important: You don’t need any web platform to use git, as it can run completely local.

Git on the command line

There are many commands in git, but with the following you can get started:

# initialize a repository (folder)
git init

# add files for the repository
git add file

# add all files in this folder to the repository
git add .

# commit changes for the added files
git commit -m “short message describing the changes”

Git on GitHub

Go to https://github.com and create a new account to get started.

Git on GitHub

Go to https://gitlab.com an create a new account to get started.

Git workflow: GitFlow

There are many ways of using git (workflows or branching strategies) which one could follow but the most popular ones are

There are already excellent comparisons which you can refer to if you want to learn more. (Check https://www.devbridge.com/articles/branching-strategies-git-flow-vs-trunk-based-development/ and https://www.flagship.io/git-branching-strategies/.

I personally use Git-Flow.

Article updates

  • 2022-12-23: added links for git web platforms, added syntax highlighting for bash and links for git workflows and social share image

Comments

Comments powered by Disqus