• Git: Learn Everything of Git in a Single Post

Hi! I’m Sourav, the founder of SmartSourav.com, I hope that through this post I can give you a clear idea about Git. Before getting started about Git, you have to know about some specific phrases like Version Control system(VCS), DVCS, CVCS, Source Code, etc. If by some unfortunate circumstance you get stuck or have any questions for me about Git, just get in touch with me or leave a comment below this post. I’ll help you out with any problems. So, let’s get started!

What’s a version control system?

A version control system, or VCS, keeps track of every modification to the code in a special kind of database as people and teams collaborate on projects together. As the project evolves, teams can run tests, fix bugs, and contribute new code with the confidence that any version can be recovered at any time.

Simply you can say Version Control System is a category of software tools that help a software team manage changes to source code over time. If a mistake is made, developers can turn back the clock and compare earlier versions of the code to help fix the mistake while minimizing disruption to all team members.

There are two types of Version Control System. These are:

1.  Distributed/Decentralized version control system (DVCS)

2. Centralized version control system (CVCS).

Distributed version control

In software development, distributed version control (also known as distributed revision control) is a form of version control where the complete codebase – including its full history – is mirrored on every developer’s computer. This allows branching and merging to be managed automatically, increases speeds of most operations (except for pushing and pulling), improves the ability to work offline and does not rely on a single location for backups.

In 2010, software development author Joel Spolsky described DVCS as “possibly the biggest advance in software development technology in the past ten years.”

Differences between Distributed and centralized
1. Initial checkout of a repository is slower as compared to check out in a centralized version control system because all branches and revision history are copied to the local machine by default.
2. The lack of locking mechanisms that is part of most centralized VCS and still plays an important role when it comes to non-mergeable binary files such as graphics assets or too complex single file binary or XML packages (e.g. office documents, PowerBI files, SQL Server Data Tools BI packages, etc.)
3. Additional storage required for every user to have a complete copy of the complete codebase history.

4. Centralized version control system (CVCS) uses a central server to store all files and enables team collaboration. But the major drawback of CVCS is its single point of failure, i.e., failure of the central server. Unfortunately, if the central server goes down for an hour or two, then during that time, no one can collaborate at all. And even in the worst case, if the disk of the central server gets corrupted and proper backup has not been taken, then you will lose the entire history of the project. Here, the distributed version control system (DVCS) comes into the picture.

DVCS clients not only check out the latest snapshot of the directory but they also fully mirror the repository. If the server goes down, then the repository from any client can be copied back to the server to restore it. Every checkout is a full backup of the repository. You can commit changes, create branches, view logs, and perform other operations when you are offline. You require internet connection only to publish your changes and take the latest changes.

Advantages of DVCS include:

1. Allows users to work productively when not connected to a network.

2. With DVCS, communication is only necessary when sharing changes among other peers.

3. Allows private work, so users can use their changes even for early drafts they do not want to publish.
4. Common operations (such as viewing history, commits, and reverting changes) are faster for DVCS because there is no need to communicate with a central server.

5. Working copies effectively function as remote backups, which avoids relying on one physical machine as a single point of failure.
6. Allows various development models to be used, such as using development branches or a Commander/Lieutenant model.
7. Permits centralized control of the “release version” of the project
On FOSS software projects it is much easier to create a project fork from a project that is stalled because of leadership conflicts or design disagreements.

8. The distributed model is generally better suited for large projects with partly independent developers, such as the Linux kernel project, because developers can work independently and submit their changes for merge (or rejection).

The distributed model flexibly allows adopting custom source code contribution workflows. The integrator workflow is the most widely used.

In the centralized model, developers must serialize their work, to avoid problems with different versions.

Source Code

In computing, the source code is any collection of code, possibly with comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source code. The source code is often transformed by an assembler or compiler into binary machine code understood by the computer. The machine code might then be stored for execution at a later time. Alternatively, source code may be interpreted and thus immediately executed.

Most application software is distributed in a form that includes only executable files. If the source code were included it would be useful to a user, programmer or a system administrator, any of whom might wish to study or modify the program.

Definitions
The Linux Information Project defines source code as:

Source code (also referred to as source or code) is the version of the software as it is originally written (i.e., typed into a computer) by a human in plain text (i.e., human readable alphanumeric characters).

The notion of the source code may also be taken more broadly, to include machine code and notations in graphical languages, neither of which are textual in nature.

What is Git?

Git is a distributed version control system for tracking changes in source code during software development. It is designed for coordinating work among programmers, but it can be used to track changes in any set of files. Its goals include speed, data integrity, and support for distributed, non-linear workflows.

As with most other distributed version-control systems, and unlike most client-server systems, every Git directory on every computer is a full-fledged repository with a complete history and full version-tracking abilities, independent of network access or a central server.

Git is free and open-source software distributed under the terms of the GNU General Public License version 2.

Why Git?

According to a recent survey, more than 70 percent of developers use Git, making it the most-used VCS in the world. Git is commonly used for both open source and commercial software development, with significant benefits for individuals, teams, and businesses.

Git lets developers work in every time zone. With a DVCS like Git, collaboration can happen any time while maintaining source code integrity. Using branches, developers can safely propose changes to production code.

Developers can see the entire timeline of their changes, decisions, and progression of any project in one place. From the moment they access the history of a project, the developer has all the context they need to understand it and start contributing.

Businesses using Git can break down communication barriers between teams and keep them focused on doing their best work. Plus, Git makes it possible to align experts across a business to collaborate on major projects.

How does git work?

When you do actions in Git, nearly all of them only add data to the Git database. Git has three main states that your files can reside in: committedmodified, and staged:

  • Committed means that the data is safely stored in your local database.
  • Modified means that you have changed the file but have not committed it to your database yet.
  • Staged means that you have marked a modified file in its current version to go into your next commit snapshot.
  1. This leads us to the three main sections of a Git project: the Git directory, the working tree, and the staging area.

    Git

    Git Workflow

The Git directory is where Git stores the metadata and object database for your project. This is the most important part of Git, and it is what is copied when you clone a repository from another computer.

The working tree is a single checkout of one version of the project. These files are pulled out of the compressed database in the Git directory and placed on disk for you to use or modify.

The staging area is a file, generally contained in your Git directory, that stores information about what will go into your next commit. Its technical name in Git parlance is the “index”, but the phrase “staging area” works just as well.

The basic Git workflow goes like this:

  1. You modify files in your working tree.
  2. You selectively stage just those changes you want to be part of your next commit, which adds only those changes to the staging area.
  3. You do a commit, which takes the files as they are in the staging area and stores that snapshot permanently to your Git directory.

How to check the version of Git in your machine?

Simply run this command in your system:

git –version

If the version number is returned, then it means that Git is successfully installed on your computer, if not then you can download it from this link.

Setting config values:

Now you need to set the global configuration variables, which are very important, especially if you are working with other developers.

git config --global user.name “Sourav Mandal”
git config --global user.email “srvmndl9217@mail.com”
git config --list

Help Command:

If you need help in any step, just type:

git help config
git config --help

Both of these commands perform the same action. This will be useful to identify more advanced capabilities of git.

What’s a repository and how to create it?

A Git repository is a virtual storage of your project. It allows you to save versions of your code, which you can access when needed.

If you have a local repository that you want to convert into a git project to start tracking it, then we can start by running the command below within the project directory.

git init

Done! Just like that, you have converted your project into a local git repository. If you open up the project folder you will see that a new directory called .git has been created.

A quick note: git init and git clone can be easily confused. At a high level, they can both be used to “initialize a new git repository.” However, git clone  is dependent on git init. git clone is used to create a copy of an existing repository. Internally, git clone first calls git init to create a new repository. It then copies the data from the existing repository and checks out a new set of working files.

Basic Git commands:

To use Git, developers use specific commands to copy, create, change, and combine code. These commands can be executed directly from the command line or by using an application like GitHub Desktop or Git Kraken. Here are some common commands for using Git:

  • git init initializes a brand new Git repository and begins tracking an existing directory. It adds a hidden subfolder within the existing directory that houses the internal data structure required for version control.
  • git clone creates a local copy of a project that already exists remotely. The clone includes all the project’s files, history, and branches.
  • git add stages a change. Git tracks changes to a developer’s codebase, but it’s necessary to stage and take a snapshot of the changes to include them in the project’s history. This command performs staging, the first part of that two-step process. Any changes that are staged will become a part of the next snapshot and a part of the project’s history. Staging and committing separately gives developers complete control over the history of their project without changing how they code and work.
  • git commit saves the snapshot to the project history and completes the change-tracking process. In short, commit functions like taking a photo. Anything that’s been staged with git add will become a part of the snapshot with git commit.
  • git status shows the status of changes as untracked, modified, or staged.
  • git branch shows the branches being worked on locally.
  • git merge merges lines of development together. This command is typically used to combine changes made on two distinct branches. For example, a developer would merge when they want to combine changes from a feature branch into the master branch for deployment.
  • git pull updates the local line of development with updates from its remote counterpart. Developers use this command if a teammate has made commits to a branch on a remote, and they would like to reflect those changes in their local environment.
  • git push updates the remote repository with any commits made locally to a branch.

I am adding more about Git as soon as possible. So, come back after some time or after one or two days. And bookmark this page, so that you can come back here in the future. Thanks!