Difference between revisions of "Source Control"

From Coder Merlin
(Changed invigilator to guide.)
m (Editorial review and minor corrections)
 
(7 intermediate revisions by 3 users not shown)
Line 8: Line 8:


== Introduction ==
== Introduction ==
'''Source control''' enables us to ''track'' and ''manage'' changes to our code. This functionality becomes increasingly critical as the size of our projects grow both in terms of the lines of code and the number of coders. Source control shows us ''who'' changed the code and ''when''. We're able to ''compare'' one '''revision''' of code to another. And, when necessary, we can '''rollback''' changes to a previous revision. Source control can be a tremendous help to you (and your team) when you want to easily recover from accidentally damaging a project and as such, provides you with the freedom to experiment without fear. However, you must use the source control system regularly and often or it won't be able to help you. In this experience we'll configure our source control system, learn (a bit) about how to use it, and then place our journals under source control.
'''Source control''' enables us to ''track'' and ''manage'' changes to our code. This functionality becomes increasingly critical as the size of our projects grow both in terms of the lines of code and the number of coders. Source control shows us ''who'' changed the code and ''when''. We're able to ''compare'' one '''revision''' of code to another. And, when necessary, we can '''roll back''' changes to a previous revision. Source control can be a tremendous help to you (and your team) when you want to easily recover from accidentally damaging a project and, as such, provides you with the freedom to experiment without fear. However, you must use the source control system regularly and often or it won't be helpful to you. In this experience, we'll configure our source control system, learn (a bit) about how to use it, and then place our journals under source control.


== Git ==
== Git ==
Many options are available to choose from when selecting a source control system. We'll be using one called {{GlossaryReference|Git|git}}, created by Linus Torvalds in 2005. It's a '''distributed version-control system''', meaning that every Git directory on every computer is a repository with a complete history and full version-tracking abilities.
Many options are available to choose from when selecting a source control system. We'll be using one called {{GlossaryReference|Git|Git}}, created by Linus Torvalds in 2005. It's a '''distributed version-control system''', meaning that every Git directory on every computer is a repository with a complete history and full version-tracking abilities.


Let's say you're a graphic artist specializing in photo restoration, and working for the director of a museum. You've received an old photo that you're responsible to restore. Every step requires a lot of work and, being human, sometimes you make a mistake. Even if you didn't make a mistake, your director might not agree with the choices that you've made. You (smartly) decide that just like every other project you've done, you'll track every version of your file in git.
Let's say you're a graphic artist specializing in photo restoration and working for the director of a museum. You've received an old photo that you're responsible to restore. Every step requires a lot of work and, being human, sometimes you make a mistake. Even if you didn't make a mistake, your director might not agree with the choices that you've made. You (smartly) decide that just like every other project you've done, you'll track every version of your file in Git.


Let's consider your progress through this process:
Let's consider your progress through this process:
[[File:Alan-Turing-Enhancement.png|link=]]
[[File:Alan-Turing-Enhancement.png|link=]]


You '''commit''' each version of your project into git. Git considers each ''commit'' to be a '''snapshot''' of the current project. Each snapshot includes every current version of every file in your project that you've added to git, so it becomes a simple matter to move back in time to any version. While you're able to attach comments to each commit, internally, git uses a something called an '''SHA-1 hash''' to uniquely identify each commit. The ''hash'' is a 40-character string generated from the content of the files and directory structure. The hash will look something like this: ''24b9da6552252987aa493b52f8696cd6d3b00373''. You'll see this type of string in the git log.   
You '''commit''' each version of your project into Git. Git considers each ''commit'' to be a '''snapshot''' of the current project. Each snapshot includes every current version of every file in your project that you've added to Git, so it becomes a simple matter to move back in time to any version. While you're able to attach comments to each commit, internally, Git uses a something called an '''SHA-1 hash''' to uniquely identify each commit. The ''hash'' is a 40-character string generated from the content of the files and directory structure. The hash looks something like this: ''24b9da6552252987aa493b52f8696cd6d3b00373''. You'll see this type of string in the Git log.   


{{Hint|The SHA-1 hash is a bit long and unwieldy to type. In most cases git will accept the first few characters of a hash, so you don't need to type the whole thing.
{{Hint|The SHA-1 hash is a bit long and unwieldy to type. In most cases, Git accepts the first few characters of a hash, so you don't need to type the whole thing.
}}
}}


Line 26: Line 26:


== Configuration ==
== Configuration ==
As a first step, we'll need to let Git know about our name and email address. ''Be sure to change your name and email address appropriately. You'll need to be able to receive email at the address you specify in order to complete the setup process.''
As a first step, we'll need to let Git know about our name and email address. ''Be sure to change your name and email address appropriately. To complete the setup process, you'll need to be able to receive email at the address you specify.''


{{ConsoleLine|jane-williams@codermerlin:~$|git config --global user.email "jane@williams.org"}}
{{ConsoleLine|jane-williams@codermerlin:~$|git config --global user.email "jane@williams.org"}}
Line 34: Line 34:


== Initialization ==
== Initialization ==
Let's setup a new directory for all of our experiences and within, a directory for this experience:
Let's setup a new directory for all of our experiences and within that, a directory for this experience:


{{ConsoleLine|jane-williams@codermerlin:~$|mkdir Experiences}}
{{ConsoleLine|jane-williams@codermerlin:~$|mkdir Experiences}}
Line 42: Line 42:
{{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|}}
{{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|}}


One can consider the structure of a project, consisting of directories, files, and the associated content, as two dimensional. The '''repository''' can then be considered a three-dimensional structure formed by storing every committed version of the project structure across time. In order to initialize the repository, we issue the '''init''' command in the root of our project:
One can consider the structure of a project, consisting of directories, files, and the associated content, as two dimensional. The '''repository''' can then be considered a three-dimensional structure formed by storing every committed version of the project structure across time. To initialize the repository, we issue the '''init''' command in the root of our project:


{{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|git init}}
{{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|git init}}
{{ConsoleLine|Initialized empty Git repository in /home/jane-williams/Experiences/W1006/.git/|}}
{{ConsoleLine|Initialized empty Git repository in /home/jane-williams/Experiences/W1006/.git/|}}


Note that this repository exists locally, alongside your other files for the project. ''There is no central server repository.''
Note that this repository exists locally, alongside your other files for the project. ''There is no central server repository.''


{{Hint|
{{Hint|
Even though we're using ''git init'' on an empty directory, there's no requirement that we do so. It's perfectly fine to initialize git in a directory in which we've already begun work.}}
Even though we're using ''git init'' on an empty directory, there's no requirement that we do so. It's perfectly fine to initialize Git in a directory in which we've already begun work.}}


== Add a File and Check Status ==
== Add a File and Check Status ==
Line 68: Line 68:
}}
}}


Note that "file1.txt" is displayed in red under the title "Untracked files"Git is telling us that this file isn't currently being tracked. If we want to track it we'll need to tell Git to do so:
Note that "file1.txt" is displayed in red under the title "Untracked files." Git is telling us that this file isn't being tracked. If we want to track it, we'll need to tell Git to do so:
{{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|git add file1.txt}}
{{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|git add file1.txt}}


Line 83: Line 83:
}}
}}


Note that "file1.txt" is displayed in green under the title "Changes to be committed"Git is telling us that this file will be included in the next commit.
Note that "file1.txt" is displayed in green under the title "Changes to be committed." Git is telling us that this file will be included in the next commit.


{{GoingDeeper|
{{GoingDeeper|
A git repository contains a set of '''commit objects''', where each ''commit object'' in itself contains:
A Git repository contains a set of '''commit objects''', where each ''commit object'' in itself contains:
* the set of files representing a project at the instant of a particular commit
* The set of files representing a project at the instant of a commit
* references to '''parent commit objects'''
* References to '''parent commit objects'''
* a string of characters that uniquely identifies that particular commit object   
* A string of characters that uniquely identifies that commit object   


The very first commit won't have any parents.
The very first commit won't have any parents.


Each git repository is essentially a directed graph of commit objects.
Each Git repository is essentially a directed graph of commit objects.
}}
}}


Line 99: Line 99:


== Git States ==
== Git States ==
There are three states that a file in your project could be in:
A file in your project could be in one of three states:
* '''modified''' - a ''tracked'' file has been modified, but hasn't yet been staged
* '''Modified''' - a ''tracked'' file has been modified but hasn't yet been staged
* '''staged''' - the current version of the file's data has been marked for inclusion in the next commit snapshot
* '''Staged''' - the current version of the file's data has been marked for inclusion in the next commit snapshot
* '''committed''' - the file's data has been safely stored in the ''repository''
* '''Committed''' - the file's data has been safely stored in the ''repository''


Consequently, there are three main sections of a git project:
Consequently, a Git project has three main sections:
* '''working directory''' - These are the files of your project that you interact (work) with
* '''Working directory''' - These are the files of your project that you interact (work) with
* '''staging area''' - a list of objects (directories and files) that will go into the next commit
* '''Staging area''' - a list of objects (directories and files) that will go into the next commit
* '''repository''' - this is where git stores all of the information, i.e. '''metadata''' (data that describes other data) and '''objects''' (directories and files) associated with your project
* '''Repository''' - this is where Git stores all the information, i.e., '''metadata''' (data that describes other data) and '''objects''' (directories and files) associated with your project


[[File:Git-Sections.png|link=|Git Sections]]
[[File:Git-Sections.png|link=|Git Sections]]
Line 113: Line 113:


== Committing Changes ==
== Committing Changes ==
In order to create a new snapshot of all staged objects, we use the '''commit''' command. As part of the commit, it's helpful to explain (both to our fellow programmers and to our future selves) what it is that we changed and why. To facilitate this, git will open emacs so that we can edit our commit comments.
To create a new snapshot of all staged objects, we use the '''commit''' command. As part of the commit, it's helpful to explain (both to our fellow programmers and to our future selves) what it is that we changed and why. To facilitate this, Git opens emacs so that we can edit our commit comments.
   
   
{{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|git commit}}
{{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|git commit}}


Emacs will now open.  We'll see something similar to:
Emacs now opens.  We'll see something similar to:
{{ConsoleLines|
{{ConsoleLines|
   &nbsp;1 {{Bar}}<br/>
   &nbsp;1 {{Bar}}<br/>
Line 141: Line 141:


{{Caution|
{{Caution|
Meaningful commit comments are very important, and likely required by your guide. Some excellent descriptions of such messages can be found below:
Meaningful commit comments are very important and likely required by your guide. Some excellent descriptions of such messages are below:
* [https://chris.beams.io/posts/git-commit/ {{Cyan|How to Write a Git Commit Message}}] (Chris Beams)
* [https://chris.beams.io/posts/git-commit/ {{Cyan|How to Write a Git Commit Message}}] (Chris Beams)
* [https://dev.to/jacobherrington/how-to-write-useful-commit-messages-my-commit-message-template-20n9 {{Cyan|Useful Commit Messages}}] (Jacob Herrington)
* [https://dev.to/jacobherrington/how-to-write-useful-commit-messages-my-commit-message-template-20n9 {{Cyan|Useful Commit Messages}}] (Jacob Herrington)
}}
}}


Git will then let us know that the commit was successful with a message similar to the following:
Git then lets us know that the commit was successful with a message similar to the following:
{{ConsoleLines|
{{ConsoleLines|
[master (root-commit) 3fe8239] Adding our first file in this tutorial<br/>
[master (root-commit) 3fe8239] Adding our first file in this tutorial<br/>
Line 157: Line 157:
{{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|git log}}
{{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|git log}}


# What is git communicating to you?
# What is Git communicating to you?
# What do you see that is common between this '''git log''' command and the previous '''git commit'''?
# What do you see that is common between this '''git log''' command and the previous '''git commit'''?
# Why do you think this is?
# Why do you think this is?
Line 163: Line 163:


== More Changes ==
== More Changes ==
Let's add some additional text to file1.txt and also create a new file, file2.txt. Use emacs to:
Let's add some additional text to file1.txt and create a new file, file2.txt. Use emacs to do the following:
# Add a new line to the end of "file1.txt" with the text, "This is a new line." {{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|emacs file1.txt}}
* Add a new line to the end of "file1.txt" with the text, "This is a new line."  
# Add a new file, "file2.txt" with the text, "This is a new line in a new file." {{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|emacs file2.txt}}
{{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|emacs file1.txt}}
* Add a new file, "file2.txt" with the text, "This is a new line in a new file."  
{{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|emacs file2.txt}}


Then, exit emacs, and take a look at the status provided by git.  {{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|git status}}
Then, exit emacs and take a look at the status provided by Git.  {{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|git status}}




{{Observe|Section 2|
{{Observe|Section 2|
# What do you notice about file1.txt and file2.txt? How are they displayed in '''git status'''?
# What do you notice about file1.txt and file2.txt? How are they displayed in '''git status'''?
# Are they both displayed in the same section? If not, why not?
# Are they both displayed in the same section? If not, why not?
}}
}}


Line 181: Line 183:


{{Observe|Section 3|
{{Observe|Section 3|
# What do you notice about file1.txt and file2.txt? How are they displayed in '''git status'''?
# What do you notice about file1.txt and file2.txt? How are they displayed in '''git status'''?
# Compare and contrast the manner in which the two files are displayed.
# Compare and contrast the manner in which the two files are displayed.
}}
}}


Before we commit our changes, let's remind ourselves of what's changed. We can compare our working directory to the repository with '''git diff'''. To compare the staged area to the repository, we add the '''--cached''' flag: {{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|git diff --cached}}
Before we commit our changes, let's remind ourselves of what's changed. We can compare our working directory to the repository with '''git diff'''. To compare the staged area to the repository, we add the '''--cached''' flag: {{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|git diff --cached}}


{{Observe|Section 4|
{{Observe|Section 4|
# How many files are listed as having been changed?
# How many files are listed as having been changed?
# What are the specific differences listed for each file? In what color is the difference displayed?
# What are the specific differences listed for each file? In what color is the difference displayed?
}}
}}


Line 195: Line 197:


{{Observe|Section 5|
{{Observe|Section 5|
# Execute the commands {{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|git diff}} and also {{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|git diff --cached}}  What does git tell you has changed?  Why?}}
# Execute the commands  
{{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|git diff}}
{{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|git diff --cached}}   
What does Git tell you has changed?  Why?}}


== Create Journal Repository ==
== Create Journal Repository ==
We'll create a repository for all of our journals.  First, we'll temporarily move to our journal directory:
We'll create a repository for all of our journals.  First, we'll temporarily move to our journal directory:
{{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|pushd ~/Journals}}
{{ConsoleLine|jane-williams@codermerlin:~/Experiences/W1006$|pushd ~/"Digital Portfolio"/CS-I/Journals}}


As such, the general workflow is as follows:
As such, the general workflow is as follows:
# Initialize the repository. This need be done **only once**.  {{ConsoleLine|jane-williams@codermerlin:~/Journals$|git init}}
* Initialize the repository. This must be done **only once**.   
# Create and/or modify files in your working directory
{{ConsoleLine|jane-williams@codermerlin:~/Digital Portfolio/CS-I/Journals$|git init}}
# Add the files to the staging area {{ConsoleLine|jane-williams@codermerlin:~/Journals$|git add J1002.html}}
* Create or modify files in your working directory
# Commit ''all'' of the staged changes to the repository {{ConsoleLine|jane-williams@codermerlin:~/Journals$|git commit}}
* Add the files to the staging area  
{{ConsoleLine|jane-williams@codermerlin:~/Digital Portfolio/CS-I/Journals$|git add J1002.html}}
* Commit ''all'' of the staged changes to the repository  
{{ConsoleLine|jane-williams@codermerlin:~/Digital Portfolio/CS-I/Journals$|git commit}}




Now, return from the Journals directory:
Now, return from the Journals directory:
{{ConsoleLine|jane-williams@codermerlin:~/Journals$|popd}}
{{ConsoleLine|jane-williams@codermerlin:~/Digital Portfolio/CS-I/Journals$|popd}}




{{Hint|Rather than add files to the staging individually, it's possible to add them all on the same command line. '''Note that this is an example for future use, don't execute it now.'''
{{Hint|Rather than add files to the staging individually, it's possible to add them all on the same command line. '''Note that this is an example for future use, don't execute it now.'''
* To add multiple, named files:
* To add multiple, named files:
{{ConsoleLine|jane-williams@codermerlin:~/Journals$|git add J1006.html J1007.html J1008.html}}
{{ConsoleLine|jane-williams@codermerlin:~/Digital Portfolio/CS-I/Journals$|git add J1006.html J1007.html J1008.html}}
* To add everything that has changed:
* To add everything that has changed:
{{ConsoleLine|jane-williams@codermerlin:~/Journals$|git add .}}
{{ConsoleLine|jane-williams@codermerlin:~/Digital Portfolio/CS-I/Journals$|git add .}}
}}
}}


== Ignoring Unimportant Files ==
== Ignoring Unimportant Files ==
Emacs produces some files that are temporary in nature and should be excluded from git. However, unless we specify which files to ignore, '''git status''' will show these temporary files in addition to the files which are important to us. At best, this can be annoying. In order to instruct git to ignore these temporary files, create a special file in your git directory named '''.gitignore''' (the leading period is significant). For example:
Emacs produces some files that are temporary in nature and should be excluded from Git. However, unless we specify which files to ignore, '''git status''' will show these temporary files in addition to the files that are important to us. At best, this can be annoying. To instruct Git to ignore these temporary files, create a special file in your git directory named '''.gitignore''' (the leading period is significant). For example:
{{ConsoleLine|john-williams@codermerlin:~/Journals$ |emacs .gitignore}}
{{ConsoleLine|john-williams@codermerlin:~/Digital Portfolio/CS-I/Journals$ |emacs .gitignore}}


To this file, add the following two lines:
To this file, add the following three lines:
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
*~
*~
\#*\#
\#*\#
.merlin/
</syntaxhighlight>
</syntaxhighlight>


The '''*''' character is known as a '''wildcard''' character and it will match any standard character, instructing git to ignore files such as '''story.txt~''' and '''#story.txt#'''.
The '''*''' character is known as a '''wildcard''' character and it will match any standard character, instructing Git to ignore files such as '''story.txt~''' and '''#story.txt#'''.  The '''.merlin/''' line will ignore directories special to the {{CM}} platform.


Save the file and exit emacs. Repeat this process for any new git repository that you create.
Save the file and exit emacs. Repeat this process for any new Git repository that you create. Or, better yet, because these files are particular to the editor and platform you are using, move this file to your configuration directory using the following command:
 
<syntaxhighlight lang="bash">
mv .gitignore ~/.config/git/ignore
</syntaxhighlight>
 
After executing this command, it will no longer be necessary to create a new .gitignore file to exclude the files and directories listed.


== Oopsies... or How to Revert Changes ==
== Oopsies... or How to Revert Changes ==
Line 245: Line 260:
* We can '''rollback''' changes to a previous revision.
* We can '''rollback''' changes to a previous revision.
* Source control provides the ''freedom to experiment without fear''.
* Source control provides the ''freedom to experiment without fear''.
* '''git''' was created by Linus Torvalds in 2005.  
* '''Git''' was created by Linus Torvalds in 2005.  
* '''git''' is a '''distributed version-control system'''; every git directory on every computer is a repository with a ''complete history'' and ''full version-tracking abilities''.
* '''Git''' is a '''distributed version-control system'''; every Git directory on every computer is a repository with a ''complete history'' and ''full version-tracking abilities''.
* git considers each commit to be a '''snapshot''' of the current project. Each snapshot includes every current version of every file in the project that's been added to git.
* Git considers each commit to be a '''snapshot''' of the current project. Each snapshot includes every current version of every file in the project that's been added to Git.
* The three states that file in the project may be in include:
* A file in a project can be three states:
** '''working directory''' contains the files that have been created or modified in the project
** '''Working directory''' contains the files that have been created or modified in the project
** '''staging area''' contains a list of files that will go into the next commit
** '''Staging area''' contains a list of files that will go into the next commit
** '''repository''' is where git stores all of the committed information
** '''Repository''' is where Git stores all the committed information
* The general workflow for using git is:
* The general workflow for using Git is as follows:
** '''init''' to initialize the repository; we generally only do this once per project
** '''init''' to initialize the repository; we generally only do this once per project
** '''diff''' to show us the differences between the working directory and the repository
** '''diff''' to show us the differences between the working directory and the repository
Line 262: Line 277:
== Exercises ==
== Exercises ==
{{Exercises|
{{Exercises|
* {{Assignment|J1006}} Create a journal and answer all questions. Be sure to include all sections of the journal, properly formatted.
* {{Assignment|J1006}} Create a journal and answer all questions. Be sure to include all sections of the journal, properly formatted.
* Enter your Journals directory, then:
* Enter your Journals directory, then do the following:
** Initialize a git project in the directory.
** Initialize a Git project in the directory.
** Add all of your journals to git and commit these changes.
** Add all of your journals to Git and commit these changes.
** Going forward, it will be your responsibility to '''always add every journal to git'''. This includes any updates to your journals. Also, note that all content must be present '''within''' the git repository. ''Substituting links (e.g. Google Docs) for actual content is not acceptable.
** Going forward, it will be your responsibility to '''always add every journal to git'''. This includes any updates to your journals. Also note that all content must be present '''within''' the Git repository. ''Substituting links (e.g., Google Docs) for actual content is not acceptable.
<hr/>
<hr/>
''After completing W1008:''
''After completing W1008:''

Latest revision as of 17:33, 25 September 2023

Within these castle walls be forged Mavens of Computer Science ...
— Merlin, The Coder
Undo-redo

Curriculum[edit]

ExercisesIcon.png
 Coder Merlin™  Computer Science Curriculum Data

Unit: Lab basics

Experience Name: Source Control (W1006)

Next Experience: ()

Knowledge and skills:

  • §10.231 Demonstrate proficiency in using a source control system for single-users

Topic areas: Source control systems

Classroom time (average): 60 minutes

Study time (average): 30 minutes

Successful completion requires knowledge: understand the purpose of a source control system

Successful completion requires skills: ability to use a source control system to add, delete, and move documents; ability to use a source control system to commit changes; ability to use a source control system to checkout previous versions; ability to view a log of changes

Background[edit]

Introduction[edit]

Source control enables us to track and manage changes to our code. This functionality becomes increasingly critical as the size of our projects grow both in terms of the lines of code and the number of coders. Source control shows us who changed the code and when. We're able to compare one revision of code to another. And, when necessary, we can roll back changes to a previous revision. Source control can be a tremendous help to you (and your team) when you want to easily recover from accidentally damaging a project and, as such, provides you with the freedom to experiment without fear. However, you must use the source control system regularly and often or it won't be helpful to you. In this experience, we'll configure our source control system, learn (a bit) about how to use it, and then place our journals under source control.

Git[edit]

Many options are available to choose from when selecting a source control system. We'll be using one called Git, created by Linus Torvalds in 2005. It's a distributed version-control system, meaning that every Git directory on every computer is a repository with a complete history and full version-tracking abilities.

Let's say you're a graphic artist specializing in photo restoration and working for the director of a museum. You've received an old photo that you're responsible to restore. Every step requires a lot of work and, being human, sometimes you make a mistake. Even if you didn't make a mistake, your director might not agree with the choices that you've made. You (smartly) decide that just like every other project you've done, you'll track every version of your file in Git.

Let's consider your progress through this process: Alan-Turing-Enhancement.png

You commit each version of your project into Git. Git considers each commit to be a snapshot of the current project. Each snapshot includes every current version of every file in your project that you've added to Git, so it becomes a simple matter to move back in time to any version. While you're able to attach comments to each commit, internally, Git uses a something called an SHA-1 hash to uniquely identify each commit. The hash is a 40-character string generated from the content of the files and directory structure. The hash looks something like this: 24b9da6552252987aa493b52f8696cd6d3b00373. You'll see this type of string in the Git log.

Hint.pngHelpful Hint
The SHA-1 hash is a bit long and unwieldy to type. In most cases, Git accepts the first few characters of a hash, so you don't need to type the whole thing.

Git enables you to experiment fearlessly because (as long as you've been diligent with your commits) you can also go back to a previous (working) version.

Configuration[edit]

As a first step, we'll need to let Git know about our name and email address. Be sure to change your name and email address appropriately. To complete the setup process, you'll need to be able to receive email at the address you specify.

jane-williams@codermerlin:~$ git config --global user.email "jane@williams.org"

jane-williams@codermerlin:~$ git config --global user.name "Jane Williams"

Note: These commands, if successful, will complete silently.

Initialization[edit]

Let's setup a new directory for all of our experiences and within that, a directory for this experience:

jane-williams@codermerlin:~$ mkdir Experiences

jane-williams@codermerlin:~$ cd Experiences

jane-williams@codermerlin:~/Experiences$ mkdir W1006

jane-williams@codermerlin:~/Experiences$ cd W1006

jane-williams@codermerlin:~/Experiences/W1006$ 

One can consider the structure of a project, consisting of directories, files, and the associated content, as two dimensional. The repository can then be considered a three-dimensional structure formed by storing every committed version of the project structure across time. To initialize the repository, we issue the init command in the root of our project:

jane-williams@codermerlin:~/Experiences/W1006$ git init

Initialized empty Git repository in /home/jane-williams/Experiences/W1006/.git/ 

Note that this repository exists locally, alongside your other files for the project. There is no central server repository.

Hint.pngHelpful Hint
Even though we're using git init on an empty directory, there's no requirement that we do so. It's perfectly fine to initialize Git in a directory in which we've already begun work.

Add a File and Check Status[edit]

Let's create a small file that we can add to our project:

jane-williams@codermerlin:~/Experiences/W1006$ echo "This file isn't empty." > file1.txt

Let's find out what Git knows about this file:

jane-williams@codermerlin:~/Experiences/W1006$ git status

...

Untracked files:

  (use "git add <file>..." to include in what will be committed)



  file1.txt

...

Note that "file1.txt" is displayed in red under the title "Untracked files." Git is telling us that this file isn't being tracked. If we want to track it, we'll need to tell Git to do so:

jane-williams@codermerlin:~/Experiences/W1006$ git add file1.txt

Let's check the status now:

jane-williams@codermerlin:~/Experiences/W1006$ git status

...

Changes to be committed:

  (use "git rm --cached <file>..." to unstage)



  new file: file1.txt

...

Note that "file1.txt" is displayed in green under the title "Changes to be committed." Git is telling us that this file will be included in the next commit.

Going DeeperGoingDeeperIcon.png

A Git repository contains a set of commit objects, where each commit object in itself contains:

  • The set of files representing a project at the instant of a commit
  • References to parent commit objects
  • A string of characters that uniquely identifies that commit object

The very first commit won't have any parents.

Each Git repository is essentially a directed graph of commit objects.

The previous status output mentioned a command to be used to unstage. What's a stage?

Git States[edit]

A file in your project could be in one of three states:

  • Modified - a tracked file has been modified but hasn't yet been staged
  • Staged - the current version of the file's data has been marked for inclusion in the next commit snapshot
  • Committed - the file's data has been safely stored in the repository

Consequently, a Git project has three main sections:

  • Working directory - These are the files of your project that you interact (work) with
  • Staging area - a list of objects (directories and files) that will go into the next commit
  • Repository - this is where Git stores all the information, i.e., metadata (data that describes other data) and objects (directories and files) associated with your project

Git Sections


Committing Changes[edit]

To create a new snapshot of all staged objects, we use the commit command. As part of the commit, it's helpful to explain (both to our fellow programmers and to our future selves) what it is that we changed and why. To facilitate this, Git opens emacs so that we can edit our commit comments.

jane-williams@codermerlin:~/Experiences/W1006$ git commit

Emacs now opens. We'll see something similar to:

    1 |

    2 | # Please enter the commit message for your changes. Lines starting

    3 | # with '#' will be ignored, and an empty message aborts the commit.

    4 | #

    5 | # On branch master

    6 | #

    7 | # Initial commit

    8 | #

    9 | # Changes to be committed:

  10 | # new file: file1.txt

  11 | #

Let's add a helpful comment:

    1 |Adding our first file in this tutorial

    2 | # Please enter the commit message for your changes. Lines starting

We then save the file and exit emacs as usual: CONTROL-x CONTROL-s CONTROL-x CONTROL-c

CautionWarnIcon.png

Meaningful commit comments are very important and likely required by your guide. Some excellent descriptions of such messages are below:

Git then lets us know that the commit was successful with a message similar to the following:

[master (root-commit) 3fe8239] Adding our first file in this tutorial

 1 file changed, 1 insertion(+)

 create mode 100644 file1.txt

ObserveObserveIcon.png
Observe, Ponder, and Journal: Section 1

Execute the command git log:

jane-williams@codermerlin:~/Experiences/W1006$ git log

  1. What is Git communicating to you?
  2. What do you see that is common between this git log command and the previous git commit?
  3. Why do you think this is?

More Changes[edit]

Let's add some additional text to file1.txt and create a new file, file2.txt. Use emacs to do the following:

  • Add a new line to the end of "file1.txt" with the text, "This is a new line."

jane-williams@codermerlin:~/Experiences/W1006$ emacs file1.txt

  • Add a new file, "file2.txt" with the text, "This is a new line in a new file."

jane-williams@codermerlin:~/Experiences/W1006$ emacs file2.txt

Then, exit emacs and take a look at the status provided by Git.

jane-williams@codermerlin:~/Experiences/W1006$ git status


ObserveObserveIcon.png
Observe, Ponder, and Journal: Section 2
  1. What do you notice about file1.txt and file2.txt? How are they displayed in git status?
  2. Are they both displayed in the same section? If not, why not?

Let's stage the new versions of both of these files:

jane-williams@codermerlin:~/Experiences/W1006$ git add file1.txt file2.txt

Then, have a look at the status again:

jane-williams@codermerlin:~/Experiences/W1006$ git status


ObserveObserveIcon.png
Observe, Ponder, and Journal: Section 3
  1. What do you notice about file1.txt and file2.txt? How are they displayed in git status?
  2. Compare and contrast the manner in which the two files are displayed.

Before we commit our changes, let's remind ourselves of what's changed. We can compare our working directory to the repository with git diff. To compare the staged area to the repository, we add the --cached flag:

jane-williams@codermerlin:~/Experiences/W1006$ git diff --cached

ObserveObserveIcon.png
Observe, Ponder, and Journal: Section 4
  1. How many files are listed as having been changed?
  2. What are the specific differences listed for each file? In what color is the difference displayed?

To conclude this section, let's commit our changes.

jane-williams@codermerlin:~/Experiences/W1006$ git commit

ObserveObserveIcon.png
Observe, Ponder, and Journal: Section 5
  1. Execute the commands

jane-williams@codermerlin:~/Experiences/W1006$ git diff

jane-williams@codermerlin:~/Experiences/W1006$ git diff --cached

What does Git tell you has changed? Why?

Create Journal Repository[edit]

We'll create a repository for all of our journals. First, we'll temporarily move to our journal directory:

jane-williams@codermerlin:~/Experiences/W1006$ pushd ~/"Digital Portfolio"/CS-I/Journals

As such, the general workflow is as follows:

  • Initialize the repository. This must be done **only once**.

jane-williams@codermerlin:~/Digital Portfolio/CS-I/Journals$ git init

  • Create or modify files in your working directory
  • Add the files to the staging area

jane-williams@codermerlin:~/Digital Portfolio/CS-I/Journals$ git add J1002.html

  • Commit all of the staged changes to the repository

jane-williams@codermerlin:~/Digital Portfolio/CS-I/Journals$ git commit


Now, return from the Journals directory:

jane-williams@codermerlin:~/Digital Portfolio/CS-I/Journals$ popd


Hint.pngHelpful Hint
Rather than add files to the staging individually, it's possible to add them all on the same command line. Note that this is an example for future use, don't execute it now.
  • To add multiple, named files:

jane-williams@codermerlin:~/Digital Portfolio/CS-I/Journals$ git add J1006.html J1007.html J1008.html

  • To add everything that has changed:

jane-williams@codermerlin:~/Digital Portfolio/CS-I/Journals$ git add .

Ignoring Unimportant Files[edit]

Emacs produces some files that are temporary in nature and should be excluded from Git. However, unless we specify which files to ignore, git status will show these temporary files in addition to the files that are important to us. At best, this can be annoying. To instruct Git to ignore these temporary files, create a special file in your git directory named .gitignore (the leading period is significant). For example:

john-williams@codermerlin:~/Digital Portfolio/CS-I/Journals$  emacs .gitignore

To this file, add the following three lines:

*~
\#*\#
.merlin/

The * character is known as a wildcard character and it will match any standard character, instructing Git to ignore files such as story.txt~ and #story.txt#. The .merlin/ line will ignore directories special to the  Coder Merlin™  platform.

Save the file and exit emacs. Repeat this process for any new Git repository that you create. Or, better yet, because these files are particular to the editor and platform you are using, move this file to your configuration directory using the following command:

mv .gitignore ~/.config/git/ignore

After executing this command, it will no longer be necessary to create a new .gitignore file to exclude the files and directories listed.

Oopsies... or How to Revert Changes[edit]

ComingSoonIcon.png
Coming Soon

Section on How to Revert Changes

Key Concepts[edit]

Key ConceptsKeyConceptsIcon.png
  • Source control enables us to track and manage changes to our code.
  • Source control shows us who changed the code and when.
  • We're able to compare one revision to another.
  • We can rollback changes to a previous revision.
  • Source control provides the freedom to experiment without fear.
  • Git was created by Linus Torvalds in 2005.
  • Git is a distributed version-control system; every Git directory on every computer is a repository with a complete history and full version-tracking abilities.
  • Git considers each commit to be a snapshot of the current project. Each snapshot includes every current version of every file in the project that's been added to Git.
  • A file in a project can be three states:
    • Working directory contains the files that have been created or modified in the project
    • Staging area contains a list of files that will go into the next commit
    • Repository is where Git stores all the committed information
  • The general workflow for using Git is as follows:
    • init to initialize the repository; we generally only do this once per project
    • diff to show us the differences between the working directory and the repository
    • diff --cached to show us the differences between the staging area and the repository
    • add to add new files or their most recent modifications
    • commit to save the snapshot from the staging area into the repository

Exercises[edit]

ExercisesExercisesIcon.png
  •  J1006  Create a journal and answer all questions. Be sure to include all sections of the journal, properly formatted.
  • Enter your Journals directory, then do the following:
    • Initialize a Git project in the directory.
    • Add all of your journals to Git and commit these changes.
    • Going forward, it will be your responsibility to always add every journal to git. This includes any updates to your journals. Also note that all content must be present within the Git repository. Substituting links (e.g., Google Docs) for actual content is not acceptable.

After completing W1008:

  •  M1006-31  Complete  Merlin Mission Manager  Mission M1006-31.

References[edit]


Experience Metadata

Experience ID W1006
Next experience ID
Unit Lab basics
Knowledge and skills §10.231
Topic areas Source control systems
Classroom time 60 minutes
Study time 30 minutes
Acquired knowledge understand the purpose of a source control system
Acquired skill ability to use a source control system to add, delete, and move documents
ability to use a source control system to commit changes
ability to use a source control system to checkout previous versions
ability to view a log of changes
Additional categories