03-VersionControl

____________________________________________________


Previous: 01-02-LinuxBash



Introduction: three problems


1. How to keep past versions of your stuff?

For example, in a day's work you may produce:

~> ls
draft.py
final.py
final_real.py
final_real_real.py
actually_done.py
actually_done_v1.py
actually_done_v2.py
actually_done_v2.1.py
actually_done_v2.1-2019-12-10.py
...

Ok, I guess I should use Git...


Sub-problem: You had a working version of your code at the beginning of the day, but at the end of the days work, it's broken.




2. How to collaborate by making copies of a document or code, and then re-integrate those changes.

For example:

  • How to write code between 1000's of people while everyone wants to work at once.
  • How to re-write or draft a document (e.g., a constitutional amendment) at once with lots of people.

3. How to back up your code?

In case of fire: git commit, git push, leave the building!



Version control, the git that keeps on giving


  • https://en.wikipedia.org/wiki/Distributed_version_control
    • A form of version control in which the complete code-base, including its full history, is mirrored on every developer's computer.
    • This enables automatic management branching and merging, speeds up of most operations (except pushing and pulling), improves the ability to work offline, and does not rely on a single location for backups.
    • In general, distributed systems are more robust and favorable for end-users, when compared to centralized systems.

Version control is the Git...
Git is one byte short of a four-letter word.


  • https://en.wikipedia.org/wiki/Git
    • Is a distributed VCS written by the original author of the Linux kernel, https://en.wikipedia.org/wiki/Linus_Torvalds
    • Torvalds sarcastically quipped about the name Git (which means unpleasant person in British English slang): "I'm an egotistical bastard, and I name all my projects after myself. First 'Linux', now 'git'." ...
    • The man page describes Git as "the stupid content tracker".
    • The read-me file of the source code elaborates further:
      • random three-letter combination that is pronounceable, and not actually used by any common UNIX command.
      • Git: stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
      • "global information tracker"
      • "goddamn idiotic truckload of..."

Git

https://git-scm.com/
https://en.wikipedia.org/wiki/Git

  • Git is software. It exists locally on your machine and other developer's machines.
  • Github, Gitlab, and BitBucket are websites (servers), that interface with end-users' git software.
    • They host their own versions of Git-compatible server software that hosts Git repositories and talks to local Git processes.
  • Quite ironically, unlike GitHub (now a Microsoft product), Gitlab's server-side software is actually https://en.wikipedia.org/wiki/Open_source, so anyone can host their own Gitlab website/server.
    • Gitlab itself also has a rich positive development community.
  • MST IT hosts two installations of Gitlab server-side software, on two different servers residing on campus (cool!!):

  • This is actually a good approach for now, if you break your "repo" but still have your code...
  • Later, you will want to learn branching and conflict handling better.

Demo 1

#!/bin/bash

# Make a repository at https://git-classes.mst.edu
# Show the gitlab view of it

git clone
vim README.md
vim hello_world.py
# write, save, quit
git add .
git commit -m "my first repo!"
git push -u origin master

# show web interface

# edit a file locally
git push #?
git pull #?

# edit something in web, then, what happens?

git pull


Demo 2

Check out some real repositories


Extra background


Tracking changes


Git version control?

  • Keeps track of changes to your code.
  • You don't have to worry about accidentally losing or deleting code.
  • You can experiment with changes to your code, and then reset to a known good state.
  • Makes collaborating with others easier.

How does Git work?

  • Distributed - everything is kept on your, and your collaborators' local machines, not primarily or necessarily in the cloud.
  • Repository - a collection of code and history; a.k.a, "repo"
  • Commit - a chunk of saved changes, like a snapshot in time, similar to a VM snapshot, but only for a particular folder (a git repo).

Distributed

Distributed version control


Snapshots

Snapshots (commits) include all files


Storage landscape

Three places where edits exist


Gitting Started...

  • Actually demo some of these in class

Pre-use configuration: these are just for meta-data, not login, etc.

  • $ git config --global user.name "<YOUR NAME>"
  • $ git config --global user.email <EMAIL>
  • $ git config --global core.editor vim
    • or your choice of text editor

Basic local use:

  • $ git init Makes a new empty git repository out of your current working directory and its sub-directories.
  • $ git add <FILENAME> Adds FILENAME or changes to FILENAME to the next commit. Addable thing can be a wildcard, like . or *
  • $ git commit -m "some message" Takes a snapshot (commit) with any staged (added) changes.
    • Note: don't skip the -m "message" or you may end up stuck in vim; if so, just hit 'i' type something, hit 'esc', then type ':wq!'

THESE SHOULD BE YOUR CONSTANT GO-TO:

  • $ git status Shows the status of the repository.
  • $ git diff Shows the diff of anything you have done from your last snapshot
    • $ git diff fileofinterest.py
  • $ git log --all --graph Shows a nice history


commit

$ echo hey >>README.md
$ git add README.md
$ git commit -m "a message"
$ echo hey >>README.md
$ git commit -am "b message"
$ echo hey >>README.md
$ git commit -am "c message"
$ git log -p --all --graph


++++++++++++++++++++++++++++
Cahoot-02c.1
https://mst.instructure.com/courses/58101/quizzes/55183



branch

May the forks be with you!


$ git branch new-branch
$ git checkout new-branch

or

$ git checkout -b new-branch

then


$ git log -p --all --graph
$ echo hey >>README.md
$ git commit -am "d message"
$ git log -p --all --graph
$ git checkout master
$ git log -p --all --graph
$ git checkout -b another
$ git log -p --all --graph
$ echo hey >>README.md
$ git commit -am "e message"
$ git log -p --all --graph


diff for branches

$ git diff branch1..branch2


merge

Incorporates changes from the named commits (since the time their histories diverged from the current branch) into the current branch.
$ git merge new-branch
$ git log -p --all --graph
$ git checkout master
$ git merge another
$ git log -p --all --graph



Merge conflicts (oh Fork! ...)


CONFLICT (content): Merge conflict in the-file.txt
Automatic merge failed; fix conflicts and then commit the result.


In the-file.txt:


<<<<<<< HEAD
The current branch's contents
=======
Stuff from the branch you're merging
>>>>>>> new-branch


$ git add the-file.txt
$ git commit -m "message"



++++++++++++++++++++++++++++
Cahoot-02c.2
https://mst.instructure.com/courses/58101/quizzes/55184



Exploration


Looking at stuff

$ git status shows summary data


$ git log Show a log of commits

--graph Neat ASCII graph
--all Shows all branches
-p Show what changed in each commit


$ git show firstfourofhashofcommit


$ git diff Show un-added, un-committed changes for all files
$ git diff firstfourofhashofcommit
$ git diff --cached shows diff with added but not committed changes
$ git diff branch1..branch2



Git happens

Now, how to clean up a mess?


Revert single file in latest commit

$ git checkout file.py


reverting changes

$ git revert help


Undoing stuff since a commit

To delete all local changes in the branch that have not been added to the staging area, and leave un-staged files/folders, type:
$ git checkout .


To undo the most recently added, but not committed, changes to files/folders:
$ git reset .



Remote repositories


Working with remotes

$ git clone <REPO_URL>

  • makes a copy of a repository.

git push <remote> <name-of-branch>

$ git push

  • Pushes changes from your current branch to the remote branch it tracks. (You may need to run $ git config --global push.default simple.)

For example, to push your local commits to the master branch of the origin remote:

$ git push origin master


git pull <REMOTE> <name-of-branch>

$ git pull

  • Pulls changes from the remote branch and merges them into your current branch

$ git remote -v

  • To view your remote repositories

$ git remote add <REMOTE_NAME> <REPO_URL>

  • adds a remote to an existing repository.

For projects you work on:

a 'git pull' a day, keeps the conflicts away


Note: commit before pull

$ git commit -am "always commit before pull
$ git pull

++++++++++++++++++++++++++++
Cahoot-02c.3
https://mst.instructure.com/courses/58101/quizzes/55185



Working with others


Collaboration

  • You and your co-workers are working on a project simultaneously
  • You clone the company's repository:

$ git clone https://git.company.com/project.git

  • $ git checkout -b dougs-branch
    • to create your own development branch
  • Modify files, $ git add <FILENAME> to stage them, and
    • $ git commit when they are in a working state.
  • Ready to merge with mainline?
    • $ git checkout master and
    • $ git merge dougs-branch
  • Your work is now merged with your local master branch (but not on the company's repo).
  • Question: which branch is HEAD now pointing to?
  • Meanwhile, your co-workers might have made changes!
  • First, $ git pull to fetch and merge their changes
  • Rectify merge conflicts (if any),
    • test the code, then
    • $ git add <FILENAME> to stage, and
    • then $ git commit when in a working state
  • Only after pulling and merging the most recent changes should you
    • $ git push
  • Your work is merged with that of your co-workers, and now resides on the company repo
  • Take a break.

Blaming your collaborators

When you need a scapegoat for that critical mistake in your code-base...
$ git blame help


Commit early, commit often!

A tip for version control, not for relationships...



++++++++++++++++++++++++++++
Cahoot-02c.4
https://mst.instructure.com/courses/58101/quizzes/55186



Final Git Tips

  • Unlike GCC/G++, Git actually gives good error messages!
    • If something went wrong, it often tells you exactly what to do.
    • Actually read Git's error messages!!!!
  • Make your commit messages descriptive.
  • Only $ git commit when the code works.
  • Don't add generated files (like a.out) to your repo.
  • You can ignore certain files by putting their names in a .gitignore file in your repository.
  • When collaborating, work on separate branches and merge as you go along.
  • $ git help COMMAND will show you documentation.
    • $ git COMMAND --help will usually too.
    • $ man git COMMAND often does too.


Time to Git-er-done: Continuous testing and integration


  • Continuous testing was originally proposed as a way of reducing waiting time for feedback to developers by introducing development environment-triggered tests as well as more traditional developer/tester-triggered tests.
  • Continuous testing is the process of executing automated tests as part of the software delivery pipeline to obtain immediate feedback on the business risks associated with a software release candidate.
  • For Continuous testing, the scope of testing extends from validating bottom-up requirements or user stories to assessing the system requirements associated with overarching business goals.



Note:

Your https://git-classes.mst.edu unit tests are built into the git CI framework!
Check out how we do it:


Remember, in learning to code, and trying new projects:

Fork it until you make it!



Backlinks: CoursesArchive:GeneralSyllabusFS21 CoursesArchive:GeneralSyllabusSP22 index:SyllabusGeneral index:ResearchDevelopment:ClassroomCode index:Classes:DataStructuresLab:Content index:Classes:ProgrammingCpp:Content index:Classes:Bioinformatics:Content:02-PlatformTools index:Classes:ComputationalThinking:Content:02-GitLinuxBash index:Classes:DataStructuresLab:Content:01-02-LinuxBash CoursesArchive:GeneralSyllabusSP23