03-VersionControl
____________________________________________________
Previous: 01-02-LinuxBash
Introduction: three problems
1. How to keep past versions of your stuff?
~> ls draft.py final.py final_real.py final_real_real.py actually_done.py actually_done_v1.py actually_done_v2.py actually_done_v2.1.py actually_done_v2.1-2019-12-10.py ...

2. How to collaborate by making copies of a document or code, and then re-integrate those changes.
For example:
- How to write code between 1000's of people while everyone wants to work at once.
- How to re-write or draft a document (e.g., a constitutional amendment) at once with lots of people.
3. How to back up your code?
Version control, the git that keeps on giving
- https://en.wikipedia.org/wiki/Version_control
- Like a MS-Word document's track changes, but better, with many more features, and for source code!
- Comes in many flavors.
- https://en.wikipedia.org/wiki/Distributed_version_control
- A form of version control in which the complete code-base, including its full history, is mirrored on every developer's computer.
- This enables automatic management branching and merging, speeds up of most operations (except pushing and pulling), improves the ability to work offline, and does not rely on a single location for backups.
- In general, distributed systems are more robust and favorable for end-users, when compared to centralized systems.
Git is one byte short of a four-letter word.
- https://en.wikipedia.org/wiki/Git
- Is a distributed VCS written by the original author of the Linux kernel, https://en.wikipedia.org/wiki/Linus_Torvalds
- Torvalds sarcastically quipped about the name Git (which means unpleasant person in British English slang): "I'm an egotistical bastard, and I name all my projects after myself. First 'Linux', now 'git'." ...
- The man page describes Git as "the stupid content tracker".
- The read-me file of the source code elaborates further:
- random three-letter combination that is pronounceable, and not actually used by any common UNIX command.
- Git: stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
- "global information tracker"
- "goddamn idiotic truckload of..."
Git
https://git-scm.com/
https://en.wikipedia.org/wiki/Git
- Git is software. It exists locally on your machine and other developer's machines.
- Github, Gitlab, and BitBucket are websites (servers), that interface with end-users' git software.
- They host their own versions of Git-compatible server software that hosts Git repositories and talks to local Git processes.
- Quite ironically, unlike GitHub (now a Microsoft product), Gitlab's server-side software is actually https://en.wikipedia.org/wiki/Open_source, so anyone can host their own Gitlab website/server.
- Gitlab itself also has a rich positive development community.
- MST IT hosts two installations of Gitlab server-side software, on two different servers residing on campus (cool!!):
- https://git.mst.edu (permanent code, like lab code or personal projects)
- https://git-classes.mst.edu (class code, which gets deleted ever now-and-then)
- This is actually a good approach for now, if you break your "repo" but still have your code...
- Later, you will want to learn branching and conflict handling better.
Demo 1
#!/bin/bash
# Make a repository at https://git-classes.mst.edu
# Show the gitlab view of it
git clone
vim README.md
vim hello_world.py
# write, save, quit
git add .
git commit -m "my first repo!"
git push -u origin master
# show web interface
# edit a file locally
git push #?
git pull #?
# edit something in web, then, what happens?
git pull
Demo 2
Check out some real repositories
- https://github.com/explore
- For example:
- https://gitlab.com/explore
Extra background
- Reading about version control and Git. Read these roughly in order.
- https://www.atlassian.com/git/tutorials/what-is-version-control
- https://git-scm.com/
- https://git-scm.com/videos
- https://git-scm.com/docs/gittutorial
- https://git-scm.com/book/en/v2 (read at least chapters 1 and 2)
- https://docs.gitlab.com/ce/gitlab-basics/README.html
- https://marklodato.github.io/visual-git-guide/index-en.html (great after you have completed intro reading)
- https://learnxinyminutes.com/docs/git/
- http://think-like-a-git.net/
- https://www.dangitgit.com/
- https://learngitbranching.js.org/ (advanced tutorial)
- http://git.rocks (dead link, used to be a great demo.... find it again?)
- ../tools-for-computer-scientists.pdf Appendix E, Chapter 1
- ./03-version_control.pdf (my old slides)
- Cheat sheets:
- https://rogerdudler.github.io/git-guide/
- https://github.com/hbons/git-cheat-sheet/raw/master/git-cheat-sheet.pdf
- http://wall-skills.com/wp-content/uploads/2013/12/git-Cheat-Sheet_Wall-Skills1.pdf
- https://rogerdudler.github.io/git-guide/files/git_cheat_sheet.pdf
- https://www.atlassian.com/git/tutorials/atlassian-git-cheatsheet
- https://about.gitlab.com/images/press/git-cheat-sheet.pdf
Tracking changes
Git version control?
- Keeps track of changes to your code.
- You don't have to worry about accidentally losing or deleting code.
- You can experiment with changes to your code, and then reset to a known good state.
- Makes collaborating with others easier.
How does Git work?
- Distributed - everything is kept on your, and your collaborators' local machines, not primarily or necessarily in the cloud.
- Repository - a collection of code and history; a.k.a, "repo"
- Commit - a chunk of saved changes, like a snapshot in time, similar to a VM snapshot, but only for a particular folder (a git repo).
Distributed
Distributed version control
Snapshots
Snapshots (commits) include all files
Storage landscape
Three places where edits exist
Gitting Started...
- Actually demo some of these in class
Pre-use configuration: these are just for meta-data, not login, etc.
- $ git config --global user.name "<YOUR NAME>"
- $ git config --global user.email <EMAIL>
- $ git config --global core.editor vim
- or your choice of text editor
Basic local use:
- $ git init Makes a new empty git repository out of your current working directory and its sub-directories.
- $ git add <FILENAME> Adds FILENAME or changes to FILENAME to the next commit. Addable thing can be a wildcard, like . or *
- $ git commit -m "some message" Takes a snapshot (commit) with any staged (added) changes.
- Note: don't skip the -m "message" or you may end up stuck in vim; if so, just hit 'i' type something, hit 'esc', then type ':wq!'
THESE SHOULD BE YOUR CONSTANT GO-TO:
- $ git status Shows the status of the repository.
- $ git diff Shows the diff of anything you have done from your last snapshot
- $ git diff fileofinterest.py
- $ git log --all --graph Shows a nice history
commit
$ echo hey >>README.md $ git add README.md $ git commit -m "a message" $ echo hey >>README.md $ git commit -am "b message" $ echo hey >>README.md $ git commit -am "c message" $ git log -p --all --graph
++++++++++++++++++++++++++++
Cahoot-02c.1
https://mst.instructure.com/courses/58101/quizzes/55183
branch
May the forks be with you!
$ git branch new-branch $ git checkout new-branch
$ git checkout -b new-branch
$ git log -p --all --graph $ echo hey >>README.md $ git commit -am "d message" $ git log -p --all --graph $ git checkout master $ git log -p --all --graph $ git checkout -b another $ git log -p --all --graph $ echo hey >>README.md $ git commit -am "e message" $ git log -p --all --graph
diff for branches
$ git diff branch1..branch2
merge
Incorporates changes from the named commits (since the time their histories diverged from the current branch) into the current branch.
$ git merge new-branch
$ git log -p --all --graph
$ git checkout master
$ git merge another
$ git log -p --all --graph
Merge conflicts (oh Fork! ...)
CONFLICT (content): Merge conflict in the-file.txt
Automatic merge failed; fix conflicts and then commit the result.
In the-file.txt:
The current branch's contents
=======
Stuff from the branch you're merging
>>>>>>> new-branch
$ git add the-file.txt
$ git commit -m "message"
++++++++++++++++++++++++++++
Cahoot-02c.2
https://mst.instructure.com/courses/58101/quizzes/55184
Exploration
Looking at stuff
$ git status shows summary data
$ git log Show a log of commits
--all Shows all branches
-p Show what changed in each commit
$ git show firstfourofhashofcommit
$ git diff Show un-added, un-committed changes for all files
$ git diff firstfourofhashofcommit
$ git diff --cached shows diff with added but not committed changes
$ git diff branch1..branch2
Git happens
Now, how to clean up a mess?
Revert single file in latest commit
$ git checkout file.py
reverting changes
$ git revert help
Undoing stuff since a commit
To delete all local changes in the branch that have not been added to the staging area, and leave un-staged files/folders, type:
$ git checkout .
To undo the most recently added, but not committed, changes to files/folders:
$ git reset .
Remote repositories
Working with remotes
$ git clone <REPO_URL>
- makes a copy of a repository.
git push <remote> <name-of-branch>
- Pushes changes from your current branch to the remote branch it tracks. (You may need to run $ git config --global push.default simple.)
For example, to push your local commits to the master branch of the origin remote:
git pull <REMOTE> <name-of-branch>
- Pulls changes from the remote branch and merges them into your current branch
$ git remote -v
- To view your remote repositories
$ git remote add <REMOTE_NAME> <REPO_URL>
- adds a remote to an existing repository.
For projects you work on:
Note: commit before pull
$ git commit -am "always commit before pull $ git pull
++++++++++++++++++++++++++++
Cahoot-02c.3
https://mst.instructure.com/courses/58101/quizzes/55185
Working with others
Collaboration
- You and your co-workers are working on a project simultaneously
- You clone the company's repository:
- $ git checkout -b dougs-branch
- to create your own development branch
- Modify files, $ git add <FILENAME> to stage them, and
- $ git commit when they are in a working state.
- Ready to merge with mainline?
- $ git checkout master and
- $ git merge dougs-branch
- Your work is now merged with your local master branch (but not on the company's repo).
- Question: which branch is HEAD now pointing to?
- Meanwhile, your co-workers might have made changes!
- First, $ git pull to fetch and merge their changes
- Rectify merge conflicts (if any),
- test the code, then
- $ git add <FILENAME> to stage, and
- then $ git commit when in a working state
- Only after pulling and merging the most recent changes should you
- $ git push
- Your work is merged with that of your co-workers, and now resides on the company repo
- Take a break.
Blaming your collaborators
When you need a scapegoat for that critical mistake in your code-base...
$ git blame help
Commit early, commit often!
A tip for version control, not for relationships...
++++++++++++++++++++++++++++
Cahoot-02c.4
https://mst.instructure.com/courses/58101/quizzes/55186
Final Git Tips
- Unlike GCC/G++, Git actually gives good error messages!
- If something went wrong, it often tells you exactly what to do.
- Actually read Git's error messages!!!!
- Make your commit messages descriptive.
- Only $ git commit when the code works.
- Don't add generated files (like a.out) to your repo.
- You can ignore certain files by putting their names in a .gitignore file in your repository.
- When collaborating, work on separate branches and merge as you go along.
- $ git help COMMAND will show you documentation.
- $ git COMMAND --help will usually too.
- $ man git COMMAND often does too.
Time to Git-er-done: Continuous testing and integration
- https://en.wikipedia.org/wiki/Continuous_testing
- https://en.wikipedia.org/wiki/Continuous_integration
- https://en.wikipedia.org/wiki/Deployment_environment
- Continuous testing was originally proposed as a way of reducing waiting time for feedback to developers by introducing development environment-triggered tests as well as more traditional developer/tester-triggered tests.
- Continuous testing is the process of executing automated tests as part of the software delivery pipeline to obtain immediate feedback on the business risks associated with a software release candidate.
- For Continuous testing, the scope of testing extends from validating bottom-up requirements or user stories to assessing the system requirements associated with overarching business goals.
Note:
Check out how we do it:
- https://about.gitlab.com/ci-cd/
- https://docs.gitlab.com/ee/ci/
- https://about.gitlab.com/product/continuous-integration/
Remember, in learning to code, and trying new projects:
Backlinks: CoursesArchive:GeneralSyllabusFS21 CoursesArchive:GeneralSyllabusSP22 index:SyllabusGeneral index:ResearchDevelopment:ClassroomCode index:Classes:DataStructuresLab:Content index:Classes:ProgrammingCpp:Content index:Classes:Bioinformatics:Content:02-PlatformTools index:Classes:ComputationalThinking:Content:02-GitLinuxBash index:Classes:DataStructuresLab:Content:01-02-LinuxBash CoursesArchive:GeneralSyllabusSP23