Pages

Git and Github - setup, workflow and learnings

Git has arrived and is here to stay. The learning curve is steep and frustrating. But the results are rewarding to say the least. I transitioned not too long ago from the simplicity (and accompanying inflexibility) of CVS. I can even hear myself groaning at the cruel change in terminology (commit is local... aaargh) and no direct means of applying CVS concepts in git. Sidenote: do not bother looking for git equivalence of CVS commands. I groaned until the moment I saw light.

The CodeChix ONF driver project collaboration would not have been easy without the power of git. But git in itself wasn't sufficient for our purposes - we also needed an online hosting service for the repository. We chose github.

Github has additional mechanism defined to manage collaboration in its hosted service which can be yet another source of frustration if not understood well - more on that later.

Specifically, we extensively used these features:
1. Fork
2. Pull request for codereview and merge
3. Pull from 'upstream'

The alternative to the above workflow is to clone directly from the project and push directly into it. I did not favor this approach as it does not allow for an intermediate step of reviewing the code. Pull requests are built for code reviews and explicit merges by the repo manager.

Here's the setup in great detail:
1. The main repo has a topic branch (called 'dev-onf-driver') apart from the master. This main repo with its 2 branches is our 'upstream'. You can set one or the other as the 'default' branch via the online github interface.

2. Fork - also executed online - creates a copy of the upstream in the collaborator's online github account. This is the collaborator's 'origin' and has both the branches.

3. git clone <path to origin>
Each collaborator clones the origin to their local development machine.

4. git checkout -b <branch name> --track <remote branch>
This step is necessary to clone any additional branches from the origin. The names/paths of all remote branches are listed in 'git branch -a'

5. git remote add upstream <path to upstream>
Necessary for pulling latest changes from upstream. The upstream (as noted in #1 above) is the main repo to which all collaborators will merge their changes via pull requests.

This completes the setup.

The typical workflow with this setup is:
1. Merging changes to upstream:
    Each collaborator does the following to merge to upstream:
      a. A series of 'git commit' followed by 'git push' when ready to merge. The changes are now updated in the collaborator's 'origin'.
      b. Login to online origin repo and start a 'pull request'. Edit the repo:branch combination to select the correct upstream and the correct origin branch. After confirming the changes displayed on the page, initiate the pull request.

2. Codereview:
    Every time a pull request is generated, it gives the opportunity to the other collaborators to review and comment on the code. The pull request can be cancelled or updated with changes.

3. Merge to upstream:
    Once the codereview is complete, the pull request is merged to upstream.

4. Pull changes to all collaborators' repos:
git pull upstream <branch name>
eg: git pull upstream dev-onf-driver
This is possible only after stashing or committing the changes in the repo. Once the local repos are updated, the origin needs to be also brought in sync with the upstream by:
git push

Our experiences:
1. Pull requests are *not* very intuitive. The pull in this context refers to pulling a merge branch. Pull in other git context refers to updating local repos with code from upstream. Getting pull request right in concept and in practice is a struggle and cause for many a mistake.

2. Collaborative merge permissions can be dangerous. It is best for the merges (from pull request) to be controlled by one owner. It is terribly easy to pull-req/merge with incorrect base and origin branches/repos. Reverting this is not as easy.

3. The only one way to update the online 'origin' repo is by doing a 'git pull upstream <branch>' followed by git push. There is no online mechanism to achieve that. This can be annoying but if the workflow is strictly established for changes to travel in uni-direction, this is not a problem. In our chase, the graph-edges were always uni-directional. upstream -> local -> origin -> upstream

4. git push to upstream. Like all things git, this too is possible but in a workflow like ours, dreaded!

What we may change next time:
Evaluate other means of codereview. Gerrit and Jenkins will be tested out for ease of use and cost. Pull requests are cause of many lost hours of productivity and will be avoided if possible.