git-subtree

Sometimes projects can have complex relationships. Perhaps a project develops a utility or libary internally that ends up becoming its own separate project. Perhaps a project uses a particular library, but adds some features to it in their own repository. How are these commits reconciled? One method is the use of the script git-subtree. It is a very handy tool that allows for a repository to be associated with a directory or subtree of a different repository, yet allow for bi-directional transfer of commits. In Arch Linux, git-subtree is in the AUR as git-subtree-git so it can be installed with yaourt with yaourt -S git-subtree-git.

Some time ago, I started working on a project that involved using XBee ZigBee radios from Digi. Unfortunately, the software provided is windows-only and is lacking some very important features, like packet generation and decoding. As a part of the project, I worked on a cross-platform open source re-implementation of a ZigBee terminal that can talk to the XBee radios in packet mode and create and decode packets with a nice interface. The application evolved to the point where it would be nice to split it off into its own repository. So, I installed git-subtree and went to work. Or not. The documentation is really bad if you're not a git expert. After consulting the oracle of google, I discovered a very good blog post on the matter. Please see the link for the details as Jakub does a much better job of explaining it than myself.

Before the split, the repository looked like this:

  • cansat-2012
    • admin
    • AUTHORS
    • docs
    • firmware
    • ground-station
    • MCAD
    • pcb
    • README
    • zigbee-terminal

I wanted to extract the zigbee-terminal tree and all of its commits into its own repository. First, I split the commits into a new branch like so:

git subtree split -P zigbee-terminal -b export

Then I created a new repository and imported the branch:

cd ~/Projects
mkdir zigbee-terminal
cd zigbee-terminal
git init
git fetch ../cansat-2012 export
git checkout -b master FETCH_HEAD

Then I pushed the commits to the new repository:

git remote add github git@github.com:alexforencich/zigbee-terminal.git
git push github master

After that, I went back the old repository and removed the old folder:

git rm -rf zigbee-terminal
git commit -m "Removed zigbee-terminal"

Then I added the remote to the original repository as well:

git remote add zbt git@github.com:alexforencich/zigbee-terminal.git

And finally I added the project back as a subtree:

git subtree add -P zigbee-terminal -m "Added zigbee-terminal as a subproject" zbt/master

Now, to push commits back to the zigbee-terminal repository, I would run:

git subtree split -P zigbee-terminal -b zbtbackport
git push zbt zbtbackport:master

And to pull commits from the zigbee-terminal repository, I would run:

git fetch zbt
git subtree merge -P zigbee-terminal -m "merged changes in zigbee-terminal" zbt/master

Slick. Now that that's taken care of, time to get back to the real engineering…

Helper Script

As I have been using the commands listed here quite often, I decided to put together a helper script for managing these subtrees. Take the script below, modify it to match your repository configuration, and include it in your repository alongside the subtree. Then you can call it with no arguments or with a single argument of 'pull' to pull from the remote subtree repository, or add 'push' to push to the remote subtree repository. As is, the script performs the exact commands listed in the above section, aside from the splitting off of the subdirectory in the process of creating a separate repository.

#!/bin/bash
 
# Git subtree manager
# Alex Forencich <alex@alexforencich.com>
# This script facilitates easy management of subtrees
# included in larger repositories as this script can
# be included in the repository itself.
 
# Note:
# Requires git-subtree
# https://github.com/apenwarr/git-subtree
 
# Settings
# uncomment to use --squash
#squash="yes"
# Remote repository
repo="git@github.com:alexforencich/zigbee-terminal.git"
# Remote name
remote="zbt"
# Subdirectory to store code in
subdir="zigbee-terminal"
# Remote branch
branch="master"
# Backport branch name (only used for pushing)
backportbranch="${remote}backport"
# Add commit message
addmsg="added ${subdir} as a subproject"
# Merge commit message
mergemsg="merged changes in ${subdir}"
 
# Usage
# add - adds subtree
# pull - default, pulls from remote
# push - pushes to remote
 
squashflag=""
 
if [ $squash ]; then
  squashflag="--squash"
fi
 
action="pull"
 
cd $(git rev-parse --show-toplevel)
 
if [ ! -d "$subdir" ]; then
    action="add"
fi
 
if [ -n "$1" ]; then
    action="$1"
fi
 
# array contains value
# usage: contains array value
function contains() {
    local n=$#
    local value=${!n}
    for ((i=1;i < $n;i++)) {
        if [ "${!i}" == "${value}" ]; then
            echo "y"
            return 0
        fi
    }
    echo "n"
    return 1
}
 
case "$action" in
  add)
    if [ $(contains $(git remote) "$remote") != "y" ]; then
      git remote add "$remote" "$repo"
    fi
    git fetch "$remote"
    git subtree add -P "$subdir" $squashflag -m "$addmsg" "$remote/$branch"
    ;;
  pull)
    if [ $(contains $(git remote) "$remote") != "y" ]; then
      git remote add "$remote" "$repo"
    fi
    git fetch "$remote"
    git subtree merge -P "$subdir" $squashflag -m "$mergemsg" "$remote/$branch"
    ;;
  push)
    if [ $(contains $(git remote) "$remote") != "y" ]; then
      git remote add "$remote" "$repo"
    fi
    git subtree split -P "$subdir" -b "$backportbranch"
    git push "$remote" "$backportbranch:$branch"
    ;;
  *)
    echo "Error: unknown action!"
    exit 1
esac
 
exit 0