On the face of it, Git submodules seem like a great idea. They allow you to include a Git repository inside another Git repository. This is good for maintaining shared code across multiple applications, or for including third-party libraries in your project. However, there are some trip hazards to be aware of when using them, and the source of most of the problems is this:
Submodules point to specific commits rather than branches.
The primary advantage of git submodules is precise version control. Unlike package managers that might automatically update dependencies, submodules point to specific commits in the external repository. This ensures that your web application, AI slop generator or robot litter tray firmware maintains consistent behavior regardless of upstream changes until you explicitly choose to update.
Using Git Submodules
To add a git repository into your project as a submodule, recite the following incantations:
- From the root of your project, run
git submodule add https://github.com/littertron/disco-mode.git src/libraries/led/disco-modeto initialise the desired git repository in thesrc/libraries/led/disco-modedirectory. The submodule configuration will be added to the.gitmodulesfile in the project root and that change will be staged for commit. - Then run
git submodule initto add the new submodule to the project’s git configuration. This command reads the.gitmodulesfile and updates the.git/configfile. - Finally run
git submodule updateto download the contents of the submodule so you can get on with coding some slick pixel LED effects for the Littertron 9000-F
You’ve been working on Disco Mode for the Littertron 9000-F for a while, and it’s time to push your changes.
- From within
src/libraries/led/disco-mode, rungit add -Ato stage your changes, andgit commit -m "Added new chase effects"to commit them. Then rungit pushto push your changes to the remote repository. - Back in the outer project, run
git add src/libraries/led/disco-modeto stage the submodule update, andgit commit -m "Updated Disco Mode submodule"to commit it. Finally, rungit push.
The outer project’s repository does not store the contents of the submodule. Only the commit hash of the submodule is tracked and commited to the outer project.
Potential Problems
Submodules are not automatically updated when you pull changes to the outer project
Your colleague, who is working on the Littertron 9000-F’s sensor logic, sees your changes. They run git pull and get the impression they’re up to date but they are not. Their copy of your lighting control submodule has not been updated. The result? The light sensor is improperly calibrated, and Disco Mode causes an integer overflow in the sensor reading. The 9000-F forcefully ejects its contents as a safety measure.
Team members must use
git pull --recurse-submodulesin the outer repository to update everything. (Orgit pullfollowed bygit submodule update --init --recursive)
Submodules track commits, not branches
You’re tasked with designing some scaled back lighting effects for an OTA update to the Littertron 9000 Mini. You switch to the appropriate branch in the outer project, run git pull --recurse-submodules and start working. But when you flash the firmware to your prototype, you find that Disco Mode starts a literal disco inferno. It turns out the battery management submodule should have been updated to the latest commit on its main branch to support your sick RGB chases.
To add a submodule that tracks a branch, use
git submodule add -b <branch> <repository> <path>
Running
git submodule update --remotein the outer project will then update every submodule that tracks a branch to the latest commit on that branch. Adding--mergewill merge those changes into your local submodule.
Submodules can be a pain to debug
LitterCorp use a CI/CD pipeline to deploy the firmware for the Littertron 9000-F. The pipeline runs the tests, builds the firmware, and packages it for deployment. But the pipeline fails with a message about ‘invalid submodules’. You spend hours trying to figure out what’s wrong, only to discover that one submodule was cloned into place before being added as a submodule. Running git submodule add didn’t produce any errors, and the code is all up to date… But it turns out the code is actually being tracked by the outer project, not the submodule; the submodule is indeed invalid.
Always use
git submodule addto add submodules to your project. If you’ve cloned a repo into place by mistake, remove it from the project and then re-add it as a submodule.
What about Git Subtrees?
Git subtrees merge the contents of an external repository directly into a subdirectory of your main repository. Unlike submodules, the external code becomes part of your repository’s history. To quote the highest-voted Stack Overflow answer on the subject:
submodule is link;
subtree is copy
Changes to subtree code are tracked within your main repository, not as references to another repository. This can make it easier to work with them, and it is still possible to push changes back to the source repository.
git subtree add --prefix=<path> <repository> <ref>Adds a subtreegit subtree pull --prefix=<path> <repository> <ref>Updates a subtreegit subtree push --prefix=<path> <repository> <branch>Pushes changes back to the original repository
<ref> can be a branch, tag, or commit hash.
So when would you use a subtree over a submodule? If you want to include a third-party library in your project and you intend to heavily customise it to that specific project, a subtree is a good choice. If you want to include a re-usable module that you might want to update independently of the current project, a submodule is the way to go.
TLDR
Adding submodules:
- To add submodules to a project always use
git submodule addrather than simply cloning the repo. - Submodules generally point to specific commits in the external repository, not branches.
- To add a submodule that tracks a specific branch, use
git submodule add -b <branch> <repository> <path>
Updating submodules:
- To pull your codebase and any submodules at the same time, use
git pull --recurse-submodules. This will discard any local changes. - To pull only submodules use
git submodule update --remote. This will also discard any local changes. - To pull only submodules, keeping local changes use
git submodule update --remote --merge. This command:- Fetches the latest commits from the submodule’s remote.
- Merges those changes into your local submodule.
- Updates the submodule pointer in your parent repo.