Using Git Submodules for Private Content
My website has been open source for as long as it has existed. Originally, it was a WordPress site, but only the layout was out there for everyone to see since the data was saved in a database. Once I moved to Gatsby, I kept all the images and posts in a
content directory. This was way better, as my content is all conveniently stored in one easy-to-save folder and the posts are all in beautiful markdown.
However, people often like my layout and want to use it, so they clone and deploy this site. Sometimes they will just leave up all the posts and images and update the name and image. Although I subscribe to the Zenhabits Uncopyright philosophy towards content - my content is out there for the world to see and do what they want with it, and it doesn't bother me - I don't think I should make it quite so easy to just clone everything I've written in a moment. If you're going to plagiarize, you should at least have to do a bit of work.
So I decided to store my content in a private git submodule. If you go to the repo for this site now, you'll see a folder that looks like
content @ <hash>. If you click on it, you'll be taken to a 404 page. If I click on it, I'll be taken to a separate, private repo that contains all my images and posts.
A lot of people have asked me how to use private git submodules, so I'll go over it here. Note that this is not a deep-dive into submodules, but just the basics of adding, updating, and cloning a repo with submodules.
Git submodules allow you to keep a git repository as a subdirectory of another git repository.
This could be useful if you have a lot of projects within a project. One example of this is the Dracula code theme repo. Every folder is a git submodule. This allows people to add a new theme for a new program by creating their own repo, and the owner of the parent repository only needs to reference the child repos. You can tell they're all submodules because of the
@ <hash> after each subdirectory name.
Before doing anything with submodules, I would recommend running this command to update the config and set submodule.recurse to
true, which allows
git clone and
git pull to automatically update submodules.
git config --global submodule.recurse true
||Add a submodule within a repository|
||Update existing submodules within a repository (add
||Initialize local submodules file (only necessary if repo not cloned with
Let's imagine that you want a public blog, located on the
blog repo, to contain a submodule with all the posts, located in the
posts repo. So it will look like:
- A public repo at
- A private repo at
I'm just using GitHub as an example, it doesn't matter where the repo is hosted. Also, git submodules can also be used for both private and public repos.
First you can
add the submodule. From the root of
blog, you would run this command.
git submodule add https://github.com/you/posts
This would clone the
posts repo into a folder in
Cloning into '/Users/you/blog/posts'...
You will now have two new entries into the
blog repo, a
.gitmodules file, and the new
Changes to be committed: (use "git rm --cached <file>..." to unstage) new file: .gitmodules new file: posts
.gitmodules will look like this:
[submodule "posts"] path = posts url = https://github.com/you/posts
At this point, you have a reference to the
posts repo as a submodule, so the directory structure will look like this:
.git .gitmodules posts/
As a note, if you
.git, you'll see a
modulesdirectory. This will contain a folder called
posts, and this is where git is storing references and other data about your submodules.
To update submodule content, you'll pull in any changes made to the remote submodule repo with the
update command. Since you would be updating content from a remote location, you'll add the
--remote flag. From the root of the
blog repo, you would run the command:
git submodule update --remote
It's important to note that when working with submodules, you shouldn't work on or commit your local version of the submodule repo. If you made any changes locally, your version would now be out-of-sync with the submodule repo.
You just want to treat a submodule as an entirely separate repo, but linked. This is much like code found in
node_modulesfor an node project, where the references to the projects are listed in
package.jsonand you know any local changes you make to a dependency in
node_moduleswill not be persisted.
If you make changes locally and run a git status, you will see modified content next to the modified submodule.
Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git restore <file>..." to discard changes in working directory) (commit or discard the untracked or modified content in submodules) modified: posts (modified content)
If you make changes to the submodule and bring those commits in properly and run a git status, you will see new commits next to the modified submodule.
Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git restore <file>..." to discard changes in working directory) modified: posts (new commits)
If you see
(new commits), you can commit those changes on the parent repo. You can also check this by viewing the diff.
diff --git a/posts b/posts index abc..def 160000 --- a/posts +++ b/posts @@ -1 +1 @@ -Subproject commit abc... +Subproject commit def...
If you clone an existing repository and it has submodules within it, you'll have to
update to pull in all the submodule content.
git clone https://github.com/you/posts
cd posts && git submodule init && git submodule update
You can bypass this by either having the
submodules.recurse setting set, or by using the
git clone --recurse-submodules https://github.com/you/posts
This will clone the directory along with all submodule content.
For my site, I've made the
content submodule private. If you have a Netlify site and want to know what to do to allow Netlify to pull from the private repo, here is an overview of the steps.
- Generate a deploy key from Netlify.
- Add the key as a read-only deploy key on the settings for your private repo (found at
- Netlify will now have permissions to fetch the submodules that it reads from your
Here are the main points from the article:
- Submodules are used when a subdirectory in a repo should consist of all the data from another repo.
- You can add a submodule to a project with
git add submodule <submodule-repo>.
- You can update submodules within a project with
git update submodule --remote.
- You should clone a project that has submodules with
submodule.recursein your config to do this by default.
- You should not work on any submodule files directly within the parent repo. The submodule directory should be treated only as a reference to another existing repo.
My current process for updating the site looks like:
- Make changes to
- Commit changes to
contentand push to private submodule hosted on GitHub:
git commit && git push.
- Pull new updates into local
git submodule update --remote.
- Commit the new submodule changes and push to public GitHub repo:
git commit && git push.
- Netlify deploys the new site.