How do I clone a subdirectory only of a Git repository?
Categories:
How to Clone a Subdirectory Only of a Git Repository
Learn how to efficiently clone only a specific subdirectory from a large Git repository, saving bandwidth and disk space using Git's sparse-checkout feature.
Cloning an entire Git repository, especially a large one, can be time-consuming and resource-intensive if you only need a small portion of its content. This article guides you through the process of cloning only a subdirectory using Git's sparse-checkout
feature, allowing you to work with a subset of files without downloading the entire repository history.
Understanding Sparse Checkout
Git's sparse-checkout
is a powerful feature that allows you to define a working directory that does not contain all the files in the repository. Instead, it includes only a specified subset of files and directories. This is particularly useful for monorepos or repositories with many unrelated projects, where you only need to interact with a specific component.
Sparse checkout allows you to selectively include directories from a large repository.
Step-by-Step Guide to Cloning a Subdirectory
To clone a specific subdirectory, you'll need to initialize a shallow clone, enable sparse-checkout
, and then specify the subdirectory path. This process avoids downloading unnecessary history and content.
1. Step 1
Initialize a shallow clone of the repository. This downloads only the latest commit, saving time.
2. Step 2
Navigate into the newly created repository directory.
3. Step 3
Enable sparse-checkout
for your repository. This tells Git to only populate files matching your patterns.
4. Step 4
Specify the subdirectory (or subdirectories) you wish to include in your working copy.
5. Step 5
Pull the repository content. Git will now populate only the specified subdirectory.
git clone --no-checkout --depth 1 <repository-url> <repo-name>
cd <repo-name>
git sparse-checkout init --cone
git sparse-checkout set <subdirectory-path>
git checkout
Commands to perform a sparse checkout of a subdirectory.
--cone
mode for sparse-checkout
simplifies pattern definition. It automatically includes all files within specified directories. If you need more granular control (e.g., specific files within a directory), you might need to use the legacy sparse-checkout
mode without --cone
.Example: Cloning Only the docs
Subdirectory
Let's assume you have a repository my-monorepo
with the structure: my-monorepo/src/
, my-monorepo/docs/
, my-monorepo/tests/
. You only need the docs
directory.
git clone --no-checkout --depth 1 https://github.com/user/my-monorepo.git my-monorepo-docs
cd my-monorepo-docs
git sparse-checkout init --cone
git sparse-checkout set docs/
git checkout
Cloning only the docs/
subdirectory from my-monorepo
.
sparse-checkout
, commands like git status
and git add
will only operate on the files within your sparse checkout definition. Be mindful that files outside this definition are still part of the repository's history but are not present in your working directory.