How do I clone a subdirectory only of a Git repository?

Learn how do i clone a subdirectory only of a git repository? with practical examples, diagrams, and best practices. Covers git, repository, subdirectory development techniques with visual explanat...

How to Clone a Subdirectory Only of a Git Repository

How to Clone a Subdirectory Only of a Git Repository

Learn how to efficiently clone only a specific subdirectory from a large Git repository, saving bandwidth and disk space using Git's sparse-checkout feature.

Cloning an entire Git repository, especially a large one, can be time-consuming and resource-intensive if you only need a small portion of its content. This article guides you through the process of cloning only a subdirectory using Git's sparse-checkout feature, allowing you to work with a subset of files without downloading the entire repository history.

Understanding Sparse Checkout

Git's sparse-checkout is a powerful feature that allows you to define a working directory that does not contain all the files in the repository. Instead, it includes only a specified subset of files and directories. This is particularly useful for monorepos or repositories with many unrelated projects, where you only need to interact with a specific component.

A diagram illustrating the concept of sparse checkout. On the left, a large Git repository with multiple directories (e.g., 'project-a', 'project-b', 'docs'). On the right, a smaller working directory representing a sparse checkout, showing only 'project-b' and its contents. Arrows indicate that the sparse checkout is derived from the full repository but only includes the specified subdirectory. Use a cloud icon for the full repository and a folder icon for the sparse checkout. Blue and green colors for distinction.

Sparse checkout allows you to selectively include directories from a large repository.

Step-by-Step Guide to Cloning a Subdirectory

To clone a specific subdirectory, you'll need to initialize a shallow clone, enable sparse-checkout, and then specify the subdirectory path. This process avoids downloading unnecessary history and content.

1. Step 1

Initialize a shallow clone of the repository. This downloads only the latest commit, saving time.

2. Step 2

Navigate into the newly created repository directory.

3. Step 3

Enable sparse-checkout for your repository. This tells Git to only populate files matching your patterns.

4. Step 4

Specify the subdirectory (or subdirectories) you wish to include in your working copy.

5. Step 5

Pull the repository content. Git will now populate only the specified subdirectory.

git clone --no-checkout --depth 1 <repository-url> <repo-name>
cd <repo-name>
git sparse-checkout init --cone
git sparse-checkout set <subdirectory-path>
git checkout

Commands to perform a sparse checkout of a subdirectory.

Example: Cloning Only the docs Subdirectory

Let's assume you have a repository my-monorepo with the structure: my-monorepo/src/, my-monorepo/docs/, my-monorepo/tests/. You only need the docs directory.

git clone --no-checkout --depth 1 https://github.com/user/my-monorepo.git my-monorepo-docs
cd my-monorepo-docs
git sparse-checkout init --cone
git sparse-checkout set docs/
git checkout

Cloning only the docs/ subdirectory from my-monorepo.