Add "--filter" flag to enable git partial clone#982
Add "--filter" flag to enable git partial clone#982k8s-ci-robot merged 3 commits intokubernetes:masterfrom
Conversation
|
Welcome @kane8n! |
| single commit. Setting this to 0 will sync the full history of the | ||
| repo. | ||
|
|
||
| --filter <string>, $GITSYNC_FILTER |
There was a problem hiding this comment.
Please keep these flags alphabetically sorted
| Use partial clone with the specified filter. This can reduce | ||
| the amount of data transferred when cloning large repositories. | ||
| Common values are 'blob:none' (omit all blobs, fetch on demand) | ||
| and 'tree:0' (omit all trees and blobs). This is most effective |
There was a problem hiding this comment.
Can you reference git fetch docs so users can learn about the syntax? E.g. See docs for "git fetch --filter" for more information.
| single commit. Setting this to 0 will sync the full history of the | ||
| repo. | ||
|
|
||
| --filter <string>, $GITSYNC_FILTER |
There was a problem hiding this comment.
You also need to update README.md which has these docs embedded (you can delete the last section then just run git-sync --man >> README.md)
…reference, and regenerate README.md
|
@thockin
PTAL when you get a chance. |
|
Thanks! /lgtm |
|
e2e fails for |
| # Test filter (partial clone) with blob:none | ||
| ############################################## | ||
| function e2e::filter_partial_clone_blob_none() { | ||
| echo "${FUNCNAME[0]}" > "$REPO/file" |
There was a problem hiding this comment.
Tests begin with the funcname already in the file, which makes this redundant
| assert_file_eq "$ROOT/link/file" "${FUNCNAME[0]}" | ||
|
|
||
| # Verify the repo is a partial clone | ||
| if ! git -C "$ROOT/link" config --get remote.origin.promisor >/dev/null 2>&1; then |
There was a problem hiding this comment.
Can you explain in comments what this block is doing? It doesn't seem to work - remote.origin.partialclonefilter fails (which you are hiding with || true)
| echo "!/*" > "$WORK/sparseconfig" | ||
| echo "!/*/" >> "$WORK/sparseconfig" | ||
| echo "file2" >> "$WORK/sparseconfig" | ||
| echo "${FUNCNAME[0]}" > "$REPO/file" |
…one verification
- Remove redundant `echo "${FUNCNAME[0]}" > "$REPO/file"` in
filter_partial_clone_blob_none and filter_with_sparse_checkout;
init_repo already initializes $REPO/file with the funcname.
- Replace the silently-failing `remote.origin.partialclonefilter`
check (git-sync fetches by URL, not via the "origin" remote, so
the config keys are never set) with a check for the `.promisor`
pack marker that `git fetch --filter` produces.
|
@thockin |
|
Thanks! /lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kane8n, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Summary
Closes #981
--filterflag ($GITSYNC_FILTER) to support git partial clone--filtertogit fetch, enabling filters likeblob:noneortree:0--depthand--sparse-checkout-fileto minimize data transfer at every stage (commits, trees, and blobs)Details
For large monorepos where users only need a subset of files, combining
--filter=blob:none+--depth+--sparse-checkout-fileensures:--depthlimits commit and tree objects--filter=blob:noneeliminates blob transfer at fetch timesparse-checkoutfetches only needed blobs on demand at checkout