diff options
Diffstat (limited to 'Documentation/technical/partial-clone.txt')
-rw-r--r-- | Documentation/technical/partial-clone.txt | 46 |
1 files changed, 19 insertions, 27 deletions
diff --git a/Documentation/technical/partial-clone.txt b/Documentation/technical/partial-clone.txt index 210373e..cd948b0 100644 --- a/Documentation/technical/partial-clone.txt +++ b/Documentation/technical/partial-clone.txt @@ -3,7 +3,7 @@ Partial Clone Design Notes The "Partial Clone" feature is a performance optimization for Git that allows Git to function without having a complete copy of the repository. -The goal of this work is to allow Git better handle extremely large +The goal of this work is to allow Git to better handle extremely large repositories. During clone and fetch operations, Git downloads the complete contents @@ -32,7 +32,7 @@ if/when needed. A remote that can later provide the missing objects is called a promisor remote, as it promises to send the objects when -requested. Initialy Git supported only one promisor remote, the origin +requested. Initially Git supported only one promisor remote, the origin remote from which the user cloned and that was configured in the "extensions.partialClone" config option. Later support for more than one promisor remote has been implemented. @@ -79,7 +79,7 @@ Design Details upload-pack negotiation. + This uses the existing capability discovery mechanism. -See "filter" in Documentation/technical/pack-protocol.txt. +See "filter" in linkgit:gitprotocol-pack[5]. - Clients pass a "filter-spec" to clone and fetch which is passed to the server to request filtering during packfile construction. @@ -171,23 +171,19 @@ additional flag. Fetching Missing Objects ------------------------ -- Fetching of objects is done using the existing transport mechanism using - transport_fetch_refs(), setting a new transport option - TRANS_OPT_NO_DEPENDENTS to indicate that only the objects themselves are - desired, not any object that they refer to. -+ -Because some transports invoke fetch_pack() in the same process, fetch_pack() -has been updated to not use any object flags when the corresponding argument -(no_dependents) is set. +- Fetching of objects is done by invoking a "git fetch" subprocess. - The local repository sends a request with the hashes of all requested - objects as "want" lines, and does not perform any packfile negotiation. + objects, and does not perform any packfile negotiation. It then receives a packfile. -- Because we are reusing the existing fetch-pack mechanism, fetching +- Because we are reusing the existing fetch mechanism, fetching currently fetches all objects referred to by the requested objects, even though they are not necessary. +- Fetching with `--refetch` will request a complete new filtered packfile from + the remote, which can be used to change a filter without needing to + dynamically fetch missing objects. Using many promisor remotes --------------------------- @@ -249,8 +245,7 @@ remote in a specific order. repository and can satisfy all such requests. - Repack essentially treats promisor and non-promisor packfiles as 2 - distinct partitions and does not mix them. Repack currently only works - on non-promisor packfiles and loose objects. + distinct partitions and does not mix them. - Dynamic object fetching invokes fetch-pack once *for each item* because most algorithms stumble upon a missing object and need to have @@ -261,7 +256,7 @@ remote in a specific order. - Dynamic object fetching currently uses the existing pack protocol V0 which means that each object is requested via fetch-pack. The server will send a full set of info/refs when the connection is established. - If there are large number of refs, this may incur significant overhead. + If there are a large number of refs, this may incur significant overhead. Future Work @@ -270,7 +265,7 @@ Future Work - Improve the way to specify the order in which promisor remotes are tried. + -For example this could allow to specify explicitly something like: +For example this could allow specifying explicitly something like: "When fetching from this remote, I want to use these promisor remotes in this order, though, when pushing or fetching to that remote, I want to use those promisor remotes in that order." @@ -280,9 +275,6 @@ to use those promisor remotes in that order." The user might want to work in a triangular work flow with multiple promisor remotes that each have an incomplete view of the repository. -- Allow repack to work on promisor packfiles (while keeping them distinct - from non-promisor packfiles). - - Allow non-pathname-based filters to make use of packfile bitmaps (when present). This was just an omission during the initial implementation. @@ -330,7 +322,7 @@ Footnotes [a] expensive-to-modify list of missing objects: Earlier in the design of partial clone we discussed the need for a single list of missing objects. - This would essentially be a sorted linear list of OIDs that the were + This would essentially be a sorted linear list of OIDs that were omitted by the server during a clone or subsequent fetches. This file would need to be loaded into memory on every object lookup. @@ -350,26 +342,26 @@ Related Links [0] https://crbug.com/git/2 Bug#2: Partial Clone -[1] https://public-inbox.org/git/20170113155253.1644-1-benpeart@microsoft.com/ + +[1] https://lore.kernel.org/git/20170113155253.1644-1-benpeart@microsoft.com/ + Subject: [RFC] Add support for downloading blobs on demand + Date: Fri, 13 Jan 2017 10:52:53 -0500 -[2] https://public-inbox.org/git/cover.1506714999.git.jonathantanmy@google.com/ + +[2] https://lore.kernel.org/git/cover.1506714999.git.jonathantanmy@google.com/ + Subject: [PATCH 00/18] Partial clone (from clone to lazy fetch in 18 patches) + Date: Fri, 29 Sep 2017 13:11:36 -0700 -[3] https://public-inbox.org/git/20170426221346.25337-1-jonathantanmy@google.com/ + +[3] https://lore.kernel.org/git/20170426221346.25337-1-jonathantanmy@google.com/ + Subject: Proposal for missing blob support in Git repos + Date: Wed, 26 Apr 2017 15:13:46 -0700 -[4] https://public-inbox.org/git/1488999039-37631-1-git-send-email-git@jeffhostetler.com/ + +[4] https://lore.kernel.org/git/1488999039-37631-1-git-send-email-git@jeffhostetler.com/ + Subject: [PATCH 00/10] RFC Partial Clone and Fetch + Date: Wed, 8 Mar 2017 18:50:29 +0000 -[5] https://public-inbox.org/git/20170505152802.6724-1-benpeart@microsoft.com/ + +[5] https://lore.kernel.org/git/20170505152802.6724-1-benpeart@microsoft.com/ + Subject: [PATCH v7 00/10] refactor the filter process code into a reusable module + Date: Fri, 5 May 2017 11:27:52 -0400 -[6] https://public-inbox.org/git/20170714132651.170708-1-benpeart@microsoft.com/ + +[6] https://lore.kernel.org/git/20170714132651.170708-1-benpeart@microsoft.com/ + Subject: [RFC/PATCH v2 0/1] Add support for downloading blobs on demand + Date: Fri, 14 Jul 2017 09:26:50 -0400 |