summaryrefslogtreecommitdiff
path: root/commit-graph.h
diff options
context:
space:
mode:
authorPatrick Steinhardt <ps@pks.im>2021-08-09 08:12:03 (GMT)
committerJunio C Hamano <gitster@pobox.com>2021-08-09 16:51:12 (GMT)
commitf559d6d45e7e58ae1f922213948723de77ea77bd (patch)
tree5db9c2b4540075ffb8f27aaab1f5d4ba489ca9e8 /commit-graph.h
parent809ea28f809e52d3204b597637b2f5e072c140f8 (diff)
downloadgit-f559d6d45e7e58ae1f922213948723de77ea77bd.zip
git-f559d6d45e7e58ae1f922213948723de77ea77bd.tar.gz
git-f559d6d45e7e58ae1f922213948723de77ea77bd.tar.bz2
revision: avoid hitting packfiles when commits are in commit-graph
When queueing references in git-rev-list(1), we try to optimize parsing of commits via the commit-graph. To do so, we first look up the object's type, and if it is a commit we call `repo_parse_commit()` instead of `parse_object()`. This is quite inefficient though given that we're always uncompressing the object header in order to determine the type. Instead, we can opportunistically search the commit-graph for the object ID: in case it's found, we know it's a commit and can directly fill in the commit object without having to uncompress the object header. Expose a new function `lookup_commit_in_graph()`, which tries to find a commit in the commit-graph by ID, and convert `get_reference()` to use this function. This provides a big performance win in cases where we load references in a repository with lots of references pointing to commits. The following has been executed in a real-world repository with about 2.2 million refs: Benchmark #1: HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev Time (mean ± σ): 4.458 s ± 0.044 s [User: 4.115 s, System: 0.342 s] Range (min … max): 4.409 s … 4.534 s 10 runs Benchmark #2: HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev Time (mean ± σ): 3.089 s ± 0.015 s [User: 2.768 s, System: 0.321 s] Range (min … max): 3.061 s … 3.105 s 10 runs Summary 'HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev' ran 1.44 ± 0.02 times faster than 'HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev' Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'commit-graph.h')
-rw-r--r--commit-graph.h8
1 files changed, 8 insertions, 0 deletions
diff --git a/commit-graph.h b/commit-graph.h
index 96c24fb..04a94e1 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -41,6 +41,14 @@ int open_commit_graph(const char *graph_file, int *fd, struct stat *st);
int parse_commit_in_graph(struct repository *r, struct commit *item);
/*
+ * Look up the given commit ID in the commit-graph. This will only return a
+ * commit if the ID exists both in the graph and in the object database such
+ * that we don't return commits whose object has been pruned. Otherwise, this
+ * function returns `NULL`.
+ */
+struct commit *lookup_commit_in_graph(struct repository *repo, const struct object_id *id);
+
+/*
* It is possible that we loaded commit contents from the commit buffer,
* but we also want to ensure the commit-graph content is correctly
* checked and filled. Fill the graph_pos and generation members of