From 654ad400c27653a138b028cdd35a09b0c69c7ef0 Mon Sep 17 00:00:00 2001 From: Michael Haggerty Date: Thu, 24 May 2012 14:16:50 +0200 Subject: Avoid sorting if references are added to ref_cache in order The old code allowed many references to be efficiently added to a single directory, because it just appended the references to the containing directory unsorted without doing any searching (and therefore without requiring any intermediate sorting). But the old code was inefficient when a large number of subdirectories were added to a directory, because the directory always had to be searched to see if the new subdirectory already existed, and this search required the directory to be sorted first. The same was repeated for every new subdirectory, so the time scaled like O(N^2), where N is the number of subdirectories within a single directory. In practice, references are often added to the ref_cache in lexicographic order, for example when reading the packed-refs file. So build some intelligence into add_entry_to_dir() to optimize for the case of references and/or subdirectories being added in lexicographic order: if the existing entries were already sorted, and the new entry comes after the last existing entry, then adjust ref_dir::sorted to reflect the fact that the ref_dir is still sorted. Thanks to Peff for pointing out the performance regression that inspired this change. Signed-off-by: Michael Haggerty Signed-off-by: Junio C Hamano diff --git a/refs.c b/refs.c index 09322fe..98f6425 100644 --- a/refs.c +++ b/refs.c @@ -208,6 +208,12 @@ static void add_entry_to_dir(struct ref_dir *dir, struct ref_entry *entry) { ALLOC_GROW(dir->entries, dir->nr + 1, dir->alloc); dir->entries[dir->nr++] = entry; + /* optimize for the case that entries are added in order */ + if (dir->nr == 1 || + (dir->nr == dir->sorted + 1 && + strcmp(dir->entries[dir->nr - 2]->name, + dir->entries[dir->nr - 1]->name) < 0)) + dir->sorted = dir->nr; } /* -- cgit v0.10.2-6-g49f6