summaryrefslogtreecommitdiff
path: root/diffcore-rename.c
AgeCommit message (Collapse)Author
2020-01-31sha1-file: pass git_hash_algo to hash_object_file()Matheus Tavares
Allow hash_object_file() to work on arbitrary repos by introducing a git_hash_algo parameter. Change callers which have a struct repository pointer in their scope to pass on the git_hash_algo from the said repo. For all other callers, pass on the_hash_algo, which was already being used internally at hash_object_file(). This functionality will be used in the following patch to make check_object_signature() be able to work on arbitrary repos (which, in turn, will be used to fix an inconsistency at object.c:parse_object()). Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-15Merge branch 'ew/hashmap'Junio C Hamano
Code clean-up of the hashmap API, both users and implementation. * ew/hashmap: hashmap_entry: remove first member requirement from docs hashmap: remove type arg from hashmap_{get,put,remove}_entry OFFSETOF_VAR macro to simplify hashmap iterators hashmap: introduce hashmap_free_entries hashmap: hashmap_{put,remove} return hashmap_entry * hashmap: use *_entry APIs for iteration hashmap_cmp_fn takes hashmap_entry params hashmap_get{,_from_hash} return "struct hashmap_entry *" hashmap: use *_entry APIs to wrap container_of hashmap_get_next returns "struct hashmap_entry *" introduce container_of macro hashmap_put takes "struct hashmap_entry *" hashmap_remove takes "const struct hashmap_entry *" hashmap_get takes "const struct hashmap_entry *" hashmap_add takes "struct hashmap_entry *" hashmap_get_next takes "const struct hashmap_entry *" hashmap_entry_init takes "struct hashmap_entry *" packfile: use hashmap_entry in delta_base_cache_entry coccicheck: detect hashmap_entry.hash assignment diff: use hashmap_entry_init on moved_entry.ent
2019-10-07OFFSETOF_VAR macro to simplify hashmap iteratorsEric Wong
While we cannot rely on a `__typeof__' operator being portable to use with `offsetof'; we can calculate the pointer offset using an existing pointer and the address of a member using pointer arithmetic for compilers without `__typeof__'. This allows us to simplify usage of hashmap iterator macros by not having to specify a type when a pointer of that type is already given. In the future, list iterator macros (e.g. list_for_each_entry) may also be implemented using OFFSETOF_VAR to save hackers the trouble of using container_of/list_entry macros and without relying on non-portable `__typeof__'. v3: use `__typeof__' to avoid clang warnings Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap: introduce hashmap_free_entriesEric Wong
`hashmap_free_entries' behaves like `container_of' and passes the offset of the hashmap_entry struct to the internal `hashmap_free_' function, allowing the function to free any struct pointer regardless of where the hashmap_entry field is located. `hashmap_free' no longer takes any arguments aside from the hashmap itself. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap: use *_entry APIs to wrap container_ofEric Wong
Using `container_of' can be verbose and choosing names for intermediate "struct hashmap_entry" pointers is a hard problem. So introduce "*_entry" APIs inspired by similar linked-list APIs in the Linux kernel. Unfortunately, `__typeof__' is not portable C, so we need an extra parameter to specify the type. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap_get_next returns "struct hashmap_entry *"Eric Wong
This is a step towards removing the requirement for hashmap_entry being the first field of a struct. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap_add takes "struct hashmap_entry *"Eric Wong
This is less error-prone than "void *" as the compiler now detects invalid types being passed. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap_get_next takes "const struct hashmap_entry *"Eric Wong
This is less error-prone than "const void *" as the compiler now detects invalid types being passed. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap_entry_init takes "struct hashmap_entry *"Eric Wong
C compilers do type checking to make life easier for us. So rely on that and update all hashmap_entry_init callers to take "struct hashmap_entry *" to avoid future bugs while improving safety and readability. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-02diffcore_rename(): use a stable sortJohannes Schindelin
During Git's rename detection, the file names are sorted. At the moment, this job is performed by `qsort()`. As that function is not guaranteed to implement a stable sort algorithm, this can lead to inconsistent and/or surprising behavior: a rename might be detected differently depending on the platform where Git was run. The `qsort()` in MS Visual C's runtime does _not_ implement a stable sort algorithm, and it even leads to an inconsistency leading to a test failure in t3030.35 "merge-recursive remembers the names of all base trees": a different code path than on Linux is taken in the rename detection of an ambiguous rename between either `e` to `a` or `a~Temporary merge branch 2_0` to `a` during a recursive merge, unexpectedly resulting in a clean merge. Let's use the stable sort provided by `git_stable_qsort()` to avoid this inconsistency. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-09Merge branch 'jk/oidhash'Junio C Hamano
Code clean-up to remove hardcoded SHA-1 hash from many places. * jk/oidhash: hashmap: convert sha1hash() to oidhash() hash.h: move object_id definition from cache.h khash: rename oid helper functions khash: drop sha1-specific map types pack-bitmap: convert khash_sha1 maps into kh_oid_map delta-islands: convert island_marks khash to use oids khash: rename kh_oid_t to kh_oid_set khash: drop broken oid_map typedef object: convert create_object() to use object_id object: convert internal hash_obj() to object_id object: convert lookup_object() to use object_id object: convert lookup_unknown_object() to use object_id pack-objects: convert locate_object_entry_hash() to object_id pack-objects: convert packlist_find() to use object_id pack-bitmap-write: convert some helpers to use object_id upload-pack: rename a "sha1" variable to "oid" describe: fix accidental oid/hash type-punning
2019-06-20hashmap: convert sha1hash() to oidhash()Jeff King
There are no callers left of sha1hash() that do not simply pass the "hash" member of a "struct object_id". Let's get rid of the outdated sha1-specific function and provide one that operates on the whole struct (even though the technique, taking the first few bytes of the hash, will remain the same). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-13cleanup: fix possible overflow errors in binary search, part 2René Scharfe
Calculating the sum of two array indexes to find the midpoint between them can overflow, i.e. code like this is unsafe for big arrays: mid = (first + last) >> 1; Make sure the intermediate value stays within the boundaries instead, like this: mid = first + ((last - first) >> 1); The loop condition of the binary search makes sure that 'last' is always greater than 'first', so this is safe as long as 'first' is not negative. And that can be verified easily using the pre-context of each change, except for name-hash.c, so add an assertion to that effect there. The unsafe calculations were found with: git grep '(.*+.*) *>> *1' This is a continuation of 19716b21a4 (cleanup: fix possible overflow errors in binary search, 2017-10-08). Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-19Merge branch 'nd/the-index'Junio C Hamano
Various codepaths in the core-ish part learn to work on an arbitrary in-core index structure, not necessarily the default instance "the_index". * nd/the-index: (23 commits) revision.c: reduce implicit dependency the_repository revision.c: remove implicit dependency on the_index ws.c: remove implicit dependency on the_index tree-diff.c: remove implicit dependency on the_index submodule.c: remove implicit dependency on the_index line-range.c: remove implicit dependency on the_index userdiff.c: remove implicit dependency on the_index rerere.c: remove implicit dependency on the_index sha1-file.c: remove implicit dependency on the_index patch-ids.c: remove implicit dependency on the_index merge.c: remove implicit dependency on the_index merge-blobs.c: remove implicit dependency on the_index ll-merge.c: remove implicit dependency on the_index diff-lib.c: remove implicit dependency on the_index read-cache.c: remove implicit dependency on the_index diff.c: remove implicit dependency on the_index grep.c: remove implicit dependency on the_index diff.c: remove the_index dependency in textconv() functions blame.c: rename "repo" argument to "r" combine-diff.c: remove implicit dependency on the_index ...
2018-09-21diff.c: reduce implicit dependency on the_indexNguyễn Thái Ngọc Duy
diff and textconv code has so widespread use that it's hard to simply update their api and all call sites at once because it would result in a big patch. For now reduce the_index references to two places: diff_setup() and fill_textconv(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-29convert "oidcmp() != 0" to "!oideq()"Jeff King
This is the flip side of the previous two patches: checking for a non-zero oidcmp() can be more strictly expressed as inequality. Like those patches, we write "!= 0" in the coccinelle transformation, which covers by isomorphism the more common: if (oidcmp(E1, E2)) As with the previous two patches, this patch can be achieved almost entirely by running "make coccicheck"; the only differences are manual line-wrap fixes to match the original code. There is one thing to note for anybody replicating this, though: coccinelle 1.0.4 seems to miss the case in builtin/tag.c, even though it's basically the same as all the others. Running with 1.0.7 does catch this, so presumably it's just a coccinelle bug that was fixed in the interim. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-16object-store: move object access functions to object-store.hStefan Beller
This should make these functions easier to find and cache.h less overwhelming to read. In particular, this moves: - read_object_file - oid_object_info - write_object_file As a result, most of the codebase needs to #include object-store.h. In this patch the #include is only added to files that would fail to compile otherwise. It would be better to #include wherever identifiers from the header are used. That can happen later when we have better tooling for it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-15Merge branch 'po/object-id'Junio C Hamano
Conversion from uchar[20] to struct object_id continues. * po/object-id: sha1_file: rename hash_sha1_file_literally sha1_file: convert write_loose_object to object_id sha1_file: convert force_object_loose to object_id sha1_file: convert write_sha1_file to object_id notes: convert write_notes_tree to object_id notes: convert combine_notes_* to object_id commit: convert commit_tree* to object_id match-trees: convert splice_tree to object_id cache: clear whole hash buffer with oidclr sha1_file: convert hash_sha1_file to object_id dir: convert struct sha1_stat to use object_id sha1_file: convert pretend_sha1_file to object_id
2018-01-30sha1_file: convert hash_sha1_file to object_idPatryk Obara
Convert the declaration and definition of hash_sha1_file to use struct object_id and adjust all function calls. Rename this function to hash_object_file. Signed-off-by: Patryk Obara <patryk.obara@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-22Use MOVE_ARRAYSZEDER Gábor
Use the helper macro MOVE_ARRAY to move arrays. This is shorter and safer, as it automatically infers the size of elements. Patch generated by Coccinelle and contrib/coccinelle/array.cocci in Travis CI's static analysis build job. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-03diffcore-rename: make diff-tree -l0 mean -l<large>Jonathan Tan
In the documentation of diff-tree, it is stated that the -l option "prevents rename/copy detection from running if the number of rename/copy targets exceeds the specified number". The documentation does not mention any special handling for the number 0, but the implementation before commit 9f7e4bfa3b ("diff: remove silent clamp of renameLimit", 2017-11-13) treated 0 as a special value indicating that the rename limit is to be a very large number instead. The commit 9f7e4bfa3b changed that behavior, treating 0 as 0. Revert this behavior to what it was previously. This allows existing scripts and tools that use "-l0" to continue working. The alternative (to have "-l0" suppress rename detection) is probably much less useful, since users can just refrain from specifying -M and/or -C to have the same effect. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-15diff: remove silent clamp of renameLimitElijah Newren
In commit 0024a5492 (Fix the rename detection limit checking; 2007-09-14), the renameLimit was clamped to 32767. This appears to have been to simply avoid integer overflow in the following computation: num_create * num_src <= rename_limit * rename_limit although it also could be viewed as a hardcoded bound on the amount of CPU time we're willing to allow users to tell git to spend on handling renames. An upper bound may make sense, but unfortunately this upper bound was neither communicated to the users, nor documented anywhere. Although large limits can make things slow, we have users who would be ecstatic to have a small five file change be correctly cherry picked even if they have to manually specify a large limit and wait ten minutes for the renames to be detected. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-15progress: fix progress meters when dealing with lots of workElijah Newren
The possibility of setting merge.renameLimit beyond 2^16 raises the possibility that the values passed to progress can exceed 2^32. Use uint64_t, because it "ought to be enough for anybody". :-) Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-01diff: make struct diff_flags members lowercaseBrandon Williams
Now that the flags stored in struct diff_flags are being accessed directly and not through macros, change all struct members from being uppercase to lowercase. This conversion is done using the following semantic patch: @@ expression E; @@ - E.RECURSIVE + E.recursive @@ expression E; @@ - E.TREE_IN_RECURSIVE + E.tree_in_recursive @@ expression E; @@ - E.BINARY + E.binary @@ expression E; @@ - E.TEXT + E.text @@ expression E; @@ - E.FULL_INDEX + E.full_index @@ expression E; @@ - E.SILENT_ON_REMOVE + E.silent_on_remove @@ expression E; @@ - E.FIND_COPIES_HARDER + E.find_copies_harder @@ expression E; @@ - E.FOLLOW_RENAMES + E.follow_renames @@ expression E; @@ - E.RENAME_EMPTY + E.rename_empty @@ expression E; @@ - E.HAS_CHANGES + E.has_changes @@ expression E; @@ - E.QUICK + E.quick @@ expression E; @@ - E.NO_INDEX + E.no_index @@ expression E; @@ - E.ALLOW_EXTERNAL + E.allow_external @@ expression E; @@ - E.EXIT_WITH_STATUS + E.exit_with_status @@ expression E; @@ - E.REVERSE_DIFF + E.reverse_diff @@ expression E; @@ - E.CHECK_FAILED + E.check_failed @@ expression E; @@ - E.RELATIVE_NAME + E.relative_name @@ expression E; @@ - E.IGNORE_SUBMODULES + E.ignore_submodules @@ expression E; @@ - E.DIRSTAT_CUMULATIVE + E.dirstat_cumulative @@ expression E; @@ - E.DIRSTAT_BY_FILE + E.dirstat_by_file @@ expression E; @@ - E.ALLOW_TEXTCONV + E.allow_textconv @@ expression E; @@ - E.TEXTCONV_SET_VIA_CMDLINE + E.textconv_set_via_cmdline @@ expression E; @@ - E.DIFF_FROM_CONTENTS + E.diff_from_contents @@ expression E; @@ - E.DIRTY_SUBMODULES + E.dirty_submodules @@ expression E; @@ - E.IGNORE_UNTRACKED_IN_SUBMODULES + E.ignore_untracked_in_submodules @@ expression E; @@ - E.IGNORE_DIRTY_SUBMODULES + E.ignore_dirty_submodules @@ expression E; @@ - E.OVERRIDE_SUBMODULE_CONFIG + E.override_submodule_config @@ expression E; @@ - E.DIRSTAT_BY_LINE + E.dirstat_by_line @@ expression E; @@ - E.FUNCCONTEXT + E.funccontext @@ expression E; @@ - E.PICKAXE_IGNORE_CASE + E.pickaxe_ignore_case @@ expression E; @@ - E.DEFAULT_FOLLOW_RENAMES + E.default_follow_renames Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-01diff: remove DIFF_OPT_TST macroBrandon Williams
Remove the `DIFF_OPT_TST` macro and instead access the flags directly. This conversion is done using the following semantic patch: @@ expression E; identifier fld; @@ - DIFF_OPT_TST(&E, fld) + E.flags.fld @@ type T; T *ptr; identifier fld; @@ - DIFF_OPT_TST(ptr, fld) + ptr->flags.fld Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-19progress: simplify "delayed" progress APIJunio C Hamano
We used to expose the full power of the delayed progress API to the callers, so that they can specify, not just the message to show and expected total amount of work that is used to compute the percentage of work performed so far, the percent-threshold parameter P and the delay-seconds parameter N. The progress meter starts to show at N seconds into the operation only if we have not yet completed P per-cent of the total work. Most callers used either (0%, 2s) or (50%, 1s) as (P, N), but there are oddballs that chose more random-looking values like 95%. For a smoother workload, (50%, 1s) would allow us to start showing the progress meter earlier than (0%, 2s), while keeping the chance of not showing progress meter for long running operation the same as the latter. For a task that would take 2s or more to complete, it is likely that less than half of it would complete within the first second, if the workload is smooth. But for a spiky workload whose earlier part is easier, such a setting is likely to fail to show the progress meter entirely and (0%, 2s) is more appropriate. But that is merely a theory. Realistically, it is of dubious value to ask each codepath to carefully consider smoothness of their workload and specify their own setting by passing two extra parameters. Let's simplify the API by dropping both parameters and have everybody use (0%, 2s). Oh, by the way, the percent-threshold parameter and the structure member were consistently misspelled, which also is now fixed ;-) Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30hashmap.h: compare function has access to a data fieldStefan Beller
When using the hashmap a common need is to have access to caller provided data in the compare function. A couple of times we abuse the keydata field to pass in the data needed. This happens for example in patch-ids.c. This patch changes the function signature of the compare function to have one more void pointer available. The pointer given for each invocation of the compare function must be defined in the init function of the hashmap and is just passed through. Documentation of this new feature is deferred to a later patch. This is a rather mechanical conversion, just adding the new pass-through parameter. However while at it improve the naming of the fields of all compare functions used by hashmaps by ensuring unused parameters are prefixed with 'unused_' and naming the parameters what they are (instead of 'unused' make it 'unused_keydata'). Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-24Merge branch 'ab/free-and-null'Junio C Hamano
A common pattern to free a piece of memory and assign NULL to the pointer that used to point at it has been replaced with a new FREE_AND_NULL() macro. * ab/free-and-null: *.[ch] refactoring: make use of the FREE_AND_NULL() macro coccinelle: make use of the "expression" FREE_AND_NULL() rule coccinelle: add a rule to make "expression" code use FREE_AND_NULL() coccinelle: make use of the "type" FREE_AND_NULL() rule coccinelle: add a rule to make "type" code use FREE_AND_NULL() git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
2017-06-16coccinelle: make use of the "type" FREE_AND_NULL() ruleÆvar Arnfjörð Bjarmason
Apply the result of the just-added coccinelle rule. This manually excludes a few occurrences, mostly things that resulted in many FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent change. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-05diffcore-rename: use is_empty_blob_oidBrandon Williams
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02diff: convert fill_filespec to struct object_idBrandon Williams
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-11-17Merge branch 'tk/diffcore-delta-remove-unused'Junio C Hamano
Code cleanup. * tk/diffcore-delta-remove-unused: diffcore-delta: remove unused parameter to diffcore_count_changes()
2016-11-14diffcore-delta: remove unused parameter to diffcore_count_changes()Tobias Klauser
The delta_limit parameter to diffcore_count_changes() has been unused since commit ba23bbc8e ("diffcore-delta: make change counter to byte oriented again.", 2006-03-04). Remove the parameter and adjust all callers. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-29use QSORTRené Scharfe
Apply the semantic patch contrib/coccinelle/qsort.cocci to the code base, replacing calls of qsort(3) with QSORT. The resulting code is shorter and supports empty arrays with NULL pointers. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-01pass constants as first argument to st_mult()René Scharfe
The result of st_mult() is the same no matter the order of its arguments. It invokes the macro unsigned_mult_overflows(), which divides the second parameter by the first one. Pass constants first to allow that division to be done already at compile time. Signed-off-by: Rene Scharfe <l.s.r@web.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-28diff: rename struct diff_filespec's sha1_valid memberbrian m. carlson
Now that this struct's sha1 member is called "oid", update the comment and the sha1_valid member to be called "oid_valid" instead. The following Coccinelle semantic patch was used to implement this, followed by the transformations in object_id.cocci: @@ struct diff_filespec o; @@ - o.sha1_valid + o.oid_valid @@ struct diff_filespec *p; @@ - p->sha1_valid + p->oid_valid Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-28diff: convert struct diff_filespec to struct object_idbrian m. carlson
Convert struct diff_filespec's sha1 member to use a struct object_id called "oid" instead. The following Coccinelle semantic patch was used to implement this, followed by the transformations in object_id.cocci: @@ struct diff_filespec o; @@ - o.sha1 + o.oid.hash @@ struct diff_filespec *p; @@ - p->sha1 + p->oid.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-29Merge branch 'sg/diff-multiple-identical-renames' into maintJunio C Hamano
"git diff -M" used to work better when two originally identical files A and B got renamed to X/A and X/B by pairing A to X/A and B to X/B, but this was broken in the 2.0 timeframe. * sg/diff-multiple-identical-renames: diffcore: fix iteration order of identical files during rename detection
2016-04-13Merge branch 'sg/diff-multiple-identical-renames'Junio C Hamano
"git diff -M" used to work better when two originally identical files A and B got renamed to X/A and X/B by pairing A to X/A and B to X/B, but this was broken in the 2.0 timeframe. * sg/diff-multiple-identical-renames: diffcore: fix iteration order of identical files during rename detection
2016-03-30diffcore: fix iteration order of identical files during rename detectionSZEDER Gábor
If the two paths 'dir/A/file' and 'dir/B/file' have identical content and the parent directory is renamed, e.g. 'git mv dir other-dir', then diffcore reports the following exact renames: renamed: dir/B/file -> other-dir/A/file renamed: dir/A/file -> other-dir/B/file While technically not wrong, this is confusing not only for the user, but also for git commands that make decisions based on rename information, e.g. 'git log --follow other-dir/A/file' follows 'dir/B/file' past the rename. This behavior is a side effect of commit v2.0.0-rc4~8^2~14 (diffcore-rename.c: simplify finding exact renames, 2013-11-14): the hashmap storing sources returns entries from the same bucket, i.e. sources matching the current destination, in LIFO order. Thus the iteration first examines 'other-dir/A/file' and 'dir/B/file' and, upon finding identical content and basename, reports an exact rename. Other hashmap users are apparently happy with the current iteration order over the entries of a bucket. Changing the iteration order would risk upsetting other hashmap users and would increase the memory footprint of each bucket by a pointer to the tail element. Fill the hashmap with source entries in reverse order to restore the original exact rename detection behavior. Reported-by: Bill Okara <billokara@gmail.com> Signed-off-by: SZEDER Gábor <szeder@ira.uka.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22use st_add and st_mult for allocation size computationJeff King
If our size computation overflows size_t, we may allocate a much smaller buffer than we expected and overflow it. It's probably impossible to trigger an overflow in most of these sites in practice, but it is easy enough convert their additions and multiplications into overflow-checking variants. This may be fixing real bugs, and it makes auditing the code easier. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-10Merge branch 'jk/diffcore-rename-duplicate'Junio C Hamano
A corrupt input to "git diff -M" can cause us to segfault. * jk/diffcore-rename-duplicate: diffcore-rename: avoid processing duplicate destinations diffcore-rename: split locate_rename_dst into two functions
2015-02-27diffcore-rename: avoid processing duplicate destinationsJeff King
The rename code cannot handle an input where we have duplicate destinations (i.e., more than one diff_filepair in the queue with the same string in its pair->two->path). We end up allocating only one slot in the rename_dst mapping. If we fill in the diff_filepair for that slot, when we re-queue the results, we may queue that filepair multiple times. When the diff is finally flushed, the filepair is processed and free()d multiple times, leading to heap corruption. This situation should only happen when a tree diff sees duplicates in one of the trees (see the added test for a detailed example). Rather than handle it, the sanest thing is just to turn off rename detection altogether for the diff. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-27diffcore-rename: split locate_rename_dst into two functionsJeff King
This function manages the mapping of destination pathnames to filepairs, and it handles both insertion and lookup. This makes the return value a bit confusing, as we return a newly created entry (even though no caller cares), and have no room to indicate to the caller that an entry already existed. Instead, let's break this up into two distinct functions, both backed by a common binary search. The binary search will use our normal "return the index if we found something, or negative index minus one to show where it would have gone" semantics. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-18diff.c: allow to pass more flags to diff_populate_filespecNguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-07hashmap: add simplified hashmap_get_from_hash() APIKarsten Blees
Hashmap entries are typically looked up by just a key. The hashmap_get() API expects an initialized entry structure instead, to support compound keys. This flexibility is currently only needed by find_dir_entry() in name-hash.c (and compat/win32/fscache.c in the msysgit fork). All other (currently five) call sites of hashmap_get() have to set up a near emtpy entry structure, resulting in duplicate code like this: struct hashmap_entry keyentry; hashmap_entry_init(&keyentry, hash(key)); return hashmap_get(map, &keyentry, key); Add a hashmap_get_from_hash() API that allows hashmap lookups by just specifying the key and its hash code, i.e.: return hashmap_get_from_hash(map, hash(key), key); Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-07hashmap: factor out getting a hash code from a SHA1Karsten Blees
Copying the first bytes of a SHA1 is duplicated in six places, however, the implications (the actual value would depend on the endianness of the platform) is documented only once. Add a properly documented API for this. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-18Merge branch 'dd/use-alloc-grow'Junio C Hamano
Replace open-coded reallocation with ALLOC_GROW() macro. * dd/use-alloc-grow: sha1_file.c: use ALLOC_GROW() in pretend_sha1_file() read-cache.c: use ALLOC_GROW() in add_index_entry() builtin/mktree.c: use ALLOC_GROW() in append_to_tree() attr.c: use ALLOC_GROW() in handle_attr_line() dir.c: use ALLOC_GROW() in create_simplify() reflog-walk.c: use ALLOC_GROW() replace_object.c: use ALLOC_GROW() in register_replace_object() patch-ids.c: use ALLOC_GROW() in add_commit() diffcore-rename.c: use ALLOC_GROW() diff.c: use ALLOC_GROW() commit.c: use ALLOC_GROW() in register_commit_graft() cache-tree.c: use ALLOC_GROW() in find_subtree() bundle.c: use ALLOC_GROW() in add_to_ref_list() builtin/pack-objects.c: use ALLOC_GROW() in check_pbase_path()
2014-03-14Merge branch 'nd/i18n-progress'Junio C Hamano
Mark the progress indicators from various time-consuming commands for i18n/l10n. * nd/i18n-progress: i18n: mark all progress lines for translation
2014-03-03diffcore-rename.c: use ALLOC_GROW()Dmitry S. Dolzhenko
Use ALLOC_GROW() instead of open-coding it in locate_rename_dst() and register_rename_src(). Signed-off-by: Dmitry S. Dolzhenko <dmitrys.dolzhenko@yandex.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>