summaryrefslogtreecommitdiff
path: root/reftable/stack_test.c
AgeCommit message (Collapse)Author
8 daysreftable: make the compaction factor configurablePatrick Steinhardt
When auto-compacting, the reftable library packs references such that the sizes of the tables form a geometric sequence. The factor for this geometric sequence is hardcoded to 2 right now. We're about to expose this as a config option though, so let's expose the factor via write options. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 daysreftable: pass opts as constant pointerPatrick Steinhardt
We sometimes pass the refatble write options as value and sometimes as a pointer. This is quite confusing and makes the reader wonder whether the options get modified sometimes. In fact, `reftable_new_writer()` does cause the caller-provided options to get updated when some values aren't set up. This is quite unexpected, but didn't cause any harm until now. Adapt the code so that we do not modify the caller-provided values anymore. While at it, refactor the code to code to consistently pass the options as a constant pointer to clarify that the caller-provided opts will not ever get modified. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 daysreftable: consistently refer to `reftable_write_options` as `opts`Patrick Steinhardt
Throughout the reftable library the `reftable_write_options` are sometimes referred to as `cfg` and sometimes as `opts`. Unify these to consistently use `opts` to avoid confusion. While at it, touch up the coding style a bit by removing unneeded braces around one-line statements and newlines between variable declarations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
13 daysMerge branch 'ps/reftable-write-optim'Junio C Hamano
Code to write out reftable has seen some optimization and simplification. * ps/reftable-write-optim: reftable/block: reuse compressed array reftable/block: reuse zstream when writing log blocks reftable/writer: reset `last_key` instead of releasing it reftable/writer: unify releasing memory reftable/writer: refactorings for `writer_flush_nonempty_block()` reftable/writer: refactorings for `writer_add_record()` refs/reftable: don't recompute committer ident reftable: remove name checks refs/reftable: skip duplicate name checks refs/reftable: perform explicit D/F check when writing symrefs refs/reftable: fix D/F conflict error message on ref copy
2024-04-16Merge branch 'jt/reftable-geometric-compaction'Junio C Hamano
The strategy to compact multiple tables of reftables after many operations accumulate many entries has been improved to avoid accumulating too many tables uncollected. * jt/reftable-geometric-compaction: reftable/stack: use geometric table compaction reftable/stack: add env to disable autocompaction reftable/stack: expose option to disable auto-compaction
2024-04-09Merge branch 'ps/pack-refs-auto'Junio C Hamano
"git pack-refs" learned the "--auto" option, which is a useful addition to be triggered from "git gc --auto". Acked-by: Karthik Nayak <karthik.188@gmail.com> cf. <CAOLa=ZRAEA7rSUoYL0h-2qfEELdbPHbeGpgBJRqesyhHi9Q6WQ@mail.gmail.com> * ps/pack-refs-auto: builtin/gc: pack refs when using `git maintenance run --auto` builtin/gc: forward git-gc(1)'s `--auto` flag when packing refs t6500: extract objects with "17" prefix builtin/gc: move `struct maintenance_run_opts` builtin/pack-refs: introduce new "--auto" flag builtin/pack-refs: release allocated memory refs/reftable: expose auto compaction via new flag refs: remove `PACK_REFS_ALL` flag refs: move `struct pack_refs_opts` to where it's used t/helper: drop pack-refs wrapper refs/reftable: print errors on compaction failure reftable/stack: gracefully handle failed auto-compaction due to locks reftable/stack: use error codes when locking fails during compaction reftable/error: discern locked/outdated errors reftable/stack: fix error handling in `reftable_stack_init_addition()`
2024-04-09reftable: remove name checksPatrick Steinhardt
In the preceding commit we have disabled name checks in the "reftable" backend. These checks were responsible for verifying multiple things when writing records to the reftable stack: - Detecting file/directory conflicts. Starting with the preceding commits this is now handled by the reftable backend itself via `refs_verify_refname_available()`. - Validating refnames. This is handled by `check_refname_format()` in the generic ref transacton layer. The code in the reftable library is thus not used anymore and likely to bitrot over time. Remove it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08reftable/stack: use geometric table compactionJustin Tobler
To reduce the number of on-disk reftables, compaction is performed. Contiguous tables with the same binary log value of size are grouped into segments. The segment that has both the lowest binary log value and contains more than one table is set as the starting point when identifying the compaction segment. Since segments containing a single table are not initially considered for compaction, if the table appended to the list does not match the previous table log value, no compaction occurs for the new table. It is therefore possible for unbounded growth of the table list. This can be demonstrated by repeating the following sequence: git branch -f foo git branch -d foo Each operation results in a new table being written with no compaction occurring until a separate operation produces a table matching the previous table log value. Instead, to avoid unbounded growth of the table list, the compaction strategy is updated to ensure tables follow a geometric sequence after each operation by individually evaluating each table in reverse index order. This strategy results in a much simpler and more robust algorithm compared to the previous one while also maintaining a minimal ordered set of tables on-disk. When creating 10 thousand references, the new strategy has no performance impact: Benchmark 1: update-ref: create refs sequentially (revision = HEAD~) Time (mean ± σ): 26.516 s ± 0.047 s [User: 17.864 s, System: 8.491 s] Range (min … max): 26.447 s … 26.569 s 10 runs Benchmark 2: update-ref: create refs sequentially (revision = HEAD) Time (mean ± σ): 26.417 s ± 0.028 s [User: 17.738 s, System: 8.500 s] Range (min … max): 26.366 s … 26.444 s 10 runs Summary update-ref: create refs sequentially (revision = HEAD) ran 1.00 ± 0.00 times faster than update-ref: create refs sequentially (revision = HEAD~) Some tests in `t0610-reftable-basics.sh` assert the on-disk state of tables and are therefore updated to specify the correct new table count. Since compaction is more aggressive in ensuring tables maintain a geometric sequence, the expected table count is reduced in these tests. In `reftable/stack_test.c` tests related to `sizes_to_segments()` are removed because the function is no longer needed. Also, the `test_suggest_compaction_segment()` test is updated to better showcase and reflect the new geometric compaction behavior. Signed-off-by: Justin Tobler <jltobler@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08reftable/stack: expose option to disable auto-compactionJustin Tobler
The reftable stack already has a variable to configure whether or not to run auto-compaction, but it is inaccessible to users of the library. There exist use cases where a caller may want to have more control over auto-compaction. Move the `disable_auto_compact` option into `reftable_write_options` to allow external callers to disable auto-compaction. This will be used in a subsequent commit. Signed-off-by: Justin Tobler <jltobler@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05Merge branch 'ps/pack-refs-auto' into jt/reftable-geometric-compactionJunio C Hamano
* ps/pack-refs-auto: builtin/gc: pack refs when using `git maintenance run --auto` builtin/gc: forward git-gc(1)'s `--auto` flag when packing refs t6500: extract objects with "17" prefix builtin/gc: move `struct maintenance_run_opts` builtin/pack-refs: introduce new "--auto" flag builtin/pack-refs: release allocated memory refs/reftable: expose auto compaction via new flag refs: remove `PACK_REFS_ALL` flag refs: move `struct pack_refs_opts` to where it's used t/helper: drop pack-refs wrapper refs/reftable: print errors on compaction failure reftable/stack: gracefully handle failed auto-compaction due to locks reftable/stack: use error codes when locking fails during compaction reftable/error: discern locked/outdated errors reftable/stack: fix error handling in `reftable_stack_init_addition()`
2024-04-01Merge branch 'ps/reftable-unit-test-nfs-workaround'Junio C Hamano
A unit test for reftable code tried to enumerate all files in a directory after reftable operations and expected to see nothing but the files it wanted to leave there, but was fooled by .nfs* cruft files left, which has been corrected. * ps/reftable-unit-test-nfs-workaround: reftable: fix tests being broken by NFS' delete-after-close semantics
2024-03-25reftable/stack: gracefully handle failed auto-compaction due to locksPatrick Steinhardt
Whenever we commit a new table to the reftable stack we will end up invoking auto-compaction of the stack to keep the total number of tables at bay. This auto-compaction may fail though in case at least one of the tables which we are about to compact is locked. This is indicated by the compaction function returning `REFTABLE_LOCK_ERROR`. We do not handle this case though, and thus bubble that return value up the calling chain, which will ultimately cause a failure. Fix this bug by ignoring `REFTABLE_LOCK_ERROR`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25reftable/error: discern locked/outdated errorsPatrick Steinhardt
We currently throw two different errors into a similar-but-different error code: - Errors when trying to lock the reftable stack. - Errors when trying to write to the reftable stack which has been modified concurrently. This results in unclear error handling and user-visible error messages. Create a new `REFTABLE_OUTDATED_ERROR` so that those error conditions can be clearly told apart from each other. Adjust users of the old `REFTABLE_LOCK_ERROR` to use the new error code as required. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-21reftable: fix tests being broken by NFS' delete-after-close semanticsPatrick Steinhardt
It was reported that the reftable unit tests in t0032 fail with the following assertion when running on top of NFS: running test_reftable_stack_compaction_concurrent_clean reftable/stack_test.c: 1063: failed assertion count_dir_entries(dir) == 2 Aborted Setting a breakpoint immediately before the assertion in fact shows the following list of files: ./stack_test-1027.QJBpnd ./stack_test-1027.QJBpnd/0x000000000001-0x000000000003-dad7ac80.ref ./stack_test-1027.QJBpnd/.nfs000000000001729f00001e11 ./stack_test-1027.QJBpnd/tables.list Note the weird ".nfs*" file? This file is maintained by NFS clients in order to emulate delete-after-last-close semantics that we rely on in the reftable code [1]. Instead of unlinking the file right away and keeping it open in the client, the NFS client will rename it to ".nfs*" and then delete that temporary file when the last reference to it gets dropped. Quoting the NFS FAQ: > D2. What is a "silly rename"? Why do these .nfsXXXXX files keep > showing up? > > A. Unix applications often open a scratch file and then unlink it. > They do this so that the file is not visible in the file system name > space to any other applications, and so that the system will > automatically clean up (delete) the file when the application exits. > This is known as "delete on last close", and is a tradition among > Unix applications. > > Because of the design of the NFS protocol, there is no way for a > file to be deleted from the name space but still remain in use by an > application. Thus NFS clients have to emulate this using what > already exists in the protocol. If an open file is unlinked, an NFS > client renames it to a special name that looks like ".nfsXXXXX". > This "hides" the file while it remains in use. This is known as a > "silly rename." Note that NFS servers have nothing to do with this > behavior. This of course throws off the assertion that we got exactly two files in that directory. The test in question triggers this behaviour by holding two open file descriptors to the "tables.list" file. One of the references is because we are about to append to the stack, whereas the other reference is because we want to compact it. As the compaction has just finished we already rewrote "tables.list" to point to the new contents, but the other file descriptor pointing to the old version is still open. Thus we trigger the delete-after-last-close emulation. Furthermore, it was reported that this behaviour only triggers with 4f36b8597c (reftable/stack: fix race in up-to-date check, 2024-01-18). This is expected as well because it is the first point in time where we actually keep the "tables.list" file descriptor open for the stat cache. Fix this bug by skipping over any files that start with a leading dot when counting files. While we could explicitly check for a prefix of ".nfs", other network file systems like SMB for example do the same trickery but with a ".smb" prefix. In any case though, this loosening of the assertion should be fine given that the reftable library would never write files with leading dots by itself. [1]: https://nfs.sourceforge.net/#faq_d2 Reported-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-05reftable/record: convert old and new object IDs to arraysPatrick Steinhardt
In 7af607c58d (reftable/record: store "val1" hashes as static arrays, 2024-01-03) and b31e3cc620 (reftable/record: store "val2" hashes as static arrays, 2024-01-03) we have converted ref records to store their object IDs in a static array. Convert log records to do the same so that their old and new object IDs are arrays, too. This change results in two allocations less per log record that we're iterating over. Before: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 8,068,495 allocs, 8,068,373 frees, 401,011,862 bytes allocated After: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 6,068,489 allocs, 6,068,367 frees, 361,011,822 bytes allocated Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-12Merge branch 'ps/reftable-styles'Junio C Hamano
Code clean-up in various reftable code paths. * ps/reftable-styles: reftable/record: improve semantics when initializing records reftable/merged: refactor initialization of iterators reftable/merged: refactor seeking of records reftable/stack: use `size_t` to track stack length reftable/stack: use `size_t` to track stack slices during compaction reftable/stack: index segments with `size_t` reftable/stack: fix parameter validation when compacting range reftable: introduce macros to allocate arrays reftable: introduce macros to grow arrays
2024-02-09Merge branch 'en/header-cleanup' into maint-2.43Junio C Hamano
Remove unused header "#include". * en/header-cleanup: treewide: remove unnecessary includes in source files treewide: add direct includes currently only pulled in transitively trace2/tr2_tls.h: remove unnecessary include submodule-config.h: remove unnecessary include pkt-line.h: remove unnecessary include line-log.h: remove unnecessary include http.h: remove unnecessary include fsmonitor--daemon.h: remove unnecessary includes blame.h: remove unnecessary includes archive.h: remove unnecessary include treewide: remove unnecessary includes in source files treewide: remove unnecessary includes from header files
2024-02-06reftable/stack: index segments with `size_t`Patrick Steinhardt
We use `int`s to index into arrays of segments and track the length of them, which is considered to be a code smell in the Git project. Convert the code to use `size_t` instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-26reftable/stack: adjust permissions of compacted tablesPatrick Steinhardt
When creating a new compacted table from a range of preexisting ones we don't set the default permissions on the resulting table when specified by the user. This has the effect that the "core.sharedRepository" config will not be honored correctly. Fix this bug and add a test to catch this issue. Note that we only test on non-Windows platforms because Windows does not use POSIX permissions natively. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-16Merge branch 'ps/reftable-fixes-and-optims'Junio C Hamano
More fixes and optimizations to the reftable backend. * ps/reftable-fixes-and-optims: reftable/merged: transfer ownership of records when iterating reftable/merged: really reuse buffers to compute record keys reftable/record: store "val2" hashes as static arrays reftable/record: store "val1" hashes as static arrays reftable/record: constify some parts of the interface reftable/writer: fix index corruption when writing multiple indices reftable/stack: do not auto-compact twice in `reftable_stack_add()` reftable/stack: do not overwrite errors when compacting
2024-01-08Merge branch 'en/header-cleanup'Junio C Hamano
Remove unused header "#include". * en/header-cleanup: treewide: remove unnecessary includes in source files treewide: add direct includes currently only pulled in transitively trace2/tr2_tls.h: remove unnecessary include submodule-config.h: remove unnecessary include pkt-line.h: remove unnecessary include line-log.h: remove unnecessary include http.h: remove unnecessary include fsmonitor--daemon.h: remove unnecessary includes blame.h: remove unnecessary includes archive.h: remove unnecessary include treewide: remove unnecessary includes in source files treewide: remove unnecessary includes from header files
2024-01-03reftable/record: store "val1" hashes as static arraysPatrick Steinhardt
When reading ref records of type "val1", we store its object ID in an allocated array. This results in an additional allocation for every single ref record we read, which is rather inefficient especially when iterating over refs. Refactor the code to instead use an embedded array of `GIT_MAX_RAWSZ` bytes. While this means that `struct ref_record` is bigger now, we typically do not store all refs in an array anyway and instead only handle a limited number of records at the same point in time. Using `git show-ref --quiet` in a repository with ~350k refs this leads to a significant drop in allocations. Before: HEAP SUMMARY: in use at exit: 21,098 bytes in 192 blocks total heap usage: 2,116,683 allocs, 2,116,491 frees, 76,098,060 bytes allocated After: HEAP SUMMARY: in use at exit: 21,098 bytes in 192 blocks total heap usage: 1,419,031 allocs, 1,418,839 frees, 62,145,036 bytes allocated Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26treewide: remove unnecessary includes in source filesElijah Newren
Each of these were checked with gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE} to ensure that removing the direct inclusion of the header actually resulted in that header no longer being included at all (i.e. that no other header pulled it in transitively). ...except for a few cases where we verified that although the header was brought in transitively, nothing from it was directly used in that source file. These cases were: * builtin/credential-cache.c * builtin/pull.c * builtin/send-pack.c Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11reftable/stack: perform auto-compaction with transactional interfacePatrick Steinhardt
Whenever updating references or reflog entries in the reftable stack, we need to add a new table to the stack, thus growing the stack's length by one. The stack can grow to become quite long rather quickly, leading to performance issues when trying to read records. But besides performance issues, this can also lead to exhaustion of file descriptors very rapidly as every single table requires a separate descriptor when opening the stack. While git-pack-refs(1) fixes this issue for us by merging the tables, it runs too irregularly to keep the length of the stack within reasonable limits. This is why the reftable stack has an auto-compaction mechanism: `reftable_stack_add()` will call `reftable_stack_auto_compact()` after its added the new table, which will auto-compact the stack as required. But while this logic works alright for `reftable_stack_add()`, we do not do the same in `reftable_addition_commit()`, which is the transactional equivalent to the former function that allows us to write multiple updates to the stack atomically. Consequentially, we will easily run into file descriptor exhaustion in code paths that use many separate transactions like e.g. non-atomic fetches. Fix this issue by calling `reftable_stack_auto_compact()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11reftable/stack: verify that `reftable_stack_add()` uses auto-compactionPatrick Steinhardt
While we have several tests that check whether we correctly perform auto-compaction when manually calling `reftable_stack_auto_compact()`, we don't have any tests that verify whether `reftable_stack_add()` does call it automatically. Add one. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-11reftable: handle interrupted writesPatrick Steinhardt
There are calls to write(3P) where we don't properly handle interrupts. Convert them to use `write_in_full()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-20Merge branch 'ep/maint-equals-null-cocci'Junio C Hamano
Introduce and apply coccinelle rule to discourage an explicit comparison between a pointer and NULL, and applies the clean-up to the maintenance track. * ep/maint-equals-null-cocci: tree-wide: apply equals-null.cocci tree-wide: apply equals-null.cocci contrib/coccinnelle: add equals-null.cocci
2022-05-02tree-wide: apply equals-null.cocciJunio C Hamano
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16Merge branch 'hn/reftable-coverity-fixes'Junio C Hamano
Problems identified by Coverity in the reftable code have been corrected. * hn/reftable-coverity-fixes: reftable: add print functions to the record types reftable: make reftable_record a tagged union reftable: remove outdated file reftable.c reftable: implement record equality generically reftable: make reftable-record.h function signatures const correct reftable: handle null refnames in reftable_ref_record_equal reftable: drop stray printf in readwrite_test reftable: order unittests by complexity reftable: all xxx_free() functions accept NULL arguments reftable: fix resource warning reftable: ignore remove() return value in stack_test.c reftable: check reftable_stack_auto_compact() return value reftable: fix resource leak blocksource.c reftable: fix resource leak in block.c error path reftable: fix OOB stack write in print functions
2022-01-20reftable: ignore remove() return value in stack_test.cHan-Wen Nienhuys
If the cleanup fails, there is nothing we can do. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-20reftable: check reftable_stack_auto_compact() return valueHan-Wen Nienhuys
Fixes a problem detected by Coverity. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-23reftable: support preset file mode for writingHan-Wen Nienhuys
Create files with mode 0666, so umask works as intended. Provides an override, which is useful to support shared repos (test t1301-shared-repo.sh). Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08reftable: implement stack, a mutable database of reftable files.Han-Wen Nienhuys
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>