summaryrefslogtreecommitdiff
path: root/builtin-fsck.c
AgeCommit message (Collapse)Author
2009-11-16Check the format of more printf-type functionsTarmigan Casebolt
We already have these checks in many printf-type functions that have prototypes which are in header files. Add these same checks to some more prototypes in header functions and to static functions in .c files. cc: Miklos Vajna <vmiklos@frugalware.org> Signed-off-by: Tarmigan Casebolt <tarmigan+git@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-20fsck: default to "git fsck --full"Junio C Hamano
Linus and other git developers from the early days trained their fingers to type the command, every once in a while even without thinking, to check the consistency of the repository back when the lower core part of the git was still being developed. Developers who wanted to make sure that git correctly dealt with packfiles could deliberately trigger their creation and checked them after they were created carefully, but loose objects are the ones that are written by various commands from random codepaths. It made some technical sense to have a mode that checked only loose objects from the debugging point of view for that reason. Even for git developers, there no longer is any reason to type "git fsck" every five minutes these days, worried that some newly created objects might be corrupt due to recent change to git. The reason we did not make "--full" the default is probably we trust our filesystems a bit too much. At least, we trusted filesystems more than we trusted the lower core part of git that was under development. Once a packfile is created and we always use it read-only, there didn't seem to be much point in suspecting that the underlying filesystems or disks may corrupt them in such a way that is not caught by the SHA-1 checksum over the entire packfile and per object checksum. That trust in the filesystems might have been a good tradeoff between fsck performance and reliability on platforms git was initially developed on and for, but it may not be true anymore as we run on many more platforms these days. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-22Merge branch 'cc/replace'Junio C Hamano
* cc/replace: t6050: check pushing something based on a replaced commit Documentation: add documentation for "git replace" Add git-replace to .gitignore builtin-replace: use "usage_msg_opt" to give better error messages parse-options: add new function "usage_msg_opt" builtin-replace: teach "git replace" to actually replace Add new "git replace" command environment: add global variable to disable replacement mktag: call "check_sha1_signature" with the replacement sha1 replace_object: add a test case object: call "check_sha1_signature" with the replacement sha1 sha1_file: add a "read_sha1_file_repl" function replace_object: add mechanism to replace objects found in "refs/replace/" refs: add a "for_each_replace_ref" function
2009-07-06Merge branch 'tr/die_errno'Junio C Hamano
* tr/die_errno: Use die_errno() instead of die() when checking syscalls Convert existing die(..., strerror(errno)) to die_errno() die_errno(): double % in strerror() output just in case Introduce die_errno() that appends strerror(errno) to die()
2009-06-27Use die_errno() instead of die() when checking syscallsThomas Rast
Lots of die() calls did not actually report the kind of error, which can leave the user confused as to the real problem. Use die_errno() where we check a system/library call that sets errno on failure, or one of the following that wrap such calls: Function Passes on error from -------- -------------------- odb_pack_keep open read_ancestry fopen read_in_full xread strbuf_read xread strbuf_read_file open or strbuf_read_file strbuf_readlink readlink write_in_full xwrite Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-27Convert existing die(..., strerror(errno)) to die_errno()Thomas Rast
Change calls to die(..., strerror(errno)) to use the new die_errno(). In the process, also make slight style adjustments: at least state _something_ about the function that failed (instead of just printing the pathname), and put paths in single quotes. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-21Fix various sparse warnings in the git source codeLinus Torvalds
There are a few remaining ones, but this fixes the trivial ones. It boils down to two main issues that sparse complains about: - warning: Using plain integer as NULL pointer Sparse doesn't like you using '0' instead of 'NULL'. For various good reasons, not the least of which is just the visual confusion. A NULL pointer is not an integer, and that whole "0 works as NULL" is a historical accident and not very pretty. A few of these remain: zlib is a total mess, and Z_NULL is just a 0. I didn't touch those. - warning: symbol 'xyz' was not declared. Should it be static? Sparse wants to see declarations for any functions you export. A lack of a declaration tends to mean that you should either add one, or you should mark the function 'static' to show that it's in file scope. A few of these remain: I only did the ones that should obviously just be made static. That 'wt_status_submodule_summary' one is debatable. It has a few related flags (like 'wt_status_use_color') which _are_ declared, and are used by builtin-commit.c. So maybe we'd like to export it at some point, but it's not declared now, and not used outside of that file, so 'static' it is in this patch. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-01environment: add global variable to disable replacementChristian Couder
This new "read_replace_refs" global variable is set to 1 by default, so that replace refs are used by default. But reachability traversal and packing commands ("cmd_fsck", "cmd_prune", "cmd_pack_objects", "upload_pack", "cmd_unpack_objects") set it to 0, as they must work with the original DAG. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-25parse-opts: prepare for OPT_FILENAMEStephen Boyd
To give OPT_FILENAME the prefix, we pass the prefix to parse_options() which passes the prefix to parse_options_start() which sets the prefix member of parse_opts_ctx accordingly. If there isn't a prefix in the calling context, passing NULL will suffice. Signed-off-by: Stephen Boyd <bebarino@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-09Merge branch 'jc/maint-1.6.0-keep-pack' into maintJunio C Hamano
* jc/maint-1.6.0-keep-pack: pack-objects: don't loosen objects available in alternate or kept packs t7700: demonstrate repack flaw which may loosen objects unnecessarily Remove --kept-pack-only option and associated infrastructure pack-objects: only repack or loosen objects residing in "local" packs git-repack.sh: don't use --kept-pack-only option to pack-objects t7700-repack: add two new tests demonstrating repacking flaws is_kept_pack(): final clean-up Simplify is_kept_pack() Consolidate ignore_packed logic more has_sha1_kept_pack(): take "struct rev_info" has_sha1_pack(): refactor "pretend these packs do not exist" interface git-repack: resist stray environment variable Conflicts: t/t7700-repack.sh
2009-02-28has_sha1_pack(): refactor "pretend these packs do not exist" interfaceJunio C Hamano
Most of the callers of this function except only one pass NULL to its last parameter, ignore_packed. Introduce has_sha1_kept_pack() function that has the function signature and the semantics of this function, and convert the sole caller that does not pass NULL to call this new function. All other callers and has_sha1_pack() lose the ignore_packed parameter. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-31fsck: check loose objects from alternate object stores by defaultJunio C Hamano
"git fsck" used to validate only loose objects that are local and nothing else by default. This is not just too little when a repository is borrowing objects from other object stores, but also caused the connectivity check to mistakenly declare loose objects borrowed from them to be missing. The rationale behind the default mode that validates only loose objects is because these objects are still young and more unlikely to have been pushed to other repositories yet. That holds for loose objects borrowed from alternate object stores as well. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-31fsck: HEAD is part of refsJunio C Hamano
By default we looked at all refs but not HEAD. The only thing that made fsck not lose sight of commits that are only reachable from a detached HEAD was the reflog for the HEAD. This fixes it, with a new test. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-18Merge branch 'ap/clone-into-empty'Junio C Hamano
* ap/clone-into-empty: Allow cloning to an existing empty directory add is_dot_or_dotdot inline function
2009-01-18Merge branch 'maint-1.6.0' into maintJunio C Hamano
* maint-1.6.0: builtin-fsck: fix off by one head count Documentation: let asciidoc align related options githooks.txt: add missing word builtin-commit.c: do not remove COMMIT_EDITMSG
2009-01-18builtin-fsck: fix off by one head countChristian Couder
According to the man page, if "git fsck" is passed one or more heads, it should verify connectivity and validity of only objects reachable from the heads it is passed. However, since 5ac0a20 (Make builtin-fsck.c use parse_options., 2007-10-15) the command behaved as if no heads were passed, when given only one argument. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-11add is_dot_or_dotdot inline functionAlexander Potashev
A new inline function is_dot_or_dotdot is used to check if the directory name is either "." or "..". It returns a non-zero value if the given string is "." or "..". It's applicable to a lot of Git source code. Signed-off-by: Alexander Potashev <aspotashev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-11Merge branch 'maint'Junio C Hamano
* maint: fsck: reduce stack footprint make sure packs to be replaced are closed beforehand
2008-12-11fsck: reduce stack footprintLinus Torvalds
The logic to mark all objects that are reachable from tips of refs were implemented as a set of recursive functions. In a repository with a deep enough history, this can easily eat up all the available stack space. Restructure the code to require less stackspace by using an object array to keep track of the objects that still need to be processed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-05Make some of fwrite/fclose/write/close failures visibleAlex Riesen
So that full filesystem conditions or permissions problems won't go unnoticed. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-06Teach fsck and prune that tmp_obj_ file names may not be 14 bytes longBrandon Casey
As Shawn pointed out, not all temporary file creation routines can ensure that the generated temporary file is of a certain length. e.g. Java's createTempFile(prefix, suffix). So just depend on the prefix 'tmp_obj_' for detection. Update prune, and fix the "fix" introduced by a08c53a1 :) Signed-off-by: Brandon "appendixless" Casey <casey@nrlssc.navy.mil> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-27fsck: Don't require tmp_obj_ file names are 14 bytes in lengthShawn O. Pearce
Not all temporary file creation routines will ensure 14 bytes are used to generate the temporary file name. In C Git this may be true, but alternate implementations such as jgit are not always able to generate a temporary file name with a specific prefix and also ensure the file name length is 14 bytes long. Since temporary files in a directory we are fsck'ing should be uncommon (as they are short lived only long enough for an active writer to finish writing the file and rename it) we shouldn't see these show up very often. Always using a prefixcmp() call and ignoring the length opens up room for other implementations to use different name generation schemes. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-25Teach fsck and prune about the new location of temporary objectsBrandon Casey
Since 5723fe7e, temporary objects are now created in their final destination directories, rather than in .git/objects/. Teach fsck to recognize and ignore the temporary objects it encounters, and teach prune to remove them. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-13Make usage strings dash-lessStephan Beyer
When you misuse a git command, you are shown the usage string. But this is currently shown in the dashed form. So if you just copy what you see, it will not work, when the dashed form is no longer supported. This patch makes git commands show the dash-less version. For shell scripts that do not specify OPTIONS_SPEC, git-sh-setup.sh generates a dash-less usage string now. Signed-off-by: Stephan Beyer <s-beyer@gmx.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-25move show_pack_info() where it belongsNicolas Pitre
This is called when verify_pack() has its verbose argument set, and verbose in this context makes sense only for the actual 'git verify-pack' command. Therefore let's move show_pack_info() to builtin-verify-pack.c instead and remove useless verbose argument from verify_pack(). Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-02Merge branch 'mk/maint-parse-careful'Junio C Hamano
* mk/maint-parse-careful: receive-pack: use strict mode for unpacking objects index-pack: introduce checking mode unpack-objects: prevent writing of inconsistent objects unpack-object: cache for non written objects add common fsck error printing function builtin-fsck: move common object checking code to fsck.c builtin-fsck: reports missing parent commits Remove unused object-ref code builtin-fsck: move away from object-refs to fsck_walk add generic, type aware object chain walker Conflicts: Makefile builtin-fsck.c
2008-02-26builtin-fsck: move common object checking code to fsck.cMartin Koegler
Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-26builtin-fsck: reports missing parent commitsMartin Koegler
parse_commit ignores parent commits with certain errors (eg. a non commit object is already loaded under the sha1 of the parent). To make fsck reports such errors, it has to compare the nummer of parent commits returned by parse commit with the number of parent commits in the object or in the graft/shallow file. Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-26builtin-fsck: move away from object-refs to fsck_walkMartin Koegler
Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-12Merge branch 'lt/in-core-index'Junio C Hamano
* lt/in-core-index: lazy index hashing Create pathname-based hash-table lookup into index read-cache.c: introduce is_racy_timestamp() helper read-cache.c: fix a couple more CE_REMOVE conversion Also use unpack_trees() in do_diff_cache() Make run_diff_index() use unpack_trees(), not read_tree() Avoid running lstat(2) on the same cache entry. index: be careful when handling long names Make on-disk index representation separate from in-core one
2008-02-04git-fsck: report missing author/commit line in a commit as an errorMartin Koegler
A zero commit date could be caused by: * a missing author line * a missing commiter line * a malformed email address in the commiter line * a malformed commit date Simply reporting it as zero commit date is missleading. Additionally, it upgrades the message to an error (instead of an printf). Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-21Make on-disk index representation separate from in-core oneLinus Torvalds
This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-16Make 'git fsck' complain about non-commit branchesLinus Torvalds
Since having non-commits in branches is a no-no, and just means you cannot commit on them, let's make fsck tell you when a branch is bad. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-10-30Fixed a command line option type for builtin-fsck.cEmil Medve
The typo was introduced by 5ac0a2063e8f824f6e8ffb4d18de74c55aae7131 (Make builtin-fsck.c use parse_options.) Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com> Acked-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-10-30Make builtin-fsck.c use parse_options.Pierre Habouzit
Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-07-22fsck --lost-found: write blob's contents, not their SHA-1Johannes Schindelin
When looking for a lost blob, it is much nicer to be able to grep through .git/lost-found/other/* than to write an inefficient loop over the file names. So write the contents of the dangling blobs, not their object names. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-15Make every builtin-*.c file #include "builtin.h"Peter Hagervall
Make every builtin-*.c file #include "builtin.h". Also takes care of some declaration/definition mismatches. Signed-off-by: Peter Hagervall <hager@cs.umu.se> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-03git-fsck: add --lost-found optionJohannes Schindelin
With this option, dangling objects are not only reported, but also written to .git/lost-found/commit/ or .git/lost-found/other/. This option implies '--full' and '--no-reflogs'. 'git fsck --lost-found' is meant as a replacement for git-lost-found. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-07War on whitespaceJunio C Hamano
This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-05git-fsck: learn about --verboseJohannes Schindelin
With --verbose, it gets really chatty now. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-05-30Ensure the pack index is opened before accessShawn O. Pearce
In this particular location of fsck the index should have already been opened by verify_pack, which is called just before we get here and loop through the object names. However, just in case a future version of that function does not use the index file we'll double-check its open before we access the num_objects field. Better safe now than sorry later. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-22rename dirlink to gitlink.Martin Waitz
Unify naming of plumbing dirlink/gitlink concept: git ls-files -z '*.[ch]' | xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;' Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-29Merge branch 'maint'Junio C Hamano
* maint: http.c: Fix problem with repeated calls of http_init Add missing reference to GIT_COMMITTER_DATE in git-commit-tree documentation Fix import-tars fix. Update .mailmap with "Michael" Do not barf on too long action description Catch empty pathnames in trees during fsck Don't allow empty pathnames in fast-import import-tars: be nice to wrong directory modes git-svn: Added 'find-rev' command git shortlog documentation: add long options and fix a typo
2007-04-29Catch empty pathnames in trees during fsckShawn O. Pearce
Released versions of fast-import have been able to create a tree that contains files or subtrees that contain no name. Unfortunately these trees aren't valid, but people may have actually tried to create them due to bugs in import-tars.perl or their own fast-import frontend. We now look for this unusual condition and warn the user if at least one of their tree objects contains the problem. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-22Merge branch 'lt/gitlink'Junio C Hamano
* lt/gitlink: Tests for core subproject support Expose subprojects as special files to "git diff" machinery Fix some "git ls-files -o" fallout from gitlinks Teach "git-read-tree -u" to check out submodules as a directory Teach git list-objects logic to not follow gitlinks Fix gitlink index entry filesystem matching Teach "git-read-tree -u" to check out submodules as a directory Teach git list-objects logic not to follow gitlinks Don't show gitlink directories when we want "other" files Teach git-update-index about gitlinks Teach directory traversal about subprojects Fix thinko in subproject entry sorting Teach core object handling functions about gitlinks Teach "fsck" not to follow subproject links Add "S_IFDIRLNK" file mode infrastructure for git links Add 'resolve_gitlink_ref()' helper function Avoid overflowing name buffer in deep directory structures diff-lib: use ce_mode_from_stat() rather than messing with modes manually
2007-04-22Merge branch 'np/pack'Junio C Hamano
* np/pack: (27 commits) document --index-version for index-pack and pack-objects pack-objects: remove obsolete comments pack-objects: better check_object() performances add get_size_from_delta() pack-objects: make in_pack_header_size a variable of its own pack-objects: get rid of create_final_object_list() pack-objects: get rid of reuse_cached_pack pack-objects: clean up list sorting pack-objects: rework check_delta_limit usage pack-objects: equal objects in size should delta against newer objects pack-objects: optimize preferred base handling a bit clean up add_object_entry() tests for various pack index features use test-genrandom in tests instead of /dev/urandom simple random data generator for tests validate reused pack data with CRC when possible allow forcing index v2 and 64-bit offset treshold pack-redundant.c: learn about index v2 show-index.c: learn about index v2 sha1_file.c: learn about index version 2 ...
2007-04-12Merge branch 'maint'Junio C Hamano
* maint: GIT 1.5.1.1 cvsserver: Fix handling of diappeared files on update fsck: do not complain on detached HEAD. (encode_85, decode_85): Mark source buffer pointer as "const".
2007-04-11fsck: do not complain on detached HEAD.Junio C Hamano
Detached HEAD is just a normal state of a repository. Do not say anything about it. Do not give worrying "error:" messages when we let the user know that the HEAD points at nothing (i.e. yet to be born branch), nor we do not have any default refs to start following the objects chain. Reword them as "notice:". Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-10Teach "fsck" not to follow subproject linksLinus Torvalds
Since the subprojects don't necessarily even exist in the current tree, much less in the current git repository (they are totally independent repositories), we do not want to try to follow the chain from one git repository to another through a gitlink. This involves teaching fsck to ignore references to gitlink objects from a tree and from the current index. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-10get rid of num_packed_objects()Nicolas Pitre
The coming index format change doesn't allow for the number of objects to be determined from the size of the index file directly. Instead, Let's initialize a field in the packed_git structure with the object count when the index is validated since the count is always known at that point. While at it let's reorder some struct packed_git fields to avoid padding due to needed 64-bit alignment for some of them. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>