summaryrefslogtreecommitdiff
path: root/fast-import.c
AgeCommit message (Collapse)Author
2007-05-24Merge branch 'maint'Junio C Hamano
* maint: Fix possible coredump with fast-import --import-marks Refactor fast-import branch creation from existing commit fast-import: Fix crash when referencing already existing objects fast-import: Fix uninitialized variable Documentation: fix git-config.xml generation
2007-05-24Fix possible coredump with fast-import --import-marksShawn O. Pearce
When e8438420bb7d368bec3647b90c557b9931582267 allowed us to reload the marks table on subsequent runs of fast-import we really broke things, as we set pack_id to MAX_PACK_ID for any objects we imported into the marks table. Creating a branch from that mark should fail as we attempt to read the object through a non-existant packed_git pointer. Instead we have to use the normal Git object system to locate the older commit, as we ourselves do not have a reference to the packed_git it resides in. This bug only occurred because t9300 was not complete enough. When we added the --import-marks feature we didn't actually test its implementation enough to verify the function worked as intended. I have corrected that, and included the changes as part of this fix. Prior versions of fast-import fail the new test(s); this commit allows them to pass. Credit for this bug find goes to Simon Hausmann <simon@lst.de> as he recently identified a similiar bug in the tree lazy-loading path. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-24Refactor fast-import branch creation from existing commitShawn O. Pearce
To resolve a corner case uncovered by Simon Hausmann I need to reuse the logic for the SHA-1 expression version of the 'from ' command within the mark version of the 'from ' command. This change doesn't alter any functionality, but is merely breaking the common code out to a function that I can reuse. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-24fast-import: Fix crash when referencing already existing objectsSimon Hausmann
Commit a5c1780a0355a71b9fb70f1f1977ce726ee5b8d8 sets the pack_id of existing objects to MAX_PACK_ID. When the same object is referenced later again it is found in the local object hash. With such a pack_id fast-import should not try to locate that object in the newly created pack(s). Signed-off-by: Simon Hausmann <simon@lst.de> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-24fast-import: Fix uninitialized variableSimon Hausmann
Fix uninitialized last_object->no_free variable that is accessed in store_object. Signed-off-by: Simon Hausmann <simon@lst.de> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-10git-update-ref: add --no-deref option for overwriting/detaching refSven Verdoolaege
git-checkout is also adapted to make use of this new option instead of the handcrafted command sequence. Signed-off-by: Sven Verdoolaege <skimo@kotnet.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-02Create pack-write.c for common pack writing codeDana L. How
Include a generalized fixup_pack_header_footer() in this new file. Needed by git-repack --max-pack-size feature in a later patchset. [sp: Moved close(pack_fd) to callers, to support index-pack, and changed name to better indicate it is for packfiles.] Signed-off-by: Dana L. How <danahow@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-04-29Merge branch 'maint'Junio C Hamano
* maint: http.c: Fix problem with repeated calls of http_init Add missing reference to GIT_COMMITTER_DATE in git-commit-tree documentation Fix import-tars fix. Update .mailmap with "Michael" Do not barf on too long action description Catch empty pathnames in trees during fsck Don't allow empty pathnames in fast-import import-tars: be nice to wrong directory modes git-svn: Added 'find-rev' command git shortlog documentation: add long options and fix a typo
2007-04-29Don't allow empty pathnames in fast-importShawn O. Pearce
riddochc on #git noticed corruption caused by import-tars. This was fixed in the prior commit by Dscho, but fast-import was wrong to have allowed a tree to be created with an empty string as the filename. No operating system allows this, and Git itself doesn't accept this into the index. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-04-24fast-import: size_t vs ssize_tSami Farin
size_t is unsigned, so (n < 0) is never true. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-04-20Don't repack existing objects in fast-importShawn O. Pearce
Some users of fast-import have been trying to use it to rewrite commits and trees, an activity where the all of the relevant blobs are already available from the existing packfiles. In such a case we don't want to repack a blob, even if the frontend application has supplied us the raw data rather than a mark or a SHA-1 name. I'm intentionally only checking the packfiles that existed when fast-import started and am always ignoring all loose object files. We ignore loose objects because fast-import tends to operate on a very large number of objects in a very short timespan, and it is usually creating new objects, not reusing existing ones. In such a situtation the majority of the objects will not be found in the existing packfiles, nor will they be loose object files. If the frontend application really wants us to look at loose object files, then they can just repack the repository before running fast-import. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-31Rename warn() to warning() to fix symbol conflicts on BSD and Mac OSTheodore Ts'o
This fixes a problem reported by Randal Schwartz: >I finally tracked down all the (albeit inconsequential) errors I was getting >on both OpenBSD and OSX. It's the warn() function in usage.c. There's >warn(3) in BSD-style distros. It'd take a "great rename" to change it, but if >someone with better C skills than I have could do that, my linker and I would >appreciate it. It was annoying to me, too, when I was doing some mergetool testing on Mac OS X, so here's a fix. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: "Randal L. Schwartz" <merlyn@stonehenge.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-25make it more obvious that temporary files are temporary filesNicolas Pitre
When some operations are interrupted (or "die()'d" or crashed) then the partial object/pack/index file may remain around. Make it more obvious in their name that those files are temporary stuff and can be cleaned up if no operation is in progress. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-12Remove unnecessary casts from fast-importShawn O. Pearce
Jeff King pointed out that these casts are quite unnecessary, as the compiler should be doing them anyway, and may cause problems in the future if the size of the argument for to_atom were to ever be increased. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-12Merge branch 'maint'Shawn O. Pearce
* maint: fast-import: grow tree storage more aggressively
2007-03-12fast-import: grow tree storage more aggressivelyJeff King
When building up a tree for a commit, fast-import dynamically allocates memory for the tree entries. When more space is needed, the allocated memory is increased by a constant amount. For very large trees, this means re-allocating and memcpy()ing the memory O(n) times. To compound this problem, releasing the previous tree resource does not free the memory; it is kept in a pool for future trees. This means that each of the O(n) allocations will consume increasing amounts of memory, giving O(n^2) memory consumption. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-08Merge branch 'master' of git://repo.or.cz/git/fastimportJunio C Hamano
* 'master' of git://repo.or.cz/git/fastimport: Allow fast-import frontends to reload the marks table Use atomic updates to the fast-import mark file Preallocate memory earlier in fast-import
2007-03-07Allow fast-import frontends to reload the marks tableShawn O. Pearce
I'm giving fast-import a lesson on how to reload the marks table using the same format it outputs with --export-marks. This way a frontend can reload the marks table from a prior import, making incremental imports less painful. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-07Use atomic updates to the fast-import mark fileShawn O. Pearce
When we allow fast-import frontends to reload a mark file from a prior session we want to let them use the same file as they exported the marks to. This makes it very simple for the frontend to save state across incremental imports. But we don't want to lose the old marks table if anything goes wrong while writing our current marks table. So instead of truncating and overwriting the path specified to --export-marks we use the standard lockfile code to write the current marks out to a temporary file, then rename it over the old marks table. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-07Preallocate memory earlier in fast-importShawn O. Pearce
I'm about to teach fast-import how to reload the marks file created by a prior session. The general approach that I want to use is to immediately parse the marks file when the specific argument is found in argv, thereby allowing the caller to supply multiple marks files, as the mark space can be sparsely populated. To make that work out we need to allocate our object tables before we parse the command line options. Since none of these tables depend on the command line options, we can easily relocate them. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-07Use off_t in pack-objects/fast-import when we mean an offsetShawn O. Pearce
Always use an off_t value in pack-objects anytime we are dealing with an offset to some data within a packfile. Also fixed a minor uintmax_t that was incorrectly defined before. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07Use off_t when we really mean a file offset.Shawn O. Pearce
Not all platforms have declared 'unsigned long' to be a 64 bit value, but we want to support a 64 bit packfile (or close enough anyway) in the near future as some projects are getting large enough that their packed size exceeds 4 GiB. By using off_t, the POSIX type that is declared to mean an offset within a file, we support whatever maximum file size the underlying operating system will handle. For most modern systems this is up around 2^60 or higher. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07General const correctness fixesShawn O. Pearce
We shouldn't attempt to assign constant strings into char*, as the string is not writable at runtime. Likewise we should always be treating unsigned values as unsigned values, not as signed values. Most of these are very straightforward. The only exception is the (unnecessary) xstrdup/free in builtin-branch.c for the detached head case. Since this is a user-level interactive type program and that particular code path is executed no more than once, I feel that the extra xstrdup call is well worth the easy elimination of this warning. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-05Merge branch 'maint'Shawn O. Pearce
* maint: fast-import: Fail if a non-existant commit is used for merge fast-import: Avoid infinite loop after reset [sp: Minor evil merge to deal with type_names array moving to be private in 'master'.]
2007-03-05fast-import: Fail if a non-existant commit is used for mergeShawn O. Pearce
Johannes Sixt noticed during one of his own imports that fast-import did not fail if a non-existant commit is referenced by SHA-1 value as an argument to the 'merge' command. This allowed the user to unknowingly create commits that would fail in fsck, as the commit contents would not be completely reachable. A side effect of this bug was that a frontend process could mark any SHA-1 object (blob, tree, tag) as a parent of a merge commit. This should also fail in fsck, as the commit is not a valid commit. We now use the same rule as the 'from' command. If a commit is referenced in the 'merge' command by hex formatted SHA-1 then the SHA-1 must be a commit or a tag that can be peeled back to a commit, the commit must already exist, and must be readable by the core Git infrastructure code. This requirement means that the commit must have existed prior to fast-import starting, or the commit must have been flushed out by a prior 'checkpoint' command. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-03-05fast-import: Avoid infinite loop after resetShawn O. Pearce
Johannes Sixt noticed that a 'reset' command applied to a branch that is already active in the branch LRU cache can cause fast-import to relink the same branch into the LRU cache twice. This will cause the LRU cache to contain a cycle, making unload_one_branch run in an infinite loop as it tries to select the oldest branch for eviction. I have trivially fixed the problem by adding an active bit to each branch object; this bit indicates if the branch is already in the LRU and allows us to avoid trying to add it a second time. Converting the pack_id field into a bitfield makes this change take up no additional memory. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-27convert object type handling from a string to a numberNicolas Pitre
We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-27formalize typename(), and add its reverse type_from_string()Nicolas Pitre
Sometime typename() is used, sometimes type_names[] is accessed directly. Let's enforce typename() all the time which allows for validating the type. Also let's add a function to go from a name to a type and use it instead of manual memcpy() when appropriate. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-21prefixcmp(): fix-up mechanical conversion.Junio C Hamano
Previous step converted use of strncmp() with literal string mechanically even when the result is only used as a boolean: if (!strncmp("foo", arg, 3)) ==> if (!(-prefixcmp(arg, "foo"))) This step manually cleans them up to read: if (!prefixcmp(arg, "foo")) Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-21Mechanical conversion to use prefixcmp()Junio C Hamano
This mechanically converts strncmp() to use prefixcmp(), but only when the parameters match specific patterns, so that they can be verified easily. Leftover from this will be fixed in a separate step, including idiotic conversions like if (!strncmp("foo", arg, 3)) => if (!(-prefixcmp(arg, "foo"))) This was done by using this script in px.perl #!/usr/bin/perl -i.bak -p if (/strncmp\(([^,]+), "([^\\"]*)", (\d+)\)/ && (length($2) == $3)) { s|strncmp\(([^,]+), "([^\\"]*)", (\d+)\)|prefixcmp($1, "$2")|; } if (/strncmp\("([^\\"]*)", ([^,]+), (\d+)\)/ && (length($1) == $3)) { s|strncmp\("([^\\"]*)", ([^,]+), (\d+)\)|(-prefixcmp($2, "$1"))|; } and running: $ git grep -l strncmp -- '*.c' | xargs perl px.perl Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-21Merge branch 'maint'Junio C Hamano
* maint: Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c.
2007-02-21Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c.Jason Riedy
Thanks to Simon 'corecode' Schubert <corecode@fs.ei.tum.de> for the clean-up. Defining the C99 standard PRIuMAX when necessary replaces UM_FMT and the awkward UM10_FMT. There are no direct C99 translations for other uses of NO_C99_FORMAT in git, alas. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-20Merge branch 'maint'Junio C Hamano
* maint: Obey NO_C99_FORMAT in fast-import.c. Add a compat/strtoumax.c for Solaris 8. git-clone: Sync documentation to usage note.
2007-02-20Obey NO_C99_FORMAT in fast-import.c.Jason Riedy
Define UM_FMT and UM10_FMT and use in place of %ju and %10ju, respectively. Both format as unsigned long long, so this assumes the compiler supports long long. Signed-off-by: Jason Riedy <jason@acm.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-14Merge branch 'jc/merge-base' (early part)Junio C Hamano
This contains an evil merge to fast-import, in order to resolve in_merge_bases() update.
2007-02-12fast-import: Support reusing 'from' and brown paper bag fix reset.Shawn O. Pearce
It was suggested on the mailing list that being able to use `from` in any commit to reset the current branch is useful in some types of importers, such as a darcs importer. We originally did not permit resetting an existing branch with a new `from` command during a `commit` command, but this restriction was only to help debug the hacked up cvs2svn that Jon Smirl was developing in parallel with git-fast-import. It is probably more of a problem to disallow it than to allow it. So now we permit a `from` during any `commit`. While making the changes required to permit multiple `from` commands on the same branch, I discovered we no longer needed the last_commit field to be set to 0 during a reset, so that was removed. (Reset was originally setting the field to 0 to signal cmd_from() that it was OK to execute on the branch.) While poking around in this section of fast-import I also realized the `reset` command was not working as intended if the corresponding `from` command was omitted (as allowed by the BNF grammar and the code). If `from` was omitted we cleared out the tree but we left the tree SHA-1 and parent commit SHA-1 intact. This is not what the user intended in this case. Instead they would be trying to reset the branch to have no parent and to have no tree, making the branch look new-born during the next commit. We now clear these SHA-1 values during `reset`, ensuring the branch looks new-born if `from` does not get supplied. New test cases for these were also added. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-12fast-import: Hide the pack boundary commits by default.Shawn O. Pearce
Most users don't need the pack boundary information that fast-import was printing to standard output, especially if they were calling it with --quiet. Those users who do want this information probably want it captured so they can go back and use it to repack the imported repository. So dumping the boundary commits to a log file makes more sense then printing them to standard output. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-07fast-import: Fix compile warningsJohannes Schindelin
Not on all platforms are size_t and unsigned long equivalent. Since I do not know how portable %z is, I play safe, and just cast the respective variables to unsigned long. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-07Don't crash fast-import if the marks cannot be exported.Shawn O. Pearce
Apparently fast-import used to die a horrible death if we were unable to open the marks file for output. This is slightly less than ideal, especially now that we dump the marks as part of the `checkpoint` command. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-07Dump all refs and marks during a checkpoint in fast-import.Shawn O. Pearce
If the frontend asks us to checkpoint (via the explicit checkpoint command) its probably because they are afraid the current import will crash/fail/whatever and want to make sure they can pickup from the last checkpoint. To do that sort of recovery, we will need the current tip of every branch and tag available at the next startup. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-07Teach fast-import how to sit quietly in the corner.Shawn O. Pearce
Often users will be running fast-import from within a larger frontend process, and this may be a frequent periodic tool such as a future edition of `git-svn fetch`. We don't want to bombard users with our large stats output if they won't be interested in it, so `--quiet` is now an option to make gfi be more silent. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-07Teach fast-import how to clear the internal branch content.Shawn O. Pearce
Some frontends may not be able to (easily) keep track of which files are included in the branch, and which aren't. Performing this tracking can be tedious and error prone for the frontend to do, especially if its foreign data source cannot supply the changed path list on a per-commit basis. fast-import now allows a frontend to request that a branch's tree be wiped clean (reset to the empty tree) at the start of a commit, allowing the frontend to feed in all paths which belong on the branch. This is ideal for a tar-file importer frontend, for example, as the frontend just needs to reformat the tar data stream into a gfi data stream, which may be something a few Perl regexps can take care of. :) Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-06S_IFLNK != 0140000Junio C Hamano
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-06Don't do non-fastforward updates in fast-import.Shawn O. Pearce
If fast-import is being used to update an existing branch of a repository, the user may not want to lose commits if another process updates the same ref at the same time. For example, the user might be using fast-import to make just one or two commits against a live branch. We now perform a fast-forward check during the ref updating process. If updating a branch would cause commits in that branch to be lost, we skip over it and display the new SHA1 to standard error. This new default behavior can be overridden with `--force`, like git-push and git-fetch. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-06Support RFC 2822 date parsing in fast-import.Shawn O. Pearce
Since some frontends may be working with source material where the dates are only readily available as RFC 2822 strings, it is more friendly if fast-import exposes Git's parse_date() function to handle the conversion. This way the frontend doesn't need to perform the parsing itself. The new --date-format option to fast-import can be used by a frontend to select which format it will supply date strings in. The default is the standard `raw` Git format, which fast-import has always supported. Format rfc2822 can be used to activate the parse_date() function instead. Because fast-import could also be useful for creating new, current commits, the format `now` is also supported to generate the current system timestamp. The implementation of `now` is a trivial call to datestamp(), but is actually a whole whopping 3 lines so that fast-import can verify the frontend really meant `now`. As part of this change I have added validation of the `raw` date format. Prior to this change fast-import would accept anything in a `committer` command, even if it was seriously malformed. Now fast-import requires the '> ' near the end of the string and verifies the timestamp is formatted properly. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-06Remove unnecessary null pointer checks in fast-import.Shawn O. Pearce
There is no need to check for a NULL pointer before invoking free(), the runtime library automatically performs this check anyway and does nothing if a NULL pointer is supplied. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-06Correct minor style issue in fast-import.Shawn O. Pearce
Junio noticed that I was using a different style in fast-import for returned pointers than the rest of Git. Before merging this code into the main git.git tree I'd like to make it consistent, as this style variation was not intentional. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-06Correct compiler warnings in fast-import.Shawn O. Pearce
Junio noticed these warnings/errors in fast-import when compiling with `-Werror -ansi -pedantic`. A few changes are to reduce compiler warnings, while one (in cmd_merge) is a bug fix. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-06Remove --branch-log from fast-import.Shawn O. Pearce
The --branch-log option and its associated code hasn't been used in several months, as its not really very useful for debugging fast-import or a frontend. I don't plan on supporting it in this state long-term, so I'm killing it now before it gets distributed to a wider audience. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-06Don't support shell-quoted refnames in fast-import.Shawn O. Pearce
The current implementation of shell-style quoted refnames and SHA-1 expressions within fast-import contains a bad memory leak. We leak the unquoted strings used by the `from` and `merge` commands, maybe others. Its also just muddling up the docs. Since Git refnames cannot contain LF, and that is our delimiter for the end of the refname, and we accept any other character as-is, there is no reason for these strings to support quoting, except to be nice to frontends. But frontends shouldn't be expecting to use funny refs in Git, and its just as simple to never quote them as it is to always pass them through the same quoting filter as pathnames. So frontends should never quote refs, or ref expressions. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>