path: root/archive.c
AgeCommit message (Collapse)Author
2014-10-08Merge branch 'nd/archive-pathspec'Junio C Hamano
"git archive" learned to filter what gets archived with pathspec. * nd/archive-pathspec: archive: support filtering paths with glob
2014-09-22archive: support filtering paths with globNguyễn Thái Ngọc Duy
This patch fixes two problems with using :(glob) (or even "*.c" without ":(glob)"). The first one is we forgot to turn on the 'recursive' flag in struct pathspec. Without that, tree_entry_interesting() will not mark potential directories "interesting" so that it can confirm whether those directories have anything matching the pathspec. The marking directories interesting has a side effect that we need to walk inside a directory to realize that there's nothing interested in there. By that time, 'archive' code has already written the (empty) directory down. That means lots of empty directories in the result archive. This problem is fixed by lazily writing directories down when we know they are actually needed. There is a theoretical bug in this implementation: we can't write empty trees/directories that match that pathspec. path_exists() is also made stricter in order to detect non-matching pathspec because when this 'recursive' flag is on, we most likely match some directories. The easiest way is not consider any directories "matched". Noticed-by: Peter Wu <> Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2014-08-07archive.c: replace `git_config()` with `git_config_get_bool()` familyTanay Abhra
Use `git_config_get_bool()` family instead of `git_config()` to take advantage of the config-set API which provides a cleaner control flow. Signed-off-by: Tanay Abhra <> Reviewed-by: Matthieu Moy <> Signed-off-by: Junio C Hamano <>
2014-03-18Merge branch 'rm/strchrnul-not-strlen'Junio C Hamano
* rm/strchrnul-not-strlen: use strchrnul() in place of strchr() and strlen()
2014-03-10use strchrnul() in place of strchr() and strlen()Rohit Mani
Avoid scanning strings twice, once with strchr() and then with strlen(), by using strchrnul(). Helped-by: Junio C Hamano <> Signed-off-by: Rohit Mani <> Signed-off-by: Junio C Hamano <>
2014-02-28add uploadarchive.allowUnreachable optionScott J. Goldman
In commit ee27ca4, we started restricting remote git-archive invocations to only accessing reachable commits. This matches what upload-pack allows, but does restrict some useful cases (e.g., HEAD:foo). We loosened this in 0f544ee, which allows `foo:bar` as long as `foo` is a ref tip. However, that still doesn't allow many useful things, like: 1. Commits accessible from a ref, like `foo^:bar`, which are reachable 2. Arbitrary sha1s, even if they are reachable. We can do a full object-reachability check for these cases, but it can be quite expensive if the client has sent us the sha1 of a tree; we have to visit every sub-tree of every commit in the worst case. Let's instead give site admins an escape hatch, in case they prefer the more liberal behavior. For many sites, the full object database is public anyway (e.g., if you allow dumb walker access), or the site admin may simply decide the security/convenience tradeoff is not worth it. This patch adds a new config option to disable the restrictions added in ee27ca4. It defaults to off, meaning there is no change in behavior by default. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2013-10-16archive.c: have SP around arithmetic operatorsJunio C Hamano
Signed-off-by: Junio C Hamano <>
2013-07-15archive: convert to use parse_pathspecNguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2013-07-15move struct pathspec and related functions to pathspec.[ch]Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2013-03-11archive: handle commits with an empty treeJeff King
git-archive relies on get_pathspec to convert its argv into a list of pathspecs. When get_pathspec is given an empty argv list, it returns a single pathspec, the empty string, to indicate that everything matches. When we feed this to our path_exists function, we typically see that the pathspec turns up at least one item in the tree, and we are happy. But when our tree is empty, we erroneously think it is because the pathspec is too limited, when in fact it is simply that there is nothing to be found in the tree. This is a weird corner case, but the correct behavior is almost certainly to produce an empty archive, not to exit with an error. This patch teaches git-archive to create empty archives when there is no pathspec given (we continue to complain if a pathspec is given, since it obviously is not matched). It also confirms that the tar and zip writers produce sane output in this instance. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2012-12-18Add directory pattern matching to attributesJean-Noël AVILA
The manpage of gitattributes says: "The rules how the pattern matches paths are the same as in .gitignore files" and the gitignore pattern matching has a pattern ending with / for directory matching. This rule is specifically relevant for the 'export-ignore' rule used for git archive. Signed-off-by: Jean-Noel Avila <> Signed-off-by: Junio C Hamano <>
2012-08-22Reduce translations by using same terminologiesNguyễn Thái Ngọc Duy
Somewhere in help usage, we use both "message" and "msg", "command" and "cmd", "key id" and "key-id". This patch makes all help text from parseopt use the first form. Clearer and 3 fewer strings for translators. Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2012-08-20i18n: archive: mark parseopt strings for translationNguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2012-05-23Merge branch 'rs/archive-tree-in-tip-simplify'Junio C Hamano
By René Scharfe * rs/archive-tree-in-tip-simplify: archive-tar: keep const in checksum calculation archive: simplify refname handling
2012-05-18archive: simplify refname handlingRené Scharfe
There is no need to build a copy of the relevant part of the string just to make sure we have a NUL-terminated string. We can simply pass the length of the interesting part to dwim_ref() instead. Signed-off-by: Rene Scharfe <> Signed-off-by: Junio C Hamano <>
2012-05-03archive: delegate blob reading to backendNguyễn Thái Ngọc Duy
archive-tar.c and archive-zip.c now perform conversion check, with help of sha1_file_to_archive() from archive.c This gives backends more freedom in dealing with (streaming) large blobs. Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2012-01-13Merge branch 'jk/maint-upload-archive'Junio C Hamano
* jk/maint-upload-archive: archive: re-allow HEAD:Documentation on a remote invocation
2012-01-12archive: re-allow HEAD:Documentation on a remote invocationCarlos Martín Nieto
The tightening done in (ee27ca4a: archive: don't let remote clients get unreachable commits, 2011-11-17) went too far and disallowed HEAD:Documentation as it would try to find "HEAD:Documentation" as a ref. Only DWIM the "HEAD" part to see if it exists as a ref. Once we're sure that we've been given a valid ref, we follow the normal code path. This still disallows attempts to access commits which are not branch tips. Signed-off-by: Carlos Martín Nieto <> Signed-off-by: Junio C Hamano <>
2011-12-14Merge branch 'jk/maint-upload-archive'Junio C Hamano
* jk/maint-upload-archive: archive: don't let remote clients get unreachable commits
2011-11-21Merge branch 'jk/maint-1.6.2-upload-archive' into jk/maint-upload-archiveJunio C Hamano
* jk/maint-1.6.2-upload-archive: archive: don't let remote clients get unreachable commits Conflicts: archive.c archive.h builtin-archive.c builtin/upload-archive.c t/
2011-11-21archive: don't let remote clients get unreachable commitsJeff King
Usually git is careful not to allow clients to fetch arbitrary objects from the database; for example, objects received via upload-pack must be reachable from a ref. Upload-archive breaks this by feeding the client's tree-ish directly to get_sha1, which will accept arbitrary hex sha1s, reflogs, etc. This is not a problem if all of your objects are publicly reachable anyway (or at least public to anybody who can run upload-archive). Or if you are making the repo available by dumb protocols like http or rsync (in which case the client can read your whole object db directly). But for sites which allow access only through smart protocols, clients may be able to fetch trees from commits that exist in the server's object database but are not referenced (e.g., because history was rewound). This patch tightens upload-archive's lookup to use dwim_ref rather than get_sha1. This means a remote client can only fetch the tip of a named ref, not an arbitrary sha1 or reflog entry. This also restricts some legitimate requests, too: 1. Reachable non-tip commits, like: git archive --remote=$url v1.0~5 2. Sub-trees of reachable commits, like: git archive --remote=$url v1.7.7:Documentation Local requests continue to use get_sha1, and are not restricted at all. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2011-09-28archive.c: use OPT_BOOL()Junio C Hamano
The list variable (which is OPT_BOOLEAN) is initialized to 0 and only checked against 0 in the code, so it is safe to use OPT_BOOL(). The worktree_attributes variable (which is OPT_BOOLEAN) is initialized to 0 and later assigned to a field with the same name in struct archive_args, which is a bitfield of width 1. It is safe and even more correct to use OPT_BOOL() here; the new test in 5001 demonstrates why using OPT_COUNTUP is wrong. Signed-off-by: Junio C Hamano <>
2011-08-04Rename git_checkattr() to git_check_attr()Michael Haggerty
Suggested by: Junio Hamano <> Signed-off-by: Michael Haggerty <> Signed-off-by: Junio C Hamano <>
2011-06-22upload-archive: allow user to turn off filtersJeff King
Some tar filters may be very expensive to run, so sites do not want to expose them via upload-archive. This patch lets users configure tar.<filter>.remote to turn them off. By default, gzip filters are left on, as they are about as expensive as creating zip archives. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2011-06-22archive: refactor file extension format-guessingJeff King
Git-archive will guess a format from the output filename if no format is explicitly given. The current function just hardcodes "zip" to the zip format, and leaves everything else NULL (which will default to tar). Since we are about to add user-specified formats, we need to be more flexible. The new rule is "if a filename ends with a dot and the name of a format, it matches that format". For the existing "tar" and "zip" formats, this is identical to the current behavior. For new user-specified formats, this will do what the user expects if they name their formats appropriately. Because we will eventually start matching arbitrary user-specified extensions that may include dots, the strrchr search for the final dot is not sufficient. We need to do an actual suffix match with each extension. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2011-06-22archive: move file extension format-guessing lowerJeff King
The process for guessing an archive output format based on the filename is something like this: a. parse --output in cmd_archive; check the filename against a static set of mapping heuristics (right now it just matches ".zip" for zip files). b. if found, stick a fake "--format=zip" at the beginning of the arguments list (if the user did specify a --format manually, the later option will override our fake one) c. if it's a remote call, ship the arguments to the remote (including the fake), which will call write_archive on their end d. if it's local, ship the arguments to write_archive locally There are two problems: 1. The set of mappings is static and at too high a level. The write_archive level is going to check config for user-defined formats, some of which will specify extensions. We need to delay lookup until those are parsed, so we can match against them. 2. For a remote archive call, our set of mappings (or formats) may not match the remote side's. This is OK in practice right now, because all versions of git understand "zip" and "tar". But as new formats are added, there is going to be a mismatch between what the client can do and what the remote server can do. To fix (1), this patch refactors the location guessing to happen at the write_archive level, instead of the cmd_archive level. So instead of sticking a fake --format field in the argv list, we actually pass a "name hint" down the callchain; this hint is used at the appropriate time to guess the format (if one hasn't been given already). This patch leaves (2) unfixed. The name_hint is converted to a "--format" option as before, and passed to the remote. This means the local side's idea of how extensions map to formats will take precedence. Another option would be to pass the name hint to the remote side and let the remote choose. This isn't a good idea for two reasons: 1. There's no room in the protocol for passing that information. We can pass a new argument, but older versions of git on the server will choke on it. 2. Letting the remote side decide creates a silent inconsistency in user experience. Consider the case that the locally installed git knows about the "tar.gz" format, but a remote server doesn't. Running "git archive -o foo.tar.gz" will use the tar.gz format. If we use --remote, and the local side chooses the format, then we send "--format=tar.gz" to the remote, which will complain about the unknown format. But if we let the remote side choose the format, then it will realize that it doesn't know about "tar.gz" and output uncompressed tar without even issuing a warning. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2011-06-22archive: pass archiver struct to write_archive callbackJeff King
The current archivers are very static; when you are in the write_tar_archive function, you know you are writing a tar. However, to facilitate runtime-configurable archivers that will share a common write function we need to tell the function which archiver was used. As a convenience, we also provide an opaque data pointer in the archiver struct so that individual archivers can put something useful there when they register themselves. Technically they could just use the "name" field to look in an internal map of names to data, but this is much simpler. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2011-06-22archive: refactor list of archive formatsJeff King
Most of the tar and zip code was nicely split out into two abstracted files which knew only about their specific formats. The entry point to this code was a single "write archive" function. However, as these basic formats grow more complex (e.g., by handling multiple file extensions and format names), a static list of the entry point functions won't be enough. Instead, let's provide a way for the tar and zip code to tell the main archive code what they support by registering archiver names and functions. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2011-06-15archive: reorder option parsing and config readingJeff King
The archive command does three things during its initialization phase: 1. parse command-line options 2. setup the git directory 3. read config During phase (1), if we see any options that do not require a git directory (like "--list"), we handle them immediately and exit, making it safe to abort step (2) if we are not in a git directory. Step (3) must come after step (2), since the git directory may influence configuration. However, this leaves no possibility of configuration from step (3) impacting the command-line options in step (1) (which is useful, for example, for supporting user-configurable output formats). Instead, let's reorder this to: 1. setup the git directory, if it exists 2. read config 3. parse command-line options 4. if we are not in a git repository, die This should have the same external behavior, but puts configuration before command-line parsing. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2011-03-25Convert read_tree{,_recursive} to support struct pathspecNguyễn Thái Ngọc Duy
This patch changes behavior of the two functions. Previously it does prefix matching only. Now it can also do wildcard matching. All callers are updated. Some gain wildcard matching (archive, checkout), others reset pathspec_item.has_wildcard to retain old behavior (ls-files, ls-tree as they are plumbing). Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2010-11-15archive: improve --verbose descriptionRené Scharfe
Signed-off-by: Rene Scharfe <> Signed-off-by: Junio C Hamano <>
2010-11-15add description parameter to OPT__VERBOSERené Scharfe
Allows better help text to be defined than "be verbose". Also make use of the macro in places that already had a different description. No object code changes intended. Signed-off-by: Rene Scharfe <> Signed-off-by: Junio C Hamano <>
2010-10-08Use angles for placeholders consistentlyŠtěpán Němec
Signed-off-by: Štěpán Němec <> Acked-by: Jonathan Nieder <> Signed-off-by: Junio C Hamano <>
2010-07-27archive: abbreviate substituted commit ids againJonathan Nieder
Given a file with: (define archive-id "$Format:%ct|%h|a$") and an export-subst attribute, the "%h" results in an full 40-digit object name instead of the expected 7-digit one. The export-subst feature requests unabbreviated object names because that is the low-level default. The effect was not observable until v1.7.1.1~17^2~3 (2010-05-03), which taught log --format=%h to respect the --abbrev option. Reported-by: Eli Barzilay <> Tested-by: Eli Barzilay <> Signed-off-by: Jonathan Nieder <> Signed-off-by: Junio C Hamano <>
2010-01-21Merge branch 'jc/conflict-marker-size'Junio C Hamano
* jc/conflict-marker-size: rerere: honor conflict-marker-size attribute rerere: prepare for customizable conflict marker length conflict-marker-size: new attribute rerere: use ll_merge() instead of using xdl_merge() merge-tree: use ll_merge() not xdl_merge() xdl_merge(): allow passing down marker_size in xmparam_t xdl_merge(): introduce xmparam_t for merge specific parameters git_attr(): fix function signature Conflicts: builtin-merge-file.c ll-merge.c xdiff/xdiff.h xdiff/xmerge.c
2010-01-17git_attr(): fix function signatureJunio C Hamano
The function took (name, namelen) as its arguments, but all the public callers wanted to pass a full string. Demote the counted-string interface to an internal API status, and allow public callers to just pass the string to the function. Signed-off-by: Junio C Hamano <>
2009-12-30archive: complain about path specs that don't match anythingRené Scharfe
Verify that all path specs match at least one path in the specified tree and reject those that don't. This would have made the bug fixed by 782a0005 easier to find. This implementation is simple to the point of being stupid. It walks the full tree for each path spec until it matches something. It's short and seems to be fast enough, though. Signed-off-by: Rene Scharfe <> Signed-off-by: Junio C Hamano <>
2009-10-20Refactor pretty_print_commit arguments into a structThomas Rast
pretty_print_commit() has a bunch of rarely-used arguments, and introducing more of them requires yet another update of all the call sites. Refactor most of them into a struct to make future extensions easier. The ones that stay "plain" arguments were chosen on the grounds that all callers put real arguments there, whereas some callers have 0/NULL for all arguments that were factored into the struct. We declare the struct 'const' to ensure none of the callers are bitten by the changed (no longer call-by-value) semantics. Signed-off-by: Thomas Rast <> Signed-off-by: Junio C Hamano <>
2009-10-09Merge branch 'rs/maint-archive-prefix'Junio C Hamano
* rs/maint-archive-prefix: Git archive and trailing "/" in prefix
2009-10-09Git archive and trailing "/" in prefixRené Scharfe
With --prefix=string that does not end with a slash, the top-level entries are written out with the specified prefix as expected, but no paths in the directories are added. Fix this by adding the prefix in write_archive_entry() instead of letting get_pathspec() and read_tree_recursive() pair; they are designed to only handle prefixes that are path components. Signed-off-by: Rene Scharfe <> Signed-off-by: Junio C Hamano <>
2009-09-13git-archive: add '-o' as a alias for '--output'Dmitry Potapov
The '-o' option is commonly used in many tools to specify the output file. Typing '--output' every time is a bit too long to be a practical alternative to redirecting output. But specifying the output name has the advantage of making possible to guess the desired output format by filename extension. Signed-off-by: Dmitry Potapov <> Signed-off-by: Junio C Hamano <>
2009-05-25parse-opts: prepare for OPT_FILENAMEStephen Boyd
To give OPT_FILENAME the prefix, we pass the prefix to parse_options() which passes the prefix to parse_options_start() which sets the prefix member of parse_opts_ctx accordingly. If there isn't a prefix in the calling context, passing NULL will suffice. Signed-off-by: Stephen Boyd <> Signed-off-by: Junio C Hamano <>
2009-04-18archive: do not read .gitattributes in working directoryNguyễn Thái Ngọc Duy
The old behaviour still remains with --worktree-attributes, and it is always on for the legacy "git tar-tree". Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2009-03-08archive: use parseopt for local-only optionsRené Scharfe
Replace the hand-rolled parsers that find and remove --remote and --exec by a parseopt parser that also handles --output. All three options only have a meaning if no remote server is used or on the local side. They must be rejected by upload-archive and should not be sent to the server by archive. We can't use a single parser for both remote and local side because the remote end possibly understands a different set of options than the local side. A local parser would then wrongly accuse options valid on the other side as being incorrect. This patch implements a very forgiving parser that understands only the three options mentioned above. All others are passed to the normal, complete parser in archive.c (running either locally in archive, or remotely in upload-archive). This normal parser definition contains dummy entries for the three options, in order for them to appear in the help screen. The parseopt parser allows multiple occurrences of --remote and --exec unlike the previous one; the one specified last wins. This looseness is acceptable, I think. Signed-off-by: Rene Scharfe <> Signed-off-by: Junio C Hamano <>
2009-03-04git-archive: add --output=<file> to send output to a fileCarlos Manuel Duclos Vergara
When archiving a repository there is no way to specify a file as output. This patch adds a new option "--output" that redirects the output to a file instead of stdout. Signed-off-by: Carlos Manuel Duclos Vergara <> Signed-off-by: Junio C Hamano <>
2009-02-07tree.c: allow read_tree_recursive() to traverse gitlink entriesLars Hjemli
When the callback function invoked from read_tree_recursive() returns the value `READ_TREE_RECURSIVE` for a gitlink entry, the traversal will now continue into the tree connected to the gitlinked commit. This functionality can be used to allow inter-repository operations, but since the current users of read_tree_recursive() does not yet support such operations, they have been modified where necessary to make sure that they never return READ_TREE_RECURSIVE for gitlink entries (hence no change in behaviour should be introduces by this patch alone). Signed-off-by: Lars Hjemli <> Signed-off-by: Junio C Hamano <>
2008-10-26Merge branch 'maint'Junio C Hamano
* maint: add -p: warn if only binary changes present git-archive: work in bare repos git-svn: change dashed git-config to git config
2008-10-26git-archive: work in bare reposCharles Bailey
This moves the call to git_config to a place where it doesn't break the logic for using git archive in a bare repository but retains the fix to make git archive respect core.autocrlf. Tests are by René Scharfe. Signed-off-by: Charles Bailey <> Tested-by: Deskin Miller <> Signed-off-by: Junio C Hamano <>
2008-10-12Replace calls to strbuf_init(&foo, 0) with STRBUF_INIT initializerBrandon Casey
Many call sites use strbuf_init(&foo, 0) to initialize local strbuf variable "foo" which has not been accessed since its declaration. These can be replaced with a static initialization using the STRBUF_INIT macro which is just as readable, saves a function call, and takes up fewer lines. Signed-off-by: Brandon Casey <> Signed-off-by: Shawn O. Pearce <>
2008-10-03archive.c: make archiver staticNanako Shiraishi
This variable is not used anywhere outside. Signed-off-by: Nanako Shiraishi <> Signed-off-by: Shawn O. Pearce <>