summaryrefslogtreecommitdiff
path: root/Documentation/git-fast-export.txt
diff options
context:
space:
mode:
authorJeff King <peff@peff.net>2014-08-27 17:01:28 (GMT)
committerJunio C Hamano <gitster@pobox.com>2014-08-27 17:42:16 (GMT)
commita8722750985a53cc502a66ae3d68a9e42c7fdb98 (patch)
tree31892a12522ed3f217be7a3ce96c4e5d2c3173be /Documentation/git-fast-export.txt
parent6c4ab27f2378ce67940b4496365043119d7ffff2 (diff)
downloadgit-a8722750985a53cc502a66ae3d68a9e42c7fdb98.zip
git-a8722750985a53cc502a66ae3d68a9e42c7fdb98.tar.gz
git-a8722750985a53cc502a66ae3d68a9e42c7fdb98.tar.bz2
teach fast-export an --anonymize option
Sometimes users want to report a bug they experience on their repository, but they are not at liberty to share the contents of the repository. It would be useful if they could produce a repository that has a similar shape to its history and tree, but without leaking any information. This "anonymized" repository could then be shared with developers (assuming it still replicates the original problem). This patch implements an "--anonymize" option to fast-export, which generates a stream that can recreate such a repository. Producing a single stream makes it easy for the caller to verify that they are not leaking any useful information. You can get an overview of what will be shared by running a command like: git fast-export --anonymize --all | perl -pe 's/\d+/X/g' | sort -u | less which will show every unique line we generate, modulo any numbers (each anonymized token is assigned a number, like "User 0", and we replace it consistently in the output). In addition to anonymizing, this produces test cases that are relatively small (compared to the original repository) and fast to generate (compared to using filter-branch, or modifying the output of fast-export yourself). Here are numbers for git.git: $ time git fast-export --anonymize --all \ --tag-of-filtered-object=drop >output real 0m2.883s user 0m2.828s sys 0m0.052s $ gzip output $ ls -lh output.gz | awk '{print $5}' 2.9M Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'Documentation/git-fast-export.txt')
-rw-r--r--Documentation/git-fast-export.txt6
1 files changed, 6 insertions, 0 deletions
diff --git a/Documentation/git-fast-export.txt b/Documentation/git-fast-export.txt
index 221506b..52831fa 100644
--- a/Documentation/git-fast-export.txt
+++ b/Documentation/git-fast-export.txt
@@ -105,6 +105,12 @@ marks the same across runs.
in the commit (as opposed to just listing the files which are
different from the commit's first parent).
+--anonymize::
+ Replace all refnames, paths, blob contents, commit and tag
+ messages, names, and email addresses in the output with
+ anonymized data, while still retaining the shape of history and
+ of the stored tree.
+
--refspec::
Apply the specified refspec to each ref exported. Multiple of them can
be specified.