summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
authorJeff King <peff@peff.net>2014-08-28 12:32:58 (GMT)
committerJunio C Hamano <gitster@pobox.com>2014-08-28 17:32:52 (GMT)
commit75d3d6573ee809f35dd87ee4569e49b068ab4731 (patch)
treecbfd2347e1553b074749cd1f0e4ff25750e6231c /Documentation
parenta8722750985a53cc502a66ae3d68a9e42c7fdb98 (diff)
downloadgit-75d3d6573ee809f35dd87ee4569e49b068ab4731.zip
git-75d3d6573ee809f35dd87ee4569e49b068ab4731.tar.gz
git-75d3d6573ee809f35dd87ee4569e49b068ab4731.tar.bz2
docs/fast-export: explain --anonymize more completely
The original commit made mention of this option, but not why one might want it or how they might use it. Let's try to be a little more thorough, and also explain how to confirm that the output really is anonymous. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/git-fast-export.txt63
1 files changed, 59 insertions, 4 deletions
diff --git a/Documentation/git-fast-export.txt b/Documentation/git-fast-export.txt
index 52831fa..dbe9a46 100644
--- a/Documentation/git-fast-export.txt
+++ b/Documentation/git-fast-export.txt
@@ -106,10 +106,9 @@ marks the same across runs.
different from the commit's first parent).
--anonymize::
- Replace all refnames, paths, blob contents, commit and tag
- messages, names, and email addresses in the output with
- anonymized data, while still retaining the shape of history and
- of the stored tree.
+ Anonymize the contents of the repository while still retaining
+ the shape of the history and stored tree. See the section on
+ `ANONYMIZING` below.
--refspec::
Apply the specified refspec to each ref exported. Multiple of them can
@@ -147,6 +146,62 @@ referenced by that revision range contains the string
'refs/heads/master'.
+ANONYMIZING
+-----------
+
+If the `--anonymize` option is given, git will attempt to remove all
+identifying information from the repository while still retaining enough
+of the original tree and history patterns to reproduce some bugs. The
+goal is that a git bug which is found on a private repository will
+persist in the anonymized repository, and the latter can be shared with
+git developers to help solve the bug.
+
+With this option, git will replace all refnames, paths, blob contents,
+commit and tag messages, names, and email addresses in the output with
+anonymized data. Two instances of the same string will be replaced
+equivalently (e.g., two commits with the same author will have the same
+anonymized author in the output, but bear no resemblance to the original
+author string). The relationship between commits, branches, and tags is
+retained, as well as the commit timestamps (but the commit messages and
+refnames bear no resemblance to the originals). The relative makeup of
+the tree is retained (e.g., if you have a root tree with 10 files and 3
+trees, so will the output), but their names and the contents of the
+files will be replaced.
+
+If you think you have found a git bug, you can start by exporting an
+anonymized stream of the whole repository:
+
+---------------------------------------------------
+$ git fast-export --anonymize --all >anon-stream
+---------------------------------------------------
+
+Then confirm that the bug persists in a repository created from that
+stream (many bugs will not, as they really do depend on the exact
+repository contents):
+
+---------------------------------------------------
+$ git init anon-repo
+$ cd anon-repo
+$ git fast-import <../anon-stream
+$ ... test your bug ...
+---------------------------------------------------
+
+If the anonymized repository shows the bug, it may be worth sharing
+`anon-stream` along with a regular bug report. Note that the anonymized
+stream compresses very well, so gzipping it is encouraged. If you want
+to examine the stream to see that it does not contain any private data,
+you can peruse it directly before sending. You may also want to try:
+
+---------------------------------------------------
+$ perl -pe 's/\d+/X/g' <anon-stream | sort -u | less
+---------------------------------------------------
+
+which shows all of the unique lines (with numbers converted to "X", to
+collapse "User 0", "User 1", etc into "User X"). This produces a much
+smaller output, and it is usually easy to quickly confirm that there is
+no private data in the stream.
+
+
Limitations
-----------