path: root/Documentation/technical/api-diff.txt
diff options
Diffstat (limited to 'Documentation/technical/api-diff.txt')
1 files changed, 166 insertions, 0 deletions
diff --git a/Documentation/technical/api-diff.txt b/Documentation/technical/api-diff.txt
new file mode 100644
index 0000000..822609b
--- /dev/null
+++ b/Documentation/technical/api-diff.txt
@@ -0,0 +1,166 @@
+diff API
+The diff API is for programs that compare two sets of files (e.g. two
+trees, one tree and the index) and present the found difference in
+various ways. The calling program is responsible for feeding the API
+pairs of files, one from the "old" set and the corresponding one from
+"new" set, that are different. The library called through this API is
+called diffcore, and is responsible for two things.
+* finding total rewrites (`-B`), renames (`-M`) and copies (`-C`), and
+ changes that touch a string (`-S`), as specified by the caller.
+* outputting the differences in various formats, as specified by the
+ caller.
+Calling sequence
+* Prepare `struct diff_options` to record the set of diff options, and
+ then call `diff_setup()` to initialize this structure. This sets up
+ the vanilla default.
+* Fill in the options structure to specify desired output format, rename
+ detection, etc. `diff_opt_parse()` can be used to parse options given
+ from the command line in a way consistent with existing git-diff
+ family of programs.
+* Call `diff_setup_done()`; this inspects the options set up so far for
+ internal consistency and make necessary tweaking to it (e.g. if
+ textual patch output was asked, recursive behaviour is turned on).
+* As you find different pairs of files, call `diff_change()` to feed
+ modified files, `diff_addremove()` to feed created or deleted files,
+ or `diff_unmerged()` to feed a file whose state is 'unmerged' to the
+ API. These are thin wrappers to a lower-level `diff_queue()` function
+ that is flexible enough to record any of these kinds of changes.
+* Once you finish feeding the pairs of files, call `diffcore_std()`.
+ This will tell the diffcore library to go ahead and do its work.
+* Calling `diffcore_flush()` will produce the output.
+Data structures
+* `struct diff_filespec`
+This is the internal representation for a single file (blob). It
+records the blob object name (if known -- for a work tree file it
+typically is a NUL SHA-1), filemode and pathname. This is what the
+`diff_addremove()`, `diff_change()` and `diff_unmerged()` synthesize and
+feed `diff_queue()` function with.
+* `struct diff_filepair`
+This records a pair of `struct diff_filespec`; the filespec for a file
+in the "old" set (i.e. preimage) is called `one`, and the filespec for a
+file in the "new" set (i.e. postimage) is called `two`. A change that
+represents file creation has NULL in `one`, and file deletion has NULL
+in `two`.
+A `filepair` starts pointing at `one` and `two` that are from the same
+filename, but `diffcore_std()` can break pairs and match component
+filespecs with other filespecs from a different filepair to form new
+filepair. This is called 'rename detection'.
+* `struct diff_queue`
+This is a collection of filepairs. Notable members are:
+ An array of pointers to `struct diff_filepair`. This
+ dynamically grows as you add filepairs;
+ The allocated size of the `queue` array;
+ The number of elements in the `queue` array.
+* `struct diff_options`
+This describes the set of options the calling program wants to affect
+the operation of diffcore library with.
+Notable members are:
+ The output format used when `diff_flush()` is run.
+ Number of context lines to generate in patch output.
+`break_opt`, `detect_rename`, `rename-score`, `rename_limit`::
+ Affects the way detection logic for complete rewrites, renames
+ and copies.
+ Number of hexdigits to abbrevate raw format output to.
+ A constant string (can and typically does contain newlines to
+ look for a block of text, not just a single line) to filter out
+ the filepairs that do not change the number of strings contained
+ in its preimage and postmage of the diff_queue.
+ This is mostly a collection of boolean options that affects the
+ operation, but some do not have anything to do with the diffcore
+ library.
+ Affects the way how a file that is seemingly binary is treated.
+ Tells the patch output format not to use abbreviated object
+ names on the "index" lines.
+ Tells the diffcore library that the caller is feeding unchanged
+ filepairs to allow copies from unmodified files be detected.
+ Output should be colored.
+ Output is a colored word-diff.
+ Tells diff-files that the input is not tracked files but files
+ in random locations on the filesystem.
+ Tells output routine that it is Ok to call user specified patch
+ output routine. Plumbing disables this to ensure stable output.
+ Do not show any output.
+ Tells the library that the calling program is feeding the
+ filepairs reversed; `one` is two, and `two` is one.
+ For communication between the calling program and the options
+ parser; tell the calling program to signal the presense of
+ difference using program exit code.
+ Internal; used for optimization to see if there is any change.
+ Affects if diff-files shows removed files.
+ Tells if tree traversal done by tree-diff should recursively
+ descend into a tree object pair that are different in preimage
+ and postimage set.