summaryrefslogtreecommitdiff
path: root/Documentation/technical/send-pack-pipeline.txt
blob: bd32aff00be012cbfa44d2662a4e471bcfb0ca83 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
git-send-pack
=============
 
Overall operation
-----------------
 
. Connects to the remote side and invokes git-receive-pack.
 
. Learns what refs the remote has and what commit they point at.
  Matches them to the refspecs we are pushing.
 
. Checks if there are non-fast-forwards.  Unlike fetch-pack,
  the repository send-pack runs in is supposed to be a superset
  of the recipient in fast-forward cases, so there is no need
  for want/have exchanges, and fast-forward check can be done
  locally.  Tell the result to the other end.
 
. Calls pack_objects() which generates a packfile and sends it
  over to the other end.
 
. If the remote side is new enough (v1.1.0 or later), wait for
  the unpack and hook status from the other end.
 
. Exit with appropriate error codes.
 
 
Pack_objects pipeline
---------------------
 
This function gets one file descriptor (`out`) which is either a
socket (over the network) or a pipe (local).  What's written to
this fd goes to git-receive-pack to be unpacked.
 
    send-pack ---> fd ---> receive-pack
 
It somehow forks once, but does not wait for it.  I am not sure
why.
 
The forked child calls rev_list_generate() with that file
descriptor (while the parent closes `out` -- the child will be
the one that writes the packfile to the other end).
 
    send-pack
       |
       rev-list-generate ---> fd ---> receive-pack
 
 
Then rev-list-generate forks after creates a pipe; the child
will become a pipeline "rev-list --stdin | pack-objects", which
is the rev_list() function, while the parent feeds that pipeline
the list of refs.
 
    send-pack
       |
       rev-list-generate ---> fd ---> receive-pack
          | ^ (pipe)
	  v |
         rev-list
 
The child process, before calling rev-list, rearranges the file
descriptors:
 
. what it reads from rev-list-generate via pipe becomes the
  stdin; this is to feed the upstream of the pipeline which will
  be git-rev-list process.
 
. what it writes to its stdout goes to the fd connected to
  receive-pack.
 
On the other hand, the parent process, before starting to feed
the child pipeline, closes the reading side of the pipe and fd
to receive-pack.
 
    send-pack
       |
       rev-list-generate
          |
	  v [0]
         rev-list [1] ---> receive-pack
 
The parent then writes to the pipe and later closes it.  There
is a commented out waitpid to wait for the rev-list side before
it exits, I again do not understand why.
 
The rev-list function further sets up a pipe and forks to run
git-rev-list piped to git-pack-objects.  The child side, before
exec'ing git-pack-objects, rearranges the file descriptors:
 
. what it reads from the pipe becomes the stdin; this gets the
  list of objects from the git-rev-list process.
 
. its stdout is already connected to receive-pack, so what it
  generates goes there.
 
The parent process arranges its file descriptors before exec'ing
git-rev-list:
 
. its stdout is sent to the pipe to feed git-pack-objects.
 
. its stdin is already connected to rev-list-generate and will
  read the set of refs from it.
 
 
    send-pack
       |
       rev-list-generate
          |
	  v [0]
	  git-rev-list [1] ---> [0] git-pack-objects [1] ---> receive-pack