2.1.3. Some related Unix tools¶
You may have wondered what the hashes are about that git uses.
In general, a hash function takes any data as input and creates a hash value of the data. You can try this yourself by e.g. running the following command on Unix:
$ sha1sum with.py 4f2fb68a29c3a1f9978be115a1798371a57e9ae9 with.py
Here, we run the command sha1sum which calculates the SHA-1 hash of the file with.py. If you don’t have sha1sum you may try e.g. md5sum (or possibly md5 on Mac). Hash functions can have the following properties:
- Changing a file insignificantly (e.g. by adding one byte) may significantly change the hash (e.g. result in a completely different hash)
- The hash function may be cryptographically secure - i.e. it is difficult or impossible to modify the input data such that the resulting hash would still be the same
In general, if you know the hash of a file, you can calculate the hash to check whether the file has been modified or corrupted. git uses hashes to uniquely identify commits and to protect against data corruption.
Exercise: Look up the definition of SHA-1 hash function online. You can e.g. find an implementation in pseudocode.
220.127.116.11. diff and patch¶
“git diff” gives a practical output of a difference between a file before and after a change:
$ git diff diff --git a/with.py b/with.py index f61db97..d63b0bf 100644 --- a/with.py +++ b/with.py @@ -1,3 +1,3 @@ with open('test.txt', 'w') as f: for i in xrange(5): - f.write("%f %f\n" % (0.2, 0.5)) + f.write("%f %f\n" % (0.0, 1.0))
In general, you can diff any two files by running the utility “diff”. Conventionally the switch “-u” is used to display the output in unified form, which is also the default git uses:
$ diff -u with2.py with.py --- with2.py 2018-03-25 22:34:47.530840487 +0200 +++ with.py 2018-03-25 22:05:25.477035716 +0200 @@ -1,3 +1,3 @@ with open('test.txt', 'w') as f: for i in xrange(5): - f.write("%f %f\n" % (0.0, 1.0) + f.write("%f %f %f\n" % (0.0, 0.5, 1.0)
What can be useful is redirecting diff output to a file. There’s another utility called patch which takes the output from diff to actually make changes to a file, i.e. patch them. Let’s say someone sent us the above diff output and we had our file with.py which we wanted to patch:
$ patch -p0 < with.diff patching file with.py
Here, “patch” will modify our with.py according to the diff.