20 December, 2015 (published)
22 December, 2015 (last modified)

Prevent and detect source code tampering

hackerThese days the security professionals are upset about two malicious vulnerabilities found in the firewalls of ScreenOS, a product of Juniper. Wild speculations are put forward as to who did it, when it was done, how it was done and where it was it done. The two problems appear to be of the form of a backdoor. Exactly the type of backdoor security agencies like NSA want to have installed by US software companies.

In this post we will present a simple practice that would help preventing and detecting tampering of source code on the SVN/Git repositories. In addition the practice will protect the software company, and its developers, against legal disputes. Our suggestion is certainly not a panacee but it would narrow done the possibilities of attacks.

Open or closed source

The mantra in the world of software developers is that code should be open source. Only then can it be audited independently. We do not agree. We have written about the problems with open source and about its alternative open functionality. We are not going to repeat the arguments here. Our suggestion in the present post would - if implemented - allow for immediate detection of changes of source code and also woud indicate what files were modified, be it closed or open source.

Attackers cannot beat hashes

Sophisticated attackers, often supported by governments, are extremely capable. Changing files on an svn server without changing the revision number is Kindergarten level for them. In addition there is the threat of one the people having access to the file system is a bad guy. However smart attackers are  there is however one thing nobody can do: change a file without changing the hash of that file.

Hashing of source files

So we suggest to hash all your source files and keep - or publish - these hashes. A procedure we always use is:

  1. Analyze all our source files and classify them either as ascii or as binary. This analys likely needs to be done only once in the lifetime of the project
  2. Do a fresh checkout of the relevant revision number
  3. Convert all ascii files to files with Unix-like linefeeds (replace first all "\r\n" by "\n" and then all "\r" by "\n"). Beware that many file types look like binary, as for instance png files, but still suffer from linefeed replacements. Treat them as ascii as otherwise hashes on Linux will differ from hashes on Windows
  4. Sort all files by their full file paths
  5. Hash the content of each file
  6. Hash the concatenated content
  7. Write all the file paths and their hashes to a single file

We have found that this procedure generates stable and identical hashes whatever platform is used to do the checkout. We use several Window, OSX and Linux systems with always the same hashes.

Publishing of hashes

The full hash file should be saved outside the repository. It could well be published. But at least the "grand hash" of the full revision should published. From this on any change in the source files can be detected. This check on tampering can be automated with a cron job.


In addition later disputes or accusations about the content of the closed source can be settled by showing the source code and showing hashes. This defense will certainly hold up in court.


For large and complex code bases - including open source projects - the hashing procedure would be a quick and automatic sensor to detect that something fishy has taken place.


- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Write a reaction

By submitting a comment here you grant this site a perpetual license to reproduce your words and name/web site in attribution.