Tag Archives: Python

Dependency Management in Version Control

Hello everyone,

A few months ago I came up with a pretty great way to manage third party dependencies in projects and I’d like to share it with the world. In my experience, dependency management has been pretty straight forward. If I am working on a project that uses three different open source libraries, normally I would compile them into binaries and commit the headers and the libraries into version control. There are a couple of drawbacks to this approach:

  1. It consumes an enormous amount of storage space on the server. On a standard HDD, this opens the door for more fragmentation and could actually affect the performance of your version control system.
  2. It consumes an outrageous amount of bandwidth. Whether you all have your own tube or you share one, this is bad and checking in new libraries or updates to existing ones could mean a 30 minute update that blocks your entire team.
  3. Depending on the version control system you use, it could slow down the performance of your working copy. For example, updates could become slower because there are more directories to recurse and larger files to process. With Subversion, the extra recursion depth can really kill your update speed locally, not just in bandwidth.
  4. If you are targeting multiple compilers or platforms, you will have to build and maintain multiple copies of the binaries for each third party library you have. This is a maintenance nightmare for a team.

Some teams have solved a few of these issues by creating a compressed archive of all of the pre-built third party libraries the project depends on and placing that on a server somewhere separate from the version control system. This creates a few issues of its own, however, such as complicating the checkout process. Ideally checkouts should be as close to one step as possible. It also doesn’t solve some of the more annoying issues, such as point #4 in the list above.

I believe I’ve come up with a solution that addresses all of the issues above. In my opinion, this is the best solution to maintaining third party dependencies for a project. In a simple case, simply write a script that will download, build, and install (i.e. restructure libraries’ headers and binaries locally to the project in a way that it expects them to be hierarchically) each individual third party library your project depends on. For example, if you wrote this script in Python, you would be able to make it portable so that it would build those libraries appropriately for every platform you are going to support.

I have implemented such a script in my HareSVN project and it works pretty good. For now, however, it only works on Windows as I do not have the resources available to test on other platforms. But if anyone is curious to see it in action feel free to do a checkout and test it for yourself!