The Inner Guts of Bitbucket

By on August 11, 2014

Recently our teammate and Bitbucket engineer Erik Van Zijst had the opportunity to present at Euro Python 2014 in Berlin. Check out this video of his session on the Inner Guts of Bitbucket and get a detailed overview of our current architecture at all layers from Gunicorn and Django to Celery and HAProxy to NFS.

In addition to the inside scoop into Bitbucket’s inner workings, this video covers some war stories and shows how we too have to learn things the hard way sometimes.

  • Clive

    “Sometimes we too have to learn the hard way”…? What you mean even brilliant, world-class and modest geniuses also have to learn the hard way?!

  • http://www.versioneye.com/ Robert Reiz

    Thanks for the presentation. Very interesting to see how Bitbucket works. What’s the reason for running real hardware instead of virtual machines?

  • Jardel Weyrich

    Great presentation Erik! I like the simplicity of your architecture ;-)

    Your cache solution to minimize the impact of bcrypt-ing passwords all the time seems reasonable, although, IMHO that does not seem to tackle the underlying problem – The password shouldn’t be used to authenticate multiple times in a short period, mainly if it’s an expensive operation. You partially solve that with API rate-limiting (which I think you already have) – a decorator backed by NoSQL would do. Sites rely on cookies & sessions, and REST APIs can too. But there’s another (similar) approach: Once an account is authenticated, you could hand the client a time-limited authentication token based on HMAC. From this point, the client can rely solely on this token to perform any operation that requires authentication, until it expires. You may implement a renewal process as well. Anyway, if a server gets compromised, the intruder can always modify the authentication API to log/save plaintext passwords. Can’t fight that.