Bitbucket downtime for a hardware upgrade

To badly paraphrase everyone’s favorite Wall Crawler series, with great success comes great responsibility. Bitbucket has grown fast – faster than we were ready for.

We're aware that there have been on-going stability and performance issues. That is why we’re happy to announce that on Monday, August 30, 01:00 GMT we’ll be moving off the Amazon EC2 system to a dedicated server deployment, professionally managed at Contegix.

The current Amazon EC2 setup looks like this:

2 x m1.small
2 x c1.xlarge
2 x m2.4xlarge

Many of the problems that we have are related to disk I/O and memory which is why we’ve chosen to move to a physical machine setup.

When we switch to Contegix, we’ll be switching to:

5 x Dell R610, 32Gb ram, 16 core
Storage, Dell MD1120 DAS array
22 600Gb 10krpm disks RAID10 (~2.4tb final storage)
Redundant backbone providers

Expected Downtime

Over the last month we’ve been putting together a plan that limits downtime, which should be limited to 1 hour of downtime.

The main part of that is moving the database, and everyone’s repositories will be moved over gradually. Your repositories will still be available during the transition, however while each individual repository is being migrated, they will be in read-only mode. This should only be for 10-60 seconds, even for the largest ones. Chances are you may not even notice it.

We'd like to thank everyone for their patience in helping us get this far.

Be sure to check back soon for some very exciting updates, and look forward to a more stable, faster Bitbucket!

Expected Downtime

Moving server

Outage incident and our new monitoring setup

Downtime Postmortem