Smart mirroring: the cure for poor Git performance

By on February 9, 2016

distributedGit

<This is a cross-post from Atlassian Blogs>

Are you on one of those teams that finds all kinds of ways to stretch the limits of its development tools? If you’re at a big company, working on big projects stored in big repositories – possibly repos that are shared with teammates across multiple continents – the answer is probably “yes”.

Using Git at massive scale can be so inefficient that it poisons your team’s productivity. So I want to bring you up to speed on the antidote we’ve developed. It’s called smart mirroring, and it’s now available in Bitbucket Data Center.

Does your team need smart mirroring?

Some teams do, some don’t. But the teams who do need it have a few things in common.

For starters, we’re talking about teams with hundreds, if not thousands, of developers. A few thousand coders will tax any repository server, even if that server is sitting just a few feet away from you. And as important as the performance of Bitbucket itself is, there are other factors in play.

We’ve noticed that an increasing number of large development teams using Git are geographically distributed, with little or no control over the network performance between themselves and their Bitbucket instance. These teams suffer from high latency and every operation they perform competes for limited bandwidth. In addition, these same teams often need to work with large repositories for a variety of reasons (sometimes even good reasons!).

All these factors conspire to rob developers of valuable time, making them wait long periods – often hours – to clone a large repository from across the globe. It can get so bad that people have resorted to sending portable drives around the world via mail. That kinda sucks.

If this sounds like you, smart mirroring will help.

How smart mirroring improves Git performance

Bitbucket Data Center has always been able to run multiple application nodes in a local cluster to help serve all those users and build bots that demand performance and availability. Smart mirroring takes the performance improvements a step further for Git read operations in a way that’s tailored for distributed teams working with large repositories.

image2016-2-2 12-55-43
It works by setting up one or more active mirror servers to operate with read-only copies of repositories in remote locations, automatically kept up-to-date from the primary Bitbucket instance. A mirror can host all of your primary instance’s repositories, or just a subset. Mirror servers delegate user authorization and authentication to the primary server, so no additional user management is required. And you can connect as many mirrors to your Bitbucket Data Center instance as you need at no additional cost.

Aside from dramatically improved Git performance, developers are automatically presented with alternate clone locations in the Bitbucket interface, so administrators don’t have to provide extra training. Once set up, the mirrors are fully self-serve.

Predicting performance improvements

spedometer-narrowmargins

The performance gains you can expect vary as a factor of network bandwidth and repository size. What it basically comes down to is how slow remote cloning is for you today. In a simple test, we saw that a 5GB repository took over an hour to clone between San Francisco and Sydney. But with smart mirroring, that time was reduced by 25x to just a few minutes.

We heard from one customer where a remote user had a clone that took 9 hours. (9 f’ing hours!) They could expect a more substantial performance increase – basically, a whole working day given back to each developer who clones a large repo.

Imagine: the mobile team in Bangalore, that web team in London, the secret project team working from a lab in Thailand… all able to reap the benefits of Git, without suffering from the tyranny of distance.

Whether you’re just adopting Git or already a guru, your distributed teams should be able to make cool stuff without unnecessary delays. If you’re ready to talk to a real live human about smart mirroring and how it can help your team, get in touch with one of our customer advocates using our handy contact form. If you’ve temporarily lost your ability to speak due to banging your head against your desk while wondering if that 8GB clone is ever going to complete (or you’re busy hand-delivering a hard copy of it to your teammate in Poland), you can get more information about our Data Center offerings online.

Interested in learning more? Join Roger Barnes, Senior Bitbucket Product Manager, on March 3rd at 11:00am PST, CET, and AEST to learn:

Register for Webinar

Did you find this post helpful? Please share it on your social network of choice and help your fellow Git users end their performance woes!

2 Comments

  • Posted March 24, 2016 at 1:04 am | Permalink

    I find it really slow to check out repos just to search through the code. Please fix: https://bitbucket.org/site/master/issues/2874/ability-to-search-source-code-bb-39

    • Raj Sarkar
      Posted March 25, 2016 at 7:22 am | Permalink

      We just launched code search for Bitbucket Server. We’re excited that we’ve finally delivered code search to Bitbucket Server customers, and we know our cloud customers are also interested in the functionality. Thanks for the feedback, and we’ll share an update as soon as we have one!