The Future of Performance: Scaling

Fri Jan 9 04:23:43 UTC 2009

Steve Wendt wrote on 1/8/09 10:20 PM:
> On 1/8/2009 5:52 PM, Max Kanat-Alexander wrote:
> 
>>> Besides being MySQL specific, the privileges required for running
>>> that are probably not what you want your web processes running with.
>>> I've thought about trying to do something like this myself
>>> (completely unrelated to Bugzilla), and never came up with a good
>>> solution.
>>
>>     I think Eclipse has a cron script or something that routinely
>> checks the slave replication delay and switches the shadowdb parameters
>> over to match the master DB if the replication lag is longer than 30
>> seconds or so.
> 
> Which means you have to make sure that the cron script runs "frequently
> enough" - it's not very elegant, because it is completely independent of
> actual traffic.  I suppose you could have something from the web side
> trigger it to run, but you'd have to be careful about file permissions
> so you didn't get yourself into trouble...

What I'm in the middle of setting up at Mozilla is this:

We have one master and two slave databases.  The slaves are behind a
load balancer.  Bugzilla only knows the master database and the load
balancer's IPs.  The status check on the load balancer that checks to
see if the host that it's sending traffic to is actually there also
checks the replication lag, and will pull it out of the rotation if it
gets more than a few minutes behind.  This in turn takes load off that
server, allowing it to catch up, and then it'll get put back in.  And I
imagine it'll probably switch back and forth a lot, as replication
blocks when people run queries, and people frequently run queries on
bugzilla.mozilla.org that take 3 or 4 minutes to run.

Currently, during busy times of day, b.m.o is often in the 10 to 15
minute range on replication lag (there's only one slave live at the
moment, the second one is something new I'm setting up, more for
redundancy reasons than capacity.

Although, thinking about it...  searches are the main thing that take
lots of time and cause replication lag.  What if Bugzilla itself knew
about two distinct slave servers, and used one for bug searches, and the
other for everything else?  Then the "everything else" one would almost
always have almost zero lag because the rest of the queries Bugzilla
runs are so quick.

-- 
Dave Miller                                   http://www.justdave.net/
System Administrator, Mozilla Corporation      http://www.mozilla.com/
Project Leader, Bugzilla Bug Tracking System  http://www.bugzilla.org/