The Future of Performance: Scaling

Fri Jan 9 05:41:30 UTC 2009

On Thu, Jan 8, 2009 at 10:23 PM, David Miller <justdave at bugzilla.org> wrote:
> We have one master and two slave databases.  The slaves are behind a
> load balancer.  Bugzilla only knows the master database and the load
> balancer's IPs.  The status check on the load balancer that checks to
> see if the host that it's sending traffic to is actually there also
> checks the replication lag, and will pull it out of the rotation if it
> gets more than a few minutes behind.  This in turn takes load off that
> server, allowing it to catch up, and then it'll get put back in.  And I
> imagine it'll probably switch back and forth a lot, as replication
> blocks when people run queries, and people frequently run queries on
> bugzilla.mozilla.org that take 3 or 4 minutes to run.

Any have the sorryserver be the main db server, or something? I've
thought about doing something like that in a previous job, but never
done it. The other option is to only drop a server if its more than a
certain amount behind the other server (or most up-to-date server if
there are more than two), rather than behind the master.

> Currently, during busy times of day, b.m.o is often in the 10 to 15
> minute range on replication lag (there's only one slave live at the
> moment, the second one is something new I'm setting up, more for
> redundancy reasons than capacity.

Any idea why? With innodb, nothing should get blocked.

Bradley