Why quicksearch sucks
Andreas Franke
afranke at ags.uni-sb.de
Sat Nov 16 05:04:56 UTC 2002
Hi Bradley, hi all,
> The main problem is that quicksearch entries are inherently slow. We do
> a substring search on the description OR status whiteboard for each
> word. This means that the best we can get away with is a linear search
> of all the bug in teh database.
That's true. Note that quicksearch has been designed for an
existing bugzilla backend, with the goal of
(1) getting a good approximation of "what the b.m.o. user wants"
(2) within a "reasonable" amount of time on b.m.o.
At the time of the design, and with respect to the typical load
of b.m.o at that time, typical requests with 1 - 3 words were
served within an acceptable time interval.
If this situation has changed (e.g. backend code changed,
number of users / queries / bugs has gone up, typical
server load has increased), then there are always two
alternative approaches:
(a) adopt quicksearch to the new situation, or
(b) change the situation so that quicksearch still works.
In our case, the bugzilla backend code seems to be the only
parameter of the situation that we can change; and on the
quicksearch side I think changing the search semantics
is the only parameter we can tune. Let's keep these two
alternatives in mind.
> However, we also do a substring match on the product/components. Before
> the upgrade, this was just an extra string search. After the upgrade +
> bug 43600, this is now a string search on the product/component table,
> and an integer equality test on teh bugs table, via a JOIN/
> [...]
> So, how do we fix it? The correct fix is to use a subselect for the
> product id:
> [...]
> That then gets us back to the previous search, but probably slightly
> faster because of the number checks.
>
> Which is great, except that mysql doesn't support SUBSELECTs.
>
> So the alternate solution is to do what was done in bug 21700, and
> lookup stuff in joined tables first. I wasnt to do this via some sort of
> warpper func which would use subselects for database servers, and a
> separate select + IN for mysql.
I think this is a good solution. In my experience with a (non-bugzilla)
mysql app, the app<->db communication overhead is hardly a problem.
Also, if it is too easy to submit queries with a 50000 bugs result
set by accident, then it may make sense to define a default limit
for a result set (maybe somewhere in the range of 2000-5000, possibly
as a per-user pref), so that you need to explicitly ask for the
_complete_ result set if you really want it, something like
&all_results=1 . I'm not sure whether this would obsolete the
need for temp tables on disk, but if it does it may be worth
considering.
So this would be a possible solution on the bugzilla backend side,
dealing with the performance difference for some kinds of queries
caused by the latest code changes. On the other side, maybe as a
short-term solution, we could adopt quicksearch by just changing
the semantics to _not_ include product / component by default,
so that you'd have to explicitly specify the "area prefix" (":")
if you want to do substring matches in product and component names,
and only there. This makes it a bit less intuitive for new users
who are not familiar with the actual product/component hierarchy
used by mozilla.org, but most of the queries would still find
most of the relevant bugs, so having quicksearch available again
with that changed semantics may be an improvement over it being
disabled completely. And when the real solution is well tested
and checked in, the semantics can always be changed back if so
desired.
I compared the behavior of b.m.o. for a specific example query:
1) the original semantics:
http://www.ps.uni-sb.de/~afranke/moz0/quicksearch.html
2) the new semantics:
http://www.ps.uni-sb.de/~afranke/moz1/quicksearch.html
The patch for quicksearch.js (as a unified diff) is here:
http://www.ps.uni-sb.de/~afranke/quicksearch.js.diff
1a) currently on b.m.o., the search for
bugzilla quicksearch
takes about 17-19 seconds and returns 14 bugs.
1b) the slightly more expert query
:bugzilla quicksearch
takes about 14-19 seconds and returns 13 bugs.
This is the almost the same
as the previous result set, missing only bug 159451
("Bugzilla QuickSearch broken in default Chimera install")
which is in a non-bugzilla product because it is probably
caused by the browser, not the bugzilla code; so we may
tolerate this inaccuracy.
2a) with the semantics changed to not search
the product and component names by default,
the first query
bugzilla quicksearch
takes only 4-5 seconds, but returns only a single
bug (159451) as a result, since "bugzilla" does
not occur in the summary (etc.) in the other bugs.
2b) the second query
:bugzilla quicksearch
takes 11-12 seconds and returns the same 13 bugs
as 1b).
So with this change, all the non-expert quicksearch queries
where the user just enters some word fragments should be
as fast as before the backend code change, but with a slightly
different semantics. But I suspect only very few people will
notice the difference.
And even "expert" queries using the ":" prefix to trigger product
and component name matching will be faster because product and
component name matching is avoided for all the other search terms.
Is this sufficient improvement to make this change and enable
quicksearch on b.m.o again? Or are we going for the backend
fix only?
Cheers,
Andreas
==============================================================
Here are the query urls for the above examples, taken from
http://www.ps.uni-sb.de/~afranke/moz0/quicksearchhack.html and
http://www.ps.uni-sb.de/~afranke/moz1/quicksearchhack.html
using the "preview query url as page" button:
[1a] original query for "bugzilla quicksearch":
http://bugzilla.mozilla.org/buglist.cgi
?bug_status=UNCONFIRMED
&bug_status=NEW
&bug_status=ASSIGNED
&bug_status=REOPENED
&field0-0-0=product
&type0-0-0=substring
&value0-0-0=bugzilla
&field0-0-1=component
&type0-0-1=substring
&value0-0-1=bugzilla
&field0-0-2=short_desc
&type0-0-2=substring
&value0-0-2=bugzilla
&field0-0-3=status_whiteboard
&type0-0-3=substring
&value0-0-3=bugzilla
&field1-0-0=product
&type1-0-0=substring
&value1-0-0=quicksearch
&field1-0-1=component
&type1-0-1=substring
&value1-0-1=quicksearch
&field1-0-2=short_desc
&type1-0-2=substring
&value1-0-2=quicksearch
&field1-0-3=status_whiteboard
&type1-0-3=substring
&value1-0-3=quicksearch
[1b] original query for ":bugzilla quicksearch"
http://bugzilla.mozilla.org/buglist.cgi
?bug_status=UNCONFIRMED
&bug_status=NEW
&bug_status=ASSIGNED
&bug_status=REOPENED
&field0-0-0=product
&type0-0-0=substring
&value0-0-0=bugzilla
&field0-0-1=component
&type0-0-1=substring
&value0-0-1=bugzilla
&field1-0-0=product
&type1-0-0=substring
&value1-0-0=quicksearch
&field1-0-1=component
&type1-0-1=substring
&value1-0-1=quicksearch
&field1-0-2=short_desc
&type1-0-2=substring
&value1-0-2=quicksearch
&field1-0-3=status_whiteboard
&type1-0-3=substring
&value1-0-3=quicksearch
[2a] new query for "bugzilla quicksearch":
http://bugzilla.mozilla.org/buglist.cgi
?bug_status=UNCONFIRMED
&bug_status=NEW
&bug_status=ASSIGNED
&bug_status=REOPENED
&field0-0-0=short_desc
&type0-0-0=substring
&value0-0-0=bugzilla
&field0-0-1=status_whiteboard
&type0-0-1=substring
&value0-0-1=bugzilla
&field1-0-0=short_desc
&type1-0-0=substring
&value1-0-0=quicksearch
&field1-0-1=status_whiteboard
&type1-0-1=substring
&value1-0-1=quicksearch
[2b] new query for ":bugzilla quicksearch":
http://bugzilla.mozilla.org/buglist.cgi
?bug_status=UNCONFIRMED
&bug_status=NEW
&bug_status=ASSIGNED
&bug_status=REOPENED
&field0-0-0=product
&type0-0-0=substring
&value0-0-0=bugzilla
&field0-0-1=component
&type0-0-1=substring
&value0-0-1=bugzilla
&field1-0-0=short_desc
&type1-0-0=substring
&value1-0-0=quicksearch
&field1-0-1=status_whiteboard
&type1-0-1=substring
&value1-0-1=quicksearch
More information about the developers
mailing list