Towards faster keyword searches

Donald Lindsay lindsay at arista.com
Mon Sep 26 20:55:35 UTC 2016


Here are two examples of IN lists, that I see in our MySQL slow-query log:

DELETE FROM `dist_abuild_rpm` WHERE `abuild_id` IN (1370624,
1315692, 1315697, 1315705, 1315708, 1315709, 1315718,
1315718,1315723, 1315735, 1315747, 1315751, 1315752,
1315768, 1315774, 1315784, 1315834, 791557, 1315865,
1355269, 1370631, 1315894, 1315922, 1315955, 1315981,
1315999, 1377469, 1316063, 1316170, 1355324, 1316209,
1316213, 1355329, 1316240, 1316248, 1355332, 1316262,
1316272, 1316281, 791998, 1316290, 792008, 1316304,
792017, 792019, 1355344, 1316331, 1316348, 792062, 792072,
1316371, 792093, 1316384, 792100, 792117, 792119, 1316413,
1316414, 792130, 792132, 792133, 1316427, 1316448, 792162,
792187, 792201, 792203, 1316512, 1316535, 1316536, 792256,
792260, 792266, 792275, 792282, 792288, 1355387, 792293,
792296, 792308, 792319, 792323, 792324, 792338, 792345,
792350, 792359, 1316649, 792366, 792375, 792377, 1316671,
792393, 792397, 792415, 1363507, 1355411, 792444, 1316738,
1355414, 792455);


SELECT AutoTest.dut.dutspec as dutspec, propertyName,
value, startTime, endTime FROM rdam.dut
JOIN rdam.dut_property on dut_id=rdam.dut.id
JOIN AutoTest.dut ON AutoTest.dut.dutspec
LIKE CONCAT( '%', rdam.dut.name ) WHERE AutoTest.dut.dutspec
IN ( 'rdam://ck414', 'rdam://ck417', 'rdam://in343',
'rdam://ck411', 'rdam://ckp302', 'rdam://ol391',
'rdam://ckp203', 'rdam://in351', 'rdam://hs209',
'rdam://yr306', 'rdam://hs211', 'rdam://hs228',
'rdam://up486', 'rdam://hs207', 'rdam://snp104',
'rdam://hs205', 'rdam://snp105', 'rdam://ckp329',
'rdam://ckp304', 'rdam://nv443', 'rdam://ckp201',
'rdam://ckp323', 'rdam://ckp320', 'rdam://cd273',
'rdam://ckp308', 'rdam://ckp204', 'rdam://up301',
'rdam://gb305', 'rdam://nv406', 'rdam://hs106',
'rdam://fm393', 'rdam://bn102', 'rdam://fm202',
'rdam://ht103', 'rdam://in353', 'rdam://ck408',
'rdam://do411', 'rdam://do401', 'rdam://do466',
'rdam://do402', 'rdam://ol161', 'rdam://hs103',
'rdam://ckp328', 'rdam://lf327', 'rdam://ckp303',
'rdam://pts101', 'rdam://tg274', 'rdam://ckp322',
'rdam://wa401', 'rdam://hs212', 'rdam://hs214',
'rdam://lf120', 'rdam://wa444', 'rdam://ckp202',
'rdam://fm204', 'rdam://ol421', 'rdam://tg422',
'rdam://tcp106', 'rdam://yr302', 'rdam://cd343',
'rdam://cp131', 'rdam://cd369', 'rdam://psp103',
'rdam://hs112', 'rdam://psp101', 'rdam://ht238',
'rdam://lf244', 'rdam://nv210', 'rdam://in472',
'rdam://lp110', 'rdam://ht422', 'rdam://tcp105',
'rdam://in474' );

If anyone feels like speeding them up, I'm interested.

On Sun, Sep 25, 2016 at 9:05 PM, Dylan Hardison <dylan at mozilla.com> wrote:

>
> > On Sep 25, 2016, at 23:51, Jeff Fearn <jfearn at redhat.com> wrote:
> >
> >> There are a whole class of these -- anything that is a 'multiselect'
> type search. I believe someone suggested embedding these queries (at least
> as an option)
> >> as a sub select and in general I think that's a good idea. An even
> nicer idea is to just let elastic search do the searching.
> >> One of these approaches I hope to explore with an outreachy intern.
> >
> > The effect of using ID lists instead of sub selects is wider than just
> > searching, as the above patch demonstrates.
> >
> Noted :-)
>
>
> >
> > I ran this on a copy of our DB and it seems to perform well. I did have
> > to chop off the order by as it's not leagl in Pg to have a group by
> > clause and then order by things not in the group by clause or an
> > aggregate function.
>
> I would hope that offending code is only generated by the (mysql) parts of
> Bugzilla::DB::*. I'll verify that before getting that ready for review.-
> To view or change your list settings, click here:
> <https://lists.bugzilla.org/cgi-bin/mj_wwwusr?user=lindsay@arista.com>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.bugzilla.org/pipermail/developers/attachments/20160926/48eab42d/attachment.html>


More information about the developers mailing list