Custom fields schema

Bradley Baetz bbaetz at acm.org
Mon Dec 20 07:18:18 UTC 2004


On Sun, Dec 19, 2004 at 06:38:43PM -0500, Sean McAfee wrote:
> Well, here's some hard data.
> 
> 1 row in set (0.40 sec)
> 
> 1 row in set (0.03 sec)
> 

0.4 seconds is not hard data - you're going to be hitting cache only.
You need something that takes at least 10 seconds, and hits disk.

> So the distributed-tables solution is clearly much faster.

It can usually never be faster, since there are additional lookups to
do. Where it can be faster is where indexes can't be used, and you can
scan all of a smaller table plus some of a large table, rather than all
of a larger table - thats what you're doing here with |instr|. Noone
expends instr to be fast, though.

Try what you had, but using a small set of strings (eg products, before
I moved it over to product_id), and some selects on some values that
appear often, and some that are rare.

However, the biggest issue you run into is MySQL's use of at most one
index per table per query. Try it again with product and component, and
requirements on both, (joining the main table twice to the secondary
table) and you'll see the difference. Then try it with a 'text table' and an
'integer table' (so that you really have multiple tables).

For added fun, do it on postgres (>= 7.4, remembering to VACUUM after
the initial data population) and compare the times then...

Separate tables are definately cleaner, though, and my prefered solution.

Bradley



More information about the developers mailing list