external searching bugzilla database, robots.txt

Jason Remillard jremillardshop at letterboxes.org
Sun Feb 20 05:02:56 UTC 2005


Recently I was interested in new feature in bugzilla. I used google and 
other search engines to see if people where working on it already, if it 
was being planed etc. I got no hits. It was not until I resigned myself 
to the fact that I must be the only person who wanted the feature that I 
opened a bugzilla account. Just for kicks I did a bugzilla search, and 
to my surprise the exact item I was looking for was present in the 
database with a patch! I just learned two things, much of the activity 
of the bugzilla developer community is in the bugzilla database, not on 
this list, and not on any news group. Second, no search engine is 
indexing any of this valuable information.

Now that I have a clue, I see that there is a bug (81920) against not 
letting in search engines. It is marked as “will not fix”. I am not sure 
if it was marked as do not fix because letting in search engines was 
considered a bad idea, because the solution suggested in the bug was 
bad, or because it was not going to be done for the current release (2.16).

Obviously I think that it is a good thing to unlock the wealth of 
information the bugzilla database to the search engines. Also, many 
companies have internal search spiders, so I think the value in indexing 
holds up for both public and private installations. I am sure it would 
have saved me time if the public bugzilla database was being index’ed.

Q: Would a patch to allow indexing by search engines not get accepted 
because of policy decision against external indexing? Or is simply a 
matter of getting an implementation that would work well?

Getting it implemented well seems to be a bit tricky. Google says that 
they don't want a special search only page to help them index.


The best way I see this working is to add a link somewhere that would 
bring up a series of "browse" pages. The browse pages would form a 
hierarchy (perhaps off of the advanced search page).

Product List or Products -> Bugs opened by year-week number –> Links to 
the bug numbers

This would look like a real browse to the search engines, and hopefully 
not kill the server because it would be a set of nested pages, allowing 
the queries to be broken up into lots of small chunks. I think this 
would be pretty easy to put together, but I don’t want start on it if 
everybody is dead against the idea of indexing.


More information about the developers mailing list