external searching bugzilla database, robots.txt

Cunningham, Tom tcunningham at vfa.com
Mon Feb 21 16:11:17 UTC 2005


Could indexing be a preference?  There are definitely situations where
allowing Google and other search engines to index bugs would help
projects (cutting down numbers of duplicates, allowing people to find
patches quickly).   If I'm working with an open source project, I
usually search Google for an issue before I search a bug tracker.

I would definitely suggest implementing the nofollow tag as part of this
patch if you do go ahead and try this.   Blogs have had problems with
people and automated systems logging comments to improve their pagerank
ranking, and if indexing were allowed in a Bugzilla installation, I
could foresee that spammers might try to log spurious bugs in order to
get links back to their site.

-----Original Message-----
From: developers-owner at bugzilla.org
[mailto:developers-owner at bugzilla.org] On Behalf Of Jason Remillard
Sent: Sunday, February 20, 2005 12:03 AM
To: developers at bugzilla.org
Subject: external searching bugzilla database, robots.txt

Hi,

Recently I was interested in new feature in bugzilla. I used google and 
other search engines to see if people where working on it already, if it

was being planed etc. I got no hits. It was not until I resigned myself 
to the fact that I must be the only person who wanted the feature that I

opened a bugzilla account. Just for kicks I did a bugzilla search, and 
to my surprise the exact item I was looking for was present in the 
database with a patch! I just learned two things, much of the activity 
of the bugzilla developer community is in the bugzilla database, not on 
this list, and not on any news group. Second, no search engine is 
indexing any of this valuable information.

Now that I have a clue, I see that there is a bug (81920) against not 
letting in search engines. It is marked as "will not fix". I am not sure

if it was marked as do not fix because letting in search engines was 
considered a bad idea, because the solution suggested in the bug was 
bad, or because it was not going to be done for the current release
(2.16).

Obviously I think that it is a good thing to unlock the wealth of 
information the bugzilla database to the search engines. Also, many 
companies have internal search spiders, so I think the value in indexing

holds up for both public and private installations. I am sure it would 
have saved me time if the public bugzilla database was being index'ed.

Q: Would a patch to allow indexing by search engines not get accepted 
because of policy decision against external indexing? Or is simply a 
matter of getting an implementation that would work well?

Getting it implemented well seems to be a bit tricky. Google says that 
they don't want a special search only page to help them index.

http://www.google.com/intl/en/webmasters/

The best way I see this working is to add a link somewhere that would 
bring up a series of "browse" pages. The browse pages would form a 
hierarchy (perhaps off of the advanced search page).

Product List or Products -> Bugs opened by year-week number -> Links to 
the bug numbers

This would look like a real browse to the search engines, and hopefully 
not kill the server because it would be a set of nested pages, allowing 
the queries to be broken up into lots of small chunks. I think this 
would be pretty easy to put together, but I don't want start on it if 
everybody is dead against the idea of indexing.

Thanks
Jason.

-
To view or change your list settings, click here:
<http://bugzilla.org/cgi-bin/mj_wwwusr?user=tcunningham@vfa.com>




More information about the developers mailing list