external searching bugzilla database, robots.txt
Cunningham, Tom
tcunningham at vfa.com
Mon Feb 21 16:11:17 UTC 2005
Could indexing be a preference? There are definitely situations where
allowing Google and other search engines to index bugs would help
projects (cutting down numbers of duplicates, allowing people to find
patches quickly). If I'm working with an open source project, I
usually search Google for an issue before I search a bug tracker.
I would definitely suggest implementing the nofollow tag as part of this
patch if you do go ahead and try this. Blogs have had problems with
people and automated systems logging comments to improve their pagerank
ranking, and if indexing were allowed in a Bugzilla installation, I
could foresee that spammers might try to log spurious bugs in order to
get links back to their site.
-----Original Message-----
From: developers-owner at bugzilla.org
[mailto:developers-owner at bugzilla.org] On Behalf Of Jason Remillard
Sent: Sunday, February 20, 2005 12:03 AM
To: developers at bugzilla.org
Subject: external searching bugzilla database, robots.txt
Hi,
Recently I was interested in new feature in bugzilla. I used google and
other search engines to see if people where working on it already, if it
was being planed etc. I got no hits. It was not until I resigned myself
to the fact that I must be the only person who wanted the feature that I
opened a bugzilla account. Just for kicks I did a bugzilla search, and
to my surprise the exact item I was looking for was present in the
database with a patch! I just learned two things, much of the activity
of the bugzilla developer community is in the bugzilla database, not on
this list, and not on any news group. Second, no search engine is
indexing any of this valuable information.
Now that I have a clue, I see that there is a bug (81920) against not
letting in search engines. It is marked as "will not fix". I am not sure
if it was marked as do not fix because letting in search engines was
considered a bad idea, because the solution suggested in the bug was
bad, or because it was not going to be done for the current release
(2.16).
Obviously I think that it is a good thing to unlock the wealth of
information the bugzilla database to the search engines. Also, many
companies have internal search spiders, so I think the value in indexing
holds up for both public and private installations. I am sure it would
have saved me time if the public bugzilla database was being index'ed.
Q: Would a patch to allow indexing by search engines not get accepted
because of policy decision against external indexing? Or is simply a
matter of getting an implementation that would work well?
Getting it implemented well seems to be a bit tricky. Google says that
they don't want a special search only page to help them index.
http://www.google.com/intl/en/webmasters/
The best way I see this working is to add a link somewhere that would
bring up a series of "browse" pages. The browse pages would form a
hierarchy (perhaps off of the advanced search page).
Product List or Products -> Bugs opened by year-week number -> Links to
the bug numbers
This would look like a real browse to the search engines, and hopefully
not kill the server because it would be a set of nested pages, allowing
the queries to be broken up into lots of small chunks. I think this
would be pretty easy to put together, but I don't want start on it if
everybody is dead against the idea of indexing.
Thanks
Jason.
-
To view or change your list settings, click here:
<http://bugzilla.org/cgi-bin/mj_wwwusr?user=tcunningham@vfa.com>
More information about the developers
mailing list