Regarding Duplicate Bug Report Detection in Bugzilla

Dylan Hardison dylan at
Sat Oct 15 03:28:58 UTC 2016

> On Oct 14, 2016, at 21:03, amar.budhiraja1 at wrote:
> Hi,
> I am working on a research project on automatic detection of duplicate bug report detection. 
> I am using the last 10 years of Mozilla bug reports(~750K) to do the ask in order to make it easier triager by getting the duplicate bug report in top-10. 
> The results look promising quantitatively and we want to publish the result in a tier 1 conference.
> For the same, we need Mozilla's help. We have about 600 sets of 10-words and we request Mozilla to help us do the quantitative evaluation on those. Basically for each set of 10-words, someone will have to say whether these words belong to the same topic. 
> We would appreciate if Mozilla could help by asking its community to help with the labeling. 
> Hoping to hear back.

This is very interesting. How will the labeling work? Some online questionnaire / google form?
Let me know exactly what's expected and I'll see what I can do. If we can find 60 volunteers, that's only ten sets of ten words
each -- but I might be misunderstanding how you'd need to collect the answers
(and how you control for humans reporting incorrectly).

Kind regards,

Dylan Hardison.

More information about the developers mailing list