Regarding Duplicate Bug Report Detection in Bugzilla

amar.budhiraja1 at gmail.com amar.budhiraja1 at gmail.com
Sat Oct 15 01:43:54 UTC 2016


Hi Emma,

Thank you for such a prompt reply. 
Please find my replies in-line.



On Saturday, October 15, 2016 at 6:45:34 AM UTC+5:30, Emma Humphries wrote:
> Hi Amar,
> 
> Thanks for your inquiry. I'm CC'ing the Firefox community manager Mike Hoye
> on this in case he has volunteer resources available to help.
> 
> To clarify your request, do you want someone to verify if a group of words
> describes the same topic?

Yes. To add to it, I'd like to know if most of the words are under the same topic, not necessarily describe a topic. The rating would be 0/1.
 
> What level of experience/expertise will be needed to do this verification?
> Does the verify-er need to understand the product/component organization of
> bugzilla, understand Firefox source code, have programming experience?


Some level of product/component organization of Firefox is necessary. For example, a set of words that I encountered had words like email, reply etc with thunderbird. Programming experience is necessary to understand code jargons. No Source code knowledge is required.


> Do you have an estimate of how long this task would take, and do you have a
> sample of the data you need reviewed and verified?



In worst case, scenario all the words in the 10-word set could be unknown to the verify-er and hence, ideally she/he would search for it online. In such a case, it could take up to 2-3 minutes per 10-words set.

Sample of 10 such 10-word sets is here: 

https://drive.google.com/file/d/0BwFxxPd1ZJkrbks3TUtoZkI4OTg/view?usp=sharing



Thanks,
Amar




> Thanks,
> 
> Emma Humphries
> Bugmaster
> 
> On Fri, Oct 14, 2016 at 6:03 PM, <amar.budhiraja1 at gmail.com> wrote:
> 
> > Hi,
> > I am working on a research project on automatic detection of duplicate bug
> > report detection.
> > I am using the last 10 years of Mozilla bug reports(~750K) to do the ask
> > in order to make it easier triager by getting the duplicate bug report in
> > top-10.
> > The results look promising quantitatively and we want to publish the
> > result in a tier 1 conference.
> >
> > For the same, we need Mozilla's help. We have about 600 sets of 10-words
> > and we request Mozilla to help us do the quantitative evaluation on those.
> > Basically for each set of 10-words, someone will have to say whether these
> > words belong to the same topic.
> >
> > We would appreciate if Mozilla could help by asking its community to help
> > with the labeling.
> >
> > Hoping to hear back.
> >
> > Thanks,
> > Amar Budhiraja
> > Data Science and Analytics Centre
> > IIIT-Hyderabad
> > _______________________________________________
> > dev-apps-bugzilla mailing list
> > dev-apps-bugzilla at lists.mozilla.org
> > https://lists.mozilla.org/listinfo/dev-apps-bugzilla
> >


On Saturday, October 15, 2016 at 6:45:34 AM UTC+5:30, Emma Humphries wrote:
> Hi Amar,
> 
> Thanks for your inquiry. I'm CC'ing the Firefox community manager Mike Hoye
> on this in case he has volunteer resources available to help.
> 
> To clarify your request, do you want someone to verify if a group of words
> describes the same topic?
> 
> What level of experience/expertise will be needed to do this verification?
> Does the verify-er need to understand the product/component organization of
> bugzilla, understand Firefox source code, have programming experience?
> 
> Do you have an estimate of how long this task would take, and do you have a
> sample of the data you need reviewed and verified?
> 
> Thanks,
> 
> Emma Humphries
> Bugmaster
> 
> On Fri, Oct 14, 2016 at 6:03 PM, <amar.budhiraja1 at gmail.com> wrote:
> 
> > Hi,
> > I am working on a research project on automatic detection of duplicate bug
> > report detection.
> > I am using the last 10 years of Mozilla bug reports(~750K) to do the ask
> > in order to make it easier triager by getting the duplicate bug report in
> > top-10.
> > The results look promising quantitatively and we want to publish the
> > result in a tier 1 conference.
> >
> > For the same, we need Mozilla's help. We have about 600 sets of 10-words
> > and we request Mozilla to help us do the quantitative evaluation on those.
> > Basically for each set of 10-words, someone will have to say whether these
> > words belong to the same topic.
> >
> > We would appreciate if Mozilla could help by asking its community to help
> > with the labeling.
> >
> > Hoping to hear back.
> >
> > Thanks,
> > Amar Budhiraja
> > Data Science and Analytics Centre
> > IIIT-Hyderabad
> > _______________________________________________
> > dev-apps-bugzilla mailing list
> > dev-apps-bugzilla at lists.mozilla.org
> > https://lists.mozilla.org/listinfo/dev-apps-bugzilla
> >



On Saturday, October 15, 2016 at 6:45:34 AM UTC+5:30, Emma Humphries wrote:
> Hi Amar,
> 
> Thanks for your inquiry. I'm CC'ing the Firefox community manager Mike Hoye
> on this in case he has volunteer resources available to help.
> 
> To clarify your request, do you want someone to verify if a group of words
> describes the same topic?
> 
> What level of experience/expertise will be needed to do this verification?
> Does the verify-er need to understand the product/component organization of
> bugzilla, understand Firefox source code, have programming experience?
> 
> Do you have an estimate of how long this task would take, and do you have a
> sample of the data you need reviewed and verified?
> 
> Thanks,
> 
> Emma Humphries
> Bugmaster
> 
> On Fri, Oct 14, 2016 at 6:03 PM, <amar.budhiraja1 at gmail.com> wrote:
> 
> > Hi,
> > I am working on a research project on automatic detection of duplicate bug
> > report detection.
> > I am using the last 10 years of Mozilla bug reports(~750K) to do the ask
> > in order to make it easier triager by getting the duplicate bug report in
> > top-10.
> > The results look promising quantitatively and we want to publish the
> > result in a tier 1 conference.
> >
> > For the same, we need Mozilla's help. We have about 600 sets of 10-words
> > and we request Mozilla to help us do the quantitative evaluation on those.
> > Basically for each set of 10-words, someone will have to say whether these
> > words belong to the same topic.
> >
> > We would appreciate if Mozilla could help by asking its community to help
> > with the labeling.
> >
> > Hoping to hear back.
> >
> > Thanks,
> > Amar Budhiraja
> > Data Science and Analytics Centre
> > IIIT-Hyderabad
> > _______________________________________________
> > dev-apps-bugzilla mailing list
> > dev-apps-bugzilla at lists.mozilla.org
> > https://lists.mozilla.org/listinfo/dev-apps-bugzilla
> >
_______________________________________________
dev-apps-bugzilla mailing list
dev-apps-bugzilla at lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-apps-bugzilla



More information about the developers mailing list