Rewriting improved gnats converter

Daniel Berlin dberlin at
Sat Apr 12 18:31:46 UTC 2003

Just so no one else attempts it (or if anyone wants the WIP), I'm  
rewriting my vastly improved gnats converter.
The current perl version is in gcc's cvs. See

I'm rewriting it in python, for a few reasons:
1. The perl memory leaks memory like a sieve, with no internally caused  
memory leaks visible.  I've run all the perl memory leak checkers, and  
nothing is marked as leaking.  Yet the process grows to roughly half  
the size of the gnats db it's converting.  For GCC's 600 meg gnats  
database, this means it grows to 300 meg, quite quickly, and stays  
around there.  Every variable is undef'd when it's done, every file is  
closed.  Still, no dice.  The python version just doesn't leak at all.

2.  It is badly in need of cleanup, and it's hard to modularize/OOify  
it in a nice way in perl.
The python version has two main classes, GNATSbug and Bugzillabug. It  
builds the GNATSbug from a file, then creates  Bugzillabug from it (The  
BugzillaBug constructor does the conversion), then writes out the  
The perl version has all these pieces mixed in together.

3. The python version is actually 2x-3x faster (overall) than the perl  
version (which was ~10x faster than the original bugzilla  
comes with) because it's 2x-3x faster (average) in parsing the GNATS  
bugs.   The code is the same in both versions (this part is a direct  
copy/paste/convert) if you account for language syntax differences.
The gnats parsing is bounded by the speed of string concatenation in  
both python and perl, and the python version is just faster at it.
One 21 meg PR takes 19 seconds in Perl to parse, and 2 seconds in  

The whole 600 meg, 10000 PR gnats db takes 3 minutes to convert with  
the python script.

4. I'm a python person (have been for a long time), so i've just been  
meaning to do this anyway for a while.

If anyone wants the WIP, let me know.


More information about the developers mailing list