control characters and Util::clean_text()
Dennis Melentyev
dennis.melentyev at infopulse.com.ua
Wed Dec 21 15:42:40 UTC 2005
ASCII 127 is a *correct* Russian symbol in cp1251 (thanks to M$).
Also, what to do with UTF-8 input?
В ср, 21/12/2005 в 14:09 +0100, Frédéric Buclin пишет:
> Hello!
>
> bug 238780 added a new method Util::clean_text($str) whose goal is to
> remove control characters from the string $str (ASCII 0 through 31 and
> ASCII 127). The idea was to prevent newlines and such characters in
> fields such as the product version (bug 238780), the target milestone
> (bug 177773) and the bug summary (bug 101380), among others.
>
> As far as I know, only comments should allow such characters (well,
> apart from newlines (ASCII 10 and 13) and maybe horizontal tabs (ASCII
> 9), I don't see why we should allow other control characters in
> comments). This brings us to the following problem: if we want to filter
> *all* fields using clean_text(), we would have to change a large part of
> the code, replacing most trim() by clean_text() (clean_text(), in his
> updated version, returns the trimmed string already). This is clearly
> not something I'm going to do nor to approve (6 patches are in my review
> queue about such changes, including one for the 2.16 branch!). So why
> not updating trim() to automatically remove such characters everywhere?
> This solution would be much less invasive.
>
> If nobody has objection about my suggestion, that's what I would like to
> see implemented. I could even imagine trick_taint() to do this kind of
> cleanup itself.
>
> Comments?
>
> LpSolit
> -
> To view or change your list settings, click here:
> <http://bugzilla.org/cgi-bin/mj_wwwusr?user=dennis.melentyev@infopulse.com.ua>
More information about the developers
mailing list