control characters and Util::clean_text()
Tom Emerson
tree at basistech.com
Wed Dec 21 16:08:48 UTC 2005
Dennis Melentyev writes:
> ASCII 127 is a *correct* Russian symbol in cp1251 (thanks to M$).
> Also, what to do with UTF-8 input?
What about UTF-8? It won't matter. 0x7F is a control character in
Unicode and is a valid UTF-8 single-byte value. Stripping it won't
hurt anything.
-tree
--
Tom Emerson Basis Technology Corp.
Software Architect http://www.basistech.com
"You can't fake quality any more than you can fake a good meal." (W.S.B.)
More information about the developers
mailing list