2009-03-28
I noticed a few malformed characters in ...
I noticed a few malformed characters in the RSS feed of my homepage that weren't there in the original database entries and showed ok in the web version. Again, utf-8 problems showing, although all data (postgres - script - xml - browser) should be utf-8. Lots of testing and searching, finally I found The Perl UTF-8 and utf8 Encoding Mess by Jeremy Zawodny. He is right: it is a mess. And the post itself demonstrates it by being filled with � characters.
So to make sure everything in the RSS generating process understand that what comes out of PostgreSQL is valid utf-8 and should be imported in the XML::RSS module as the same valid utf-8, I need to recode it to utf-8. Uh.. ok. The bit of code:my $body = Encode::decode('UTF-8', $row[1]);And now I can use ÜTF-8 çħáräćtërs!