Ruby 1.9 made me remember how I hate the concept of encodings
Update: You can follow the discussion on Hacker news
I guess I won’t make a lot of friends by saying this but my first impression with Ruby 1.9 was awful. Since we had to configure a new server, we thought it was the perfect occasion to install ruby 1.9 and see how our current web applications behave. I can’t comment on any new features about 1.9 because we got bored after a while and decided to switch back to 1.8.
There is one huge problem with 1.9 and it is how it manages encodings. Sure every ruby fanatics will tell you that it is “cleaner”, “more robust”, “safer”, “clever” or whatever… but it breaks working applications! So in my book this is a problem. I know it is just normal to refactor some aspects of your code when a new version of a programming language comes out but… this? All this pain for what, encodings? What’s the benefit already? No seriously tell me because I wake up every morning having to remember what they are, what purpose they serve and why application developers still have to worry about them after all those years. What I know however is that with 1.8 you could mix different encodings in the same string instance and the worst that could happen was some weird looking characters in the resulting web page. But ruby 1.9 makes things different, it throws an exception in your face. Here is a great article that explains what is happening with string encodings in 1.9. Beware though, although this article is saying that there is a solution, it is an hypothetical solution. Here is the excerpt :
Even better, Ruby already has a mechanism that is mostly designed for this purpose. In Ruby 1.9, setting Encoding.default_internal tells Ruby to encode all Strings crossing the barrier via its IO system into that preferred encoding. All we’d need, then, is for maintainers of database drivers to honor this convention as well.
So, unless I’m just not getting something (which is highly possible because encodings always confuse the heck out of me), there is no real solution other than to wait for database driver developers to honor the Encoding.default_internal setting of ruby 1.9.
A small rant about encodings
I dream of a world without war or hunger and where I don’t have to care about character encodings when I’m programming! Why on earth do we have all those different encodings in 2010? Why not making a huge encoding table UTF-16384 containing every single character in the universe so we can forget about this crazy concept of different encodings and pretend that it never existed? Would a big fat and unique encoding table cause huge performance issues everywhere? I might be mistaken but I really doubt it would.