Monday, February 22, 2010

How to fix movie subtitle (and other text) encoding issues

Summary: Fixing subtitle encoding in DivX videos is easy... once you know how to do it.

I have been hunting for a copy of Moi Ivan, toi Abraham (AKA "Ivan and Abram", "Я - Иван, ты - Абрам") since I saw the movie on cable in mid-90s. The movie has not been released on a DVD, and I do not have a VHS player, but fortunately, I got a decent DivX version of the movie with Russian subtitles (the movie is mostly in Yiddish).

Unfortunately, instead of legitimate Cyrillic, the subtitle captions displayed garbage (accented characters). As I later found out, the subtitle file was encoded in ASCII for Windows-1251 (Cyrillic) code page instead of a Western code page (such as Windows-1252), so they appear fine only on a Russian version of Windows. So, what's a girl to do? I ran a few Google searches and found some posts from people running into a similar problem, but none of them contained any answers. I thought I would write a post explaining how I fixed the problem (really easy) hoping that it would help someone.

First, a quick intro to subtitles in DivX. Well, I do not really know much about this, but this is how much you -- a typical movie viewer -- need to know (if I misstate or omit something important, feel free to correct me). A typical DivX (AVI) file does not contain embedded subtitles. Subtitles normally come from a separate file, such as SRT, SUB, SSA/ASS. Normally, a subtitle file has the same name (and different extension) as the DivX file. For example, this would be a pair of a DivX (AVI) and a subtitle (SRT) files:
Moi Ivan, Moi Abraham.avi
Moi Ivan, Moi
There is nothing magic about a subtitle file: it's just a text file, which confirms to a certain data format. Here is the format of the SubRip (SRT) subtitle file (directly from Wikipedia):
Subtitle number
Start time --> End time
Text of subtitle (one or more lines)
Blank line
Here is an example:
00:00:18,700 --> 00:00:21,889
<i>Говорят по-цыгански</i>

00:03:16,190 --> 00:03:21,760
Many popular video players (KMPlayer, VLC, etc), as well as DVD players, will automatically load and display the default subtitles from the file with the same name (as the DivX file) and the same folder, but you can also load additional subtitle files manually (e.g. you may have subtitles translated in several languages). In my favorite KMPlayer, you can load non-default subtitles via the Subtitles - Load Subtitle menu.

The original subtitle file I got looked like this:
00:00:18,700 --> 00:00:21,889
<i>Ãîâîðÿò ïî-öûãàíñêè</i>

00:03:16,190 --> 00:03:21,760
Although this text looks like garbage, it's not useless: it just needs to be re-encoded from one code page to another (and desirebly, to something non-code-page-specific, e.g. to Unicode). But how do you do it?

Help comes from Mozilla Firefox (and I suspect from any other web browser). If you need to fix the encoding of a subtitle file (or any other text file), here is what you need to do (you can use a similar approach to recover text in other types of documents, such as email, text files, and so on).
  1. Launch Firefox (or you favorite web browser).
  2. Open the subtitle file. To locate file in Firefox 3.5, use the File - Open File menu; in IE 8, use the File - Open menu, and click the Browse button; in Google Chrome 4.0 press the CTRL + O keys (when using Google Chrome, you need to change extension of the subtitle file to .TXT before opening the file; otherwise, it will launch the default program associated with the original file extension instead of displaying the file text in the browser).
  3. Once the browser opens the file, it may automatically adjust encoding. If you still see garbage, select a different encoding option until the text appears correctly. To change encoding in Firefox 3.5, select appropriate encoding from the View - Character Encoding menu (Auto-Detect menu for the appropriate language can be helpful); In IE 8, use the View - Encoding menu; In Google Chrome, click the Control the current page toolbar button and pick the appropriate option from the Encoding menu (again, the Auto detect option may help).
  4. Once you select the correct encoding option and verify that the text is displayed correctly highlight all text (you can use CTRL + A), and copy the selected text to the clipboard (press CTRL + C).
  5. Open Notepad (or your favorite plain text editor, such as Notepad++, PSPad, etc), create a new file (File - New menu option in Notepad) and paste the contents of the clipboard in the new file (press CTRL + V).
  6. Save the text file as the new subtitle file. If you decide to overwrite the original subtitle file, make sure that you first make a backup in case something goes wrong. When saving the file, you will most likely be prompted to change the default ANSI encoding, so pick the Unicode encoding.
  7. Close the newly created subtitle file in Notepad (or your text editor), and reopen it to verify that encoding is still intact and text appears correctly, and if so, use it as a new subtitle file.
Now, if you need the Unicode version of the Russian subtitle file for Moi Ivan, toi Abraham, you can download it from here:
Moi Ivan, toi
UPDATE: As I recently found out, the process of correcting the code page related issues in subtitles can be even easier, assuming that you have a free text editor Notepad++ installed. What you need to do is:
  1. Back up the subtitles file (just in case something goes wrong).
  2. Open the subtitles file in Notepad++.
  3. From the Encoding menu, select the Characters Set option.
  4. Under the character set, select the appropriate language family and then the code page (you may need to try a few code pages if you don't know which one to use).
  5. When you see the characters appearing in the correct format, select the Convert to UTF-8 option under the Encoding menu.
  6. Save the file.
That should be it.
See also:
The 3 Best Subtitle Sites For Your Movies & TV Series
How To Add Subtitles To A Movie Or TV Series
SubDownloader: Fast and Easy Subtitle Downloader
DivX Subtitles
DivXLand Media Subtitler Embeds Subtitles into Movie Files
Sublight Labs: Searching subtitles has never been this easy
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

Friday, February 19, 2010

Acer Aspire Revo 3610 review

Summary: First look at the Acer Revo 3610 HTPC.

After a long search for a home theater PC (HTPC) and entertaining various options from building my own system (what was I thinking!) to buying something similar to Sony VAIO - VGX-TP1, I finally settled on Acer Aspire Revo AR3610-U9022 Desktop PC:
I chose Revo 3610 for a number of reasons, including:Now, Revo 3610 has a few limitations, but none of them are critical to me:
  • Lack of optical drive
    While it is a major limitation, this is one reason why the system is so cheap, but here is my take on it. I will mostly use Revo 3610 for watching streamed media or media stored on a hard drive. I do not care about Blue-ray at this point, but when I do, I'll buy a Blue-ray player anyway (these things are getting cheaper each month). And for watching regular DVDs/CDs, I already have a DVD player. I will need an external DVD writer, though (to burn system backup disks, etc).
  • Small hard drive
    By current standards, the 160GB hard drive is small, but I have a bunch of external hard drives, which I can plug in, so no biggie here.
  • Small RAM
    The 2 GB RAM is hardly enough to run a 64-bit OS, so I'm planning to upgrade memory to 4 GB when I find DDR2 800 SDRAM on sale (here is a video explaining how to take Revo apart), or more likely, I'll just plug in a ReadyBoost flash drive (find ReadyBoost flash drives on Amazon).
  • Form factor
    I'm not a big fan of the design. Wish the system looked more like a DVD player and were entirely black and/or silver (the "dark blue" color is not offensive, though; it's very close to black). In the vertical position (on the stand), it looks weird.
Once I decided to buy Revo 3610, I could not find it on sale anywhere in the USA. I wasn't sure if it was discontinued, but after a couple of weeks of searching and waiting, I finally managed to snap one from Amazon (when it became available for a few days). I got the system delivered in about 10 days after placing the order. Here are my first impressions.

Where is the optical cable? Duh!
The system supports optical connection via the HDMI or the VGA port, but comes without cables. So how do you connect it to a TV? It would be nice if it came with an HDMI cable (seriously, a stock 6" HDMI cable costs what: $5?). If you decide to get Revo 3610, make sure that you have either a VGA or an HDMI cable ready (the cheap HDMI cable I bought at Meritline works very well; see HDMI cable deals).

Documentation? What documentation?
Once I connected Revo 3610 to my 32" Toshiba HGTV, I could not figure out how to make the supplied wireless keyboard and mouse work. I was expecting to find instructions in the documentation, but there were none. The provided pamphlet includes IKEA-like diagrams and some information explaining how to proceed with OS setup, but there is not a single sentence to explain how to enable the keyboard and mouse. The Revo 3610 documentation is a joke. I'm not kidding. Here is a quote from the User Guide I found on the hard drive after I completed the setup:
"Insert the startup disk you created during Windows setup into the floppy drive and press Ctrl + Alt + Del to restart your computer."
Floppy drive? Anyway, to complete the setup I just plugged in my old USB keyboard and mouse, but I needed to figure out how to actually make the wireless mouse and keyboard work, so I decided to try customer support.

Customer support! What customer support? Ah, that customer support!
I first submitted a question via the web form. It has been over a month, and I haven't received a response, yet. I then tried the online chat, but was told that it only covers Aspire One (whatever it is). I got the support number though (1-800-571-2237). The official version was that the phone support is available Mon-Fri 7 am-9 pm CST and Sat-Sun 8 am-5 pm CST (excluding holidays), but the person I talked to first said that it's available 24 hours. The support technician explained to me that the USB receiver is hidden inside of the battery compartment of the wireless mouse (the receiver is really small, so I did not notice it when I inserted the battery). Once I plugged the receiver into one of the Revo's USB ports and turned the power switch (it's nice that it has one), the mouse started to work, but I still struggled with the keyboard. The support technician told me to push the Connect switch on the bottom of the keyboard (apparently, the keyboard uses the same receiver as the mouse), but I tried and tried and it did not seem to work. After struggling for a few minutes, I finally managed to connect the keyboard, but I'm still not sure what I did differently. So finally I got the devices working.

VESA mount... no mount...
I was planning to mount Revo on the back side of my TV, but the holes in the supplied VESA mount did not match the holes in the TV. I suspect that the VESA mount is intended for smaller monitors. Even if the holes matched I would still be confused by the "instruction" explaining how to use the VESA mount.

Software, badware, goodware.
Revo 3610 comes with 64-bit edition of Windows 7, which is not the optimal choice. Not only the 64-bit OS requires more resources (such as RAM), but since HTPC is intended primarily for media viewing and most of the popular media software (Flash, media players, browsers) only come as 32-bit applications, this seems like a waste of resources. They do work, though, but what's the point! Anyway, as any other system, Revo comes with some bloatware, which took me about two hours to uninstall. I then tried to get the latest updates (OS, NVIDIA drivers, etc), and I screwed up by not following instructions (I did not uninstall Flash before installing Flash 10.1 Beta, forgot to install the latest NVIDIA chipset driver, and made a couple of other goofs, causing a few apps -- mostly web browsers showing videos -- to crash). I finally followed Paul J Roberts' instructions to the letter, and voila: it now works, and seems pretty stable.

One thing that disappointed me was that Revo did not recognize DVD+R in my external DVD writer. It worked fine for burning backup CDs, though. I suspect this is an issue with the DVD writer (I had problems with it before), so I'm currently shopping for another drive (I kinda like Samsung SE-S084C/RSBN, but keep my options open [got the Samsung SE-S084C/RSBN from Amazon for $49.99]).
I do not like the Revo's power switch. Every time I press it, I'm afraid it'll break. I'd rather have a conventional power button.

I connected a 1.5 TB Western Digital Elements USB drive for additional storage, but I did not figure out how to access this drive from my other laptops (running Windows XP and Vista).
Although my display is set to show larger fonts (150%), I wish I could make the fonts even bigger. I'm quite satisfied with HD video quality. Streaming video works fine, but at this point, I'm constrained with the speed limitations of my wireless 802.11b/g router (have been waiting for the price of Netgear Rangemax WNDR3700 to drop), so the quality of streaming video varies. Programs like The Daily Show and Frontline show fine, but HD content is choppy (most likely, the router issue). Local HD videos look great. [Update: I have since bought ASUS RT-N13U Wireless-N Router, Access Point, and Repeater; see my review here.]

I have been using Revo 3610 for just a couple of weeks, so I'm still learning the software (Windows 7 quirks, Windows Media Center, Boxee, etc), but so far, I like it. I'm planning to post an update once I get a better grip on it and get more devices (in addition to wireless-N router, I'm also considering a TV tuner, and maybe a remote).

UPDATE (Feb 22, 2009): Just discovered an issue with x.264-encoded MKV files: it plays them, but the video is really slow and gets out-of-sync with the audio. I suspect that this is an issue with the codec, so looking for a solution. Will post an update once I get it fixed. Also noticed a couple of crashes when watching Flash in full-screen mode (this started to happen after a recent Windows update). Will keep an eye on this one as well. [See update below.]

UPDATE (Feb 23, 2009): The x.264 encoding/MKV issue seems to be solved, thanks to recommendations I found at RevoUser board. Now, I'm too cheap to buy the CoreAV Professional Codec, so I somewhat altered the recommendations. First, I installed the latest release of the ffdshow codec pack. Because ffdshow is available in 32-bit version, I also installed the 32-bit version of Media Player Classic Home Cinema. Other than checking a couple of audio settings in the ffdshow options, I did not make any other changes. I opened a couple of x.264-encoded MKV files (1080p) in Media Player Classic, and they ran flawlessly. And the quality was just breathtaking even on my cheap 32" 720p Toshiba LCD. I'm very pleased with the results.

UPDATE (Nov 22, 2010): I have been using Revo 3610 for over 9 months now, and I'm quite happy with it. The only complaint I have is the performance of the wireless-N adapter for streaming HD video, but I'm not sure if this is and issue with my wireless setup (distance to the router, walls), or with the adapter. I don't stream HD video that often, so not a big deal for now. On another note, I have recently been helping some friends buy HTPCs and noticed a few models that appear to be newer and better alternatives to Revo 3610. These were not available at the time I bought it, but if I were shopping for an HTPC now, I would seriously consider the following models (Amazon prices seem reasonable at the time of this post):

Find more HTPCs on sale.

UPDATE (Jan 15, 2011): A few weeks ago, the mouse started to act (the left button kept getting stuck), so I called Acer and received a free replacement withing a few days. No hassle, whatsoever.

UPDATE (Jan 25, 2011): This week I decided to jump on the Windows Media Center wagon (so to speak), and had all kinds of issues. The major problem was that Windows 7 WMC could not play DVDs (from virtual drives mapped ISO images, as well as VIDEO_TS folders; kept getting error: "Files needed to display video are not installed or are not working correctly."). After 3 nights of investigation and a futile effort to reinstall the OS, I finally found the solution at Windows Client TechCenter (see post by CGTracy). It involves 4 step:
  1. In Windows Media Center, navigate to Settings > DVD > Audio.
  2. Check the Auto volume box.
  3. Click Save.
  4. Reboot computer.
I'll write a post on WMC migration, in which I'll explain how to make it work and play all kinds of media formats.

See also:
Acer Aspire Revo Review
Acer Aspire Revo 3610 Atom ION 330 Review
Why It's Finally Time To Get a Home Theater PC
My Media Center Setup
Guide to Building a HD HTPC
Diary Of My Switch To Internet TV - Part 2
Diary Of My Switch To Internet TV - Part 4 (Revo user forum)
Beginner's Guide What is an HTPC?
How I Built the Media Center of My Dreams for Under $500

Thursday, February 18, 2010

Going Chrome... Google Chrome

Summary: If you haven't checked out Google Chrome lately, this may be the time.

Well, boys and girls, it looks like I'm switching to Google Chrome.

I have been using Firefox since version 1.5, but lately Firefox' performance turned into a major hassle. It takes me from 15 seconds to over a minute to launch the browser and it's just plain silly. I tried all suggestion I could find -- reducing the number of add-ons, deleting and recreating a user profile, etc. -- but since nothing helped, I have been looking for a browser replacement.

I briefly flirted with every version of IE that came out of Microsoft. IE7, IE8... did not like any of them. I tried a couple of the Chrome betas when they just came out, and did not like them mostly due to missing features, such as lack of extension support. I installed the current version (v.4.0.x) a couple of weeks ago, and the more I use it, the more I like it.

First of all I'm blown away with performance. Compared to Firefox, everything from initial browser launch to page loads seems blazingly fast.

Chrome's user interface is very pleasing. Many minor details illustrate a lot of thought put into the GUI. For example, I like that the status bar only appears when status is changing (this leaves more useful space for the web content). It's nice that the menu options are located under a single toolbar button (again giving more space and reducing the UI clutter).

Finally, I found out that all but few of my favorite Firefox add-ons had been ported to Chrome extensions, which made my transition easier. I would not have made a switch to Google Chrome, had the following extensions not been available:
  • IE Tab Multi (alternative: IE Tab)
    Displays web pages using IE rendering engine hosted inside of a Chrome tab.
  • LastPass
    Password manager, form filler, and more.
  • Xmarks Bookmarks Sync
    Synchronizes my bookmarks across IE, Firefox, Chrome and multiple computers.
I could've lived without the following extensions, but they sure make my web browsing experience better: And here are several extensions that developers will appreciate (courtesy to SitePoint Design View #69):
  • Pendule
    Extends the features of the the built-in Developer Tools (available via Ctrl+Shift+I).
  • Firebug Lite
    A somewhat crippled version of the most popular Firefox add-on.
  • Resolution Test
    Changes the size of the browser window for developers to preview their websites in different screen resolutions (see also Window Resizer).
  • Eye Dropper
    Allows you to pick color from any webpage.
These extensions and a handful of bookmarklets, should take care of most of my browser need. (By the way, in case you did not know: you do not need to restart Chrome after installing a new extension. Sweet.)

Too bad, I could not find replacement for the following Firefox add-ons:
  • DownThemAll
    Downloads all or selected files pointed by hyperlinks or image references in a web page.
  • FireFTP
    In-browser FTP client.
So, I'll keep Firefox just in case I need to use it (as well as IE), but for most of my needs, it looks like Chrome will server me pretty well. Thanks Google.