I’ve been working on finally getting our site upgraded to EE 2.9, and while I’ve managed to finally get the upgrade scripts to run properly and the site to more or less come up, I’m still having a problem with entries that contain an em dash (—) being truncated at the point of the em dash character.
My original 1.7.3 database were already in utf-8, and this was causing just about all entries with extended characters to be similarly truncated during the ud_200.php conversion stage, both in content and templates. I edited that script to bypass the utf-8 conversion, which seems to have solved the problem with all of the other extended character entries except for the ones containing an em dash, which unfortunately comprises a significantly large number of our entries over the past 12 years.