HTML entities do not show up correct after upgrading to EE 2.1.3

#16 / Mar 18, 2011 2:16pm

Stefan Rechsteiner
442 posts

Any news on this?
#17 / Mar 18, 2011 2:18pm

Simon Balz
34 posts

Hi John and all

What’s going on here? I really need a solution now! Our project is already delayed because of this bug and it generates additional cost to us because we can’t migrate.

Thanks
Simon
#18 / Mar 19, 2011 7:06pm

Greg Salt
3988 posts

Hi Simon,

Apologies for the delay. I have chased this up. Thank you for your patience whilst this is investigated.

Cheers

Greg
#19 / Mar 21, 2011 6:11am

Simon Balz
34 posts

Hi All

In the meantime I tried first updating to EE 1.7.0 and then migrating to EE 2.1.3, but didn’t make any difference.

Hope to get input soon.

Thanks
#20 / Mar 21, 2011 5:03pm

Brandon Jones
5500 posts

Hi Simon,

Have you tried the Find and Replace utility, for example to convert the ASCII code to ö?

Not ideal, but that might get you up and running in the meantime?
#21 / Mar 21, 2011 5:28pm

Simon Balz
34 posts

Hey Brandon

Thanks for this proposal, but this isn’t a real solution since ö was only an example of special character which aren’t printed out correctly.
Since there are around 100 special ascii characters in more than 250 fields over around 10’000 entries, it’s too risky for us to do such a data modification.

Sorry..

Simon
#22 / Mar 22, 2011 12:19pm

Sue Crocker
26054 posts

The good news is that the search and replace will do all of one kind at a time, you’d do the replaces for each umlauted vowel, and whatever other characters you need to replace. Are new entries behaving correctly?
#23 / Mar 22, 2011 5:04pm

Simon Balz
34 posts

Yes, in new entries umlauts and special characters are handled correctly.
But I still rather would like to have EE handles them fine as I have to search and replace all the chars…
Is there any chance to get a bugfix?
It would be nice if you could communicate a bit more how things are going concerning this issue just to bringt some light into the dark…

Small question aside: Is it true that EE2.1 doesn’t encode umlauts and special characterts anymore? Maybe because utf-8 is used as collation?
#24 / Mar 23, 2011 9:57am

Sue Crocker
26054 posts

Simon - it is true that EE2.x doesn’t encode umlauts any longer. See the attached screen shots.

As far as a fix goes, I’ll see what the status is on the development team side of things.
#25 / Mar 29, 2011 6:06pm

Robin Sowell
13255 posts

Sorry for the delay, Simon.

The problem is- we really don’t want to convert entities during the upgrade process. It’s completely valid/desirable to use them in some cases. Likewise, we don’t want to alter their display in the text fields- for pretty much the same reason.

For a new 2.x install, this works as desired. You put in an entity? It stays, and displays, as an entity. And with the forced encoding, there’s no longer a need to convert high ascii to entities- so you put in an umlaut, it stays an umlaut.

However, this is less desirable when high ascii was converted in 1.x- and those are updated to 2.x. There’s no good way to know what’s intended to be entities and what’s intended to be ascii.

I don’t think we want to change the behavior of the updater- I’m sure we don’t. But the code to do the actual conversion is pretty simple. Let me see what I can do w/a bit of standalone code. It would be a pretty much ‘all or nothing’ conversion of channel data, but for those in a situation where that’s desirable, it should be workable. And a standalone wouldn’t be conversing for the majority of folks who don’t need it.

Make sense what’s going on and why? And let me go poke some code, see what I can come up with for you.
#26 / Mar 30, 2011 9:11am

Simon Balz
34 posts

Robin, thanks for your answer. This really helps me understanding my issue.
I also understand that you don’t to let EE convert entities during the upgrade process.

So in fact, it’s not a bug but a feature how you handle high ascii chars when upgrading from 1.x to 2.x, right? 😊
You’re saying the majority of folks who don’t want go get their ascii converted, so do I have a unnormal setup in my EE 1.x installation to convert all ascii chars automatically to entities? What would be best practise? Didn’t you receive any similar feedback as mine? I simply can’t believe we’re the only ones who running such constellation…

Before you spend time getting some code, can you please explain to me what would be the risks for us when converting entities back to ascii chars? If theres a big chance to break up something or getting more troubles as feeling happy with correctly displayed chars in the backend, we rather should accept this fact…
#27 / Mar 30, 2011 10:19am

Robin Sowell
13255 posts

😉 Yep- a feature! Well, really a ‘there is not a great way to handle this’ sort of feature.

I wouldn’t say your setup is odd- the ‘convert ascii’ was a setting in 1.x (it’s not in 2.x due to forcing utf8), and it definitely saw use. But the negative with it is there’s no way to know whether the stored data uses entities in a given case because entities are desired (I used them for code examples sometimes), or because it was necessary at the time for proper display (the umlaut). Best practice would have been to run utf8 and not convert to high ascii- but that wasn’t always feasible- hence the setting.

It hasn’t come up yet that I’ve seen w/the upgrade- but it has come up before among the developers as a potential issue. It’s part of why we removed the high ascii setting in 2.x- well, it was no longer needed and confusing people (which has come up). But we also knew it was potentially problematic for 1.x upgrades that had been using the setting. Weighing the pros/cons, we decided to remove it; we decided the updater was not where we wanted to address any 1.x issues; and we shot around the idea of doing something in the Find/Replace utility or in an add-on. We just handn’t done it yet.

To be clear, there are two issues the removal of high ascii conversion for those who used to use it:
1. The display in text fields - which may be irritating but doesn’t hurt functionality. Irritating is bad though- particularly if it’s a client who is irritated;
2. The issue that concerns me more is it can affect searches. If half the data was converted to entities and half not (cause it was entered in 2.x)- say you search for a company name that includes a copyright symbol. Half the time the name was stored using entities, half not. NOW there’s a functional problem- because only half of the results are going to match.

So- we really do need to provide some way of addressing the issue. I’d say the backend display is not really a big problem and could be approached in a couple of ways. The search issue to me is more of a potential problem, at least in some cases (i.e., when searches on words/phrases that have been converted are likely). The risk to doing it is you could convert some things you don’t mean to- like I say, I purposely used entities sometimes- I would not want them converted. If that is NOT the case on a given site- then as long as the data are backed up before converting, there’s not much risk. And with a backup of the channel_data table- you can always roll back.

All of which is a long way of saying- it’s a bit site dependent (and client dependent- having the entities display in text fields might drive some folks to distraction). For me- if I thought there was much chance of having the converted characters in there affecting search results? I’d convert it.

Hrm- wondering if I should code in an option- convert all, convert code characters, convert other characters- code characters being &,<,>,”,’,-. Or options to include each of those independently. But that might get into being confusing again.

Babbling- I need more caffeine. Let’s just say- I’m going to poke an a converter today regardless- the issue needs a way for us to address it. Whether it’s something that really needs to be done to a given site is going to vary depending on the data and whether there are a lot of instances where entities really were the desired option and you want to keep them.

That help clarify at all?
#28 / May 14, 2011 6:07pm

Simon Balz
34 posts

Robin,

thanks for your explanations. Did you get something like a converter in the meantime?

I’d like to spend you some of the Swiss origin delicious Nespresso coffee if we sometimes are able to meet us 😉

But yes, your post did clarify all my addressed issues.

Cheers
Simon
#29 / May 15, 2011 9:44am

Greg Salt
3988 posts

Hi Simon,

Good stuff. Robin will be back during the week so I’ll leave this thread open.

Cheers

Greg
#30 / Jul 11, 2011 12:40pm

Simon Balz
34 posts

Hi All

Any news so far?

Thanks
Simon

Thread

Stefan Rechsteiner

Simon Balz

Greg Salt

Simon Balz

Brandon Jones

Simon Balz

Sue Crocker

Simon Balz

Sue Crocker

Robin Sowell

Simon Balz

Robin Sowell

Simon Balz

Greg Salt

Simon Balz

Username

Password

Thread

HTML entities do not show up correct after upgrading to EE 2.1.3

ExpressionEngine News!

Username

Password

Email Address

Display Name

Password