ExpressionEngine CMS
Open, Free, Amazing

Thread

This is an archived forum and the content is probably no longer relevant, but is provided here for posterity.

The active forums are here.

Problem with EE

May 14, 2011 6:56am

Subscribe [3]
  • #1 / May 14, 2011 6:56am

    MINDSCREEN

    218 posts

    Editing text which has been copied+pasted from PDFs and the internet

    1. I have noticed that sometimes the articles which appear on MercatorNet (and BioEdge) appear to contain gross spelling errors. When I examined them, it seems that they are normally all cases of omitted spaces, ie, “an apple” becomes “anapple” or “is thinking” becomes “isthinking”.

    2. The problem only becomes evident AFTER the text has been edited and saved in Expression Engine. For some reason, the spaces do not close up until after saving it. So it sometimes happens that the article looks ok when you examine it in draft form. However, when you save and exit, the spaces close up.

    3. This only happens with text which has been copied and pasted from PDFs and from HTML text from the internet.

    4. The cause of the problem is that there appear to be two symbols for spaces—a normal space and something which is ... not normal
    Microsoft Word.jpg

    The way to do this is to select the symbol and replace it with normal spaces in MIcrosoft Word.

    5. Having said all this, the problem only happens sometimes, not always. But it is good to check, because it is very tricky.

    Pls, check this issue and reply soon.

    Regards
    Debasish

  • #2 / May 14, 2011 9:14am

    handyman

    509 posts

    ANY text which is going to be put into a CMS (or anywhere else) should first be placed in a text editor and/or cleaner (BBedit, etc.) and cleaned up.
    Otherwise, you are asking for trouble with line breaks, strange characters, etc.

  • #3 / May 15, 2011 3:18pm

    Greg Salt

    3988 posts

    Hi MINDSCREEN,

    Craig is quite correct - any text that you copy and paste from an external source (Microsoft Word is a common problem) may contain hidden or unusual characters and the really the only way to remove them is to initially copy the text into a plain text editor. Doing that will remove any non-standard control or formatting characters and should you if there is anything you need to edit before posting it.

    Cheers

    Greg

.(JavaScript must be enabled to view this email address)

ExpressionEngine News!

#eecms, #events, #releases