Editing entry via CP kills . (dot) in url_title
Posted: 11 January 2005 02:58 PM   [ Ignore ]  
Research Assistant
Avatar
RankRankRank
Total Posts:  768
Joined  03-16-2002

It seems, it isn’t allowed to have a . (dot) in the url_title?

I have some titles like entry123.html inside the EE database generated by a specialized import function I had to write for an outdated weblog system. This allows me to use mod rewrite to simulate the old urls (which are present in Google and a lot of bookmarks). Everything works as expected ... until I change something in thus an imported entry. If doing so, EE silently eliminates the dot in the url_title - leaving me with entry123html ... and my rewrite rules in .htaccess are messed up for that entry.

I already found cp.publish.php, where the url_title seems to be stripped from special characters - but don’t like to damage EE by hacking the back end. Therefore, two questions arise:

1. Is there a specific reason not to allow a dot in the url_title? If not, it seems sometimes useful to simulate xxx.html endings, especially for imported old entries.

2. Is it a bug or a feature that editing an entry via the CP may result in changing the url_title automatically without any ability of revoking the change? (Okay, I’ve used phpMyAdmin wink I would asume that I can manually change the url_title, but EE won’t touch it by default?

(In addition: is there an easy way to stop EE from removing my dots, using EE 1.2 251104 ? wink

Profile
 
 
Posted: 11 January 2005 03:20 PM   [ Ignore ]   [ # 1 ]  
Moderator
Avatar
RankRankRankRankRankRankRankRank
Total Posts:  32895
Joined  05-14-2004

The URL Title feature ensures that all url-titles are made up of “valid url’s” only - that is, things all browsers will read as a URL.  The period, in a URL, is really meant to denote a part of a domain and should not be used outside of it - so the url title strips it.

You can change the url-title via the cp and it won’t mdoify it if you go in later and edit it; but I think it wouldn’t accept a . if you did it via the cp in the first place.  Mucking around in the database can bring up thangs like thas. wink

And no, I don’t believe there is a way to stop this behavior….

 Signature 
Profile
MSG
 
 
Posted: 11 January 2005 03:47 PM   [ Ignore ]   [ # 2 ]  
Research Assistant
Avatar
RankRankRank
Total Posts:  768
Joined  03-16-2002

Thanks, LisaJill. But I don’t think a dot in the url_title turns out not to have “a valid url” in a technical term. I have:
www.server.com/index.php/weblog/comments/entry123.html
or ww.server.com/index.php/weblog/comments/entry123.html/
or even www.server.com/index/weblog/comments/entry123.html
and can’t see technically any difference to the EE forced url without the dot
www.server.com/index.php/weblog/comments/entry123html

Of course: I’m aware there are special characters that shouldn’t be used in EE url_titles (like slashes) or they would need an escaping mechanism, but was wondering why the dot is eliminated. As I stated, I had these titles given in an old system and imported them, not seeing a problem. And I didn’t run into one when testing - EE behaved as expected as far as I have noticed ... just ran into the effect that EE modifies the url_title without my knowledge when editing thus an entry, which is rarely done.

And of course, I’m aware 99.99% of EE users will never run into thus a problem. But “mucking round in the database” is what one has to do when importing 1200+ entries from another system not supported by the default import filters *g*. And I didn’t find a reference in the docs or the knowledge blog stating that dots aren’t permitted in the url_title field.

As stated, there is a way of stopping this behaviour if there is no need for having url_titles without dots - hacking the back end file mentioned. Somehow, I think its at least lacking feedback if EE changes my data in the database “automatically on the fly” without notification. Its okay when the live url feature generates a new url_title, but not when editing an entry already having a url_title attribute.

But thanks for the input and all the time and work you spend helping EE users here!
-Markus

Profile
 
 
Posted: 11 January 2005 04:28 PM   [ Ignore ]   [ # 3 ]  
Moderator
Avatar
RankRankRankRankRankRankRankRank
Total Posts:  32895
Joined  05-14-2004

Well, I see what you’re trying to do, but this gets cleaned up so that people don’t have things like comments/hi.i.am.a.title/ which would not be technically valid since those aren’t domains or extensions.

What is interesting is that the import, however you did it, didn’t clean those up.  And that is causing some issues, it appears. But ONLY when you edit. 

The problem here is I don’t have a solution for you - I see what you’re doing, I see why it’s a problem, but I am not a programmer just support - I was trying to explain something that I now think you already knew *winks* which is always a little silly on my part!

So the problem isn’t with the entries already there, but when you edit one and it goes and changes it.  You could use some mod rewrite and update them all so it’s not a problem.  Otherwise you would need to (either yourself, or with instructions from someone smarter than I) find a way to disable that part of the URL cleanup in the url-titles routine.  And only on edited posts, probably…

I wish I could help more. :(

 Signature 
Profile
MSG
 
 
Posted: 11 January 2005 06:28 PM   [ Ignore ]   [ # 4 ]  
Research Assistant
Avatar
RankRankRank
Total Posts:  768
Joined  03-16-2002

> I wish I could help more. :(
Nothing to worry about, LJ grin

I just wished I had a statement from the programmers (Rick?) if there is a deeper (constructional) reason not to use dots in url_title. I know slashes are used to separate parameters, and I’ve noticed a couple of minor optimization possibilities for the live url “cleanup” (e.g. instead of completely dropping umlaus switching them either to their base character [ä -> a] or the transliteration [ä -> ae]).

Writing my own import functions gave me the possiblity to use whatever I liked as the url for the old entries ... all have the same format (“entry000.html”) - exactly the same as in the old system. It seems I can use these non-conformant url_titles without breaking the EE dynamics. It works (or seems to work?) as long as I’m not editing them. New entries of course will have “descriptive” url_titles with respect to the EE conventions. I only might run into troubles if dots in the url_title are either used or reserved for some special functions. Then, updating in the future might break my installation :-(

Perhaps I’d have better named the thread “WHY does editing an entry via CP kills . (dot) in url_title?” wink

Thanks again!
-Markus

Profile
 
 
Posted: 11 January 2005 08:11 PM   [ Ignore ]   [ # 5 ]  
Lab Assistant
Avatar
RankRank
Total Posts:  281
Joined  11-30-2002

I don’t have the answer to this, but I do want to point out something. In the default RSS templates, the template name has a dot in it - rss_2.0 - and those dots in the url work fine. So I wonder too why dots aren’t allowed in the url title.

Profile
 
 
Posted: 11 January 2005 08:15 PM   [ Ignore ]   [ # 6 ]  
Moderator
Avatar
RankRankRankRankRankRankRankRank
Total Posts:  32895
Joined  05-14-2004

The dots don’t break it, it’s just not best practice, as I understand it. =)

 Signature 
Profile
MSG
 
 
Posted: 11 January 2005 11:28 PM   [ Ignore ]   [ # 7 ]  
Administrator
Avatar
RankRankRankRankRank
Total Posts:  2541
Joined  12-21-2001

It was a design decision.  Your needs are quite unique, but most people don’t intend to mimic HTML files.  Think about what most people submit as their post title?  It’s a sentence, which might have a period at the end:

Some post title.

This example, would become:

some_post_title

It would look awfully odd to leave the period there, plus it serves no purpose.  Again, there are exceptions to every rule, and you’re it.  For 99.9% of people, though, periods make no sense.

 Signature 
Profile
MSG
 
 
Posted: 12 January 2005 10:10 AM   [ Ignore ]   [ # 8 ]  
Research Assistant
Avatar
RankRankRank
Total Posts:  768
Joined  03-16-2002

Thanks a lot for the clarification, Rick & LJ! Thats what I hoped: the dot will not break anything internally grin
-Markus

Profile
 
 
   
 
 
Post Marker Legend
New Topic New posts Hot Topic Hot Topic with new posts New Poll New Poll Moved Topic Moved Topic Sticky Topic Sticky topic
Old Topic No new posts Hot Old Topic Hot Topic with no new posts Old Poll Old Poll Closed Topic Closed Topic Announcement Announcements
Theme
Change Theme
Visitor Statistics
The most visitors ever was 1149, on July 16, 2007 09:33 AM
Total Registered Members: 64978 Total Logged-in Users: 25
Total Topics: 82014 Total Anonymous Users: 14
Total Replies: 440812 Total Guests: 188
Total Posts: 522826    
Members ( View Memberlist )