1 of 2
1
Google Sitemap verification
Posted: 05 September 2005 10:21 AM   [ Ignore ]  
Grad Student
Rank
Total Posts:  84
Joined  04-17-2004

I created a Google Sitemap on both my blogs in accordance with the template on the eewiki - http://www.eewiki.com/wiki?title=Google_Sitemaps

Both sitemaps were crawled by Google and accepted. More recently they have introduced a feature which requires you to verify your sitemap so that they can provide you with more information. The instructions for verification are as follows:

1.  Create a verification file

Create an empty file named GOOGLE234234234etc.html (changed from the original provided to me). This file uniquely identifies you for Google. You can create this file in any text editor. The file should be empty, since we are only checking that it exists at the same location as your Sitemap and we aren’t going to read the contents. You can read more about this file here.

2. Upload the verification file

Once you have created the verification file, place it on your server at http://www.oracleappsblog.com/index.php/weblog/sitemap/.

Since the directories mentioned don’t actually exist can anyone provide me with some guidelines as to where I should upload this verification file. Anyone actually done this verification process yet?

 Signature 

Richard Byrom
Information Technology Consultant, Speaker and Author
Web: http://www.richardbyrom.com

Profile
 
 
Posted: 05 September 2005 11:09 AM   [ Ignore ]   [ # 1 ]  
Research Scientist
Avatar
RankRankRankRankRankRank
Total Posts:  9853
Joined  06-19-2002

Good grief.  Make it difficult for you, why don’t they…

You’re correct; that directory doesn’t really exist.  My suggestion would be to use a .htaccess rule (redirect or mod_rewrite) to tell the server to serve “/GOOGLE234234234etc.html” whenever “/index.php/weblog/sitemap/GOOGLE234234234etc.html” is requested.

Of course, it would also probably be good to contact Google about this, explain the situation, and ask them how they suggest you handle it.  EE certainly isn’t the only system to use “virtual” directories by a long shot and this requirement of Google’s would impact every one of them.

 Signature 

Chris Curtis
chriscurtis.org

Profile
 
 
Posted: 05 September 2005 11:13 AM   [ Ignore ]   [ # 2 ]  
Grad Student
Rank
Total Posts:  84
Joined  04-17-2004

Think I’ll go for the redirect option. Will contact google and see what they have to say.

 Signature 

Richard Byrom
Information Technology Consultant, Speaker and Author
Web: http://www.richardbyrom.com

Profile
 
 
Posted: 13 September 2005 05:29 PM   [ Ignore ]   [ # 3 ]  
Grad Student
Rank
Total Posts:  84
Joined  04-17-2004

I haven’t been able to sort this using a redirect, I’m going to mail google to see what they suggest.

 Signature 

Richard Byrom
Information Technology Consultant, Speaker and Author
Web: http://www.richardbyrom.com

Profile
 
 
Posted: 13 September 2005 07:26 PM   [ Ignore ]   [ # 4 ]  
Moderator
Avatar
RankRankRankRank
Total Posts:  1083
Joined  08-01-2002

Have you just tried creating a template named google91093283.html in the template group “weblog” ?

 Signature 
Profile
 
 
Posted: 10 October 2005 04:57 PM   [ Ignore ]   [ # 5 ]  
Lab Assistant
RankRank
Total Posts:  116
Joined  06-02-2005

Have Google changed this?  I put my file in the root directory as they requested and it seems to work.

Profile
 
 
Posted: 05 November 2005 10:02 AM   [ Ignore ]   [ # 6 ]  
Lab Assistant
RankRank
Total Posts:  117
Joined  12-10-2004

Hi,

Following this thread…

I just placed the file in my root directory (pmachine hosted server) and when I tried to verify with google, I recieved the following message and explanation:

<b>We’ve detected that your 404 (file not found) error page returns a status of 200 (OK) in the header.</b>
This configuration presents a security risk for site verification and therefore, we can’t verify your site. If your web server is configured to return a status of 200 in the header of 404 pages, and we enabled you to verify your site with this configuration, others would be able to take advantage of this and verify your site as well. This would allow others to see your site statistics. To ensure that no one can take advantage of this configuration to view statistics to sites they don’t own, we only verify sites that return a status of 404 in the header of 404 pages.

Please modify your web server configuration to return a status of 404 in the header of 404 pages. Note that we do a HEAD request (and not a GET request) when we check for this. Once your web server is configured correctly, try to verify the site again. If your web server is configured this way and you receive this error, click Check Status again and we’ll recheck your configuration.

Profile
 
 
Posted: 16 November 2005 04:07 AM   [ Ignore ]   [ # 7 ]  
Summer Student
Total Posts:  3
Joined  02-05-2004

The problem is that the custom 404 template don’t send the 404 http response. It responses with 200 OK code.

If you try to send the response manually, using the embeded PHP:

<?php header("HTTP/1.0 404 Not Found"); ?>

You can’t obtain it, because that thing needs to be sent prior to any output, and EE script must output some things before (it dont work also if you active the php output_buffering).

Please give me any solution, or implement the option that makes posible that error template responses with http 404 code. It’s the standard and it gives to EE many value added, for example:

- Search engines can delete from their databases the inexistent pages, solving the present erratic behavior, because the lack of semantic treatment on the error treatment.

Profile
 
 
Posted: 17 November 2005 05:47 AM   [ Ignore ]   [ # 8 ]  
Summer Student
Total Posts:  3
Joined  02-05-2004

Here is a patch solution for users that cannot use htaccess rules.

The main problem is that Google Sitemap dont accept urls that are in another levels of the sitemap url location.

For example, if your sitemap are on http://www.yoursite.com/index.php/weblog/sitemap/, urls like http://www.yoursite.com/index.php/weblog/, http://www.yoursite.com/index.php/members/, http://www.yoursite.com/ wont be acepted.

Because of this, the desired location of our sitemap will be the root path, without any “folder” (including index.php as “folder”).

EE dont leave to do this, but we can do that with a simple PHP script.

1-. Create a template, for example on /weblog/sitemap, like is explained on here

2-. The url of your sitemap will be http://www.yoursite.com/index.php/weblog/sitemap/ or similar.

3-. Create a sitemap.php file and put it on your root path, the url of that file will be http://www.yoursite.com/sitemap.php

4-. The sitemap.php file must content:

<?php

// Prevent content to be cached

header("Expires: Mon, 26 Jul 1997 05:00:00 GMT");  // Content was generated on past
header("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT"); //Content is always modified

// Inform user agent that content is XML and is UTF-8 encoded

header('Content-type: text/xml; charset=UTF-8');

// Read content from template and show it

@readfile ('http://www.yoursite.com/index.php/weblog/sitemap/');
?>

5-. Now http://www.yoursite.com/sitemap.php url are a clon of your template, use it on Google Sitemap.

Notes: The template load by “readfile” function would be ineficient on very very large sites, on medium sites it is light becouse the execution will produce once a day more or less. In fact the template load by “readfile” function is always “local” but using “http” protocol (PHP interpreter and your site are normally on the same machine or network cluster).

If you experiment load problems becouse your site is very very large you could try to do this:

Write on “/weblog/sitemap/” template a PHP code that wrotes the content on a sitemap.xml file placed on root path. (If you write that code please post it for all community smile)

That is a very efficient method, but you must execute the template regularly. For example via Cron daemon or Windows tasks equivalent.

If you cant access to Cron daemon or similar you can use some web services like:

http://www.webcron.org/

http://www.manucorp.com/

http://www.cronjob.de/

Pelase, excuse my terrible english smile

Profile
 
 
Posted: 11 December 2005 03:41 AM   [ Ignore ]   [ # 9 ]  
Grad Student
Avatar
Rank
Total Posts:  87
Joined  08-06-2004

Anybody managed this problem with the verification?
I can submit my sitemap, but I can’t solve the problem with the verification.

dpotter2 - 05 November 2005 10:02 AM

<b>We’ve detected that your 404 (file not found) error page returns a status of 200 (OK) in the header.</b>
This configuration presents a security risk for site verification and therefore, we can’t verify your site. If your web server is configured to return a status of 200 in the header of 404 pages, and we enabled you to verify your site with this configuration, others would be able to take advantage of this and verify your site as well. This would allow others to see your site statistics. To ensure that no one can take advantage of this configuration to view statistics to sites they don’t own, we only verify sites that return a status of 404 in the header of 404 pages.

Please modify your web server configuration to return a status of 404 in the header of 404 pages. Note that we do a HEAD request (and not a GET request) when we check for this. Once your web server is configured correctly, try to verify the site again. If your web server is configured this way and you receive this error, click Check Status again and we’ll recheck your configuration.

 Signature 

YachtPanorama.com

Profile
 
 
Posted: 14 January 2006 11:53 PM   [ Ignore ]   [ # 10 ]  
Lab Assistant
Avatar
RankRank
Total Posts:  105
Joined  07-09-2002

It appears as if we will have to wait for some brave soul to write a plug-in for EE. I have been struggling with this SEO issue like many others here. Recently I was shown a plugin for another wordy cms that makes me so envious… cause its automatic and processed very quickly since it handled all url’s internally. Once EE gets something like that it will rule the world!

 Signature 

It’s the strangest thing. Yesterday, it was hard, today, it is easy. Just a good night’s sleep, and yesterday’s mysteries are today’s masteries.

Profile
 
 
Posted: 15 January 2006 04:55 AM   [ Ignore ]   [ # 11 ]  
Lab Technician
Avatar
RankRankRankRank
Total Posts:  1397
Joined  01-15-2005

So maybe you wish to add your voice to this FR

 Signature 

EE Duration Tags | {view_count_total}

Profile
 
 
Posted: 15 January 2006 11:50 AM   [ Ignore ]   [ # 12 ]  
Lab Assistant
Avatar
RankRank
Total Posts:  105
Joined  07-09-2002

Yes,  I would lobby someone in the PMachine community ( or Rick and Co.)  to write a proper plug-in that:

a. Generates sitemap.xml.
b. Generates the sitemap.xml.gz.
c. Notifies google of the change.

I really like the Google Sitemap Generator for WordPress (http://www.arnebrachhold.de/2005/06/05/google-sitemaps-generator-v2-final), I really would like to see something exactly like it for EE.

Google is not going to change their ways, so we need to adapt to them.

 Signature 

It’s the strangest thing. Yesterday, it was hard, today, it is easy. Just a good night’s sleep, and yesterday’s mysteries are today’s masteries.

Profile
 
 
Posted: 16 January 2006 11:43 AM   [ Ignore ]   [ # 13 ]  
Lab Assistant
Avatar
RankRank
Total Posts:  124
Joined  12-31-2005

My experience has been that the Google Verification file needs only to be a blank page uploaded to your server with the url that Google give you.

Its NOT something that is created in EE, but something created in a normal text editor and FTP’d to your server - or alternatively, if you have “File Manager” on your SERVER control panel, you can create a new file of that name.

I’ve had no problems doing it this way

Profile
 
 
Posted: 16 January 2006 03:32 PM   [ Ignore ]   [ # 14 ]  
Lab Assistant
Avatar
RankRank
Total Posts:  105
Joined  07-09-2002

Is there a way to stop EE from sending non 404 output on a 404 request?

 Signature 

It’s the strangest thing. Yesterday, it was hard, today, it is easy. Just a good night’s sleep, and yesterday’s mysteries are today’s masteries.

Profile
 
 
Posted: 16 January 2006 09:05 PM   [ Ignore ]   [ # 15 ]  
Lab Assistant
Avatar
RankRank
Total Posts:  272
Joined  05-14-2004
tulkul - 16 January 2006 11:43 AM

My experience has been that the Google Verification file needs only to be a blank page uploaded to your server with the url that Google give you.

Its NOT something that is created in EE, but something created in a normal text editor and FTP’d to your server - or alternatively, if you have “File Manager” on your SERVER control panel, you can create a new file of that name.

I’ve had no problems doing it this way

Me either, I’ve done it on two seperate installs of EE in two seperate hosts. .. I just created a blank .htm page, uploaded it to the root of the site and was good to go, google gave me the option of using the root, or using the virtual url EE supplied it, I just picked the root.

 Signature 

RT2Photo

Profile
 
 
Posted: 16 January 2006 10:04 PM   [ Ignore ]   [ # 16 ]  
Lab Assistant
Avatar
RankRank
Total Posts:  105
Joined  07-09-2002

Guys,

It’s not as simple as just submitting a blank verification file, it is about providing google with an updated sitemap.xml file(s) so that it can prioritize your links/categories/pages accordingly. Even if google crawls your site the fact that their is no up-to-date sitemap.xml on your site will not help you. The purpose of the sitemap.xml is to help google understand your site’s layout and help it prioritize the content it crawls, because you tell it what to prioritize. If it was as simple as posting a verfication file, none of us would be asking for a plug-in.

To get a better understanding of the issue look at this plugin/module for wordpress; http://www.arnebrachhold.de/2005/06/05/google-sitemaps-generator-v2-final

and this article/interview:
http://blog.searchenginewatch.com/blog/050602-195224

 Signature 

It’s the strangest thing. Yesterday, it was hard, today, it is easy. Just a good night’s sleep, and yesterday’s mysteries are today’s masteries.

Profile
 
 
Posted: 19 January 2006 11:19 AM   [ Ignore ]   [ # 17 ]  
Summer Student
Total Posts:  1
Joined  03-22-2005

FYI: I was able to add the verification file to the root and google was able to verify the site.

I also created the a file in the root “google_sitemap.php” that called the google sitemap template as per the instructions written by Antbe above.

The sitemap url that I submitted to Google was www.mysite.com/google_sitemap.php and that solves the problem.

It does sound like the DPotter2 has a separate problem with the 404 error

Profile
 
 
Posted: 03 February 2006 06:01 PM   [ Ignore ]   [ # 18 ]  
Lab Technician
Avatar
RankRankRankRank
Total Posts:  1695
Joined  05-13-2004
John Stence - 19 January 2006 11:19 AM

FYI: I was able to add the verification file to the root and google was able to verify the site.

I also created the a file in the root “google_sitemap.php” that called the google sitemap template as per the instructions written by Antbe above.

The sitemap url that I submitted to Google was www.mysite.com/google_sitemap.php and that solves the problem.

Me too…

I saw this on the wiki and came here. Anyhow now that i have a sitemap, I"m not sure what to do with it? is it only for google or can I style it somehow so that readers/visiters can use it?

 Signature 

CreateSean Web Design
CreateSean - My journey to pro web designer
I am the poster formally known as The Linguist.

Profile
 
 
   
1 of 2
1
 
Post Marker Legend
New Topic New posts Hot Topic Hot Topic with new posts New Poll New Poll Moved Topic Moved Topic Sticky Topic Sticky topic
Old Topic No new posts Hot Old Topic Hot Topic with no new posts Old Poll Old Poll Closed Topic Closed Topic Announcement Announcements
Theme
Change Theme
Visitor Statistics
The most visitors ever was 1149, on July 16, 2007 09:33 AM
Total Registered Members: 66394 Total Logged-in Users: 30
Total Topics: 84721 Total Anonymous Users: 11
Total Replies: 454714 Total Guests: 195
Total Posts: 539435    
Members ( View Memberlist )