ExpressionEngine CMS
Open, Free, Amazing

Thread

This is an archived forum and the content is probably no longer relevant, but is provided here for posterity.

The active forums are here.

Weekly clockwork failure of EE since 2.7.2 upgrade

November 06, 2013 3:41pm

Subscribe [1]
  • #1 / Nov 06, 2013 3:41pm

    Gusto

    31 posts

    Here’s a strange one that’s driving me crazy…

    Four weeks ago I upgraded from EE 2.6 to 2.7.2. All seemed to go well. I also added Playa to the mix after the upgrade. Nothing else changed. The OS is CentOS 5.8, SQL 5.0.95, PHP 5.2.17 on a dedicated server with a few other sleepy domains that I control and haven’t touched in months.

    The next week on Wednesday morning my site stops serving content and instead I get the dreaded white screen on both my home page and control panel. I checked my http server and Apache was fine. It would serve static content from my home directory and my Linux web-based control panel serves fine (Webmin). To make sure I restarted Apache and still no luck with EE. In desperation I did a reboot (first one in 16 months) and the site was back live. Whew! I did some reading and allocated additional memory (256M) to PHP thinking perhaps Playa or the newer version of EE needed more resources.

    The next Wednesday the entire event took place again. And the next Wednesday, and again today. (Groundhog Day anyone?)

    I have no specific cron jobs on the server at the time of failure (appears to be 12 PM Eastern, give or take) and I don’t have the EE Cron module. The only Cron job I’ve configured on the server is a backup that I haven’t modified that runs at 1AM each morning. I tried looking for PHP errors in the Apache log but can find none.

    The Wednesday thing seems too much of a coincidence to ignore. There must be some service or process that launches around this time that wrecks havoc with EE. Does anyone have an idea on this one?

  • #2 / Jan 08, 2014 2:36pm

    Gusto

    31 posts

    Seeing as I haven’t heard anything on this after a few more weeks of doing a weekly restart I gathered some clues. To make a long story short I knew it was something with my SQL server (restarting SQL brings the site up). Finally today I realized each week this query is running on my server: DELETE FROM `exp_security_hashes` WHERE `date` < 1389195450 (Obviously the date parameter changes each week.) So that sent me into the db to look at the exp_security_hashes table and it found about 50 million entries. After doing more sleuthing I discovered others having this issue a few versions back. Then I discovered this is called once each week to clear the table: ee()->security->garbage_collect_xids(). That’s resulting in the aforementioned SQL DELETE query.

    Now the issue is what do I do next? I turned off secure forms for now and am manually running the DELETE query. My site hasn’t gone done, but it sure is slow and the query is taking a long time to complete. (I’m on an 8-core dedicated server.) I’m not sure why the garbage collection takes my site down, yet when I run the query manually it stays up (although it’s very slow).

    It appears at least one entry is added to the exp_security_hashes table with every page view. Any thoughts as to a long term solution on this? Perhaps I’m doing something wrong…

.(JavaScript must be enabled to view this email address)

ExpressionEngine News!

#eecms, #events, #releases