Man, this sucker has taken forever- and it still needs a lot more work. Like, a WHOLE lot. But, folks have been asking about it and I’ve got a release that’s safe enough for government work.
It’s basically a feed aggregator- uses magpierss, stores feed/article info in the db, can be displayed or used solely as a personal aggregator.
This is pre-pre-beta. I don’t expect it to scale beyond 300-400 feeds as the way updates are handled is cracked out. Also- I suck with instructions. So, this one is only recommended for folks who:
1. Are familiar with EE (i.e., know what a ‘tag’, ‘parameter’, ‘variable pair’ is- or you’ll never figure out the manual).
2. Can run cron jobs. Feed updates are handled via cron (or you can update a feed on ping, but I doubt you’ll handle all updates that way). And by ‘can run cron jobs’- I mean that I don’t have to try and explain how to set one up. There’s a url for cron- if you know what to do with it, you’re good to go. Otherwise, not so much.
3. Are not bothered by the lack of polish. I’ve messed with this until my eyes bled, but count on finding lots of little errors.
Like I say, some of the logic on this one still needs massive amounts of thought. But in general, it does what it’s supposed to.
Unfortunately, no. Well, yes- but it has the same problem as trying to run batch email via the cron plugin. The function takes way too long to complete and would unacceptably bog down page load times.
At some point I’m going to sit down and see if the cron plugin can be used to send a ping to start the function and then not wait around for a response, but until I do, I wouldn’t recommend the plugin for triggering this mod.
But it installs and adds and displays feeds fine without the need for the cron, so you could still play with it. It just won’t update the feeds- i.e., go out and grab new articles every hour- which is what you’d want it to do if you were using it for real.
And thanks for adding it to the wiki! It’s a fairly specialized mod, so I don’t know how many folks will need it, but the more feedback I get on this initial version, the better.
Description: You have an error in your SQL syntax. Check the manual that corresponds to your MySQL server version for the right syntax to use near 'CURRENT_TIMESTAMP, `accept_ping` varchar(250) NOT NULL de
Query: CREATE TABLE IF NOT EXISTS `exp_eefeed_channels` ( `feed_id` int(11) NOT NULL auto_increment, `url` varchar(250) NOT NULL default '', `title` varchar(250) NOT NULL default '', `custom_title` varchar(250) NOT NULL default '', `custom_description` varchar(250) NOT NULL default '', `link` varchar(250) NOT NULL default '', `description` varchar(250) NOT NULL default '', `feed_version` varchar(250) NOT NULL default '', `feed_type` varchar(250) NOT NULL default '', `page_url` varchar(250) NOT NULL default '', `generator` varchar(250) NOT NULL default '', `copyright` varchar(250) NOT NULL default '', `lastbuilddate` varchar(250) NOT NULL default '', `tagline` varchar(250) NOT NULL default '', `modified` varchar(250) NOT NULL default '', `language` varchar(250) NOT NULL default '', `pubdate` varchar(250) NOT NULL default '', `creator` varchar(250) NOT NULL default '', `date` varchar(250) NOT NULL default '', `rights` varchar(250) NOT NULL default '', `image_title` varchar(250) NOT NULL default '', `image_url` varchar(250) NOT NULL default '', `image_link` varchar(250) NOT NULL default '', `image_description` varchar(250) NOT NULL default '', `image_width` varchar(250) NOT NULL default '', `image_height` varchar(250) NOT NULL default '', `mytimestamp` timestamp NOT NULL default CURRENT_TIMESTAMP, `accept_ping` varchar(250) NOT NULL default '', `ping_name` varchar(250) NOT NULL default '', `translate` int(3) NOT NULL default '0', `encoding` varchar(250) NOT NULL default '', PRIMARY KEY (`feed_id`) )
Fatal error: Cannot redeclare MagpieRSS::$ERROR in C:\Program Files\Apachefriends\xampp\htdocs\pmachine\system\modules\eeaggregator\magpie.php on line 82
Fatal error: Cannot redeclare MagpieRSS::$ERROR in C:\Program Files\Apachefriends\xampp\htdocs\pmachine\system\modules\eeaggregator\magpie.php on line 82
I just found out that the magpie.php included in the eeagregator zip file is not the right version!
No- the magpie version used by the aggregator has been hacked in order to pull in categories (and eventually enclosueres). So you’ll lose some functions using the existing one. I’m digging through to try and figure what’s cuasing the ‘redeclare’ issue, changing up the way I call the class. Hopefully, that will take care of the problem.
The install issue is fixed, but I want to wait and fix both issues before I upload a new version. Hopefully today it will be up.
No worries- it was a good call, actually. I may give Paul a yell and see about adding category and enclosure support to the existing version of magpie, then calling it. It would make more sense. Why have duplicate files/classes when there’s no need?
Also- Paul’s version would be MUCH cleaner than the hack job I did. Except I just remembered, I hacked the fetch as well, because I needed conditional gets working when cache is off, since I’m using the db to store data rather than the cache. Hm. Crap. Still may talk to Paul about it- might could add an additional function or variable and still run both off of one magpie class.
I started to figure out how I could use the EE cron with your plugin. I just installed the EE cron but in your ‘External Updates’ tab I have to add the “Update feeds cron url” in order to use the cron with the RSS feeds.
I don’t know which URL I have to use here. Can you help me?
You really don’t want to use the EE cron with this- the update function takes too long and would bog down page load times whenever the function was triggered.
I’ve briefly talked with Paul about the possibility of having the EE cron trigger a ‘ping’- the ping starts whatever function is needed, but returns before the function has completed executing (crap- not sure of any of the vocab there!). IF that worked, then the EE crons could be used to trigger these longer running functions- like batch mail or the feed update. But, Paul wasn’t sure whether it would work, and I haven’t tried to test it out yet.
As it stands, you really need to be able to run true cron jobs.
You really don’t want to use the EE cron with this- the update function takes too long and would bog down page load times whenever the function was triggered.
I’ve briefly talked with Paul about the possibility of having the EE cron trigger a ‘ping’- the ping starts whatever function is needed, but returns before the function has completed executing (crap- not sure of any of the vocab there!). IF that worked, then the EE crons could be used to trigger these longer running functions- like batch mail or the feed update. But, Paul wasn’t sure whether it would work, and I haven’t tried to test it out yet.
As it stands, you really need to be able to run true cron jobs.
Thanks for the info!
I don’t know much about how it actually works but how is it possible that these update functions (using EE cron) take so long?
If I install your RSS aggregator, the update is really fast! I don’t understand why it isn’t possible to use cron jobs…..... or is just me?
When you add a new feed, it just has to go fetch one file- the url for the rss feed you’re adding. So, it’s speedy. But the cron should update ALL feeds- and I figure most folks will have 100 or so (and I’d like that to scale into the thousands). Things will start to slow down, even if you don’t run into problems with a feed not being returned- and that WILL happen, not infrequently.
You’ll see the magpie plugin bog down when it hits a feed that’s 404, as it waits a while for a response before giving up. Imagine trying to fetch hundreds of feeds, processing all of that data, and then adding it to the db. So while the EE cron will only call the function once an hour or so, the poor person who hits the page when it actually triggers the function? They could be waiting a long time for the page to load!
When you add a new feed, it just has to go fetch one file- the url for the rss feed you’re adding. So, it’s speedy. But the cron should update ALL feeds- and I figure most folks will have 100 or so (and I’d like that to scale into the thousands). Things will start to slow down, even if you don’t run into problems with a feed not being returned- and that WILL happen, not infrequently.
You’ll see the magpie plugin bog down when it hits a feed that’s 404, as it waits a while for a response before giving up. Imagine trying to fetch hundreds of feeds, processing all of that data, and then adding it to the db. So while the EE cron will only call the function once an hour or so, the poor person who hits the page when it actually triggers the function? They could be waiting a long time for the page to load!
Today I read somewhere on this forum that “time out” errors will occur when a feed/wesbite is causing some problems. Yeah, this is a pain in the ass….
Is it possible to run an update (not a cron job) everytime when you enter the control panel as well????
OK- still pre-beta, but I’ve tweaked a bit - EEAggregator 0.2. Hopefully it installs correctly for everyone. The recall issue I’m still not sure about- I changed things around and renamed the Magpie class, so hopefully that took care of it. I couldn’t replicate either problem.
The db has changed, so you’ll need to deinstall from the cp, upload the new files, then reinstall. I’m not going to bother with upgrade scripts until I get a beta out- the backend will be in flux probably every version until then.
As to the cron issues… For now, you could actually just paste the url for the cron job in your browser and refresh. Shouldn’t really be able to do that, but while I’m debugging it’s very useful. You can tell if feeds have been checked by the date in the ‘Manage Feeds’ section- and it will (hopefully) give you a response code. 304 means it was successful, but nothing had changed since the last update. 200 is also a success- if anything new was returned, it should give you a count.
For long range, you’re really going to need a true cron. I suppose if you really want to run the module and your host doesn’t offer cron, you could check out something like Web Based Cron. I’ve never used it, but it might be worth checking out.
Today I read somewhere on this forum that “time out” errors will occur when a feed/wesbite is causing some problems. Yeah, this is a pain in the ass….
Heh- yep. The regular EE cron plugin just won’t work well with this sort of thing- way too likely to bog down.
As always- give a yell when you run into trouble, or if you have ideas for improvement. I’m still mostly focused on slimming that update function down. Chris suggested running it in batches, and that’s probably what I’ll dink with next. In short, nothing sexy. But after that, I’ll work on the OPML import. Once that’s in place, I’ll start cleaning it up for a real beta.
OK- still pre-beta, but I’ve tweaked a bit - EEAggregator 0.2. Hopefully it installs correctly for everyone. The recall issue I’m still not sure about- I changed things around and renamed the Magpie class, so hopefully that took care of it. I couldn’t replicate either problem.
The db has changed, so you’ll need to deinstall from the cp, upload the new files, then reinstall. I’m not going to bother with upgrade scripts until I get a beta out- the backend will be in flux probably every version until then.
As to the cron issues… For now, you could actually just paste the url for the cron job in your browser and refresh. Shouldn’t really be able to do that, but while I’m debugging it’s very useful. You can tell if feeds have been checked by the date in the ‘Manage Feeds’ section- and it will (hopefully) give you a response code. 304 means it was successful, but nothing had changed since the last update. 200 is also a success- if anything new was returned, it should give you a count.
For long range, you’re really going to need a true cron. I suppose if you really want to run the module and your host doesn’t offer cron, you could check out something like Web Based Cron. I’ve never used it, but it might be worth checking out.
Today I read somewhere on this forum that “time out” errors will occur when a feed/wesbite is causing some problems. Yeah, this is a pain in the ass….
Heh- yep. The regular EE cron plugin just won’t work well with this sort of thing- way too likely to bog down.
As always- give a yell when you run into trouble, or if you have ideas for improvement. I’m still mostly focused on slimming that update function down. Chris suggested running it in batches, and that’s probably what I’ll dink with next. In short, nothing sexy. But after that, I’ll work on the OPML import. Once that’s in place, I’ll start cleaning it up for a real beta.
Thanks for the Update Rob1! You’re FAST!!!!!!
I have two other questions as I can’t find something about it in the ” User Guide”.....
What’s the “Category Group” in “Configuration” used for?
And how do I know what I have to choose in the “Translate encoding” option when I add a feed?
Been busy, and now the magpie-rss cvs has a developers version that should parse atom 1 feeds- so I’ve been going through it trying to understand the changes. Then I’ll need to go back through and make my own mods to it and incorporate them into the aggregator. Figure there’s no point in getting a new one up until I’ve done that.
So still working on it, but since it’s in my spare time, things are slow.