Possible to recache all sites automatically after clear cache?

Tuesday 11 November 2003 9:17:23 am - 7 replies

Hopefully this is a simple question. My site runs pretty fast once cached, but from time to time I need to clean all cache due to development. Is there a way (perhaps a script) to load all pages on site back in cache.

Also, I've registered that pages seems to fall out of cache after some time. Is there an expiration time for cache, and where do I set this parameter so that my pages will stay cached longer? My guess is the expiry.php in /var/cache/ but I'm not sure which values to insert to ensure permanent caching, until cleaned manually.

By the way, I would'nt recommend v.3.2-3, got too many bugs and went back to v.3.2-2. Looking forward to 3.3.x!! happy.gif Emoticon

Modified on Tuesday 11 November 2003 9:23:17 am by Valentin Svelland

Tuesday 11 November 2003 3:15:47 pm

You may want to try running a link checker or spider against the site as it will go through and hit every page - assuming you don't have any that aren't linked to. You can also check out programs like HTTrack ( or WGet ( which allow you to download an entire site locally. You can set theprograms up to not actually download anything, so they will go through the site and hit every page as well.


Thursday 13 November 2003 2:59:01 pm

Thanks for your reply.. Perhaps you could suggest a good webspider or similar free service online?

Modified on Friday 14 November 2003 9:17:59 am by Valentin Svelland

Thursday 13 November 2003 3:24:25 pm

Either of the two that I mentioned above would work. You might also want to look into using Xenu (


Friday 14 November 2003 9:17:36 am

I'm trying to rebuild all cache by running a wget through my siste. However I'm new to this command, and not quite sure how to use it properly. I run the command under but the command fails due to no index.html file... eZ frontpage is index.php ...

wget --spider


Modified on Friday 14 November 2003 9:18:16 am by Valentin Svelland

Friday 14 November 2003 3:16:48 pm

That is odd. The fact that your index file is PHP shouldn't make a difference... What is the actual command you are using?

Is your site hosted on a third-party server, or is it possible that another developer/sysadmin could have set up the server to block spidering scripts?


Friday 14 November 2003 7:13:55 pm

Well, my site is hosted on a third-party server with wget preinstalled, and it could be that there are som sysadmin-limits as to how the wget-command can be used.. I'll check that up on monday.

How I run the command? Well, actually just like i wrote in the previous posting, only with a different url of course. As I mentioned, I've never used wget before so it could be I'm just not using it correctly:

wget --spider
wget --spider
=> `index.html.1'
Connecting to
Connection to refused.

By the way: I've achieved some better performance by including my custommade views in the site.ini.append of my siteaccess (sitemap2,sitemap3 and so on). Still, it seems like the caching of my site only last for so long... Wish I could set this caching to be permanent, unless article republished.. And then of course, be able to do the wget-thing to recache if necessary now and then..

Thanks for your feedback, Alex.. It's useful!


Resources on the net:

Modified on Friday 14 November 2003 7:29:24 pm by Valentin Svelland

Friday 14 November 2003 8:20:46 pm

Sorry Valentin, I missed your earlier posting of the command. I'm not sure why that wouldn't work. The return you posted seems to indicate that the server is indeed blocking access. Perhaps you could try spoofing the user agent in Wget to see if that works. Try adding somehting like:

--user-agent=Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5)

This will tell the server that Wget is Mozilla running on Windows XP. You may need to enclose the value in quotes, I'm not sure. If thisworks then you can experiment with setting a user agent to whatever you want. This will also help you filter out Wget when analyzing your traffic logs.



