Skip Navigation

Caching output in PHP

Caching of output in PHP is made easier by the use of the output buffering functions built in to PHP 4 and above.

You'll need to use two files to set up a caching system for your site. The first, "begin_caching.php" in this case, will run before any other PHP on your site. The second, "end_caching.php" in this case, runs after normal scripts have run. The two scripts effectively wrap around your current site.

You can achieve this wrapping effect one of two ways. The first way is to simply use the include() function and add them manually to every script you run. Unfortunately, this method can take some time, but is arguably more portable than the alternative.

The alternative relies on adding the following two lines of code (modified to reflect the correct path to the two PHP files needed) to your htaccess file. This is my preferred method, just because it requires no modification to existing scripts, and can very easily and quickly be turned off (just by commenting out the relevant lines in the htaccess file).

  1. php_value auto_prepend_file /full/path/to/begin_caching.php
  2. php_value auto_append_file /full/path/to/end_caching.php

Next, we move on to the scripts that do the work. There are several stages to caching a document:

  1. Receive request for page
  2. Check for the existence of a cached version of that page
  3. Check the cached copy is still valid
    • If it is, send the cached copy
    • If not, create a new cached copy and send it

To begin with, the script below contains a few basic settings. Here, you can set the directory you want to save cached files to (I would recommend keeping that directory outside your web root directory or at least protecting it from view through a normal browser). This script will need to be able to create files in this directory, and you need to allow this by setting the permissions of the directory. The permissions depend upon your server set up, so you may want to start by setting them to 777 while testing the script, and then reduce them to the lowest levels possible once the script is working.

You can also set the time, in seconds, a cached file should be considered valid for after creation, and set the file extension for saved files. It would be wise to not name them ".php", just for safety's sake.

  1. <?php
  2.  
  3. // Settings
  4. $cachedir = '../cache/'; // Directory to cache files in (keep outside web root)
  5. $cachetime = 600; // Seconds to cache files for
  6. $cacheext = 'cache'; // Extension to give cached files (usually cache, htm, txt)
  7.  
  8. // Ignore List
  9. $ignore_list = array(
  10. 'addedbytes.com/rss.php',
  11. 'addedbytes.com/search/'
  12. );
  13.  
  14. // Script
  15. $page = 'http://' . $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI']; // Requested page
  16. $cachefile = $cachedir . md5($page) . '.' . $cacheext; // Cache file to either load or create
  17.  
  18. $ignore_page = false;
  19. for ($i = 0; $i < count($ignore_list); $i++) {
  20. $ignore_page = (strpos($page, $ignore_list[$i]) !== false) ? true : $ignore_page;
  21. }
  22.  
  23. $cachefile_created = ((@file_exists($cachefile)) and ($ignore_page === false)) ? @filemtime($cachefile) : 0;
  24. @clearstatcache();
  25.  
  26. // Show file from cache if still valid
  27. if (time() - $cachetime < $cachefile_created) {
  28.  
  29. //ob_start('ob_gzhandler');
  30. @readfile($cachefile);
  31. //ob_end_flush();
  32. exit();
  33.  
  34. }
  35.  
  36. // If we're still here, we need to generate a cache file
  37.  
  38. ob_start();
  39.  
  40. ?>

The file starts by generating an MD5 hash of the page that has been requested. It will use the complete requested URL, and the MD5 hash will be a 32 digit number, unique for each file. It then checks for the existence of this file.

If the file exists, it checks to see when it was last updated. If the file is older than the allowed time, it acts as though no cache existed (carrying on and generating a new file). If the file is still valid, it simply displays it.

There is also, in the settings, a list of pages to ignore when caching. This can be search results, comments pages, a news page or news feed - anything that should always be up to date. Simply add anything you do not want cached into here, and it will not be cached. You can add directories, or parts of URLs - the above simply searches for a text string. In the example above, I have left out the "http://www" portion of the URL, as this can be missed out by some visitors.

Finally, the two lines in italics above are both commented out. You can, if you like, uncomment these, and that will use outbut buffering to gzip your content before sending it to users, making your site even faster for them. Please note, though, that output buffering with gz encoding is not available in versions of PHP previous to 4.0.5.

Which brings us to the second file, "end_caching.php". At the end of the first file, if no cache exists, we start output buffering. This means that rather than send the page to the user, we are saving it for use later. In the second script below, we take the contents of the output buffer, and write it to a file.

  1. <?php
  2.  
  3. // Now the script has run, generate a new cache file
  4. $fp = @fopen($cachefile, 'w');
  5.  
  6. // save the contents of output buffer to the file
  7. @fwrite($fp, ob_get_contents());
  8. @fclose($fp);
  9.  
  10. ob_end_flush();
  11.  
  12. ?>

Important: If you do not have "register_globals" set to off in php.ini, make sure you add the following to the beginning of "end_caching.php" (straight after the "<?php" line) to aid security. This will ensure that an attacker cannot visit "end_caching.php" directly and overwrite an important file on your site (or read its contents).

  1. $cachedir = '../cache/'; // Directory to cache files in (keep outside web root)
  2. $cacheext = 'cache'; // Extension to give cached files (usually cache, htm, txt)
  3. $page = 'http://' . $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI']; // Requested page
  4. $cachefile = $cachedir . md5($page) . '.' . $cacheext; // Cache file to either load or create

And there we have it. If a cached document exists, it is shown to the user, and if not, one is created.

Finally, you need to make sure the cache remains reasonably clean. Over time, out of date or redundant files could build up, and these should be removed regularly. For this reason, I usually set up an automated script to delete all cache files once a week (or less often, depending on the traffic of the site), but this will depend greatly upon the server software you are using.

The script below is one example of a script to delete all cache files. You will need to set the cache directory at the beginning before running the script. You can either use this manually, visiting the page through your browser whenever you want to empty the cache, or run it automatically. An example of a CRON job used to run this script automatically is below the script (the " >/dev/null 2>&1" bit at the end of the crontab prevents the server emailing me every time the script runs). Please note that this last script will be cached too, unless you specify otherwise!

  1. <?php
  2.  
  3. // Settings
  4. $cachedir = '../cache/'; // Directory to cache files in (keep outside web root)
  5.  
  6. if ($handle = @opendir($cachedir)) {
  7. while (false !== ($file = @readdir($handle))) {
  8. if ($file != '.' and $file != '..') {
  9. echo $file . ' deleted.<br>';
  10. @unlink($cachedir . '/' . $file);
  11. }
  12. }
  13. @closedir($handle);
  14. }
  15.  
  16. ?>
  1. curl http://www.your_domain.com/empty_caching.php >/dev/null 2>&1

87 comments

Manuel
Spain #1: August 1, 2004
Your cache system doesn't make the cache if virtual(script_name.php) is running in PHP

And doesn't make the cache in mysql outputs too.

Some solution?
Unfortunately, the virtual() function, when called, sends the contents of the buffer to the browser. In order to use the caching script, you will need to change the call to virtual (perhaps using include).

The script will cache the page output to the user, but will not cache mysql results sets directly - it only caches the complete page that would be sent to the user. If the page is requested again before it expires, then no connection to the database is needed.
Frank
Germany #3: November 8, 2004
Hello,

tried your script, with Include:

<?php
include ("begin_caching.php");
include ("end_caching.php");
?>

But allways it saves a blank file, so next time i see nothing.

If I add the lines to htaccess I get server errors ?!

Please mail me ;)
kunstinfo at gmx.de

Regards from Germany
Frank
You have to add "begin_caching.php" to the start of the file and "end_caching.php" to the end of the file to get it to work.
Hi! This script is exactly what I've been looking for - simple, clear, effective, and easy to implement. However, I'm having trouble getting it to play nicely with WordPress. The problem is very odd: the page looks totally fine, except story content isn't being output. But - other dynamic, wp-generated content is fine, such as the post titles. Only the post content is supressed. I even looked at the WP code and it looks to me like it's all using pretty much the same syntax. Any ideas? THANKS!

See my test file:
http://flaxfamily.com/index_cached.php

Avi Flax
Jerusalem
Works so fine!! Only a little problem, when i try to use the auto_append and auto_preappend i get a 500 internal server error :(
but if I include manually the files works very well.
Congratulations and Many thanks
Hi Dapuzz - with some hosts the use of "php_value" is impossible, and it sounds like that's what has happened with you. Glad it's working fine with normal includes though!
Ashton
United States #8: January 1, 2005
Hi,
I am putting together a web site and i used your cache script. Thanks for that.
I tried to find a way to delete files that are older than certain period of time but i couldn't.(i am very new to php) can you please help me with that?
I also chenged the permission of cache directory and the script didn't work ( it has to be 777). I tried and put cache directory in same directory as begin and end and i set the permission of cache dir to be writable only from the group, the script didn't work either. How can I protect the file inside the cache directory?

Thanks
I use your sript and thanks for it. I use it in a page, which dynamically parses an .Ics file and display the calendar events. After implementing it, I feel its not cached. Does it work with the Parsed and dynamic pages.
thank you very very much!

works fine! :-)
Just some FYI. Your script made a great foundation from which to work from. I did have some issues where part of the page needed to remain dynamic, but that was easily dealt with.

Just one point you may want to consider. Comment the portion of the start_cache.php code where the ternary operator is. I've used them a great deal, but I still get a little tripped up when I see them. Maybe even provide a commented if / else equivalent so they can understand what's going on. I guess you could consider this a kindness to the noobs.

Thanx a milling and keep it up!
Thanks for this, exactly what I needed!
Jim
United States #13: March 20, 2005
I think there's a solution here somewhere...

Here's the situation.

We subscribe to newsfeeds that are delivered by js.

Sometimes there is a delay with the feeds that causes our home page to hang, waiting for the js to display.

I've gone around and around with a javascript solution (i.e. launching the feeds after the page load (onload), etc.) and I don't see that as an option.

I'm thinking this.

A few times a day, I'd like to launch the newsfeed js and write the resulting url's to a file. Then, we'll maintain our own static copy.

Make sense? If so, does anyone have an example of what I need?
Thanks!
Jim

growthtrac@yahoo.com
First, thanx for sharing that great solution.
I'm using it on my site, and works as expected, but i'm trying to imagine how to solve a little problem. I like to use an include on all my files, that simply track visitor hits, if i disable caching it works each time a page is viewed, but when i enable caching, stats only being actualized when a new cached file is generated.
The question is, is there any way to append or preppend an include to cached output?.

Thnx in advance.
Hi Alex,

If you add your include to the beginning of "begin_caching.php", it will run every time a page is viewed, regardless of whether or not the user is served a cached version.
Thanx for the tip, it works as desired. Now i'm trying to improve script funcionality to compare original file cretion time vs cached file creation time, and if original file is newer than cached refresh cached copy, any suggestion?
Justin
United States #17: March 30, 2005
Forgive me if this is dense, but is the md5 hash generated using the url itself, or the actual page existing at said url? Basically what I need to know is, is the age of the cachefile what determines whether or not to re-cache?

Am I correct in assuming that using this wrapper in conjunction with creative mod_rewrite rules can essentially make damn near any dynamic script more or less static?
The MD5 hash is generated from the URL, not the content. In order to work out if the content had changed, the script would need to be run, and any benefit gained from caching would evaporate.

The age of the cachefile is what is used to determine whether or not the page should be regenerated.
Vladimir
Serbia And Montenegro #19: April 19, 2005
Fantastic article! Just what I needed. Thanks a bunch.
Nice article.

I use a similar version based on what Simon Willison wrote, and was wondering if you wouldn't find the first comment here useful to add to your script:
http://simon.incutio.com/archive/2003/05/05/cachingWithPHP#comment1
Thanks for the great bit of code, I was looking for a way to do this on our site now that we are serving up more and more pages.

Its not quite perfect for us as there are quite a few dynamic sections all over the page but its given me a step in the right direction.
bleu
France #22: June 13, 2005
I have a php mysql search function that has 90 odd pages, but none are cached by SE as the data is hidden in the mysql. By adding this will my site look bigger to the SEs showing more pages than the main ones and the one search page?
Hi Bleu.

Using MySQL does not hide data from search engines by itself. Search engines cannot fill out forms, and that's probably why your site is not being indexed. Caching, done like this, will speed up your site, not allow search engines to index pages they normally could not.
Really nice scritp Daniel. I love it. The only problem I had was with those

php_value auto_prepend_file /begin_caching.php
php_value auto_append_file /end_caching.php

but I figured a way you could make it still be working. Here is how.

The user make a page and call it loader.php (for example).

In his .htaccess, he use the following: RewriteRule ^([a-z]+)$ loader.php?p=$1

this will tranfert to the loader a var which is the page name. In the loader, you use the following:

<?PHP
require_once("begin_caching.php");
require_once($_REQUEST["p"]);
require_once("end_caching.php");
?>

This way, each time people get on ANY page, they get the loader to add the begin and end cache and loads the page between, without using the use php_value, hope I helped some of you guys :)
Nice tech. But I see two major facts that prevent from using this type of scheme:
1â&#8364;&#8220;the first one, is that you cannot longer take into account browsers difference (for exemple, say a PC user with ie just generated the cached page; now a macosx user with safari download the same page which can cause safari not to handle it the riht way).
2 â&#8364;&#8220; You cannot handle credentialsâ&#8364;&#8220;based dynamic sites. Say an admin log into the site, at the homepage which is generated. Now he logs out, and an anonimous just download the cached homepage. He then sees private informations.
/*--------
1â&#8364;&#8220;the first one, is that you cannot longer take into account browsers difference (for exemple, say a PC user with ie just generated the cached page; now a macosx user with safari download the same page which can cause safari not to handle it the riht way).
-------*/

You can create cache file based on user agent
Bira
United Kingdom #27: September 20, 2005
Hi,

Many thanks for the script and the very clear instructions.

I would like to ask you for further advice, please:

How would I go about applying cache to only part of a page?

The top of our page is always dynamic (checks cookies, identifies user, decides whether to serve ads or not, etc), while the rest of the page should ideally be cached.

Any suggestions how to go about this?

Many thanks,

Bira
Excellent document. There are a couple of things I should mention:

* You have commented out the ob_start('ob_gzhandler'); and the related ob_flush commands, probably because it caused the page to refresh continually. Simply use ob_gzhandler without the flush, and it'll work with gzip encoding just fine.

* This script does not take into account $_POST variables that may be passed to the script. I suggest this modification:

$page = 'http://' . $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI'].serialize($_GET).serialize($_POST);

This will ensure that if the $_POST or $_GET variables change, it won't load an incorrect cached document.
I had to make this program for a client a couple of months ago. I built this to my clients wants after they said no to my sergestions of what they needed. I had to make this system that was very heavy on the server, one page used over 10 heavy queries, I really did like the coding but I had no choice. I added a cache system like this and it was fine, it is extremely useful to have a cache especially on popular sites. However, if you aren't getting that many hits it probably isn't worth the effort
A very clear and concise tutorial - thank you.
Another article I found useful is here:

http://www.devshed.com/c/a/PHP/Output-Caching-with-PHP/

Between them, they make a great reference for caching.

P.S. Thanks for all your cheat sheets as well, by the way. Invaluable!
i've modified the begin_caching.php file to allow for a 'search engine food' folder, now it's not perfect as similarly-named files will be overwritten in the cache (only one 'index.htm' will exist:


$cachedir = './cachebox/'; // Directory to cache files in (keep outside web root)
$cachetime = 1; // Minutes to cache file for
$cacheext = 'htm'; // Extension to give cached files (usually cache, htm, txt)

// Ignore List
$ignore_list = array(
''
);


$cachetime = ($cachetime * 60);



$path = "$PHP_SELF";

$path = substr($path, 0, strrpos($path, '/'));




// Script
$page = 'http://' . $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI']; // Requested page
$cachefile = $cachedir . $path . '.' . $cacheext; // Cache file to either load or create

$ignore_page = false;
for ($i = 0; $i < count($ignore_list); $i++) {
$ignore_page = (strpos($page, $ignore_list[$i]) !== false) ? true : $ignore_page;
}

$cachefile_created = ((@file_exists($cachefile)) and ($ignore_page === false)) ? @filemtime($cachefile) : 0;
@clearstatcache();

// Show file from cache if still valid
if (time() - $cachetime < $cachefile_created) {

ob_start('ob_gzhandler');
@readfile($cachefile);
ob_end_flush();
exit();

}

// If we're still here, we need to generate a cache file

ob_start();


i added a sum to make it cache by minutes instead of hours and substituted the md5 hash with a $PHP_SELF and trimmed off the php extension and added .htm

i don't quite know how i'm going to display a link for search engines yet, i might play with the robots.txt file and try to redirect them to the cachebox, who knows, hope this helps someone.


nice script mate.
sorry, there's a mistake in my previous code,

find & replace:

$path = substr($path, 0, strrpos($path, '/'));

with:

$path = substr($path, 0, strrpos($path, '.'));
Can be to whom it is useful...

Many years I use the cache circuit for dynamic pages and time of chache life.

Example:

file: .htaccess

Options -MultiViews

RewriteEngine on

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} \.gif$
RewriteRule ^(.+) empty.gif [L]

RewriteCond %{REQUEST_FILENAME} !-f [OR]
RewriteCond %{REQUEST_FILENAME} !\.html$
RewriteCond %{REQUEST_FILENAME} !\.css$
RewriteCond %{REQUEST_FILENAME} !\.js$
RewriteCond %{REQUEST_FILENAME} !\.php$
RewriteCond %{REQUEST_FILENAME} !\.png$
RewriteCond %{REQUEST_FILENAME} !\.xml$
RewriteCond %{REQUEST_FILENAME} !\.gif$
RewriteCond %{REQUEST_FILENAME} !\.jpg$
RewriteCond %{REQUEST_FILENAME} !\.exe$
RewriteCond %{REQUEST_FILENAME} !\.zip$
RewriteCond %{REQUEST_FILENAME} !\.flt$
RewriteCond %{REQUEST_FILENAME} !\.txt$
RewriteRule ^(.+) index.html [L]

DirectoryIndex index.html

<Files *.html>
ForceType application/x-httpd-php
</Files>

<Files *.css>
ForceType application/x-httpd-php
</Files>

<Files *.js>
ForceType application/x-httpd-php
</Files>


file: /includes/__cache
<?
function __cache(){
global $__DIR_caches_,$__CACHE_on_,$__HTTP_uri_,$__args,$__dir;
if(!isset($__dir[$__args[0]]['cache'])||!$__CACHE_on_||$__HTTP_uri_=='')
return;
if($f=&fopen($__DIR_caches_.$__HTTP_uri_,'w')){
fwrite(&$f,preg_replace("/(\r?\n)+/","\n",ob_get_contents()));
fclose(&$f);
}
ob_end_flush();
}
if($__CACHE_on_){
if($__HTTP_uri_=='')
$__HTTP_uri_='index';
$__cache=$__DIR_caches_.$__HTTP_uri_;
if(file_exists(&$__cache)&&(@filemtime(&$__cache)+$__CACHE_expire_*86400)>time()&&
@readfile(&$__cache)!==false)
exit();
ob_start();
}
?>

file: config
error_reporting (0);

define('__BR_',"\r\n");
define('__ON_',true);
define('__OFF_',false);

$__HTTP_host_=&str_replace('www.','',&$_SERVER ['HTTP_HOST']);
$__HTTP_dir_= &substr(&$_SERVER['SCRIPT_NAME'],0,1-strlen(strrchr(&$_SERVER ['SCRIPT_NAME'],'/')));
$__HTTP_base_='http://www.'.$__HTTP_host_.$__HTTP_dir_;
$__HTTP_query_=&$_SERVER['REDIRECT_QUERY_STRING'];
$__HTTP_uri_=&substr(&$_SERVER['REQUEST_URI'],strlen(&$__HTTP_dir_),strpos($_SERVER['REQUEST_URI'],'.html')-1);
$__HTTP_id_=&substr(&$__HTTP_host_,0,strpos(&$__HTTP_host_,'.'));

$__DIR_templates_='templates/';
$__DIR_includes_='includes/';
$__DIR_files_='files/';
$__DIR_scripts_='scripts/';
$__DIR_languages_='languages/';
$__DIR_caches_='caches/';
$__DIR_styles_='styles/';
$__DIR_images_='images/';
$__DIR_databases_='databases/';

$__CACHE_on_=__ON_;
$__CACHE_expire_=8; # days

$__LANG_id_='en';

?>


file: index.html
<?
require_once('config');
require_once($__DIR_includes_.'__cache');
require_once($__DIR_includes_.'__prepare');

/*
code generation
*/

$page['description']=$__info['description'];
$page['info']=$__info;
require_once($__DIR_languages_.$__LANG_id_.$page['type']);
require_once($__DIR_includes_.$page['type']);
require_once($__DIR_templates_.$page['type']);
__cache();
?>
A Huge thank you this script is just what the doctor ordered and will come in very hand
Thanks for this useful and well-written article.

One concern, though: Having your empty_caching.php script accessible via a browser could be bad news. Should its URL ever become known, some idiot could call it up periodically to annoy you -- or worse, a search engine my index it and visit it regularly.

Maybe you have no other choice but if you do, you could put it outside your web root and call it via cron.
Or not use a php script at all -- use a crontab one-liner along the lines of

00 4 * * 1 find /path/to/cachedir/ -type f -ctime +7 -exec rm -f {} \;

Example cleans cache dir at 4 AM every Monday
Hi, i've developed a webportal.
I've tryed to used this cache mechanism, it worked perfectly except in one thing.

I'm using sessions in the webportal.
When i cache the pages, if the user is logged on he will have the page with the sessions vars and for example, if someone log on and i open his page i will keep his sessions :\

Anyone knows how can i work this up ?
Please email me at xcrap@paranoias.org if you have any ideia.
Thx a lot for your excellent site. However on this one I have a problem. My site says hello to all users based on their last visit´s cookies. This, I assume, is not possible with the caching approach?
Any hints welcome. Thx
Jamie
United States #38: May 9, 2006
This was a great help. Worked first time with my page.

Thanks!
fadi
Saudi Arabia #39: May 30, 2006
thanks
Great code snippet man... thanks
phpfunk
United States #41: June 6, 2006
For the comments I saw about sessions and cookies, why not just do a simple regular expression to replace the values needed. Or simply anywhere you need dynamic data like "Hello NAME", you could write this to the cache file.

Hello {NAME}

Then when the cache file is outputted, simply do a find and replace for {NAME} with $_COOKIE['name'].

You could do that easily for any session or cookie data that needs to stay dynamic.
Lance
United Kingdom #42: November 30, 2006
Excellent post, thanks. I have added it to my site and wow what an improvement!
Very good tutorial.
I'm searche articles aobu caching dynami pages and i'm find it.
Thank You
Very useful caching script. I have posted an article on my website telling my members about it. I expect you will see a significant boost of hits once it gets indexed and syndicated. I am using your script to significantly reduct they CPU and I/O load on my server by caching my RSS feeds. The website itself is caches via SMARTY technology. Love your Blog. I LoveJackDaniels too!

JT McNaught
Very good tutorial, thanks. I have a .htaccess rewrite rule that transform php files in nice url addresses like http://vremea.ido.ro/Timis.htm
I've noticed that this urls no not get cached and these are the main reason I need this script.

Also, I have put a exclusion rule 'vremea.ido.ro/satelit.php' in the hope that will not cache dinamic urls like http://vremea.ido.ro/satelit.php?statia=Ramnicu-Valcea,Romania but these urls are cached!

Do you have any suggestions?

Thanks!
Marinel: Excluded urls are still cached in the file cache, but not served.
In order to prevent cache files from being created, you would need to create a variable in the header, or re-check for excluded files in the footer.
To prevent excluded urls from being cached just change the footer code to this

if ($ignore_page === false){
$fp = @fopen($cachefile, 'w');
@fwrite($fp, ob_get_contents());
@fclose($fp);
}
ob_end_flush();
Thanks a lot people!

Does anybody know how to cache html urls like http://vremea.ido.ro/Cluj.htm ?
It's a often situation on sites using mod_rewrite, this one above is in fact script.php?id=x&f=34534&another=1...

Could be this problem because I'm using a gzip script on the php file?
Amazing...

It was very easy to integrate using the include method. Works fine from the 1st time without any special server settings. Now my RSS-parsing pages are running times faster.

Thank you, Dave!
Good luck!

p.s. - I want to buy you a Jack... :)
Rafael, USA
United States #50: March 7, 2007
Instead of having an "ignore list", how could I get an "include list" ?

Thanks!
You can use hierarchy dirs to store your cache files (if your file system slow down on huge number of files)

to do that just add function

function GetFileDir($md5) {
$path = "";
$levels = 3;
for ($i=0;$i<$levels;$i++) $path .= $md5{$i}."/";
return $path;
}

change in both (begin_caching.php and end_caching.php) files:

$cachefile = $cachedir . md5($pageName) . '.' . $cacheext;

to

$cachefile = $cachedir . GetFileDir(md5($pageName)) . md5($pageName) . '.' . $cacheext;

That's it

and don't forget to make up directory tree with php script (it can be located in /path/to/php_src/ext/session/mod_files.sh

just run sh mod_files.sh /path/to/cache/ 3 3
for 3 level dirs


enjoy
george
United Kingdom #52: June 4, 2007
Thanks for the scripts! Worked fine apart form a small problem that i may need your help! The caching works fine for the 5 out of 6 php scripts that i am using. They are all identical applying in different mysql tables. The one is not caching applies to a mysql table with 7200 rows at the moment and it will grow in the future! Is there any limit on the time or size!

Thanks a lot!!!!!
Thanks for this wonderful scripts. It's work great with minimal work for us. :)
Great Script. Thank you! My site was taking up to 25 seconds to load pages the other day..the pages without caching still are.. but im working on that.. anyways as for the others they are taking less than 2 seconds now.
Lukas
Denmark #55: July 31, 2007
I have managed to make the script write to my cache folder. Although I am not sure whether it loads the files from the cache once the cached file is requested. Are there any means of testing that?

I tried to change the content of the file to be cached, and request it before the caching timeout. Browser output corresponds to the updated file right away.

I have also tried to change the content of the cache file, and this change is ignored.

I have noticed, that the server time is two hours behind my browser, but I can't see how this should affect the cache.
Lukas
Denmark #56: August 1, 2007
Problem solved
Tried to improve the script with:
$page = 'http://' . $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI'].serialize($_GET).serialize($_POST);
as mentioned by Thomas in #28.

Removed this and things got solved.
Thx again for really smooth approach to caching.
marc
United Kingdom #57: August 17, 2007
I am trying to use this nice script, but with no success. everything is fine on the begin_caching.php I can echo $_SERVER['HTTP_HOST'] and other variables and they have values, but no value is available on the end_caching.php. $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI'] are both null. As a consequence only one file http:// is cached in my cache dir.

I would appreciate any comments and advice on this.
marc
United Kingdom #58: August 17, 2007
just an update:I used $HTTP_SERVER_VARS['REQUEST_URI'].serialize($_GET).serialize($_POST); and it works. but now I have to find a way to bring the different users name and email on those cache files.
Thanks for the script, pretty easy to use. I did find huge numbers of files being created and saving the file as a name instead of md5 helped to track down problems (eg. id a links are added as a separate file and don't need to be).
Still around 15,000 files in the cache dir tho, anyone know of a way of using Ilyas method with non-md5 names?
londonhogfan
United States #60: October 26, 2007
instead of having an "ignore list" would it be possible to have an include list... there are only a few pages on my site I want to cache.

Thanks - works very nicely.
londonhogfan

technically all you need is just replace the !== with === in
$ignore_page = (strpos($page, $ignore_list[$i]) !== false)

but if a human reads it, it wont make much sense ...

to make it more complete can also incorporate chris's comment #47

Great code dave ... thanks
Very good tutorial matey :D
Graham O'Shea
United Kingdom #63: December 13, 2007
Excellent Script.

I have made some modifications to make it more search engine friendly by sending last modified and etag headers.

<?php
// Settings
$cachedir = '/var/www/dhcache/'; // Directory to cache files in (keep outside web root)
$cachetime = 12*60*60; // hours to cache files for
$cacheext = 'cache'; // Extension to give cached files (usually cache, htm, txt)
// Ignore List

$ignore_list = array(

//'example.com/rss.php',
//'example.com/search/'

);

// Script
$page = 'http://' . $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI'].serialize($_GET).serialize($_POST); // Requested page
$hash = md5($page);
$cachefile = $cachedir . md5($page) . '.' . $cacheext; // Cache file to either load or create
$ignore_page = false;
for ($i = 0; $i < count($ignore_list); $i++) {

$ignore_page = (strpos($page, $ignore_list[$i]) !== false) ? true : $ignore_page;
}
$cachefile_created = ((@file_exists($cachefile)) and ($ignore_page === false)) ? @filemtime($cachefile) : 0;
@clearstatcache();
// Show file from cache if still valid
if (time() - $cachetime < $cachefile_created) {
$fsize = filesize($cachefile);
ob_start('ob_gzhandler');

header("Last-Modified: ".gmdate("D, d M Y H:i:s",$cachefile_created) . " GMT");
header("ETag: \"{$hash}\"");
header("Accept-Ranges: bytes");
header("Content-Length: ".$fsize);


@readfile($cachefile);

exit();
}
// If we're still here, we need to generate a cache file
ob_start();?>
It worked great!
It would be nice if there was a way to avoid caching certain areas on a page. Like latest searches or popular products.
Tien Manh
Unknown #65: January 26, 2008
<?
$cacheDir = dirname(__FILE__) . '/cache/home/';
if (isset($_GET['page'])) {
$cacheFile = $cacheDir . '_' . $_GET['page'] . '.html';
} else {
$cacheFile = $cacheDir . 'index.html';
}
if (file_exists($cacheFile))
{
header("Content-Type: text/html");
readfile($cacheFile);
exit;
} else {
include('library/home.php');
}
?>

<? ob_start(); ?>

<?
$buffer = ob_get_contents();
ob_end_flush();
$fp = @fopen($cacheFile, "w");
@fwrite($fp, $buffer);
@fclose($fp);
?>

Demo : [Link Removed]
Thanks a lot for this script. It made life much easier

I love the design.

Create two files. Include one above and one below.

Watch the magic!

I have a small site cooking at http://www.kaffenyheder.dk - its a basic RSS agregation site for danish news.

Applying the script made it FAST!

Naturally, when the cache needs to be updated the page will hang a bit, but that is just for one unlucky user every 5 minutes or so.

For the rest of the users its just nice.

Again. Thank you.

Kind regards

Kristian
How does PHP handle simultaneous requests? Can their be issues with two threads writing to the same cache file at the same time? Or does PHP only process 1 request at a time?
Han
Belgium #68: March 25, 2008
I often get empty pages in the cached files. I think I'm having issues when two visistors simultaneously open a non-cached page resulting in a bad/empty cache page.
Is this possible? Does anyone have any suggestions?
After emptying the cache everything works fine again.
Has anybody done any performance testing with this??

Obviously the results would be heavily influenced by the time taken to produce the original page. It would still be interesting to know what factor of gain we could expect to see by caching pages.
csjoe
Malaysia #70: April 14, 2008
I have encountered an Internal Server Error after placing both the files in my root directory and also created the folder cache. May I know what went wrong? Did anyone faced the same issue too? Thank you.
 United States #71: April 23, 2008
Thanks for this starting point, Dave. I tweaked it a smidgen for my purposes. I was able to get it added to my main site template and away we go.

@Steve Exley, before I was cruising at about 80% CPU capacity on my server during peak hours, and now I'm down about half that (steady traffic: 1.5 million page views a month). I changed the time variable to 30 minutes. Scientific, no? : ]
Dave
United States #72: April 29, 2008
This looks very useful, however I think you should check if the method is POST, and then skip caching altogether. Someone else suggested putting the $_POST variables inside the md5sum, but that's not correct with respect to HTTP - you may be posting the same variables, but you don't want the request to be ignored, the side effects of the request still have to happen. (Maybe there's some variables in $_POST that change with each request anyway, but that's rather hacky to rely on). Instead should probably wrap the begin_caching.php and end_caching.php with a check for the server method. Like this:

if ($_SERVER['REQUEST_METHOD'] == 'GET') {
/* rest of begin_caching.php here */
}

And similarly for end_caching.php.
Dave
United States #73: April 29, 2008
Also, someone brought up a good point that there could be more than one thread (usually, a "process" though, not a thread) writing to a file at the same time. You should probably write the cached contents to a temporary file and then do an atomic rename to the cached file.
Dave
United States #74: April 29, 2008
Just wanted to correct/clarify my previous post, the problem of multiple access would be twofold:

1.) Processes might read a partially written output file if they send a request before the output from a previous request has finished writing
2.) Processes might request the same file and write to the same output file simultaneously. This might be ok but I'm not sure - both processes should end up writing the same data at the same places if the request is the same, but there might be some ordering issues if you just changed the file and the old version is in the middle of being cached, for example.

So I think you definitely want to do atomic rename..
I use this script and I love it .. but recently my site keeps getting hacked. When I take the script out there is no problem

my pages typically begin with

require_once('begin_caching.php');
require_once('inc_session.php');

[05-Jun-2008 08:07:29] PHP Warning: session_start() [<a href='function.session-start'>function.session-start</a>]: Cannot send session cookie - headers already sent by (output started at /home/illustra/public_html/index.php:2) in /home/illustration/public_html/inc_session.php on line 1
[05-Jun-2008 08:07:29] PHP Warning: session_start() [<a href='function.session-start'>function.session-start</a>]: Cannot send session cache limiter - headers already sent (output started at /home/illustra/public_html/index.php:2) in /home/illustration/public_html/inc_session.php on line 1
[05-Jun-2008 08:13:58] PHP Warning: session_start() [<a href='function.session-start'>function.session-start</a>]: Cannot send session cookie - headers already sent by (output started at /home/illustra/public_html/index.php:2) in /home/illustration/public_html/inc_session.php on line 1
 United States #76: June 6, 2008
Thanks for this, man, you've got a great site here (I've made use of your regex cheat sheet a few times, thanks for that too, you're awesome)...

An extra tip here, and in the case of the project I've been working on... I've got most requests tunneling through a particular PHP file, that includes the config, header, appropriate content, etc. If there's a file not found error and begin_caching.php is called before the 404 header is sent, I believe (assume, pretty sure, haven't tested) that it will send a 200/OK! So be sure to make some provision to send the headers first, in my case the cache was wrapped just before and after the template files themselves, at the cost of making a database connection on each page load.
I love this. It's simple and powerful, and just what I needed.

I had the same concern as some of the above posts regarding file locking. What I came up with was to fopen the file using 'x' and then flock it. Any other scripts that are running will just fetch their results from the database until the cache file has finished being written.
 Hong Kong #78: July 3, 2008
In my case, running the dynamic file itself took about 220 msec.

I implemented your caching and voila, I measured loading times as low as 0.2 msec and on average about 2 msec.

Thanks for the tutorial!
@Dave: Quite right, POST should be ignored. In my version on this site, I had actually made that change, but hadn't updated this post. I'll review this post over the weekend and update it.

@Nate: Your site is sending content before the session_start is called in index.php. I suggest you check for whitespace outsite of the <?php ... ?> tags in the begin_caching.php script.

@dan: Thanks. Always nice to be called "awesome" :). I believe that if you clear the output cache, and then send a 404, you should avoid that problem. Not tested that though.

@ojuicer: Good point. I've been thinking about this, and my feeling is it's unlilkely to be a major problem. Would you agree?

@P: You're very welcome. Glad it's helped drop your loading times.
 Hong Kong #80: July 5, 2008
I read the comments above, and almost all of you talk about some issues related to your site's architecture.

In my case, my site is run from a single "file" so I placed Dave's caching header and footer inside this "file", in critically chosen spots. I didn't have to do anything with .htaccess. Everything worked very smoothly.

Even method POST is not being handled by my "file", but some other file instead.

However, I am stuck with one -- can't seem to figure out why the caching doesn't work on my RSS feeds.
well thank you soo much mr. Dave Child ...

I see that there is lot of comments and this mean new updates for your script ... so what i wanna ask from you that if you please put all the updates in new code and put it in one zip file with examples so that every one can use your great cache system in his own script .

thank you one more time and i hope u answer positively .
William Weijia Yang
United States #82: July 30, 2008
I think the end_caching.php needs to be wrapped in a "if" or else it will always generate a cache file even if the page is in the ignore_list. Should be something like:

<?php

if($ignore_page===false) {
// Now the script has run, generate a new cache file
$fp = @fopen($cachefile, 'w');

// save the contents of output buffer to the file
@fwrite($fp, ob_get_contents());
@fclose($fp);
}

ob_end_flush();

?>


I faced 2 questions :

1- my webite code in one singel file ... it's work with cases so in the top of the file and before any case i send the header and the footer after the last break of last case ... how can i make cache file for the header or the footer html file only and not for all php page ??

2- In the ignore list when i set some dir as part of ignored array the code will not make cache file also for the files under this dir (f.e in the " semo " dir i like to make cache file for all pages under this dir except the file "semo/index.php" i dont like to make cache file for it ... how can i do that ... when i put the file " semo/index.php" on ignored list it's will prevent all the files on that dir form caching ... any solution ??


thank you all .
If I could draw your attention to this article "Better Php Caching".
http://www.oateck.com/blogs/programming_tips/archive/2008/02/19/php-caching-for-high-traffic-sites.aspx
(Comments are turned off so I thought it would benefit the discussion here)

With not much more code, the author attempts to greatly improve concurrency. (But fails)

The new logic includes:
-------------------
if (file_exists($cfile ) && (time() - $cachetime< filemtime($cfile )))
{
//if cache is not expired return cache files
include($cfile );
exit;
} else{
//open file and attempt to lock
$fp = fopen($cfile , 'w');
if( flock($fp, LOCK_EX)) {
ob_start();
} else {
//if cant lock then return the previously generated cache
include($cachefile);
exit;
}
-------------------

The worst problem here is fopen($cfile , 'w'); Opening with w instantly truncates the file, meaning that anything that falls through the flock will then blindly read an empty file. This is important because the exercise in this case was to make "Better Php Caching".

You could argue that there isn't much point to this if you don't have high traffic of several hits per second, but it's going to happen sooner or later.

if(@readfile($cachefile)) exit(); is safer because if readfile returns zero bytes due to another script over-writing it, the page will still be generated instead of exiting.
Ack, I need to correct myself because actually without the LOCK_NB flag, flock will block until it can get an exclusive lock. Not the behaviour expected judging by the comment.
Hi, great script.

Does it cache Google Adsense banners?

Glen
Casper Helenius
Greve, Denmark #87: November 5, 2008
Glen,

Not likely. Google Adsense is actually a piece of javascript, that fetches the content. Thus, the content could - theoretically - be unique everytime - even across caches.

No worries, you can easily use this script in conjunction with Google Adsense. At least, I cannot see what should block it from working.

Please share your experiences here if you conclude something different :)

//Casper

Post Your Comment

· Comments with keywords instead of a name have their URLs removed.
· Your email address will not be displayed or shared.

Live Comment Preview

 United States #88: 1 minute ago