PLEASE NOTE: This code is now considered out of date. An updated version has been released under an open source license as a Google Code project: php-text-statistics. There is more about this change in the post Readability Code Open Sourced.
[A tool for [url=http://www.addedbytes.com/resources/readability-score/]checking the readability scores of text[/url] is available - this article covers the functions behind that tool.]
The Gunning-Fog index is a measure of text readability. It represents the approximate reading age of the text - the age someone will need to be to understand what they are reading.
The following is the algorithm to determine the Gunning-Fog index:
(average_words_sentence + percentage_of_words_with_more_than_three_syllables) * 0.4
The above produces a number, which is a rough measure of the age someone must be to understand the content. The lower the number, the more understandable the content will be to your visitors. Web sites should aim to have content that falls roughly in the 11-15 range for this test.
Any number returned over the value of 22 can be taken to be just 22, and is roughly equivalent to post-graduate level.
Below are a selection of function you can use to determine the Gunning-Fog index of text. To calculate this, all you need to is call the function as follows, where $text is the text you wish to measure the readability of.
$gunning_fog_score = gunning_fog_score($text);
function gunning_fog_score($text) {
return ((average_words_sentence($text) + percentage_number_words_three_syllables($text)) * 0.4);
}
function average_words_sentence($text) {
$sentences = strlen(preg_replace('/[^\.!?]/', '', $text));
$words = strlen(preg_replace('/[^ ]/', '', $text));
return ($words/$sentences);
}
function percentage_number_words_three_syllables($text) {
$syllables = 0;
$words = explode(' ', $text);
for ($i = 0; $i < count($words); $i++) {
if (count_syllables($words[$i]) > 2) {
$syllables ++;
}
}
$score = number_format((($syllables / count($words)) * 100));
return ($score);
}
function count_syllables($word) {
$subsyl = Array(
'cial'
,'tia'
,'cius'
,'cious'
,'giu'
,'ion'
,'iou'
,'sia$'
,'.ely$'
);
$addsyl = Array(
'ia'
,'riet'
,'dien'
,'iu'
,'io'
,'ii'
,'[aeiouym]bl$'
,'[aeiou]{3}'
,'^mc'
,'ism$'
,'([^aeiouy])\1l$'
,'[^l]lien'
,'^coa[dglx].'
,'[^gq]ua[^auieo]'
,'dnt$'
);
// Based on Greg Fast's Perl module Lingua::EN::Syllables
$word = preg_replace('/[^a-z]/is', '', strtolower($word));
$word_parts = preg_split('/[^aeiouy]+/', $word);
foreach ($word_parts as $key => $value) {
if ($value <> '') {
$valid_word_parts[] = $value;
}
}
$syllables = 0;
// Thanks to Joe Kovar for correcting a bug in the following lines
foreach ($subsyl as $syl) {
$syllables -= preg_match('~'.$syl.'~', $word);
}
foreach ($addsyl as $syl) {
$syllables += preg_match('~'.$syl.'~', $word);
}
if (strlen($word) == 1) {
$syllables++;
}
$syllables += count($valid_word_parts);
$syllables = ($syllables == 0) ? 1 : $syllables;
return $syllables;
}

6 Comments
what type of words help build up decent gunning score.
#1, paul.shann@bigpond.com, Australia, 19 April 2006. Reply to this.
The phrase "Gunning-Fog Index." gives:
Flesch-Kincaid Reading Ease: -6
Ideally, web page text should be around the 60 to 80 mark on this scale. The higher the score, the more readable the text.
Flesch-Kincaid Grade Level: 14
Ideally, web page text should be around the 6 to 7 mark on this scale. The lower the score, the more readable the text.
Gunning-Fog Index: 25
Ideally, web page text should be between 11 and 15 on this scale. The lower the score, the more readable the text. (Anything over 22 should be considered the equivalent of post-graduate level text).
So G-F indices are post-graduate-level. Interesting.
#2, Anonymous, Republic Of Korea, 26 August 2006. Reply to this.
For fun (and functionality), I'm coding a small real-time implementation of Readability tests and noticed an inconsistency in your formula; why are you adding 5 to the calculated index?
#3, Martijn W. van der Lee, Netherlands, 4 September 2006. Reply to this.
Hi Martijn,
I wrote this so long ago, I honestly don't remember why I added 5. I can't think of a reason now, either, so I've removed it.
#4, Dave Child, United Kingdom, 5 September 2006. Reply to this.
Thanks for this and the FK algo. They were dividing by zero if passed text bracketed by phpBB code [color=red]some text[/color]. I added the max(1, ...) to prevent zero sentences and zero words in two slightly modified functions below.
function average_words_sentence( $text )
{
$sentences = max( 1, strlen( preg_replace( '/[^\.!?]/', '', $text ) ) );
$words = strlen( preg_replace( '/[^ ]/', '', $text ) );
return ( $words / $sentences );
}
function percentage_number_words_three_syllables( $text )
{
$syllables = 0;
$words = explode( ' ', $text );
for ( $i = 0; $i < count( $words ); $i++ )
{
if ( count_syllables( $words[$i] ) > 2 )
{
$syllables ++;
}
}
$score = number_format( ( ( $syllables / max( 1, count( $words ) ) ) * 100 ) );
return ( $score );
}
#5, Michael Brenden, Unknown, 6 February 2008. Reply to this.
It is mentioned that any number returned over the value of 22 can be taken to be just 22, and is roughly equivalent to graduate level. My point is that any criteria to judge that the value of 22 is equivalent to or suitable for graduate students. I need to know. Anyone can answer? I am working on my research about this point.
#6, Natjiree J., Thailand, 7 May 2008. Reply to this.