Skip Navigation

Blog » Flesch-Kincaid Reading Level

Functions to count the number of syllables in a word or sentence, and work out the readability of text.

PLEASE NOTE: This code is now considered out of date. An updated version has been released under an open source license as a Google Code project: php-text-statistics. There is more about this change in the post Readability Code Open Sourced.

A tool for checking the readability scores of text is available - this article covers the functions behind that tool.

Calculations based upon word structure can tell you a fair bit about the text on your site, most notably the readability of your copy. A lot of sites have text on them that is simply too advanced for their users, which is as useful as having no text at all.

It is therefore usually a good idea to check the copy on your website as thoroughly as possible. Spelling and grammar should be checked as a matter of course. You should also check how difficult your text is to read. If a user cannot easily understand what they are reading, they will leave the site and find one they can comprehend.

The following are two calculations that can give you an indicator of how easy your text is to read.

Flesch-Kincaid Reading Ease

The Flesch-Kincaid reading ease score is worked out using the following calculation, which gives a number. The higher that number is, the easier the text is to read.

206.835 - (1.015 * average_words_sentence) - (84.6 * average_syllables_word)

The function you will need to use to work this score out (in addition to the three at the bottom of this page) is:

function calculate_flesch($text) { return (206.835 - (1.015 * average_words_sentence($text)) - (84.6 * average_syllables_word($text))); }

And you can call the function like so:

$flesh_score = calculate_flesch($text);

Flesch-Kincaid Grade level

The Flesch-Kincaid grade level is a similar calculation, however gives a number that corresponds to the grade a person will need to have reached to understand it. For example, a Grade level score of 8 means that an eighth grader will understand the text.

(.39 * average_words_sentence) + (11.8 * average_syllables_word) - 15.59

The function you will need to use to work this score out (in addition to the three at the bottom of this page) is:

function calculate_flesch_grade($text) { return ((.39 * average_words_sentence($text)) + (11.8 * average_syllables_word($text)) - 15.59); }

And you can call the function like so:

$flesh_score = calculate_flesch_grade($text);

Both of the functions above make use of the functions below, so these will need to be included in your scripts in order for either function to be used.

Each score returned is not perfectly accurate. Unfortunately, it is not always possible to work out the number of syllables in a word programatically, and not always possible to correctly calculate the number of words per sentence, or indeed number of sentences, in text. However, the function will return a close approximation of the value - certainly good enough for our purposes.

Ideally, you should aim for a reading ease of around 60 to 70 (equivalent to a Grade level of around 6 to 8). The nearer 100 your text scores, the easier it is to read (and conversely, the lower the grade score, the easier the text is to read). Comics, for example, are usually in the 90s. The Harvard Law Review scores in the low 30s. Legal documents are usually lucky to make it into double figures.

The functions you will need in order to calculate the Flesch-Kincaid reading ease or Grade level of text are:

function average_words_sentence($text) { $sentences = strlen(preg_replace('/[^\.!?]/', '', $text)); $words = strlen(preg_replace('/[^ ]/', '', $text)); return ($words/$sentences); } function average_syllables_word($text) { $words = explode(' ', $text); for ($i = 0; $i < count($words); $i++) { $syllables = $syllables + count_syllables($words[$i]); } return ($syllables/count($words)); } function count_syllables($word) { $subsyl = Array( 'cial' ,'tia' ,'cius' ,'cious' ,'giu' ,'ion' ,'iou' ,'sia$' ,'.ely$' ); $addsyl = Array( 'ia' ,'riet' ,'dien' ,'iu' ,'io' ,'ii' ,'[aeiouym]bl$' ,'[aeiou]{3}' ,'^mc' ,'ism$' ,'([^aeiouy])\1l$' ,'[^l]lien' ,'^coa[dglx].' ,'[^gq]ua[^auieo]' ,'dnt$' ); // Based on Greg Fast's Perl module Lingua::EN::Syllables $word = preg_replace('/[^a-z]/is', '', strtolower($word)); $word_parts = preg_split('/[^aeiouy]+/', $word); foreach ($word_parts as $key => $value) { if ($value <> '') { $valid_word_parts[] = $value; } } $syllables = 0; // Thanks to Joe Kovar for correcting a bug in the following lines foreach ($subsyl as $syl) { $syllables -= preg_match('~'.$syl.'~', $word); } foreach ($addsyl as $syl) { $syllables += preg_match('~'.$syl.'~', $word); } if (strlen($word) == 1) { $syllables++; } $syllables += count($valid_word_parts); $syllables = ($syllables == 0) ? 1 : $syllables; return $syllables; }

Examples

The following are two examples of text and the readability of that text.

The first is an excerpt from [url=http://www.online-literature.com/grahame/windwillows/]The Wind in the Willows[/url]. It is what most people would call easy to read:

"There's Toad Hall," said the Rat; "and that creek on the left, where the notice-board says, 'Private. No landing allowed,' leads to his boat-house, where we'll leave the boat. The stables are over there to the right. That's the banqueting-hall you're looking at now - very old, that is. Toad is rather rich, you know, and this is really one of the nicest houses in these parts, though we never admit as much to Toad."

For reading ease, this scored 69. It also had a Grade Level of 7. This particular passage of Wind in the Willows scores at almost exactly the same level web page text should ideally score.

On the other hand, the following (both this text and the above were generously provided by [url=http://members.dca.net/slawski]Bill Slawski[/url], by the way) is an excerpt from a legal document, and would give many a headache:

The foregoing warranties by each party are in lieu of all other warranties, express or implied, with respect to this agreement, including but not limited to implied warranties of merchantability and fitness for a particular purpose. Neither party shall have any liability whatsoever for any cover or setoff nor for any indirect, consequential, exemplary, incidental or punitive damages, including lost profits, even if such party has been advised of the possibility of such damages.

This scores an incredible -1 on the reading ease scale. The Grade Level required to read it? 22. This is what you could widely consider the most unreadable text you could add to a web page.

These are, perhaps, extreme examples, but they should give an idea of the differences between good and bad text on a web page.


comments powered by Disqus