Wednesday, July 15, 2015

Using a Text Analyzer to Highlight Problems in Word Usage

It is still early to think about editing my trilogy/way-too-long-novel, but I cannot help thinking about how hard it will be. Specifically, it will be hard in making it shorter, since I am not yet two-thirds finished, and it is near 200k.  

One website that I just came across today looks like it will prove to be helpful: text analyzer.

This program is simple: it calculates the frequency of words and phrases. (Note: Scrivener has a feature like this built in, but it always crashes when I try to use it on such a long work.) 

When I ran the entire text of my novel through the text analyzer, there were some embarrassing results. Some of them I already knew about, but there were some problems that surprised me. This tool calculates the frequency of phrases up to eight words long. There were four eight-word-long phrases that appeared four time each. Three of the times, the repetition was intentional. The other time...let's just say I had to keep remaining myself that there is a reason for editing! Four more phrases were used three times each, and twenty-two phrases of eight words used twice! 

And these were badly-written phrases, too. Things like "although most people hate him quite a lot." Both of those instances will have to be removed. If I take out both instances of all twenty-two offending eight-word phrases, that's an easy 352 words. Only a dent in my word count, but there are shorter phrases that can also be taken out.

Then I come to the three-word phrases. This is even more incriminating. The phrase "I don't know" is used 179 times. The top thirty of these phrases occur 2822 times altogether--that is almost 8500 words! 

The list of single words is also enlightening. The top ten words? I, the, to, that, it, and, is, a, of, he. The first word was what I predicted, since the novel is in the first person. Still, it makes up four-percent--a number that should be trimmed down. Most of the other words are excusable, though I will remove them when possible as I edit. There were two words, though, that really need to be removed in most of their occurrences. "That" is used too often, and it is usually unnecessary. I also have a bad habit of using the word "and" to begin sentences.  The removal of most uses of these two words will make my novel five percent smaller!

This text analyzer is going to be a great help for me in the editing process.





1 comment:

  1. Hmm, this sounds like a helpful tool, though I'm worried about what my results would be. I know one word I use way too much is "seem." Things don't just happen in my stories, they "seem" like they are happening, or the characters thought it "seemed" like this must be the case, etc. I almost find it amusing to read through my writing and find this, but it's also annoying because it means that much more editing.

    ReplyDelete