gender - Difference between female and male usage



What explains the difference of a de facto larger frequency of vowels of one writer compared to another? In the statistics data I examined, a vowel had higher probability in the text from the female Swedish authoer compared to a Russian male author. The statistics I cite compared the male and female use of consonants and vowels indicated that the probability of next sound being a vowel was much higher for the Swedish female author compared to a Russian male author. The probability of next sound being a vowel and the probability of next sound being a consonant could be explained to vary by style, by book, by author, by language and/or by gender (male/female)


Making statistics on material either women or men wrote, I hypothesize that there are more vowels when the writer is a female and more consonants when the writer is male. Are there any evidence for or against my notion? Did anybody make a study like that? Does it have any purpose besides being a "fact"? A purpose I can think is revealing forgery when a man for instance in a text pretends to be a woman or vice versa, a woman writing to you pretending to be a man then according to patterns you could get an indication.


Edit: I changed it to a real hypothesis about how sounds change since we may wish to compare phoneticallly if doing a real study that could indicate for instance whether the next message is from a man or a woman.


Edit: The statistics say there is a statistical difference between 2 books specified as the markov matrix for if the next sound is a vowel or a consonant given that the current value is a vowel or a consonant.



Answer



Just out of curiosity I have done some quick statistics.


I downloaded the following books from Project Gutenberg


Men writers



  • Alice's Adventures in Wonderland by Lewis Carroll

  • Adventures of Huckleberry Finn by Mark Twain

  • Moby Dick, or, the whale by Herman Melville

  • The Adventures of Sherlock Holmes by Sir Arthur Conan Doyle

  • The Picture of Dorian Gray by Oscar Wilde

  • Paradise Lost by John Milton

  • The Works of Edgar Allan Poe — Volume 1 by Edgar Allan Poe

  • War and Peace by graf Leo Tolstoy

  • Dracula by Bram Stoker

  • Treasure Island by Robert Louis Stevenson


Women writers



  • Secret Adversary by Agatha Christie

  • Jane Eyre by Charlotte Brontë

  • Frankenstein by Mary Wollstonecraft Shelley

  • Pride and Prejudice by Jane Austen

  • Sarah Orne Jewett

  • Ramona by Helen Hunt Jackson

  • Home Influence by Grace Aguilar

  • Middlemarch by George Eliot

  • A Season at Harrogate by Mrs. Hofland

  • Wuthering Heights by Emily Brontë


After removing the common Project Gutenberg header, I've read the files in R, split them into characters and let it count vowels and consonants.


I had a total of 8725700 characters for men and 11468186 for women


Here's a graph with the ratios consonants/vowels1 calcolated per book (showing mean +/- standard deviation)


Consonants/vowels ratio


There is no statistical significance in the two groups (p=0.89, t-test)


EDIT


I played some more with the data and I got this bargraph of usage of the single letters.


usage per letter


Again, you can see no major differences between men and women writers


EDIT2: I repeated the analysis with 10 books per group. I would say that there is definitely no difference




1 I considered a, e, i, o and u as vowels, the result does not grossly change including y.


Comments

Popular posts from this blog

commas - Does this sentence have too many subjunctives?

verbs - "Baby is creeping" vs. "baby is crawling" in AmE

time - English notation for hour, minutes and seconds

etymology - Origin of "s--t eating grin"

grammatical number - Use of lone apostrophe for plural?

etymology - Where does the phrase "doctored" originate?

single word requests - What do you call hypothetical inhabitants living on the Moon?