Language consists of sentence constructs, word choices, and expressions of style, and an idiolect comprises an individual's uses of these facets.
Based on work done in the US, Nancy Niedzielski and Dennis Preston describe a language ideology seemingly common among American English speakers.
According to Niedzielski and Preston, many of their subjects believe that there is one "correct" pattern of grammar and vocabulary that underlies Standard English, and that individual usage comes from this external system.
[6] In 1995, Max Appedole relied in part on an analysis of Rafael Sebastián Guillén Vicente's writing style to identify him as Subcomandante Marcos, a leader of the Zapatista movement.
[10] Idiolect analysis is different for an individual depending on whether the data being analyzed is from a corpus made up entirely from texts or audio files, since written work is more thought out in planning and precise in wording than in spontaneous speech, which is full of informal language and conversation fillers, e.g. "umm..." and "you know".
Corpora with large amounts of input data allow for the generation of word frequency and synonym lists, normally through the use of the top ten bigrams created from it.
Data in corpus pertaining to idiolect get sorted into three categories: irrelevant, personal discourse marker(s), and informal vocabulary.