This software performs statistical analysis of strings. More precisely, the application counts the number of letters, digits, accents, punctuation marks, words and periods of a text. It also calculates the number of letters per word and words per period (showing position and dispersion measures of these quantitative variables), the longest periods, the shortest periods, the longest words and the shortest words. There are two special modules: the first module lets the student use the frequency distribution of letters to decode messages encrypted with the Caesar cipher, and the second one allows him or her to investigate power laws in frequency distributions of words in a text (Zipf's Law).
The main objective of our software is to provide an interactive environment in which students and teachers can experiment, explore and enjoy the use of statistics in a real-world application (namely, text mining), and through this exercise in the linguistic context, promote the learning of statistical concepts. In addition, this proposal has a practical feature: it is really very easy to find out data for analysis on the Internet (free books, poems, speeches, song lyrics, etc.).
|