Calculate Totals and Subototals
On the previous page, we used the command nrows
to illustrate the way that most R commands work: with name of the command followed by parentheses that contain the vector or values that you want to analyze with the command. The nrows
command is, however, not one that you will use very often because it deals with your data structure rather than the data itself. Another more useful command is the sum()
command that can be used to find the total of the values in a vector. Using our sample data set, we can use the sum()
command to calculate the total number of words in the corpus of Greek tragedy and also the number of words written by each of our three authors.
This sum
function takes as its input a vector or list of numbers and returns the total. Working in interactive mode, the command sum(1, 5, 23, 100, 48, 9)
returns the output 186. We can extract a similar vector from our table of sample data with the command trag.length[, "Word.Count"]
and get the output [1] 4104 4939 5115 5189 5297 5426 5447 6240 6603 7077 7177 7279 7363 7398 7597 7672 7902 7914 8032 8157 8187 8396 8702 8830 [25] 9240 9280 9430 9879 9927 10030 10385
If you are going to use a vector multiple times, it can be easier to extract it from the table and assign it to its own variable. We can assign this vector to a variable called word.counts
using the command word.counts <- trag.length[, "Word.Count"]
and then get the total 234,214
with the command sum(word.counts)
. If you do not need to repeatedly use the value It is also possible to skip this intermediate step and simply issue the command sum(trag.length[, "Word.Count"])
.
Similarly, it is possible to calculate subtotals by introducing a row specification with the trag.length
varaiable. We can modify the vector assignment command above to soph.word.counts <- trag.length[trag.length$Author == "Sophocles", "Word.Count"]
and get a vector containing the length of each of Sophocles' plays. We can then issue the command sum(trag.length[trag.length$Author == "Sophocles", "Word.Count"])
to get the result 59,651. ((As above, you can also directly issue the command sum(trag.length[trag.length$Author == "Sophocles", "Word.Count"])
and get the same result.))
If - instead of calculating each frequency separately - you wanted to tabulate the frequency of al the values in a table, you can do this using the table
command. For example, the command table(odyssey.monsters[,"Segment"])
will give you the number of words in each segment in books 9 - 12 of the Odyssey as defined in our dataset.
Aeolus | CattleOfTheSun | Cicones | Circe | CirceGivesDirections | 760 | 2007 | 627 | 3683 | 1554 | Cyclops | Hurricane | Laestrygonians | LotusEaters | ScyllaAndCharybdis | 4818 | 193 | 1308 | 225 | 618 | Sirens | Underworld | 481 | 6111 |
Simlarly, you can get a count of all of the words that appear in books nine through twelve of the Odyssey, using the table
command as follows: table(odyssey.monsters[,"Lemma"])
. If you would like a count of all the words that appear in the underworld sequence, you can do this with the command table(odyssey.monsters[odyssey.monsters$Segment=="Underworld","Lemma"])
.
<<-- Previous: Count Records In A Table
Next: Generating A Word List -->>