The word superiority has become a litmus test for theories of perception and it would be interesting to see how ACT-R's theory applied to that phenomena. The effect refers to the fact that it is easier to recognize a letter when it occurs in a word context than it is to recognize a letter alone. Thus, presented with a brief (and usually masked) presentation of WORD, subjects are better at judging whether the last letter was D or K (both of which make a word) than they are given a brief visual presentation of a D and asked whether they saw a D or K. In a test of whether ACT-R's pattern recognition component could model this effect, we used the corpus of 4-letter words compiled by McClelland and Rumelhart (1981) and used the same features to define letters that they used. When presented with a single letter, ACT-R would try to recognize a letter pattern. When presented with a 4-letter string, it would try to recognize a word pattern. It chose the word or letter that was most probable according to the Pattern-Recognition Equation (5.1). This equation requires having a prior probability P(k) of each word or letter which we set to a quantity proportional to the square root of the item's frequency (based on the work of Anderson and Schooler, 1991, investigating the relationship between frequency in the environment and memory). Equation 5.1 also requires the conditional probabilities P(f|k) of the features given the pattern. We set P(f|k) = .94 if f is a feature that should be present for the pattern and P(f|k) = .04 if f is a feature that should be absent. These values are rather arbitrary and only for demonstration purposes, but they represent the assumption that it was more likely, in sloppy encoding, that a feature would not be registered that was there than that a feature would be erroneously encoded that was not there. While this seemed like a plausible assumption, our results do not depend on it. Moreover, ACT-R's prediction of a basic word superiority effect does not depend at all on the values assigned to P(f|k) or P(k), although the actual level of the recognition rates does.