Our finding that the similarity relations within and among early-learned nouns and adjectives may create the noun advantage over adjectives contrasts with the suggestion that objects as opposed to their attributes are conceptually special (see, e.g., Gentner & Rattermann, 1991; Markman, 1989). However, one might argue
that a three-layer network in which the hidden layer compresses the sensory input into one holistic representation is one instantiation of how a whole-object conceptual assumption might be implemented. From this argument, one might conclude that this network was ``designed'' to learn easily about categories in which all instances are globally similar to each other (and thus compact and small). Is this not, in a sense, a built-in bias for noun-like categories?
By one interpretation of this question, the answer is a clear ``yes.'' The proposal that noun categories are more ``natural'' than adjective categories and the proposal that young children ``assume'' that words name things and not their properties are currently unspecified in terms of the processes through which the naturalness of nouns or children's assumptions might be realized. This network offers one implementation of these ideas; it shows just how nouns might be more ``natural'' and why very young children seem to interpret novel words as having nominal meanings. Thus the results of these simulations might be properly viewed as supportive of and an extension of proposals about young children's early biases and assumptions about word meanings.
But there is a second interpretation of the question of whether a noun-advantage was built into the network that demands a clear ``no.'' It is true that representations at our hidden layer holistically combine the input from the separate sensory dimensions. Connectionist networks do not have to do this. For example, Kruschke's ALCOVE network [Kru92] utilizes distinct dimension weights such that the network retains information about distinct attributes at the hidden layer level. Given these differences, one might expect that Kruschke's network would learn adjective categories more easily than the present one. This may be. However, the conclusion that our network is structured to make the learning of adjectives hard is not warranted. It is not warranted because our network learns single-dimension adjective categories easily, trivially fast when there is only one relevant dimension and no overlapping categories. That is, when we presented our network with the same kind of task that ALCOVE has been presented with --- classifying all inputs into two-mutually exclusive categories, each constrained by variation on the same dimension (what might correspond to learning the categories BLACK versus WHITE) --- the network rapidly (in less than 500 trials) converged to a set of attention weights that emphasize the solely relevant input dimension. In brief, it is not hard for this network to learn adjective-like categories.
However, it is hard for this network to learn adjective-like categories when it must, like young children, simultaneously learn noun-like categories that require attention to many dimensions and multiple overlapping adjective categories that each require attention to different dimensions. We conjecture that a similar difficulty might hold even for models like ALCOVE when the task is the simultaneous learning of multiple overlapping noun-like and adjective-like categories.
In sum, the ease with which the present network learns adjective categories on one dimension when that is all that it has to learn indicates that the noun advantage is not solely the product of the compression of dimensional information at the hidden layer. Rather, the noun advantage appears to be a product of similarity-based learning and the task of learning overlapping categories. Given this kind of learning device and this set of tasks to be learned, noun-like meanings are primary.