Ugly duckling theorem

The Ugly Duckling theorem is an argument asserting that classification is impossible without some sort of bias. It is named for Hans Christian Andersen's story "The Ugly Duckling." It gets its name because it shows that, all things being equal, an ugly duckling is just as similar to a swan as two swans are to each other, although it is only a theorem in a very informal sense. It was proposed by Satosi Watanabe in 1969.

Watanabe came to realize there is an unquantifiable number of shared properties between all objects, making any classification biased. Murphy and Medin (1985) give an example of two putative classified things, plums and lawnmowers:

"Suppose that one is to list the attributes that plums and lawnmowers have in common in order to judge their similarity. It is easy to see that the list could be infinite: Both weigh less than 10,000 kg (and less than 10,001 kg), both did not exist 10,000,000 years ago (and 10,000,001 years ago), both cannot hear well, both can be dropped, both take up space, and so on. Likewise, the list of differences could be infinite… any two entities can be arbitrarily similar or dissimilar by changing the criterion of what counts as a relevant attribute."

Unless some properties are considered more salient, or ‘weighted’ more important than others, everything will appear equally similar, hence Watanabe (1986) wrote: “any objects, in so far as they are distinguishable, are equally similar". However, since there is an unlimited number of properties to choose from, it remains an arbitrary choice what properties to select/deselect. This makes classification biased. Watanabe named this the "Ugly Duckling theorem" because a swan is as similar to a duckling as to another swan (there are no constraints or fixes on what constitutes similarity).

Suppose there are n things in the universe, and one wants to put them into classes or categories. One has no preconceived ideas or biases about what sorts of categories are "natural" or "normal" and what are not. So one has to consider all the possible classes that could be, all the possible ways of making sets out of the n objects. There are $2^{n}$ such ways, the size of the power set of n objects. One can use that to measure the similarity between two objects: and one would see how many sets they have in common. However one can not. Any two objects have exactly the same number of classes in common if we can form any possible class, namely $2^{n-1}$ (half the total number of classes there are). To see this is so, one may imagine each class is a represented by an n-bit string (or binary encoded integer), with a zero for each element not in the class and a one for each element in the class. As one finds, there are $2^{n}$ such strings.

...
Wikipedia