PROGRAMMING PARADIGMS

Fuzzy Logic and Prejudice

Michael Swaine

This column is about fuzzy logic: its struggle for acceptance, the arguments of its critics and supporters, and its record of success. If you're already pretty familiar with this paradigm and want some sample code and implementation advice, you may as well skip to the references at the end. Because this column is about fuzzy logic as an emerging paradigm. And along the way, it takes a detour to look at another paradigm, one that has struggled against resistance very much as fuzzy logic has, and which now finds itself in an ironic position.

Fuzzy Logic Has a Bad Name

Fuzzy logic is an approach to logic that has applications to microprocessor-control logic and expert-systems design. It was invented in 1964 by Lotfi Zadeh, who also named it.

Although it has been around for almost 30 years, fuzzy logic has been slow to catch on. One reason for this, as Dan McNeill and Paul Freiberger catalog in their book, Fuzzy Logic, is that it has been subject to prejudice and discrimination. Papers submitted to journals have been rejected because they dealt with fuzzy logic, engineers have been told not to employ fuzzy techniques, grant requests have been rejected because of the word "fuzzy" in their titles.

True, these journal editors, managers, and grant approvers may have been onto something. But McNeill and Freiberger make a convincing case that fuzzy logic has been the victim of unreasoning bias. At least some of this bias against fuzzy logic seems to involve its name. Perhaps it's not surprising that NASA engineers are reluctant to put themselves in a position where they have to tell their bosses that they're using fuzzy techniques. And their bosses certainly don't want to have to justify to the taxpayers that the logic behind space-shuttle navigation is fuzzy.

Zadeh deliberately gave fuzzy logic a provocative name. That decision may have been unwise.

Fuzzy logic is controversial enough without the oxymoronic name. It rejects the eternal verities of Aristotelian logic: That a thing is either A or not-A; that nothing is both true and false; and that nothing is neither true nor false. Fuzzy logic tolerates the true-and-not-true statement, and the neither-true-nor-false. It takes the view that, while absolute truths may exist, we never seem to see them in our data. We live in a world of imprecise, uncertain, subjective information, and to pretend otherwise is to ignore an important fact, to throw away data. The fact that the data are imprecise is itself data, data that excessive, obsessive precision throws away.

Fuzzy logic can answer questions classical logic can't. Consider the following examples of logical deduction:

Most people who blink rapidly are lying.

Bob Dole often blinks rapidly.

Therefore, Bob Dole is often lying.

My cousin Corbett looks a lot like Rush Limbaugh.

Rush Limbaugh is fat.

Therefore, my cousin Corbett is more or less a porker.

Classical logic fails to follow these deductions. It insists that fuzzy language like "most" and "a lot like" be quantified precisely. Natural logic follows these deductions informally, not worrying about precision. Fuzzy logic follows them formally and naturally.

But fuzzy logic is not fundamentally about logic.

It's about set inclusion. To how great a degree is a tomato a vegetable? Classical, Aristotelian, binary-valued logic says it either is or it isn't. Other logics permit an indeterminate third value, a Neither between True and False. But even such logics are crisp-edged: They insist on unambiguous set membership. They just have more sets than Aristotle would have approved of. Fuzzy logic is different.

Everything we know about biology and the evolution of species says that the boundaries are in fact vague. Moreover, our minds seem bent on treating them that way: A tomato is sort of a vegetable, and sort of a fruit. Psychological researchers have found that our mental categories are generally fuzzy. We think of exemplars of a category as being more or less good examples, more or less members of the set. Rather than thinking in terms of the set theory taught in school, we seem to think in terms of prototypes, and rough, subjective degrees of matching to these prototypes. When forced to categorize a penguin, we throw it unambiguously into the set called birds, but what we

really think is that it's not much of a bird.

Zadeh's fuzzy sets exactly capture this imprecision. Fuzzy sets are a way of talking about imprecision rather than masking it by rounding it off.

Say you're interested in finding the young, effective salespeople in your organization. The terms "young" and "effective" are fuzzy. Even if we agree that "effective" means a high total sales, these terms are imprecise. And the point is, this imprecision represents information. If we were only interested in total sales, we could perhaps set some precise cutoff, but suppose we're interested in the two variables of age and sales volume in combination. We'll look at a very young salesperson with a merely good sales total, or a somewhat older salesperson with a phenomenal sales record. We'll trade one variable off against the other. No precise cutoff for the sets of "young" or "effective" salespeople will let us handle this trade-off. Fuzzy sets do.

In the mathematics of fuzzy sets, an item has a value for its membership in a set, and this value is not necessarily 1 or 0, but may be any value between 1 and 0. The value of the intersection of two sets is the lower of their values; the value for their union is the larger of their values.

From a rigorous theory of sets one can derive a rigorous logic, and Lotfi Zadeh has done so with fuzzy sets, deriving the fuzzy equivalents to the propositional calculus and first-order predicate logic.

Why are the Victims of Discrimination the First to Discriminate?

But fuzzy doesn't have a lock on uncertainty.

Fuzzy logic is not the first attempt to reason rigorously about imprecise, uncertain, subjective information. That's what statistics is all about.

One type of statistics has been employed effectively in the very areas where fuzzy logic is being used: Bayesian statistics. Like fuzzy logic, the Bayesian approach has been used in expert systems. Like fuzzy logic, it has been discriminated against.

Bayesian statisticians refer to classical statisticians as "relative frequentists," alluding to the fundamental difference in their interpretations of probability. The relative-frequency interpretation has held sway throughout most of the history of statistics; only recently have Bayesians gained some ground.

To a relative frequentist, a probability is the limit of a sequence. The true probability of heads in flipping a coin is the limit of the relative frequency of heads in n flips of the coin as n goes to infinity.

If you start with this definition of probability, it determines what interpretation you can put on the outcome of an experiment. In particular, it doesn't allow you to draw conclusions about the probability that a theory or hypothesis is true; it either is true or it isn't. You can, however, draw conclusions about the probability that you would have observed the results you did if a given hypothesis is true. In other words, you can't talk about the probability of the hypothesis given the data, but you can talk about the probability of the data given the hypothesis. Since you can observe the data and you can only speculate about the hypothesis (that's why they're called hypotheses), this seems exactly backward.

Let's see how this works in practice. Let's assume that you're a relative frequentist and that you want to know if the coin I'm flipping is a legitimate coin. I've flipped it 20 times, and every time it's come up heads. Your suspicions are aroused. What can you conclude about the hypothesis that this is a fair coin?

The answer is, nothing. You can compute the probability of getting 20 heads out of 20 flips under this hypothesis, and it's very low. But this says nothing about the hypothesis itself. Classical statistics doesn't allow you to do much of anything with a single hypothesis. What it does allow you to do is to compare two hypotheses.

In our example, the hypothesis that I'm flipping a fair coin makes the observed outcome extremely unlikely, while the hypothesis that I've got a two-headed coin makes the observed outcome certain. On this basis, you are justified in rejecting the fair-coin hypothesis categorically, in favor of the two-headed-coin hypothesis. But you are not making a judgement about the probability of either hypothesis being true: It either is or is not true, and in the classical

approach probabilities do not apply. You've merely used the probability of an observed event occurring under the two hypotheses to make an educated guess.

If the logic of this seems contorted, it may be because I have not presented the classical approach fairly. Or it may seem contorted because it is contorted. That's what Bayesians think.

Bayesian statistics takes its name from Thomas Bayes, who published a simple formula about a century ago. The formula itself is uncontroversial; it's just a statement about how to compute conditional probabilities. It looks like this: P(H|D)=P(D|H)*P(H)/P(D). That is, the conditional probability of H given D is equal to the conditional probability of D given H, times the ratio of the unconditional probabilities of H and D. It's a cute trick for reversing the direction of conditionality. If we know how likely D is given H, we can use it to find out how likely H is given D. Assuming, that is, that we know how likely D and H are unconditionally. To make it clearer where these values come from when they use Bayes's rule, Bayesians usually write it this way: P(H|D)=P(D|H)*P(H)/(P(D|H)*P(H)+P(D|H')*P(H')), where H' means the complement of H, and P(D) has just been expanded into a weighted sum of probabilities.

The controversy comes in when we attach meanings to the symbols. H means hypothesis and D means data. Again, no one has ever questioned the validity of the equation. But notice what it does: It lets you compute the probability of the hypothesis given the data. In the classical interpretation of probability, this is meaningless. Hypotheses do not have relative frequencies. Observations can be repeated, so probabilities apply to them. Hypotheses can't; they are either true or false. So what is this equation saying?

As I understand it, in the classical view, the equation is mathematically valid and semantically vacuous. It follows from the mathematical definition of probability, all right, but it doesn't mean anything.

In the Bayesian view, it means exactly what it says. It is the key to Bayesian statistics. That and one other thing: a different interpretation of probability.

To a Bayesian, probability is a degree of belief.

Bayes's rule lets you revise your belief on the basis of new data. Your initial degree of belief, or prior probability, is P(H); your revised opinion, or posterior probability, is P(H|D); and the probability of observing this outcome if the hypothesis is true, which is derivable from the mathematical specification of the hypothesis and is the one probability here that a classical statistician would accept as meaningful, is P(D|H). Bayes's rule lets a Bayesian update a posterior probability from a prior probability based on the data observed.

There is a whole body of statistics that flows from this rule, but it doesn't necessarily lead to different conclusions than the ones reached by classical statisticians. Bayesian results map precisely onto classical results, given reasonable assumptions. Bayesian statistics caught on in business schools, but was sneered at by mathematicians and scientists because of P(H): In order to get the Bayesian engine cranking, you've got to prime it with a prior probability, a prior opinion. Every Bayesian conclusion starts with the unsupported (and therefore unscientific, subjective) opinion of the investigator. The fact that this mirrors reality did not initially impress the scientific and mathematical communities. Eventually, though, the success of the Bayesian approach led to its acceptance. (Those B-school statisticians were more concerned with the bottom line than with academic respectability.)

Not only is the Bayesian approach more intuitively satisfying than the classical, it also lends itself naturally to the design of robotic controllers and expert systems, both of which involve the updating of an initial guess on the basis of new data.

Like fuzzians, Bayesians were discriminated against. Like fuzzians, they championed an approach that was, at once, more natural and at odds with the fundamental assumptions of the official view. Having won some respectability, Bayesians were the first to denounce fuzzy logic as bogus. Naturally.

But fuzzy logic is a genuinely new paradigm. It's not about probability. It's about possibility. So say the fuzzians. To this the Bayesians have a rejoinder: Anything fuzzy logic can do can also be done with probability models.

Bayesians have claimed this repeatedly. This is the argument behind some of the rejections of fuzzy articles by journals. Fuzzy logic is nothing new, the argument goes, just a terminological affectation.

The argument is, of course, mathematical. It ignores the fact that fuzzy expert systems generally have about one-tenth the number of rules that probabilistic expert systems have, and are easier to implement. A similar argument says object-oriented programming can't do anything that spaghetti-code Basic can't do. Possibly so, mathematically, but in practice there's a world of difference.

But one fuzzy researcher, Bart Kosko, has tried to resolve the mathematical argument. Kosko has come up with a not-entirely uncontroversial mathematical characterization of the domain of fuzzy logic, from which it is possible to derive Bayes's rule. This would seem to imply that anything the Bayesians can do, the fuzzians can do, but that there may be more to fuzzy than mere probabilities.

People Don't Count (But They Do Classify)

There is a taxonomy of uncertainty.

George Klir, another fuzzian, claims that there are, in fact, four kinds of uncertainty: nonspecificity, fuzziness, dissonance, and confusion. Both fuzzy logic and probability theory--Bayesian or otherwise--deal with all four kinds of uncertainty, but probability does so unconvincingly at times. Proponents contend that fuzzy is more representative of reality, even if it is mathematically equivalent to probability.

There's another sense in which the probabilistic approach rings false, and here I have personal experience. In graduate school, in cognitive psychology, studying how people handle probabilities, recomputing them, working in a Bayesian paradigm, I chanced upon work of two Israeli psychologists. They concluded that people in research settings like mine are really judging how closely events match an exemplar, how much they fit into a category, not updating probabilities at all. What Kahneman and Tversky were describing, although they didn't use the term and I had not heard it then, was "fuzzy-set membership." In any case, it wasn't Bayesian-probability revision, and a few months later, I transferred to the computer-science department

Fuzzy Logic Needs a Killer App

Fuzzy logic has recently been gaining acceptance in the United States and in Europe. But the process is slow, resistance remains, and meanwhile the Japanese are making excellent use of fuzzy logic in processors and expert systems. Fuzzy is actually a fad in Japan, but that shouldn't overshadow the real successes: The Sendai subway, for example, which American engineering has not been able to match for smoothness of ride, employs fuzzy logic for this control.

What American fuzzy logic needs to break through, I think, is the quintessential American application. This needs to be an appropriate application for fuzzy logic: one in which the input is subjective, imprecise, and uncertain. And it should be a decision-making task. It should also be something that'll catch on at universities, where it will be burned into the brains of the next generation of executives and programmers. And it needs to be something warm and nonthreatening, to overcome the resistance to the name and the novelty of the technology; preferably a familiar application that has already been used to break down resistance to computers.

I can think of only one such application. A computer dating service. Your dream date through warm and fuzzy logic. How can it fail?

Further Reading

Kosko Bart. Neural Networks and Fuzzy Systems. Englewood Cliffs, NJ: Prentice Hall, 1991.

McNeill, Daniel and Paul Freiberger. Fuzzy Logic. Englewood Cliffs, NJ: Simon & Schuster, 1993.

Zadeh, L.A. "The Calculus of Fuzzy If/Then Rules." AI Expert (March, 1992).

Copyright © 1993, Dr. Dobb's Journal