Hal is a hardware engineer who sometimes programs. He is the former editor of DTACK Grounded, and can be contacted through the DDJ offices.
A "small" catastrophe is one that happens to someone else. Consider, for instance, the San Jose, CA, resident who discovered that, while she had been at work, somebody had dug a ditch in the street, run the ditch into her front yard, and dug up her rose bushes. The next morning she discovered, when Pacific Gas and Electric's work crew showed up, that no mistake had been made. PG&E had fully intended to dig up her rose bushes.
A quick review: Rice, normally a crop for monsoon countries where the annual rainfall is a quarter-mile or so, is being grown in the central California desert. Rice requires so much of California's ample water supply that there's precious little left over for people. So San Jose is recycling sewage water for watering industrial-park lawns. While laying the pipe to carry this water, San Jose contracted with PG&E to place fiber-optic cable in those sewage-water ditches.
Well, the cable is in place. The optic cable must be coupled to regular coax for home use, and that involves electronic equipment boxes measuring 6x5x2 feet (!). These boxes will run hot, so they have to be in the open air, not underground. This means they will be a prominent feature of any front yard they reside in, and if rose bushes have to be sacrificed or little Timmy's play area is eliminated, tough.
Meanwhile, several new cellular services are being established in the San Jose area. Each cellular service requires its own set of transceivers every few blocks. You guessed it: The new cellular service providers also have a contract with the city of San Jose, and it turns out your house is just perfect to bolt one of these antennas onto. Never mind that hearing aids and pacemakers don't work close to such devices.
Just when most traditional utilities such as telephone lines and electrical power have finally moved underground, the new information age is making even upscale residential neighborhoods look like high-tech junkyards. Isn't progress wonderful?
According to the industry newsletter Microprocessor Report (MPR), the fastest shipping desktop computer (as measured by SPECint95) on January 1 was the 200-MHz Pentium Pro. This lead lasted for about a month before DEC started shipping a 333-MHz DEC 21164-based workstation.
Several companies are continually vying to produce the fastest desktop CPU. At any given moment, every company but one fails to do so. Each failure is a not-so-small catastrophe--enormous sums of money are at stake.
A few years back, Sun designed the SuperSparc in-house, then turned the design over to Texas Instruments' process engineers. When Supe' saw the light of day, it proved far slower than Sun had preannounced. Unseemly fighting broke out as Sun and TI engineers pointed the finger of blame at each other. Everyone in the industry, including Sun's workstation customers, became aware that Supe' was even slower than cheap Pentium-based machines. But Sun learned its lesson.
Sun has started shipping systems based on the SuperSparc's successor, the UltraSparc. Sun has announced that Ultra is wonderful; no public arguments or finger pointing. Ultra systems are slower than relatively cheap Pentium Pro boxes, but that isn't mentioned by Sun's PR pitchmen. The UltraSparc: success, glory! Why, Sun even included a bunch of CISCy multimedia instructions, each of which replaces a sequence of 20 to 30 RISC instructions. So much for RISC's theoretical superiority to CISC. Don't laugh. Although the facts on the ground remain constant, Sun's stock has gone way up since Sun changed its PR stance.
You may remember that MIPS, the microprocessor design firm, was going bankrupt. Silicon Graphics (SGI) had to buy it to assure a continuing supply of leading-edge CPUs. The new R10000 has seen first silicon and works well, being competitive with the 200-MHz Pentium Pro and the 333-MHz DEC 21164. But the die is huge and yields are, as yet, very low.
Remember that SGI and DEC can get by with shipping dozens of their highest-end CPUs, while Intel needs shipments in the hundreds of thousands to be even noticeable.
SGI's catastrophes, two of them, loom on the horizon. First, SGI specializes in three-dimensional (3-D) graphics and has traditionally maintained 50 percent gross margins, as Apple once did. 3-D graphics boards are becoming commonly available for x86-based PCs, and (stop me if you've heard this before) SGI has discovered it cannot sustain 50 percent gross margins. In fact, it recently announced major price cuts, including dropping some system prices in half.
The supercomputer industry, which was never an especially profitable marketplace, peaked in 1992 before entering a terminal nosedive. SGI has decided that the solution to its problems is the purchase of supercomputer manufacturer Cray Research, the one in Minnesota. I am not making this up.
Well, they were supposed to before reality set in. First Apple crippled its initial 601-based systems (except for the most expensive 8100 line) by leaving out the L2 cache. This caused a horrendous performance shortfall. The line continues to be crippled by the absence of a native-code operating system.
The performance of the next-generation 604 was overpromised by 33 percent by both IBM and Apple. When the 603 came out, the internal (L1) instruction cache proved seriously undersized, so the 603e had to be rushed to market. IBM and Apple were going to release hardware-compatible systems, but it didn't happen. IBM was going to ship a PPC-native OS/2, but that didn't happen either.
All of the PPC's performance problems were going to be solved by the third-generation 620. Well, the 620 staggered weakly into public view with astonishingly poor performance, so much so that the chip has been withdrawn. IBM fired the two heads of its Somerset PPC design center and hired a new boss from a company that knows how to design fast CPUs. Yep, Cyrix.
(Andrew Allison, publisher of the computer-systems industry newsletter Inside the New Computer Industry, thinks the 620 project would have been canceled outright except for some inconvenient delivery contracts.)
The poor performance of the Pentium Pro on 16-bit code is now universally recognized. When the CPU was introduced, Intel's PR carefully pointed out that this performance shortfall was intended. But I've read that Intel's competitors are "ecstatic" over the P6's poor 16-bit performance; surely that was not intended!
As this situation first unfolded, a colleague told me the Pentium Pro would be sold for Windows NT and the Pentiums for regular Windows. I replied that such an artificial divide would be untenable. Right now my colleague is way the heck ahead on points. Sigh.
As I write this, Intel is deciding which of three product lines will get the first production allotment of its new 0.25-micron fab. The one that gets the production slot will be the one with leading performance and will likely recover the industry lead from the DEC 21164. The three contending products are the regular Pentium, a new version of the Pentium with added multimedia instructions (the P55), and the Pentium Pro.
You would think the Pentium Pro would be awarded that production slot, but if Pentium Pros are only going to be sold to NT users then not many will be sold. Intel is a mass-production company which cannot make a profit selling a very few Tiffany CPUs. (You don't suppose Intel is quietly redesigning the Pro's 16-bit mode, do you? Naah. Too sensible. Face would be lost.)
A few years back the latest fad in the semiconductor industry was the fabless chip-design firm. Now Cyrix has what's generally regarded as a really good x86-compatible chip design, but has no leading-edge production capacity. Neither IBM nor Thomson CSF, who have contracted to build chips for Cyrix and for sale under their own name, have significant available production capacity.
The spiffiest of CPU designs is worthless unless the part can be built, and in the mass PC marketplace, that means built by the millions. I tell everybody that Intel should buy Cyrix; everybody tells me I'm nuts. I guess Intel has as severe a case of "not invented here" as most design companies.
A bellyflop won't win the Olympic diving competition this summer in Atlanta, and AMD's K5 processor, the one that was going to drive Intel out of the Pentium business, isn't going to win any computer system sales at all. After trumpeting its fabulous (impending) performance for what seemed like years, AMD finally produced working K5s and discovered, to its horror, that they were pathetically slow (the SuperSparc scenario, only worse).
AMD's recovery plan is to rename the K5 as the SSA/5 to compete in the shrinking upscale-486 arena, not with Pentiums. AMD has purchased NexGen and renamed two NexGen developments as the new future K5 and K6. Like all future CPUs, these are world beaters. Like all future CPUs, the reality that eventuates may or may not prove propitious.
In the context of this column, DEC is more sinned against than sinning. DEC's chip designers decided years back, when the Alpha was being designed, that a fast clock was what it would take to provide industry-performance leadership as measured by SPECmarks. They were right. Ever since the Alpha was introduced, DEC led the SPECmark derby, as measured by SPECint89 and then SPECint92, usually by a large margin.
Most of the members of SPEC are CPU producers. The other members got tired of eating DEC's exhaust, so the new SPECint95 has been devised. Using this new metric, DEC was knocked out of the SPECmark lead with no change whatever in anyone's CPU designs. (True, DEC has regained the lead, but by the narrowest of margins.)
If you think this result is unrelated to the nature of the new SPECint95, I have some Montana beachfront property for sale. Other writers have explained the what and how of this new benchmark; I'm going to tell you why.
Higher performance can be attained by either pushing up the clock frequency of a CPU with few execution units (the DEC approach), or by using a more moderate clock frequency and using superscalar parallelism with many execution units (the approach used by everybody else).
The higher the clock frequency, the worse the performance hit of a cache miss and stall while data is fetched from DRAM. DEC's 21164 is the first CPU to include on-chip both the now-conventional L1 cache and also a larger L2 cache that's slower but still faster than the external (L3 in this case) cache using fast static RAM. In case there's a cache miss in both the 21164's internal L1 and L2 caches, DEC uses 8 MBs (!) of L3 cache in its fastest systems, again to minimize cache misses.
The new SPECint95 test suite has been specifically designed to overload all caches, including DEC's 8 MB, and force numerous data fetches from DRAM. This slows down everyone's systems, not just DEC's. But since DEC uses the fast-clock approach, this brings DEC's performance back to the front of the pack. To overload an 8-MB cache (with enough margin to prevent DEC from simply switching to, say, a 32-MB cache) requires a really big data set. That's why it takes two days to run the new SPECint95 test suite.
If your application doesn't overflow conventional static-RAM caches, then a DEC machine will perform better against the competition than SPECint95 would indicate. If your application overflows conventional, smaller caches but doesn't overflow DEC's 8 MB, then you definitely want to run your application on a DEC system, all other factors being equal. If your application overflows all caches, then SPECint95 is a very good performance indicator.
In their attempts to provide us end users with excellent performance, and to one-up their competitors, CPU designers are constantly pushing into terra incognito. It's not possible to predict what strange beast is going to jump out of the bushes to gnaw on the chip-designers' ankles, or worse.
All leading-edge CPU designs are world beaters at the start of the project, else management would not commit the enormous piles of money needed to develop such a device. When (if) the CPU sees first working silicon and its performance can be measured, we then learn whether the PR puffery was accurate. Nobody ever sets out to build a slow CPU.
If AMD/NexGen's upcoming K6 proves as fast as the PR folk claim, I'll finally be able to run Win95 as fast as my 8-MHz Z80 used to run CP/M.