INTEL PENTIUM 4+  vs. AMD ATHLON  XP 
a Confidential Analysis, entitled: If you believe the benchmarks, consider the future, first....™

    By David Cohen
    market analysis, CSS  -  January 2002
    (this article will be updated from time to time)


     You've all heard, a war is on.  Armonk and Basking Ridge, backing little AMD.  Intel in all it's glory, fighting on.
Yet something more subtle is going on beneath the sheets.  Look further before you rush to make a choice.
Both companies make exceptional products.  But there is a unique form of intellectual combat going on. A race.
Who will win?  Read on.

Here are the basic choices:

AMD Athlon vs. Pentium 4

AMD Athlon MP vs. Xeon

AMD Athlon XP vs. Pentium 4 Northwood

     On one hand, a CPU by AMD with 6-8 internal RISC micro CPUs, an advanced tri-level (actually a quad level) cache, with a big 128K front end, two Floating Point Units, and a lot of engineering that appears to have originated between IBM Armonk and Motorola/Bell Labs, designed to undo Intel's advantage with AMD carrying the ball.  (Our comments about IBM and Bell's involvement stem from personal knowledge of how the computer industry works, knowledge that may not be in the public domain. Basically, engineering from both concerns is available and for sale, when it suits them to sell it.)

     On the other hand, a CPU by Intel that is fairly tight, it has only an 8K L1 cache, and while it has 12 execution units, it is not clear that it is using them all to the maximum advantage today (and we underline the word: today).  It's greatest strengths: the ability to achieve high clock speeds in execution, bus and memory, and to stream data transmission.

     At the highest end of performance, the two are only 18% apart, with AMD taking the banner.  But, is there another competition going on under the skins: AMD's advanced architecture is very complex, and in order to increase performance, it had to increase that complexity, even though it's internal execution units are reduced, based on RISC.  Meanwhile, Intel is using a more conservative approach, going for raw linear clock rates.  Is Intel beating AMD at the reduction game, by reducing the girth of it's internal CPU architecture?  Intel us using a new technology that was predicted by CSS's chief scientist in 1984: originally called "CRISP", it is known as RCOIA (pronounced "Arkoia"), which stands for Reduced Complexity of Internal Architecture.  By leveraging "a little of this and a little of that" and pushing for extremely high data rates, will Intel unseat AMD in the benchmarks in the future?  Read on.  Maybe the benchmarks and ultimate performance are not so important as speed is so high today... 

But is Intel "rope a doping" the AMD conglomerate?  Certain parts of the Intel certainly seem to be slower than normal, lack of a barrel shifter, and other features, appear to dodge age old optimization techniques in exchange for a leaner and meaner machine.  Why is Intel taking the back road?  Read on.

a) Price vs. Performance. (Don't stop here.  Just winning the P v. P competition is not the only thing... Read on).

     This is somewhat complex because, just because AMD wins benchmarks, does not mean it is inherently a better chip.  Our conclusion: AMD's CPUs are overpowering in their array of technology, winning benchmarks by throwing enormous capabilities at the problem, vs. Intel, who is sacrificing only 18% of it's performance lead to AMD's winning CPU, in exchange for a much more power efficient, less overpowering array of technology, relying mainly on faster bus and memory bandwidth, and a more exotic, less stressed out architecture to achieve the same thing.

     AMD has been under pricing it's CPUs in order to gain market share.  Because it uses a so-called multi-issue RISC core similar to IBM's Power PC CPU (in fact, many of it's features appear to derive from IBM's Power PC which is now built by Motorola, who has an engineering relationship to AMD) AMD has also been under-rating it's clocking performance, downgrading the 1.4 and giving it only a 1900 MHz rating.  Most RISC processors are based on at least a 1.7:1 clock ratio.  That is because most RISC CPUs can perform nearly 1 Instruction Per Clock Cycle while a traditional 'CISC' like Intel's Pentium requires typically 4 or more clock cycles per instruction.  Intel tries to increase this yield with multi-execution units.  So does AMD..  On that basis, after much consideration, we believe that AMD's 1.4GHz CPU should be rated and compared with a Pentium 4 2.5 GHz (2500) in terms of "Instruction Yield" when compared on a relative instruction yield basis.  Complicating that fact is the notion that AMD has two vs. one floating point unit (this 'thingie' produces high end mathematical function) and more basic issue execution units (8-12 depending upon how you look at it) than Intel (5-7) in the P4.

     As a result, naturally, the raw power AMD is throwing at the problem yields a faster CPU for less when compared with a 1.9GHz P4.  But is it necessarily a better chip?  We don't believe so.  And we believe AMD has selected the 1.9GHz rating for it's 1.4GHz Athlon XP for some very good reasons that obstruct why they have to throw enormously combattive capabilities at the problem: they are hitting a speed vs. yield advantage wall.  RISC CPU chips do have a problem with that, as evidenced by the lag RISC CPUs have seen in achieving the highest clock rates.

     Of the two, the Pentium 4 produces "per instruction yield" more horsepower, for to compare it properly, you'd have to compare a 1.1GHz Athlon Thunderbird 266MHz bus CPU to an Intel Pentium 4 2.0 GHz CPU.  However,  since AMD is willing to sacrifice all that extra engineering, in order to accomplish the impossible: beating Intel at it's own game, AMD WINS THE PRICE VS. PERFORMANCE competition hands down.  It's most expensive CPU to date, is less expensive than a mid range INTEL Pentium 4.

     We believe that AMD has used marketing strategy to rate it's CPUs in a way that insures that "Tom's Hardware" benchmarks and other companies, find out what it found it.  The 1900 AMD Athlon XP is faster than the Intel Pentium 4 2000 at application execution.  But is it the equal of an Intel Pentium 4 2000?  No, we feel that in terms of technology, it is throwing the equivalent of an Intel Pentium 4 2600 in terms of execution units and RISC ALU at this. Based on today's Willamette core performance, such a Pentium 4 would be about 28% faster than today's 2000.  Which would make it about 10% faster than the EQUAL AMD Athlon XP.  Which is what we think is about the case: Pentium 4's are 10% faster than the equivalent AMD CPU, but AMD does not rate their CPUs correctly in a clock comparison sense.  And, we feel that AMD, whose CPU design is excellent, is also grossly undercharging for it's work.  It could increase it's prices 30% and re-rate it's 1900 AMD Athlon XP a 2200 AMD Athlon XP, and it would still be 8% faster.

     But it won't do that.  See below at "the Future" (Intel is about to put the pedal to the floor...)

     But by pushing the envelope on throwing exotic multi stage components at building a faster CPU, AMD is beginning to run out of steam with it's Athlon.  A 2.0 GHz version will exceed the loading of components with work, of a 3.8 GHz Pentium 4.  But Intel claims it can reach 10 GHz with Pentium 4.  AMD probably never will exceed 2.5GHz.

     So, AMD has it's work cut out for it.  Furthermore, AMD is vastly under pricing it's chips just to grab up market share from Intel.  This can't go on forever, the economy won't allow it.

    So, while Intel is producing a better CPU in terms of per-widget instruction yield, it is overpricing them relative to AMD, insuring it will be more profitable, while AMD continues to under-rate the clock speed of it's CPUs.

b) The Future.  (Here's where the answer lies...)

    Intel is about to release it's Tualatin based Intel Pentium 4 CPUs.  They have two market improvements.  The first is that in addition to dual streaming their CPU they double pump each stream.  Properly used, that could as much as increase the throughput of the new Pentium 4's by 33% or so.   Secondly, the core used in this new Pentium 4 is based on .13 micron gate sizes, which reduction in size by 1/7th, increases the internal execution rate of each instruction, potentially, allowing Intel to divert resources in it's XEON version of the Pentium 4 to such a degree, it can actually 'pretend' it is CPU is actually TWO CPUs (emulating a dual processor) with about 77%-95% efficiency.   This technique, called Hyperthreading, could leave AMD pounding dust, if it does not respond to it, for the dual Virtual execution at those rates of efficiency, means that a Hyperthreaded Intel CPU would be A WHOLE LOT FASTER than any equivalent AMD, even using today's rating system.  The Pentium 4 can also do this, but to a lesser degree.  By adding new 512K backend L2 Cache to both versions of the chip, and adding a few more bits and pieces to it's trace cache, Intel has enabled it's next generation 2.0 GHz Pentium 4 to increase throughput by as much as 55%.

    Making the 2.0 GHz Pentium 4 Northwood (that is the name for the new CPU) the POTENTIAL EQUAL of the 3.0 GHz Pentium 4 based on the older Willamette architecture.  Which has AMD very worried.

    AMD Fans have no fear: Of course, none of the extra performance Intel is gleaning out of Northwood, is AUTOMATICALLY provided you.  Microsoft will need to implement optimization in it's Windows XP operating systems in order for the next generation Northwood Pentium 4 to see many performance gains at all!

    But,  the use of a very efficient, if incomparable architectural design, for it's new Pentium 4 has achieved one major attraction: Intel can clock these new CPUs up over 3.5 GHz if it elects to.  

    And so it looks to us like Intel may be "rope a doping" AMD.  Who knows.  AMD and it's partners in research have faced challenging trials before.  Intel appears to like the competition, and it is rising to it taking a more conservative design attitude, while introducing some truly novel ideas, like Hyper threading, and Reduced Complexity of Internal Architecture (RCOIA, pronounced "Arkoya") in it's Pentium 4, something that AMD clearly never heard of, preferring to think solely in terms of RISC... which bespeaks the Sun Sparc, IBM Power PC origins of it's kind of design.

    Will this become a battle of RCOIA vs. RISC in the long run?  It looks that way today.

    And meanwhile, Intel has designed a next generation workstation/server chip based on the prior works of HP and Digital Equipment, called the Itanium, a/k/a IA-64.  It's performance is extraordinary, blowing the socks off of the Pentium 4 and the AMD Athlons, even with only an 800Mhz maximum clock.

    Hmmm... perhaps Intel does know what it's doing?  The bullet it appears to be dodging is the ultimate wall on it's clock rates, the ultimate parallelism of it's internal architecture.  While AMD now gets only slight increases out of the latter, it is hitting a wall on the former.  Intel is not.  We reiterate: perhaps Intel does know what it's doing?

----