Archive

Archive for the ‘Intel’ Category

Spiceworks adds remote power management

December 6th, 2009 Justin James No comments

Spiceworks and Intel have put together a plugin for the Spiceworks network management product that allows for remote power management. This is a useful way for companies to save valuable energy dollars in these tough economic times.

J.Ja

Categories: Intel, Spiceworks Tags:

Intel Clarkdale dual-core Westmere system at 27.6W system power

September 29th, 2009 George Ou 3 comments

This mini-ITX system based on Intel’s next generation Clarkdale dual-core “Westmere” 32nm CPU with a 45nm GPU and memory controller on the same CPU package has a system idle power consumption of 27.6 watts.  That seems pretty outrageously efficient for a system that performs about as well as a quad-core Q9600 2.66 GHz processor.  This is probably the first x86 CPU with a built in graphics controller and it is using a 32nm Westmere core which is a die shrunk Nehalem core with some modest architectural enhancements.

Now if we had switcheable graphics support on the desktop, then we’d have a gaming system that could idle at 27.6 watts compared to a normal gaming system that idles at 110 watts because the GPU can’t be turned off.  It would probably be the most energy efficient gaming system in the world.

Categories: Energy efficiency, Intel Tags:

DFI’s two systems on one motherboard

September 19th, 2009 George Ou 10 comments

This is a VERY cool new motherboard from DFI. It features a P45 chipset motherboard along with an NVIDIA Ion motherboard and Intel Atom CPU on a single board with an integrated gigabit Ethernet switch. It comes with a USB and audio KVM switch as well. The system allows you to shut down or suspend your high performance P45 system and leave the NVIDIA ION chipset and Intel Atom CPU running in low power mode. The video quoted 30 watts which isn’t all that low power unfortunately.

Only downside to this that I can see is the $399 list price. Hopefully more motherboard makers will build a product like this and get the price to come down. What I want is a P55 chipset motherboard and the next generation PineTrail-D Atom system as the second system.

Two AMD Shanghai processors beaten by a single Intel i7-965

November 18th, 2008 George Ou 68 comments

AMD had much to celebration last week as they managed to beat Intel’s Nehalem-EP to the mainstream two-socket server market with Shanghai.  AMD took a lead in SPECjbb, SPECweb, SPECfp, and Virtualization Nested Paging for the server market and it looked like they might have had 2 months of breathing room to before Nehalem-EP arrives in the two-socket server market.  But there was an unexpected party crasher today when a single Intel’s Nehalem i7-965 3.2 GHz single-socket processor for the desktop market decided to take on two brand new AMD Shanghai processors in the server benchmarks and win.

In the server space, the most commonly cited benchmark is general purpose integer performance which is measured by SPECint.  The table below shows the latest results from SPEC.


CPU GHz Socket Cores Threads SPECint_rate

base2006

SPECfp_rate

base2006

Intel Nehalem i7-965 3.2 1 4 8 117 82.9
AMD Shanghai 2384 2.7 2 4 8 113 105

AMD and Intel have been neck and neck in the server market in the last 4 years but never has a single top-end CPU from one competitor between two top-end processors from the other competitor.  Even on SPECfp high performance computing where AMD has dominated for the alst 4 years, the single i7-965 Nehalem comes relatively close to the performance of two Shanghai processors which almost guarantees dominance for Nehalem-EP two-socket servers when they arrive in a few months.

The Intel Nehalem i7-965 is showing a hint of what’s to come when Nehalem-EP arrives in two-socket configuration.  In the past, Intel had difficulty scaling sockets because they remained on a single North Bridge memory controller and the Front Side Bus (FSB) but those days are gone with Nehalems integrated memory controller and new QuickPath interconnect architecture.  As a result, the performance of Nehalem-EP is expected to scale extremely well because the number of memory controllers (built in to the Nehalem die) and the number of DDR3 memory channels will double.  This means we may see a two socket Nehalem-EP server crack 200 points on SPECint_rate_base2006 which is a massive leap for the mainstream server market.

Update 11/23/2008 – TechRadar has a first look at Nehalem-EP dual-socket and the performance is off the charts.

Categories: AMD, Benchmarks, Hardware, Intel, Virtualization Tags:

The facts about AMD’s ACP power rating

November 14th, 2008 George Ou 20 comments

One issue that I’m frequently asked about is AMD’s alternative energy efficiency rating called Average CPU Power (ACP).  AMD created this new standard on its own last year and despite controversy, they’ve managed to get the media to accept this new definition of energy efficiency.  AMD has told the press that AMD’s ACP rating is equivalent to Intel’s TDP rating and I know this because this is what AMD spokesperson John Taylor told me when I was Editor at Large at ZDNet.  Here’s an excerpt of what he emailed me on September 10th 2007:

“We believe that for the datacenter customer, AMD ACP is a more useful metric when configuring/budgeting. As Intel TDP is the only available metric for Intel, it is the most comparable to AMD ACP. The trick is: how does intel formulate its TDP? My sense is it is defined very closely to AMD ACP, which reinforces a “yes” answer to your question. But i’ d’ prefer Intel confirm that.”

Well I’ve not only spoke to Intel which emphatically denies this, I know for a fact that this comparison is not correct and I’ve also gotten David Kanter (one of the leading microprocessor analysts in the world) to corroborate this.  I asked David Kanter to review AMD’s claim that Intel’s TDP rating was “most comparable” to AMD ACP and he explained:

“It’s pretty hard to justify the comparison between the two from a technical perspective.  AMD’s ACP is defined in a very different way from Intel’s TDP, according to my understanding.”

We can easily verify that AMD’s ACP rating is not comparable to Intel’s TDP rating by looking at the actual performance of the newest AMD Shanghai based servers versus Intel’s current servers.  When we look at the official published SPECpower measurements between an AMD “Shanghai” 2384 2.7 GHz system with an ACP rating of 75 watts and an Intel L5430 with a TDP rating of 50 watts with comparable components in the rest of the server, we would expect a power difference of roughly 50 watts (25W per processor) if AMD’s claim that AMD’s ACP was most comparable to Intel’s TDP rating.  But according the official SPECpower benchmarks which is optimized for low power consumption, the Intel L5430 server peaks at 161 watts while the AMD 2384 based server peaks at 264 watts.

That’s more than a 100 watt delta and when we account for the fact that the Intel server has an additional North Bridge memory controller to deal with, the actual difference between the CPUs is even greater than 100 watts.  We can negate the fact that the AMD server has two more memory DIMMs which consume an additional 8.4 watts of power because AMD uses hard drives that use 6 watts less power than the Intel system.  This strongly suggests that an AMD TDP rating of 95 watts is far more likely to explain the 100 watt more power consumption than the Intel system with 50 watt TDP processors, so calling the “Shanghai” 2384 processor a 75 watt part simply doesn’t reflect the actual efficiency of the chip.

Based on what the experts say and what the evidence suggests, AMD claiming that their ACP rating is comparable to Intel’s TDP rating is simply incorrect and the ACP rating is effectively overstating the energy efficiency AMD processors.  This is the equivalent of a car company that came up with its own Miles Per Gallon (MPG) fuel efficiency rating to inflate its actual MPG rating.

Now to be clear, TDP was never meant to be an energy efficiency rating but it is in effect exactly that when it’s used in the context of advertising and press releases.  TDP is actually an engineering metric for server manufactures to figure out what kind of cooling they have to design in to their system to accommodate a CPU.  A far more appropriate measure that people should be looking at for servers is the SPECpower rating.  But if the media insists on quoting CPU wattage ratings, they should use a consistent measurement and the only one that is comparable and accepted by the entire industry is TDP.

Categories: AMD, Benchmarks, Energy efficiency, Intel Tags:

AMD Shanghai 45nm launch – Server benchmarks roundup

November 14th, 2008 George Ou 4 comments

AMD launched its 45nm Shanghai processors today for the server market ahead of Intel’s Nehalem processor launch.  AMD lists a series of benchmarks here but they omit many of the better results from Intel.  To get the full official results, here are the benchmarks based on the best scores available from AMD and Intel as of November 13th 2008.

CPU GHz Socket Cores SPECint SPECfp SPECjbb SPECweb SPECpower SAP
Intel X5482 3.2 2 4 156 93.4        
AMD 2384 2.7 2 4 136 118 352700   860  
Intel L5430 2.66 2 4         1135  
Intel X5470 3.33 2 4     316728      
Intel X7460 2.66 4 6 294 156        
AMD 8384 2.7 4 4 249 210        
AMD 2356 2.3 2 4       30007    
Intel X5460 3.166 2 4       29591    
Intel X7460 2.66 8 6           9200
AMD 8384 2.7 8 4           7010

It appears that AMD has made some important gains and it has taken the lead in SPECjbb, SPECweb, expanded its lead in SPECfp, and Virtualization (due to Nested Paging).  AMD still trails in SPECint, SPECpower, and SAP but this is an important victory for AMD which has been plagued with delays in 2007 and 2008 with its previous Barcelona processor.  Shanghai is a major milestone for AMD because it required a shift to a whole new 45nm immersion lithography process and it shows that AMD can launch a product on time and put Barcelona behind them.

However, Intel is expected to launch its Nehalem-EP processors for the mainstream two-socket server market within a few months and Nehalem is expected to be a huge jump in performance on the server platform.  While the new triple-channel DDR3 unbuffered memory subsystem doesn’t do too much for the Intel i7 Nehalem desktop processors, it is expected to unleash a huge increase in performance for Intel Nehalem.

Intel’s Penryn class of processors launched last year will be the last generation of Intel processors to use the Front Side Bus (FSB) and a single North Bridge memory controller located on the motherboard.  The FSB and single North Bridge memory controller wasn’t a problem for most of Intel’s product line but it significantly hampered Penryn’s performance in the two-socket market at higher clock speeds.  But this wasn’t a problem since AMD had trouble launching its Barcelona products early enough and at high enough clock speeds to threaten Penryn on the high end, and only now are the newest AMD Shanghai processors competing head to head with Intel Penryn.  Some people wondered why Intel stuck with the FSB architecture for so long, but the timing seems to have been appropriate given the outcome in the last two years.

Intel’s Nehalem will be too fast to run on FSB architecture which is why Intel designed it with QuickPath.  Nehalem will have no such memory bandwidth limitations as it transitions to QuickPath architecture with a memory controller on each microprocessor.  Nehalem also catches up with AMD on nested paging support for improved virtualization, but the late timing doesn’t seem to have hurt Intel given the fact that virtualization hypervisors are only now beginning to support nested paging.  So the race is on to see how quickly AMD can ramp up Shanghai processors and how long it takes Intel to launch Nehalem.

Categories: AMD, Benchmarks, Intel, Processors, Virtualization Tags:

SAP benchmarks for AMD Shanghai released

November 12th, 2008 George Ou 6 comments

I just came across some certified SAP server benchmarks for the soon to be launched AMD 45nm Shanghai processor in comparison to some Intel Tigerton 65nm processor and Intel Dunnington 45nm processor.

System CPU Cores Sockets Clock Score
HP ProLiant DL785 G5 Opteron 8384 4 8 2.7 GHz 7010
IBM System x3950 M2 Xeon X7350 4 8 2.93 GHz 6615
IBM System x3950 M2 Xeon X7460 6 8 2.66 GHz 9200

While I’m not 100% certain that the Opteron 8384 is a Shanghai processor, the fact that it has 128 KB L1 cache and 512 KB L2 cache per core, and 6 MB L3 cache per processor is a pretty good indicator that it is Shanghai.

The Intel Dunnington server probably holds the edge because of its single die monolithic 6-core CPU whereas the Shanghai only has 4-cores.

Also, more Shanghai results here.

Categories: AMD, Benchmarks, Intel Tags:

AMD submits suboptimal SPECpower benchmarks for Intel

November 9th, 2008 George Ou 25 comments

Performance benchmarks are the equivalent of the Indianapolis 500 for the computer industry, if not more important.  Rarely are the top performance numbers attainable in the real world, but winning is a critical symbolic victory in the war of perception.  The winner of the benchmarks will be perceived to have a technological lead across their entire line of products while the loser is perceived to be inferior across their entire line of products.

Benchmarks are like races and the winner is always determined by the best performance.  The only fair way to conduct benchmarks is to have each player put forth their best player to achieve the highest scores possible within a common set of rules and parameters.  By this measure, AMD’s recent submission of SPECpower energy efficiency benchmarks on behalf of Intel servers which portray Intel in a sub-optimal light while ignoring superior scores for Intel is highly inappropriate.

AMD says that their system gets a score of 731 while the Intel system that AMD submitted gets a score of 561.  AMD claims this is a legitimate comparison because the systems share many commonalities and they defend their behavior by saying:

“if we were trying to show worst case scenario we would have turned off Power Management on the Intel server”

But the fact that AMD could have turned in even worse performance numbers for the Intel system is totally irrelevant.  If anything, submitting plausible performance numbers on the Intel system is far more insidious because it is a more effective deceit.

The obvious problem with AMD’s explanation is that the Intel system submitted by AMD shows a poorly optimized hardware configuration for Intel in terms of energy efficiency.  The system uses Fully Buffered memory DIMMs which are notoriously inefficient for the Intel system when the most efficient Intel systems use unbuffered DIMMs.  Intel launched the San Clemente 5100 series chipset exactly one year ago which uses the same unbuffered memory as AMD.  While it’s true that Intel initially went with Fully Buffered memory two years ago, it was obviously a mistake given the power consumption and Intel quickly fixed that mistake last year.  With Nehalem-EP two-socket servers coming at the end of this year, Intel will only use ECC unregistered or registered DIMMs and they will completely shun FBDIMMs.  The bottom line is that anyone looking to build an efficient Intel-based server today should use unbuffered memory and AMD conveniently forgot to do that.

The other possible problem in AMD’s explanation is that they used the “same JVM command line options” for the Server Side Java (SSJ) benchmark.  While this sounds like a fair comparison, it’s common for different CPUs from different vendors to require different command line options to achieve optimum results.  This may explain why the AMD-submitted Intel server was 5% slower than comparable Intel systems submitted by other vendors.  Combined with the suboptimal Intel hardware and possibly suboptimal software configuration, we can see the likely reason why the AMD-submitted Intel system did so poorly.

To get an accurate picture of what’s really going on, we need to look at the best possible SPECpower scores for AMD and Intel to determine who the actual winner is.  The table below compares the official SPECpower_ssj2008 energy efficiency benchmarks of the top dual-processor servers from Intel and AMD.

System (SPECpower_ssj2008) Vendor DIMMs Peak SSJ Peak watt Score
Intel L5430 2.66_GHz PowerLeader 2 285,970 161 1,135
Intel L5420 2.5_GHz SuperMicro 2 279,209 174 990
Intel L5420 2.5_GHz HP 2 282,281 189 930
Intel L5430 2.66_GHz Fujitsu Siemens 4 293,162 220 827
AMD 2384 2.7_GHz AMD 4 338,577 264 860
AMD 2384 2.7_GHz AMD 4 335,116 264 827

Because the SPECpower rules don’t really specify the minimal number of memory DIMMs that a server should have, AMD should have submitted servers with 2 DIMMs like everyone else to get the best scores.  Instead, AMD submitted servers with 4 memory DIMMs instead of 2 which gives them a 3.58 idle watts to 8.42 peaks watt handicap.  However, AMD neutralized that handicap using the newest Western Digital GreenPower 3.5″ hard drive which saves 4 idle watts to 6 active watts compared to the hard drives used by the other systems.

But for the sake of comparison, I estimated the scores of the 3 Intel systems that used only 2 DIMMs and added 3.58 watts in idle and gradually increased up to 8.42 watts at peak Server Side Java (SSJ) loads to simulate power consumption for a 4-DIMM server.  But when these servers get upgraded to 4 DIMMs, they have higher memory performance which translates to higher overall performance.  Based on the peak SSJ scores in the table above, I estimated a 2.5% boost in peak SSJ performance which slightly counteracts the negative effects of the higher power consumption in terms of performance per unit energy.  Based on this estimate (which should NOT be taken as official SPECpower_ssj2008 scores), I calculated that the top three systems in the table above would have achieved a score of 1006, 889, and 836 which is still higher than the AMD servers.  But if those top three Intel servers had used the same energy efficient hard drives used in the AMD servers, the score reverts back to something similar to the unadjusted numbers.

UPDATE 11/13/2008 – The AMD results were actually not Barcelona, but AMD Shanghai results.  I had thought they were Barcelona but I didn’t realize that I was looking at yet-to-be-launched Shanghai performance numbers.

In conclusion, AMD has made huge strides to nearly close the SPECpower gap, but they’re behaving inappropriately by comparing their products to suboptimal benchmarks that they themselves submitted.  That’s unfortunate because this new controversy has overshadowed the huge progress made by AMD.  Had AMD launched these mid 2 GHz Barcelona processors on time a year earlier, they would have been extremely competitive all through 2008 but it was not to be and they suffered for it.

AMD made some huge clock-for-clock core-for-core performance gains with their quad-core Barcelona chips in Server Side Java (SSJ) performance compared to their older dual-core Opteron chips.  Even when I factor out the clock speed difference, a Barcelona quad-core 2.7 is still 3.14 times faster than an Opteron dual-core 2.4 GHz system in Server Side Java performance.  These gains have allowed AMD to become very competitive against Intel’s aging Penryn chips although AMD cannot claim the title.  It’s also interesting to note that AMD Barcelona also made huge improvements in web server performance and it is beating Intel Penryn servers on dual-socket SPECweb_2005.  However, Intel Penryn class CPUs still win by a large margin in the single-processor server market.

The reason for this disparity between single- and dual-socket servers is that Intel’s memory architecture is constraining their dual-socket performance, but that limitation will soon disappear with Intel’s Nehalem Microarchitecture server CPUs which should launch by the end of this year.  Intel’s Nehalem CPUs have completely closed the memory performance gap with QuickPath architecture and they’ve even managed to leapfrog AMD’s memory architecture with 50% more memory channels and higher performance DDR3 memory.  Coupled with the improvements in the Nehalem execution engine, there is little doubt that Intel will be regaining a comfortable lead in the Server and Desktop markets.

AMD will narrow that gap with their Shanghai processors if they can launch on time but few analysts predict Shanghai will come close to beating Nehalem.  Whether or not AMD can launch Shanghai on time or get close enough to Intel Nehalem to be reasonably competitive remains to be seen.

Tomshardware botches Intel Atom energy efficiency tests badly

July 14th, 2008 George Ou 46 comments

Update 8/20/2008 – I need to fix my mistake because I quoted some initial informal numbers from Jack that didn’t include the 3.5″ HDD or dual-threaded tests.  Now that the I have his full data set, I’m need to make some corrections.  I apologize for my mistake and I will fix it below, but my insistance that Tomshardware is off by a LOT has not changed though I was wrong about saying the Intel 945GC/Atom board could run on 802.3af.  Tomshardware on the other hand only did NOT correct the mistake despite acknowledging my emails to them, they went ahead and and published more results using an 850W power supply which is horrendously stupid.

For those who are going to say that Thorn who critized me in the comment section was right all along despite the fact that he did not run ANY tests, he was still wrong by a long shot because he insisted that the Tomshardware numbers aren’t that far off.  It turns out that Toms hardware was off by 25.6W from the Sparkle 220W power supply and my hasty post based on informal data that mistakenly excluded the hard drives was off by 11 W.  Let this be a lesson to me for posting informal data and once again, I apologize to my readers for my mistake.

After praising Tomshardware for doing a good job fixing their flash storage efficiency article, I must point out that Tomshardware’s made another horrendous error by claiming that and Intel D945GCLF system with an Intel Atom 1.6 GHz processor consumes 59W idle power.  Mr. Dandumont who authored the article claims that he used an “80 Plus” power supply implying that he was getting at least 80% efficiency on the PSU (Power Supply Unit) but that is a huge mistake.

UPDATE 7/21/2008 – Tomshardware’s Senior Editor Matthieu Lamelot has responded that they used the Tagan EasyCon U15 530 W power supply.  My 600W guess was fairly close and it means that Tomshardware loaded their PSU to roughly 3.6% load and that translates to roughly 30% efficiency which makes their benchmarks very inaccurate and misleading.  Tomshardware should at least test with a good 80 Plus 220W PSU but ideally they should test with a sub-100W PicoPSU.

The 80 Plus rating only claims greater than 80% efficiency if you’re talking about workloads between 20% to 100% output power and my guess is that Mr. Dandumont used a 600+ watt PSU which means he was likely loading the system at less than 3% workload.  3% workload on a PSU translates to a horrendous efficiency of less than 30%.

Tomshardware claims that the same system with a 7200 RPM 3.5″ hard drive consumes 59W idle and 62W peak using an energy efficient power supply but this isn’t even in the ballpark in terms of accuracy.  Tomshardware needs to correct this error and start using some more appropriate and smaller power supplies for testing computers that draw less than 50 watts of output power.

Update 8/20/2008 – My PhD friend Jack who is a very knowledgeable and meticulous tester tested the Intel D945GCLF with a Sparkle SPI220LE 220W PSU and got a measurement of 34W idle and 36.4W peak input power consumption.  Since the “80 Plus” SPI220LE gets around 73% efficiency at 19.2W output load according to Silent PC Review, we can reasonably estimate at 75% efficiency that the entire system actually consumes ~27W output power from the PSU.

Without the hard drive, the system peaked at 27.2W which means the output power from the PSU would be ~20W which is too much for 802.2af PoE.  The Intel 945GC chipset is unfortunately a bit power hungry so when the next version of Atom with on-die controller and graphics built in to the CPU, we can hopefully see the output power requirement dive down below 10W but until then, 802.3af is out of the question.

The SPI220LE is above 80% efficiency when the output load level is above 20% and here we’re only at around 8.1% load level so the efficiency drops.  I’ve done some testing with the Sparkle 55W Open Frame PSU and found that it runs at around 80% efficiency for the Intel 945GC/Atom board despite the fact that it’s not 80 Plus certified.  This is because the 55W PSU is running at optimum load levels.

Categories: Benchmarks, Energy efficiency, Intel, Via Tags:

Beware of Intel NIC driver updates

June 10th, 2008 Justin James 7 comments

I had a pretty harsh experience tonight. I have one of Intel’s server grade NICs in my ISA server at work. It has a lot of IP addresses bound to the external adapter. We updated the NIC’s driver due to some odd behavior we were seeing (some of the ports were not being detected sometimes after a cold boot). Well, the installer decided to pick one of the IP addresses assigned to each port to make the primary IP address, and to drop all of the other IP addresses bound to thew adapter. So instead of the 5 minutes of downtime we expected, I got to spend an hour re-typing the critical IP addresses, and another hour tomorrow typing in the non-critical IP addresses. Let this be a warning: if you plan on updating the drivers for your Intel NIC, plan on possibly needing to re-configure it afterwards!

J.Ja

Categories: Intel, Networking Tags: