Latest Posts
Behind the Scenes of AnandTech's Server Tests [Video]
by Anand Lal Shimpi on 12/18/2012

We've been quietly testing doing more video content on the site over the past year. I've done a few reviews over at our YouTube channel, and we also host all of our smartphone/tablet camera samples over there as well. Going into 2013 we'll be ramping up the amount of video content on the site to go along with Pipeline and the Podcast as some the new features we've introduced over the past couple of years. In doing so we're also going to be hosting videos locally.

When we were looking for the first content to trial our locally served video, I asked Johan de Gelas, the head of our IT/Enterprise testing at AnandTech if he could put something together. Johan came back with a behind the scenes look at the Sizing Servers Lab in Belgium, the back-end for all of our server reviews and testing. 

Johan's video is embedded below and if this goes well he's promised to bring us a look at ARM based servers on video in the not too distant future.

{video 1}

AnandTech/Intel S3700 Roundtable Discussion & Webcast Videos Live
by Anand Lal Shimpi on 11/16/2012

Intel invited me to attend SC12 and participate in a webcast for the launch of its new DC S3700 SSD. I joined Roger Peene from Intel's SSD Solutions and we talked about the S3700 as well as answered your questions live. If you missed the webcast, you can find the pre-recorded video here. There's a great question about the future of NAND we discussed on the webcast that I'd highly recommend paying attention to.

Prior to the webcast, I had the chance to sit down with Arbin Kumar (responsible for Intel SSD reliability and validation), Allison Goodman (lead engineer on the S3700) and Roger Peene (Marketing Director for Intel's Datacenter SSD Solutions) once again to discuss the S3700 in greater detail. The discussion in the video below is from the first day I really learned about the S3700's architecture. The full discussion took several hours but the video below distills a lot of it down to 7 minutes. If you want to hear about the S3700 from the folks who actually had a hand in building the drive, I strongly suggest watching the video. Update: The video is back up.

Finally, at SC12 Intel rented a replica of the original series bridge from the starship Enterprise which we used as a backdrop for the webcast. Prior to the webcast airing, we had some fun on the bridge which you can check out in the gallery below.

At the end of the day it was a pretty fun experience. I learned quite a bit about Intel's NAND Solutions Group through this whole process. The SSD business is pretty unusual in that it's built around a replacement to a highly commoditized product (mechanical storage). It's surprising that we even have folks who typically play in high margin products even in this industry, but without them the market would be much worse off. I still remember what things were like with SSDs prior to the X25-M and even for the 12 - 18 months after its launch. The S3700 showed that there's still room for innovation even within the constraints of 6Gbps SATA, which should tide us over until SATA Express shows up.

The Xeon Phi at work at TACC
by Johan De Gelas on 11/14/2012

The Xeon Phi family of co-processors was announced in June, but Intel finally disclosed additional details about the first shipping implementation of Larrabee. In this short article we'll go over the different Xeon Phi SKUs, what kind of software runs on it and how the Xeon Phi are implemented in a supercomputer.

We had the chance to briefly visit Stampede, the first Supercomputer based upon the Xeon Phi in Austin, TX. Stampede is the most powerful of the supercomputers at the Texas Advanced Computing Center (TACC).

The Intel SSD DC S3700 (200GB) Review
by Anand Lal Shimpi on 11/9/2012

When Intel arrived on the scene with its first SSD, it touted superiority in controller, firmware and NAND as the reason it was able to so significantly outperform the competition. Slow but steady improvements to the design followed over the next year and until 2010 Intel held our recommendation for best SSD on the market. The long awaited X25-M G3 ended up being based on the same 3Gbps SATA controller as the previous two drives, just sold under a new brand. Intel changed its tune, claiming that the controller (or who made it) wasn't as important as firmware and NAND.

Then came the 510, Intel's first 6Gbps SATA drive...based on a Marvell controller. Its follow-on, the Intel SSD 520 used a SandForce controller. With the release of the Intel SSD 330, Intel had almost completely moved to third party SSD controllers. Intel still claimed proprietary firmware, however in the case of the SandForce based drives Intel never seemed to have access to firmware source code - but rather its custom firmware was the result of Intel validation, and SandForce integrating changes into a custom branch made specifically for Intel. Intel increasingly looked like a NAND and validation house, giving it a small edge over the competition. Meanwhile, Samsung aggressively went after Intel's consumer SSD business with the SSD 830/840 Pro, while others attempted to pursue Intel's enterprise customers.

This is hardly a fall from grace, but Intel hasn't been able to lead the market it helped establish in 2008 - 2009. The SSD division at Intel is a growing one. Unlike the CPU architecture group, the NAND solutions group just hasn't been around for that long. Growing pains are still evident, and Intel management isn't too keen on investing heavily there. Despite the extremely positive impact on the industry, storage has always been a greatly commoditized market. Just as Intel is in no rush to sell $20 smartphone SoCs, it's similarly in no hurry to dominate the consumer storage market.

I had heard rumors of an Intel designed 6Gbps SATA controller for a while now. Work on the project began years ago, but the scope of Intel's true next-generation SATA SSD controller changed many times over the years. What started as a client focused controller eventually morphed into an enterprise specific design, with its scope and feature set reinvented many times over. It's typical of any new company or group. Often times the only way to learn focus is to pay the penalty for not being focused. It usually happens when you're really late with a product. Intel's NSG had yet to come into its own, it hadn't yet found its perfect development/validation/release cadence. If you look at how long it took the CPU folks to get to tick-tock, it's still very early to expect the same from NSG.

Today all of that becomes moot as Intel releases its first brand new SSD controller in 5 years. This controller has been built from the ground up rather than as an evolution of a previous generation. It corrects a lot of the flaws of the original design and removes many constraints. Finally, this new controller marks the era of a completely new performance focus. For the past couple of years we've seen controllers quickly saturate 6Gbps SATA, and slowly raise the bar for random IO performance. With its first 6Gbps SATA controller, Intel does significantly improve performance along both traditional vectors but it adds a new one entirely: performance consistency. All SSDs see their performance degrade over time, but with its new controller Intel wanted to deliver steady state performance that's far more consistent than the competition.

I originally thought that we wouldn't see much innovation in the non-PCIe SSD space until SATA Express. It turns out I was wrong. Codenamed Taylorsville, it's time to review Intel's SSD DC S3700. Read on!

The Intel SSD DC S3700: Intel's 3rd Generation Controller Analyzed
by Anand Lal Shimpi on 11/5/2012

Today Intel is announcing its first SSD based on its own custom 6Gbps SATA controller. This new controller completely abandons the architecture of the old X25-M/320/710 SSDs and adopts an all new design with one major goal: delivering consistent IO latency. 

All SSDs tend to fluctuate in performance as they alternate between writing to clean blocks and triggering defrag/garbage collection routines with each write. Under sequential workloads the penalty isn't all that significant, however under heavy random IO it can be a real problem. The occasional high latency blip can be annoying on a client machine (OS X doesn't respond particularly well to random high IO latency), but it's typically nothing more than a rare hiccup. Users who operate their drives closer to full capacity will find these hiccups to be more frequent. In a many-drive RAID array however, blips of high latency from each drive can destructively work together to reduce the overall performance of the array. In very large RAID arrays (think dozens of drives) this can be an even bigger problem. 

In the past, we've recommended simply increasing the amount of spare area on your drive to combat these issues - a sort of bandaid that would allow the SSD controller to better do its job. With its latest controller, Intel tried to solve the root cause of the problem.

The launch vehicle for Intel's first 6Gbps SATA controller is unsurprisingly a high-end enterprise drive. Since the 2008 introduction of the X25-M, Intel has shifted towards prioritizing the enterprise market. All divisions of Intel have to be profitable and with high margins. The NAND Solutions Group (NSG) is no exception to the rule. With consumer SSDs in a race to the bottom in terms of pricing, Intel's NSG was forced to focus on an area that wouldn't cause mother Intel to pull the plug on its little experiment. The enterprise SSD market is willing to pay a premium for quality, and thus it became Intel's primary focus.

The first drive to use the new controller also carries a new naming system: the Intel SSD DC S3700. The DC stands for data center, which bluntly states the target market for this drive. Read on for our analysis of Intel's first 6Gbps SATA controller.

Inside the Titan Supercomputer: 299K AMD x86 Cores and 18.6K NVIDIA GPUs
by Anand Lal Shimpi on 10/31/2012

Earlier this month I drove out to Oak Ridge, Tennessee to pay a visit to the Oak Ridge National Laboratory (ORNL). I'd never been to a national lab before, but my ORNL visit was for a very specific purpose: to witness the final installation of the Titan supercomputer.

ORNL is a US Department of Energy laboratory that's managed by UT-Battelle. Oak Ridge has a core competency in computational science, making it not only unique among all DoE labs but also making it perfect for a big supercomputer.

Titan is the latest supercomputer to be deployed at Oak Ridge, although it's technically a significant upgrade rather than a brand new installation. Jaguar, the supercomputer being upgraded, featured 18,688 compute nodes - each with a 12-core AMD Opteron CPU. Titan takes the Jaguar base, maintaining the same number of compute nodes, but moves to 16-core Opteron CPUs paired with an NVIDIA Kepler K20 GPU per node. The result is 18,688 CPUs and 18,688 GPUs, all networked together to make a supercomputer that should be capable of landing at or near the top of the TOP500 list.

Over the course of a day in Oak Ridge I got a look at everything from how Titan was built to the types of applications that are run on the supercomputer. Having seen a lot of impressive technology demonstrations over the years, I have to say that my experience at Oak Ridge with Titan is probably one of the best. Normally I cover compute as it applies to making things look cooler or faster on consumer devices. I may even dabble in talking about how better computers enable more efficient datacenters (though that's more Johan's beat). But it's very rare that I get to look at the application of computing to better understanding life, the world and universe around us. It's meaningful, impactful compute.

Read on for our inside look at the Titan supercomputer.

Making Sense of the Intel Haswell Transactional Synchronization eXtensions
by Johan De Gelas, Cara Hamm on 9/20/2012

Intel has released additional information regarding the Transactional Synchronization technology (TSX) inside their upcoming Haswell processor; it's basically an instruction set architecture (ISA) extension to make hardware accelerated transactional memory possible. What does that mean in the real world? The more cores you get in a system, the more threads you need to keep them busy. Unfortunately, it's not that easy to simply add more threads, as a lot of software scales pretty badly as core count goes up. Even server and HPC software have trouble dealing with the current octal and dodeca cores.

Intel's TSX holds the promise that it can make it easier for developers to produce code that scales well with higher core counts. Even better, code should be easier to debug and get a nice performance boost with minimal effort. In this article we explain how TSX works and how it may enable much better scaling even in legacy software.

LRDIMMs, RDIMMs, and Supermicro's Latest Twin
by Johan De Gelas on 8/3/2012

Most of the servers in the datacenter, especially the ones running virtualization, database, and some HPC applications, are more memory limited than anything else. There are several server memory options: UDIMMs, RDIMM, LRDIMMs, and even HCDIMMs. RDIMMs are the most commonly used. The LRDIMM in 2011 was the most popular high capacity variety, but only for those with huge budgets.

In our lab we have Supermicro's Twin 2U server (6027TR-D71FRF) from our Xeon E5 review and 16 Samsung LRDIMMs and RDIMMs. We felt that dense servers and high capacity memory made for an interesting combination that's worthy of investigation.

What is the situation now in 2012? Are LRDIMMs only an option for the happy few? Can a Twin node with high capacity make sense for virtualization loads? How much performance do you sacrifice when using LRDIMMs instead of RDIMMs? Does it pay off to use LRDIMMs and Supermicro's Twins? Can you get away with less memory? After all, modern hypervisor such as ESXi 5 have a lots of tricks up their sleeves to save memory. Even if you have less physical memory than allocated, chances are that your applications will still run fine. We measured bandwidth and latency, throughput and response times, scanned the market and performed a small TCO study to provide answers to the questions above.

The Bulldozer Aftermath: Delving Even Deeper
by Johan De Gelas on 5/30/2012

It has been months since AMD's Bulldozer architecture surprised the hardware enthusiast community with performance all over the place. The opinions vary wildly from “server benchmarks are here, and they're a catastrophe” to “Best Server Processor of 2011”. The least you can say is that the new architecture's idiosyncrasies have stirred up a lot of dust.

Although there have been quite a few attempts to understand what Bulldozer is all about, we cannot help but feel that many questions are still unanswered. Since this architecture is the foundation of AMD's server, workstation, and notebook future (Trinity is based on an improved Bulldozer core called "Piledriver"), it is interesting enough to dig a little deeper. Did AMD take a wrong turn with this architecture? And if not, can the first implementation "Bulldozer" be fixed relatively easily?

We decided to delve deeper into the SAP and SPEC CPU2006 results, as well as profiling our own benchmarks. Using the profiling data and correlating it with what we know about AMD's Bulldozer and Intel's Sandy Bridge, we attempt to solve the puzzle.

The Xeon E5-2600: Dual Sandy Bridge for Servers
by Johan De Gelas on 3/6/2012

Eight improved cores, 16 threads, integrated 40 lane PCIe 3.0: the new socket 2011 Xeon E5-2660 manages to package it all in a very modest power envelope of 95W TDP (at 2.2 GHz). If you read the Intel Xeon E5 paper specs, it becomes more and more likely that Intel has pulled off another "Nehalem": much better performance, richer features while consuming less power. Yes, as much as we like a good fight, the question is not whether Intel will outperform the competition and the previous Intel generation but by how much...

Intel sent us both the Xeon E5-2690 - their newest performance champ - and the more performance/watt oriented E5-2660. We managed to turn this last one into a chip that will perform like the Xeon E5-2630, a chip that is in the price range of the best Opteron 6200s. We compare Intel latest Xeon with the Xeon X5650, the Opteron 6276 and 6174. So whether you are searching for the performance champ, the best balance between performance and energy consumption or the best deal for your money, you should find an answer in this article. We improved our regular server performance testing with some HPC (LS-Dyna) and the renewed OLAP tests. Read on...

The Opteron 6276: a closer look
by Johan De Gelas on 2/9/2012

When we first looked at the Opteron 6276, our time was limited and we were only able to run our virtualization, compression, encryption, and rendering benchmarks. Most servers capable of running 20 or more cores/threads target the virtualization market, so that's a logical area to benchmark. The other benchmarks either test a small part of the server workload (compression and encryption) or represent a niche (e.g. rendering), but we included those benchmarks for a simple reason: they gave us additional insight into the performance profile of the Interlagos Opteron, they were easy to run, and last but not least those users/readers that use such applications still benefit.

Back in 2008, however, we discussed the elements of a thorough server review. Our list of important areas to test included ERP, OLTP, OLAP, Web, and Collaborative/E-mail applications. Looking at our initial Interlagos review, several of these are missing in action, but much has changed since 2008. The exploding core counts have made other bottlenecks (memory, I/O) much harder to overcome, the web application that we used back in 2009 stopped scaling beyond 12 cores due to lock contention problems, the Exchange benchmark turned out to be an absolute nightmare to scale beyond 8 threads, and the only manageable OLTP test—Swingbench Calling Circle—needed an increasing number of SSDs to scale.

The ballooning core counts have steadily made it harder and even next to impossible to benchmark applications on native Linux or Windows. Thus, we reacted the same way most companies have reacted: we virtualized our benchmark applications. It's only with a hypervisor that these multi-core monsters make sense in most enterprises, but there are always exceptions. Since quite a few of our readers still like seeing "native" Linux and Windows benchmarks, not to mention quite a few ERP, OLTP, and OLAP servers are still running without any form of virtualization, we took the time to complete our previous review and give the Opteron Interlagos another chance.

Latest from AnandTech