Hmm, with this kind of power, just render ALL POSSIBLE frames ahead for a full second, and just flip the display to the framebuffer that corresponds to the gameplay :)
Good to see GPUs gaining traction outside of videogames, paving way for their use as a general purpose devices that can benefit a wide variety of usage patterns outside of games :) Hopefully the profits from these will mean even better GPUs for us gamers down the line.
Looked on Bench. I can't find 18,688x Tesla K20's any where. I also looked for 18,688x AMD Optrons. This ain't like Anandtech. Normally Bench is updated when the article is released.
This is pretty awesome. I'm jealous you got to go. The comment about the thickness requirement of the cables for 480V compared to 208V in the first power delivery video is staggering. I'm surprised there's such a difference.
Some of the videos seem to be stopping early when I play them, and I have to skip ahead a bit to continue watching.
Voltage is 2.3 times higher, so current is 2.3 times lower for the same power. A wire 2.3x thinner (5.3x less cross sectional area) will give the same power loss. Insulation thickness would be slightly higher because it's based on voltage not current.
> The comment about the thickness requirement of the cables for 480V compared to 208V in the first power delivery video is staggering. I'm surprised there's such a difference.
V = Voltage I = Current R = Resistance P = Power
P = V times I
So if you double the Voltage you halve the Current for the same amount of Power.
Power Loss (in the cables) is calculated as I squared times R. Since I is 1/2 at 480 Volts the Power Loss is 1/4 (1/2 squared) as much.
So they determined a fixed power loss in the cables and reduced the size (which increased the resistance) of the cables so that the thinner cables (at 480 volts) had the same loss as the thicker cables (at 208 volts).
This is an awesome article Anand! I would love to see more super-computing like this, and maybe some in-depth discussion of how super-computing works and differs from traditional computing architectures. Thanks for the great article though!
I also just registered just to say that this is a great article! One of the best I have seen on Anandtech, keep up the awesome work. Perhaps you can look into the Parallella Adapteva project next!
Yes, there is significant research going on. In our lab we had a pretty big group working of using FPGAs for HPC. The RC based supercomputer is called Novo-G. It was the worlds biggest publicly known RC super computer.
It is very small in physical size compared to some of the top conventional super computer, but for some specific compute requirements it comes close to beating top supercomputers. There was a major upgrade planned (around the time I was graduating) so it might even better now. What exact type of computations? I don't remember very well (I didn't work on RC, I was mostly s/w guy in conventional HPC part of lab), you might be able to get some info by checking out few posters or papers abstract.
According to the paper, it takes 6 to 8 years for the #1 computer on the list to move to #500, and then another 8 to 10 years for that performance to be available in your average notebook computer. Not sure on notebook to smartphone, but it can't be very long.
Not saying it can't be 2688 CUDA cores but you are using the high-end of the range when the article clearly lists a range of 1.2-1.3Tflops. I don't think you can just assume that it's 2688 without a confirmation given the range of values provided.
We have other reasons to back our numbers, though I can't get into them. Suffice it to say, if we didn't have 100% confidence we would not have used it.
The Jaguar is thus renamed into Titan, and the sheer numbers are quite impressive: 46,645,248 CUDA Cores (yes, that's 46 million) 299,008 x86 cores 91.25 TB ECC GDDR5 memory 584 TB Registered ECC DDR3 memory Each x86 core has 2GB of memory
1 Node = the new Cray XK7 system, consists of 16-core AMD Opteron CPU and one Nvidia Tesla K20 compute card.
The Titan supercompute has 18,688 nodes.
46,645,248 CUDA Cores / 18,688 Nodes = 2,496 CUDA cores per 1 Tesla K20 card.
"The upgrade includes the Tesla K20 GPU accelerators, a replacement of the compute modules to convert the system’s 200 cabinets to a Cray XK7 supercomputer, and 710 terabytes of memory."
18,688 nodes, each with 32GB of RAM + 6GB of VRAM = 710,144 GB
(Press agencies are bad about using power of 10, hence "710" TB).
Great article. Fantastic way of showing to us tiny PC users what really big stuff looks like. Data center is one thing, but my word this stuff is, is... well that is Ultimate Computing Pr0n. For people who will never ever have a chance to visit one of the super computer centers it is quite something. Enjoyed that very much!
@Guspaz
If we get that kind of performance in phones then it is really scary prospect. :D
We currently have 1-billion-transistor chips. We'd get from there to 128 trillion, or Titan-magnitude computers, after 17 iterations of Moore's Law, or about 25 years. If you go 25 years back, it's definitely enough of a gap that today's technology looks like flying cars to folks of olden times. So even if 128-trillion-transistor devices isn't exactly what happens, we'll have *something* plenty exciting on the other end.
*Something*, but that may or may not be huge computers. It may not be an easy exponential curve all the way. We'll almost certainly put some efficiency gains towards saving cost and energy rather than increasing power, as we already are now. And maybe something crazy like quantum computers, rather than big conventional computers, will be the coolest new thing.
I don't imagine those powerful computers, whatever they are, will all be doing simulations of physics and weather. One of the things that made some of today's everyday tech hard to imagine was that the inputs involved (social graphs, all the contents of the Web, phones' networks and sensors) just weren't available--would have been hard, before 1980, to imagine trivially having a metric of your connectedness to an acquaintance (like Facebook's 'mutual friends') or having ads matching your interest.
I'm gonna say that 25 years out the data, power, and algorithms will be available to everyone to make things that look like Strong AI to anyone today. Oh, and the video games will be friggin awesome. If we don't all blow each other up in the next couple-and-a-half decades, of course. Any other takers? Whoever predicts it best gets a beer (or soda) in 25 years, if practical.
I'd wondering which model Opterons they threw in there. The Interlagos chips were barely faster and used more power than the Magny-Cours CPUs they were destined to replace, though I'm sure these are so heavily taxed that the Bulldozer architecture would shine through in the end.
Okay, I've checked - these are 6274s, which are Interlagos and clocked at 2.2GHz base with an ACP of 80W and a TDP of 115W apiece. This must be the CPU purchase mentioned prior to Bulldozer's launch.
It was an awesome trip, seriously one of the best. Talking to Dr. Messer was one of the highlights for sure, that guy is insanely smart and very passionate about his work.
Old hardware is traded in when you order the next round of upgrades :)
(Yes you'd have single threaded cpu bottleneck, but I can dream)