I recently upgraded from a GTX480 to a GTX680 in the hope that the tripled number of cores would manifest as significant performance gains in my CUDA code. To my horror, I've discovered that my memory intensive CUDA kernels run 30%-50% slower on the GTX680.

I realize that this is not strictly a programming question but it does directly impact on the performance of CUDA kernels on different devices. Can anyone provide some insight into the specifications of CUDA devices and how they can be used to deduce their performance on CUDA C kernels?

link|improve this question

1  
For maximum performance you really need to tune your code for different GPU configurations. – Paul R 56 mins ago
1  
From what Wikipedia tells me, the memory BW of the 680 is not much higher than that of the 480. So if you're memory-bound, you're not going to see much speedup. I can't explain why you see a slowdown, though. – Oli Charlesworth 56 mins ago
feedback

Know someone who can answer? Share a link to this question via email, Google+, Twitter, or Facebook.

Your Answer

 
or
required, but never shown

Browse other questions tagged or ask your own question.