I recently upgraded from a GTX480 to a GTX680 in the hope that the tripled number of cores would manifest as significant performance gains in my CUDA code. To my horror, I've discovered that my memory intensive CUDA kernels run 30%-50% slower on the GTX680.
I realize that this is not strictly a programming question but it does directly impact on the performance of CUDA kernels on different devices. Can anyone provide some insight into the specifications of CUDA devices and how they can be used to deduce their performance on CUDA C kernels?