Tagged Questions

This tag refers to NVIDIA’s parallel computing architecture (CUDA) that enables dramatic increases in computing performance by harnessing the power of the GPU (graphics processing unit). The CUDA architecture enables application development using several languages and associated APIs, including: ...

learn more… | top users | synonyms (2)

0
votes
0answers
6 views

Porting and openMp program to cuda c: correct grid_size/block_size and reduction

I want to convert an openMP program to cuda c. I try to find my way on the web and the sdk. But the material is beyond my level. My c program loop over n=2^30 index and add the weight of each index. ...
3
votes
2answers
21 views

How to quantify the processing tradeoffs of CUDA devices for C kernels?

I recently upgraded from a GTX480 to a GTX680 in the hope that the tripled number of cores would manifest as significant performance gains in my CUDA code. To my horror, I've discovered that my memory ...
0
votes
0answers
10 views

build gpuocelot fails due to boost linkage errors on OS X Snow Leopard

I used the latest trunk version of gpuocelot on a mac snow 10.6.8 with gcc 4.5.3 and boost @1.49.0_0+universal (active) (boost installed via macports). I run scons and I get ...
0
votes
1answer
43 views

How to gather rows from a matrix by indices list using CUDA Thrust

This is seemingly a simple problem but I just can’t figure out an elegant way to do this with CUDA Thrust. I have a two dimensional matrix NxM and a vector of desired row indices of size L that is a ...
0
votes
2answers
33 views

Link to cutil in GPU Computing SDK

I've been trying to link to the functions in the cutil.h ofthe GPU Computing SDK released by NVIDIA. At the moment, I am simply trying to compile this simple piece of code: #include <iostream> ...
0
votes
0answers
29 views

In CUDA, how to translate screen space coordinate to world space coordinate in the Kernel Function

Here, I'm trying to add ray-casting into a real 3D scene. As we know, in ray-casting, in order to cast the ray, we need to get the direction of ray. The first point in the ray is the start point of ...
0
votes
0answers
37 views

Ray-casting using CUDA [closed]

I now want to learn ray-casting volume rendering using CUDA, but I do not know how. I have some basic knowledge on OpenGL and 3D graphics. What books should learn from? And more importantly, where ...
2
votes
1answer
62 views

What happened when alll thread of a warp read the same global memory?

I want to know what happened when all threads of a warp read the same 32-bit address of global memory. How many memory requests are there? Is there any serialization. The GPU is Fermi card, the ...
0
votes
2answers
68 views

parallel reduction in CUDA

Following code sums every 32 elements in a array to the very first element of each 32 element group: int i = threadIdx.x; int warpid = i&31; if(warpid < 16){ s_buf[i] += ...
0
votes
2answers
49 views

cuda code produces incorrect result in release mode

my CUDA code produces correct result in Debug mode. However, in the release mode, the same code produces garbage results. Could the synchronization between threads behave differently between debug and ...
-1
votes
1answer
41 views

HAAR wavelet transform in CUDA

I have Tried to Implement the HAAR wavelet transform in CUDA for a 1D array. ALGORITHM I have 8 indices in the input array With this condition if(x_index>=o_width/2 || y_index>=o_height/2) I ...
0
votes
2answers
38 views

CUDA Threads execution order

In CUDA when we talk about parallel threads executing the same code is there any order to their execution? For-example: If, I have 4 threads,for a 1D array of 4 elements.All four threads perfom ...
0
votes
1answer
24 views

CUDA cublas<t>gbmv understanding

I recently wanted to use a simple CUDA matrix-vector multiplication. I found a proper function in cublas library: cublas<<>>gbmv. Here is the official documentation But it is actually very ...
0
votes
3answers
100 views

CUDA speedup for simple calculations

I have the following code in cuda_computation.cu #include <iostream> #include <stdio.h> #include <cuda.h> #include <assert.h> void checkCUDAError(const char *msg); ...
0
votes
1answer
37 views

What can cause this cuda stack trace and what is wrong with this call to cudaMemcpy?

My program, which draws a small animation, uses glut and cuda, and is written in C++, hangs after a while, and I see the following trace in the debugger when I interrupt it a few seconds after it ...

1 2 3 4 5 144
15 30 50 per page