c++ - make for ( rowIdx = 1...) work using cuda threads -

- June 15, 2013

i have in c++

for ( rowidx = 1; rowidx < (nbrows - 1); rowidx++ )

in order using cuda ,how should handle it?

because in cuda do:

if (rowidx < arraysize) ...

if set rowidx=1 before calling if (rowidx < arraysize) , doesn't work.

----update ----------------------------

a simple example illustration.

__global__ void test_func(int *a_in,int *b_in,int *c_out) {      size_t rowidx = blockidx.x * blockdim.x + threadidx.x;      rowidx=1;      if (rowidx <array_size)        c_out[rowidx]=a_in[rowidx]*b_in[rowidx];       }  //fill matrices (int i=0;i<array_size;i++){        a_in[i]=i;       b_in[i]=i+1;       c_out[i]=0;       }

if use rowidx=1 ,then taking first result correctly.the rest zeros.

for simple replace of loop given functionality provided in example, kernel can looks way.

__global__ void test_func(int *a_in,int *b_in,int *c_out) {     size_t rowidx = blockidx.x * blockdim.x + threadidx.x;       if (rowidx > 0 &&       // ensure rowidx @ least 1         rowidx <array_size) // ensure rowidx not out of bounds     {       c_out[rowidx]=a_in[rowidx]*b_in[rowidx];     } }

all threads compute different array elements starting index 1 array_size-1. aware "real" first element c_out[0] won't computed in case.

Search This Blog

Cap

c++ - make for ( rowIdx = 1...) work using cuda threads -

Comments

Post a Comment

Popular posts from this blog

Need to Replace properties of single sql file using bat file -

postgresql - Lazarus + Postgres: incomplete startup packet -

c# - How to get the current UAC mode -