c++ - make for ( rowIdx = 1...) work using cuda threads -


i have in c++

for ( rowidx = 1; rowidx < (nbrows - 1); rowidx++ ) 

in order using cuda ,how should handle it?

because in cuda do:

if (rowidx < arraysize) ... 

if set rowidx=1 before calling if (rowidx < arraysize) , doesn't work.

----update ----------------------------

a simple example illustration.

__global__ void test_func(int *a_in,int *b_in,int *c_out) {      size_t rowidx = blockidx.x * blockdim.x + threadidx.x;      rowidx=1;      if (rowidx <array_size)        c_out[rowidx]=a_in[rowidx]*b_in[rowidx];       }  //fill matrices (int i=0;i<array_size;i++){        a_in[i]=i;       b_in[i]=i+1;       c_out[i]=0;       } 

if use rowidx=1 ,then taking first result correctly.the rest zeros.

for simple replace of loop given functionality provided in example, kernel can looks way.

__global__ void test_func(int *a_in,int *b_in,int *c_out) {     size_t rowidx = blockidx.x * blockdim.x + threadidx.x;       if (rowidx > 0 &&       // ensure rowidx @ least 1         rowidx <array_size) // ensure rowidx not out of bounds     {       c_out[rowidx]=a_in[rowidx]*b_in[rowidx];     } } 

all threads compute different array elements starting index 1 array_size-1. aware "real" first element c_out[0] won't computed in case.


Comments

Popular posts from this blog

c# - How to get the current UAC mode -

postgresql - Lazarus + Postgres: incomplete startup packet -

javascript - Ajax jqXHR.status==0 fix error -