r - How to find a merge the specific data with the name? -


i'm trying find name of specific gene in data. that's did far:

gnames = unique(data_rd[,1]) gnames= gnames[2:length(gnames)] 

gnames contain of genes have find name.

gdata = lapply(list_of_data,function(x) x[3:nrow(x),1,9])   

gdata set of genes names in different files , of them might repeated in few files.

that's how created list_of_data:

tbl = list.files(pattern="*.csv") list_of_data = lapply(tbl, read.csv) 

so let's explain on example:

gnames:  gene1 gene2 gene3 gene4 gene5 gene6 gene7  gdata:  gene1 nameofgene1 gene5 nameofgene5 gene7 nameofgene7  gene2 nameofgene2 gene6 nameofgene6  gene3 nameofgene3 gene4 nameofgene4 

i want r find name of of genes gnames looking list_of_data.

> head(gnames) [1] "zz_fgczcont0025" "zz_fgczcont0099" "zz_fgczcont0126" "zz_fgczcont0146" [5] "at1g19570"       "zz_fgczcont0158"   > head(gdata) ## edited, big. [[1]]                   x 3   zz_fgczcont0025 4   zz_fgczcont0099 5   zz_fgczcont0126 6   zz_fgczcont0146 7       at1g19570.1 8   zz_fgczcont0158 9       at5g38480.1 10  zz_fgczcont0050 x.8 3                                                                                                                                                                                     gi|1346343|sp|p04264| k2c1_human keratin, type ii cytoskeletal 1 (cyto  4                                                                                                                                                                                                                                             sp|k1c9_human|  5                                                                                                                                                                                     gi|71528|pir|| krhu0 keratin 10, type i, cytoskeletal (clone lambda-kh  6                                                                                                                                                                                                                                             sp|k22e_human|  7                                                                                                                                                     | symbols: dhar1, atdhar1, dhar5 | dehydroascorbate reductase | chr1:6773462-6774413 reverse length=213 8                                                                                                                                                                                     gi|88041|pir||a31994 keratin 10, type i, epidermal - human gi|623409 (  9                                                                                                                                                             | symbols: grf3, rci1 | general regulatory factor 3 | chr5:15410277-15411285 forward length=255 10                                                                                                                                                                                    gi|71536|pir|| krhu2 keratin, 67k type ii cytoskeletal - human (fragme  

try:

lapply(gdata, function(x)x[x[,1] %in% gnames, 2])  

i tested following data:

set.seed(123) gnames <- sample(letters, 10) num.rows <- sample(10:12) gdata <- lapply(num.rows, function(i)data.frame(x    = sample(letters, i),                                                 name = sample(letters, i))) 

note code benefit gdata being stored in better manner. sample data, seems colnames x , x.8: should use more meaningful names , use them in code rather having make assumptions on column index contains data (you forced me use 1 instead of "gene" , 2 instead of "genename".) imagine making gnames data.frame gene column can rely on merge work looking for.


Comments

Popular posts from this blog

c# - How to get the current UAC mode -

postgresql - Lazarus + Postgres: incomplete startup packet -

javascript - Ajax jqXHR.status==0 fix error -