r - How to find a merge the specific data with the name? -
i'm trying find name of specific gene in data. that's did far:
gnames = unique(data_rd[,1]) gnames= gnames[2:length(gnames)]
gnames contain of genes have find name.
gdata = lapply(list_of_data,function(x) x[3:nrow(x),1,9])
gdata set of genes names in different files , of them might repeated in few files.
that's how created list_of_data:
tbl = list.files(pattern="*.csv") list_of_data = lapply(tbl, read.csv)
so let's explain on example:
gnames: gene1 gene2 gene3 gene4 gene5 gene6 gene7 gdata: gene1 nameofgene1 gene5 nameofgene5 gene7 nameofgene7 gene2 nameofgene2 gene6 nameofgene6 gene3 nameofgene3 gene4 nameofgene4
i want r find name of of genes gnames looking list_of_data.
> head(gnames) [1] "zz_fgczcont0025" "zz_fgczcont0099" "zz_fgczcont0126" "zz_fgczcont0146" [5] "at1g19570" "zz_fgczcont0158" > head(gdata) ## edited, big. [[1]] x 3 zz_fgczcont0025 4 zz_fgczcont0099 5 zz_fgczcont0126 6 zz_fgczcont0146 7 at1g19570.1 8 zz_fgczcont0158 9 at5g38480.1 10 zz_fgczcont0050 x.8 3 gi|1346343|sp|p04264| k2c1_human keratin, type ii cytoskeletal 1 (cyto 4 sp|k1c9_human| 5 gi|71528|pir|| krhu0 keratin 10, type i, cytoskeletal (clone lambda-kh 6 sp|k22e_human| 7 | symbols: dhar1, atdhar1, dhar5 | dehydroascorbate reductase | chr1:6773462-6774413 reverse length=213 8 gi|88041|pir||a31994 keratin 10, type i, epidermal - human gi|623409 ( 9 | symbols: grf3, rci1 | general regulatory factor 3 | chr5:15410277-15411285 forward length=255 10 gi|71536|pir|| krhu2 keratin, 67k type ii cytoskeletal - human (fragme
try:
lapply(gdata, function(x)x[x[,1] %in% gnames, 2])
i tested following data:
set.seed(123) gnames <- sample(letters, 10) num.rows <- sample(10:12) gdata <- lapply(num.rows, function(i)data.frame(x = sample(letters, i), name = sample(letters, i)))
note code benefit gdata
being stored in better manner. sample data, seems colnames x
, x.8
: should use more meaningful names , use them in code rather having make assumptions on column index contains data (you forced me use 1
instead of "gene"
, 2
instead of "genename"
.) imagine making gnames
data.frame gene
column can rely on merge
work looking for.
Comments
Post a Comment