Python: append to mysql command depending on list length prior to execution? -
i want append conditional clauses mysql command prior execution vary based on number of elements want pull db.
for example, have massive list of genes , each gene of interest have coordinates of of exons.
gene_id exon start end
genea exon1 325 359
genea exon2 554 601
geneb exon1 870 900
geneb exon2 990 1010
geneb exon3 1200 1350
as can see genea has 2 exons , geneb has 3 exons. want execute command such following return count of elements db within exons coordinates.
select count(*) db_x position between exon1_start , exon1_end , position between exon2_start , exon2_end;
because of different number of exons in each gene (some genes can contain tens of exons), need append additional "and position between exon_end , exon_start" conditional statement each , every exon before execution of overall command.
i'm struggling come logical solution this. @ moment each gene, i'm passing list of concatenated start_end positions function queries mysql server. eg genea ['325..359','554..601']
the command i'm using (after setting cursor up) follows;
cur.execute('select count(*) db_x position between '+str(exon1_start)+' , '+str(exon1_end)+' , position between +'str(exon2_start)+' , '+str(exon2_end))
this fine if 1 or 2 exons, how can deal potentially long lists of exons? how can dynamically reformat command before executing?
would appreciate i'm stumped!!!
you dynamically build dictionary stores exons data , build sql queries based on such dictionary:
gene_exons_dict = dict() def add_exon_to_gene(gene_name,gene_exon): if gene_name not in gene_exons_dict: gene_exons_dict[gene_name] = [] tmp_exons_dict = {'start': gene_exon[0], 'end': gene_exon[1]} gene_exons_dict[gene_name].append(tmp_exons_dict) def get_sql_query_for_gene(gene_name): if gene_name not in gene_exons_dict: raise exception("exons not defined gene: %s" % gene_name) sql_query = 'select count(*) db_x.tb_y ' exons_count = len(gene_exons_dict[gene_name]) exon_data in gene_exons_dict[gene_name]: sql_query += '(position >'+str(exon_data['start'])+' , position <'+str(exon_data['end'])+')' if exons_count > 1: sql_query += ' , ' # change or if query should return sum of sets instead of intersection # removing last , multi-exons case if exons_count > 1: sql_query=sql_query[:-5] return sql_query if __name__ == '__main__': add_exon_to_gene('gene1', [1, 2]) add_exon_to_gene('gene1', [3, 8]) add_exon_to_gene('gene1', [10, 15]) add_exon_to_gene('gene2', [20, 25]) print get_sql_query_for_gene('gene1') print get_sql_query_for_gene('gene2')
which gives output:
c:\tmp>python dynamicdicttest.py select count(*) db_x.tb_y (position >1 , position <2) , (position >3 , position <8) , (position >10 , position <15) select count(*) db_x.tb_y (position >20 , position <25)
Comments
Post a Comment