algorithm - How to find all subsets of a multiset that are in a given set? -
say have set d
of multisets:
d = { {d, g, o}, {a, e, t, t}, {a, m, t}, }
given multiset m
, like
m = {a, a, m, t}
i algorithm f
give me elements of d
subsets (or more precisely, "submultisets") of m
:
f = {{a, m, t}}
if 1 such query, scanning on elements of d
(in o(#d)
time) optimal. if want answer many such queries same d
, different m
, might able make faster preprocessing d
smarter data structure.
we toss of d
hashtable , iterate on possible subsets of m
, looking each in hashtable, that's o(2^#m)
. fine small m
, not fine medium large m
.
is possible in polynomial time in #m
? or maybe reduce known np-complete problem one, prove it's impossible fast?
edit: realized in worst case, need output of d
, #d
still going appear in time complexity. let's assume size of output bounded constant.
here quick implementation of ternarysearchtree (tst) can in problem. number of years ago, inspired article in drdobbs. can read more @ http://www.drdobbs.com/database/ternary-search-trees/184410528. provides background tst , performance analysis.
in problem description example, d dictionary containing "dgo","aett" , "amt" keys. values identical keys.
m search string says "give me values in dictionary keys containing subset or of these alphabets". order of characters not important. character '.' used wildcard in search.
for given m, algorithm , data structure not require @ elements of d. in respect fast. have done tests on number of nodes visited , of time, number of nodes visited small fraction of total nodes in dictionary keys not found.
this algorithm optionally allows enter minimum , maximum length limits keys returned dictionary.
sorry lengthy code complete able test.
import java.util.arraylist; import java.io.*; public class tsttree<t> { private tstnode<t> root; private int size = 0; private int totalnodes = 0; public int getsize() { return size; } public int gettotalnodes() { return totalnodes; } public tsttree() { } public tstnode<t> getroot() { return root; } public void insert(string key, t value) { if(key==null || key.length()==0) return; char[] keyarray = key.tochararray(); if(root==null) root = new tstnode<t>(keyarray[0]); tstnode<t> currentnode = root; tstnode<t> parentnode = null; int d = 0; int = 0; while(currentnode!=null) { parentnode = currentnode; d = keyarray[i] - currentnode.getsplitchar(); if(d==0) { if((++i) == keyarray.length) // found key { if(currentnode.getvalue()!=null) system.out.println(currentnode.getvalue() + " replaced " + value); else size++; currentnode.setvalue(value); // replace value @ node return; } else currentnode = currentnode.geteqchild(); } else if(d<0) currentnode = currentnode.getlochild(); else currentnode = currentnode.gethichild(); } currentnode = new tstnode<t>(keyarray[i++]); totalnodes++; if(d==0) parentnode.seteqchild(currentnode); else if(d<0) parentnode.setlochild(currentnode); else parentnode.sethichild(currentnode); for(;i<keyarray.length;i++) { tstnode<t> tempnode = new tstnode<t>(keyarray[i]); totalnodes++; currentnode.seteqchild(tempnode); currentnode = tempnode; } currentnode.setvalue(value); // insert value @ node size++; } public arraylist<t> find(string charstosearch) { return find(charstosearch,1,charstosearch.length()); } // return values keys between minlen , maxlen containing "charstosearch". public arraylist<t> find(string charstosearch, int minlen, int maxlen) { arraylist<t> list = new arraylist<t>(); char[] chararray = charstosearch.tochararray(); int[] charfreq = new int[256]; for(int i=0;i<chararray.length;i++) charfreq[chararray[i]]++; maxlen = chararray.length>maxlen ? maxlen : chararray.length; pmsearch(root,charfreq,minlen,maxlen,1, list); return list; } public void pmsearch(tstnode<t> node, int[] charfreq, int minlen, int maxlen, int depth, arraylist<t> list) { if(node==null) return; char c = node.getsplitchar(); if(issmaller(charfreq,c)) pmsearch(node.getlochild(),charfreq,minlen,maxlen,depth,list); if(charfreq[c]>0) { if(depth>=minlen && node.getvalue()!=null) list.add(node.getvalue()); if(depth<maxlen) { charfreq[c]--; pmsearch(node.geteqchild(),charfreq,minlen,maxlen,depth+1,list); charfreq[c]++; } } else if(charfreq['.']>0) { // wildcard if(depth>=minlen && node.getvalue()!=null) list.add(node.getvalue()); if(depth<maxlen) { charfreq['.']--; pmsearch(node.geteqchild(),charfreq,minlen,maxlen,depth+1,list); charfreq['.']++; } } if(isgreater(charfreq,c)) pmsearch(node.gethichild(),charfreq,minlen,maxlen,depth,list); } private boolean isgreater(int[] charfreq, char c) { if(charfreq['.']>0) return true; boolean retval = false; for(int i=c+1;i<charfreq.length;i++) { if(charfreq[i]>0) { retval = true; break; } } return retval; } private boolean issmaller(int[] charfreq, char c) { if(charfreq['.']>0) return true; boolean retval = false; for(int i=c-1;i>-1;i--) { if(charfreq[i]>0) { retval = true; break; } } return retval; } }
below small test program. test program inserts 4 key/value pairs in example in exact order. if have d lot of elements, best sort first , build dictionary in tournament fashion (ie. insert middle element, recursively populate left half , right half). ensure tree balanced.
import org.junit.*; import org.junit.runner.*; import java.io.*; import java.util.*; import org.junit.runner.notification.failure; public class mytest { static tsttree<string> dictionary = new tsttree<string>(); @beforeclass public static void initialize() { dictionary.insert("dgo","dgo"); dictionary.insert("aett","aett"); dictionary.insert("amt","amt"); } @test public void testmethod() { system.out.println("testmethod"); arraylist<string> result = dictionary.find("aamt"); system.out.println("results: "); for(iterator it=result.iterator();it.hasnext();) { system.out.println(it.next()); } } @test // test wildcard finds "dgo" value public void testmethod1() { system.out.println("testmethod1"); arraylist<string> result = dictionary.find("aamtdo."); system.out.println("results: "); for(iterator it=result.iterator();it.hasnext();) { system.out.println(it.next()); } } public static void main(string[] args) { result result = junitcore.runclasses(mytest.class); (failure failure : result.getfailures()) { system.out.println(failure.tostring()); } } }
Comments
Post a Comment