Issue on parsing Html with jsoup for java -
i trying parse html using jsoup.
i used "try jsoup" check if parsing of html correct.
screenshot of results : please open link ^^
my code :
url url = new url("http://tw.search.bid.yahoo.com/search/ac;_ylt=atqkyto06sgghho20hzmpex3_rf8?ei=utf-8&p=%e8%a1%a3%e6%9c%8d"); document doc; try { doc = jsoup.parse(url, 3000); elements descriptions = doc.select("div#srp_sl_result"+" div.att-item"); (element element : descriptions) { system.out.println(element.owntext()); system.out.println("--------------"); } } catch (ioexception e) { // todo auto-generated catch block e.printstacktrace(); } }
but results returning empty, getting following output:
-------------- -------------- --------------
i expecting output like:
女裝手套衣服*艾爾莎*暗釦長款披風式毛衣罩衫外套s~l【taa1166】 出價 799 元 直購 799 元 運費80元 | 30 次 | 剩 16小時 60分 賣家:艾爾莎時尚精品 (評價 25229) 在新北市 ☆意樂舖☆【塑鋼衣架】abs強化多功能神奇魔術衣架(收納衣服.領帶.皮帶.肩帶) 出價 35 元 直購 35 元 運費 55元 | 8 次 | 1天 6小時 賣家:意樂舖(創意樂園小舖) (評價 14613) 在新北市 happylife【yk1324】韓國超人氣乾濕兩用衣架 防滑魔術衣架 止滑衣架 衣服衣櫃衣櫥收納 出價 25 元 直購 25 元 運費70元 | 16 次 | 2天 3小時 賣家:happylife快樂生活網 (評價 14360) 在新北市
here sample html search page:
<div class="att-item item yui3-g " data-url="https://login.yahoo.com/config/login?.intl=tw&.pd=c%3d3chd7yq72e502eh4r99sguvi5q--&.done=https%3a%2f%2ftw.search.bid.yahoo.com%2fsearch%2fauction%2fproduct%3fei%3dutf-8%26p%3d%25e8%25a1%25a3%25e6%259c%258d&rr=2465463942"> <div class="yui3-u"> <div class="srp-pdimage"> <a href="https://tw.page.bid.yahoo.com/tw/auction/e79010279;_ylt=apstmfiftkqpq2krnhqct3xyfbn8;_ylv=3"> <img height="120" alt=" (dajin達錦衣服設計中心)棒壘球帽字凸繡200元,棒球帽,帽子,棒壘球服,棒球衣 " src="https://s.yimg.com/hg/ac/30/ea/e79010279-ac-4511xf9x0430x0600-s.jpg" /> </a> </div> </div> </div>
what should change in code? how achieve goal.
please me!
you should use text()
method, not owntext()
, documentation states, it:
gets combined text of element , children.
here updated example:
public static void main(string[] args) throws malformedurlexception { url url = new url( "http://tw.search.bid.yahoo.com/search/" + "ac;_ylt=atqkyto06sgghho20hzmpex3_rf8?ei=utf-8&p=%e8%a1%a3%e6%9c%8d"); document doc; try { doc = jsoup.parse(url, 3000); elements descriptions = doc.select("div#srp_sl_result div.att-item"); (element element : descriptions) { system.out.println(element.text()); system.out.println("--------------"); } } catch (ioexception e) { e.printstacktrace(); } }
Comments
Post a Comment