Extracting content from html tags using java -


i extracted data html page , parsed tags containing tags tried different ways extracting substring etc extract title , href tags. it'snot working..can me. small snippet of output

my code

     doc  = jsoup.connect("myurl").get();      elements link = doc.select("a[href]");     string stringlink = null;     (int = 0; < link.size(); i++)      {          stringlink = link.tostring();         system.out.println(stringlink);      } 

output

<a class="link" title="waf ad" href="https://www.facebook.com/waf.ad.54"  data- jsid="anchor" target="_blank"><img class="_s0 _rw img" src="https: //fbcdn-profile-a.akamaihd.net/hprofile-ak-ash1/t5/186729_100007938933785_ 508764241_q.jpg" alt="waf ad" data-jsid="img" /></a> <a class="link" title="ana ga" href="https://www.facebook.com/ata.ga.31392410"  data-jsid="anchor" target="_blank"><img class="_s0 _rw img" src="https:// fbcdn-profile-a.akamaihd.net/hprofile-ak-ash1/t5/186901_100002334679352_ 162381693_q.jpg" alt="ana ga" data-jsid="img" /></a> 

you can use attr() method of element class extract value of attributes.

for example:

string href = link.attr("href"); string title = link.attr("title"); 

see page more: extract attributes, text, , html elements


Comments

Popular posts from this blog

c# - How to get the current UAC mode -

postgresql - Lazarus + Postgres: incomplete startup packet -

javascript - Ajax jqXHR.status==0 fix error -