python - Get Title and Description of external URL using Django -


i know how can extract title , metadescription of external site using it's url. i've found solutions not django/python.

currently code adds link database , make go link after added , update entry corresponding title , metadescription.

it nice able retrieve og tags such meta property="og:url.

thank you.

to access title or description of external site have 2 things.

1) need fetch html external site. 2) need parse html , title element , meta elements.

the first part easy:

import urllib2 opener = urllib2.build_opener() external_sites_html = opener.open(external_sites_url).read() 

the second part more difficult, need use external library parse html, library called beautifulsoup because has nice api. (it easy programmers use.)

from bs4 import beautifulsoup soup = beautifulsoup(external_sites_html) # can tags of external site soup variable. title = soup.title.string 

however, important remember external site may respond when fetch it, wise make external site record in database, return reply user. in other process, should go , fetch url , add information database. if it's important information returned in reply, cannot in background , have make user wait.


Comments

Popular posts from this blog

c# - How to get the current UAC mode -

postgresql - Lazarus + Postgres: incomplete startup packet -

javascript - Ajax jqXHR.status==0 fix error -