python - Get Title and Description of external URL using Django -
i know how can extract title
, metadescription
of external site using it's url. i've found solutions not django/python.
currently code adds link database , make go link after added , update entry corresponding title
, metadescription
.
it nice able retrieve og
tags such meta property="og:url
.
thank you.
to access title or description of external site have 2 things.
1) need fetch html external site. 2) need parse html , title element , meta elements.
the first part easy:
import urllib2 opener = urllib2.build_opener() external_sites_html = opener.open(external_sites_url).read()
the second part more difficult, need use external library parse html, library called beautifulsoup because has nice api. (it easy programmers use.)
from bs4 import beautifulsoup soup = beautifulsoup(external_sites_html) # can tags of external site soup variable. title = soup.title.string
however, important remember external site may respond when fetch it, wise make external site record in database, return reply user. in other process, should go , fetch url , add information database. if it's important information returned in reply, cannot in background , have make user wait.
Comments
Post a Comment