实例:
#coding:utf-8 from lxml import etree import urllib url=urllib.urlopen(‘http://www.baidu.com‘).read().decode(‘utf-8‘) htm=etree.HTML(url) htree=etree.ElementTree(htm) print htree print htm.iter() ###依次打印出每个元素的文本内容和xpath路径 for t in htm.iter(): print t.text print htree.getpath(t)
时间: 2024-12-15 23:58:12