Python XML解析和处理

movies.xml

<collection shelf = "New Arrivals">
<movie title = "Enemy Behind">
   <type>War, Thriller</type>
   <format>DVD</format>
   <year>2013</year>
   <rating>PG</rating>
   <stars>10</stars>
   <description>Talk about a US-Japan war</description>
</movie>
<movie title = "Transformers">
   <type>Anime, Science Fiction</type>
   <format>DVD</format>
   <year>1989</year>
   <rating>R</rating>
   <stars>8</stars>
   <description>A schientific fiction</description>
</movie>
   <movie title = "Trigun">
   <type>Anime, Action</type>
   <format>DVD</format>
   <episodes>4</episodes>
   <rating>PG</rating>
   <stars>10</stars>
   <description>Vash the Stampede!</description>
</movie>
<movie title = "Ishtar">
   <type>Comedy</type>
   <format>VHS</format>
   <rating>PG</rating>
   <stars>2</stars>
   <description>Viewable boredom</description>
</movie>
</collection>

使用SAX API解析XML

#!/usr/bin/python3

import xml.sax

class MovieHandler( xml.sax.ContentHandler ):
   def __init__(self):
      self.CurrentData = ""
      self.type = ""
      self.format = ""
      self.year = ""
      self.rating = ""
      self.stars = ""
      self.description = ""

   # Call when an element starts
   def startElement(self, tag, attributes):
      self.CurrentData = tag
      if tag == "movie":
         print ("*****Movie*****")
         title = attributes["title"]
         print ("Title:", title)

   # Call when an elements ends
   def endElement(self, tag):
      if self.CurrentData == "type":
         print ("Type:", self.type)
      elif self.CurrentData == "format":
         print ("Format:", self.format)
      elif self.CurrentData == "year":
         print ("Year:", self.year)
      elif self.CurrentData == "rating":
         print ("Rating:", self.rating)
      elif self.CurrentData == "stars":
         print ("Stars:", self.stars)
      elif self.CurrentData == "description":
         print ("Description:", self.description)
      self.CurrentData = ""

   # Call when a character is read
   def characters(self, content):
      if self.CurrentData == "type":
         self.type = content
      elif self.CurrentData == "format":
         self.format = content
      elif self.CurrentData == "year":
         self.year = content
      elif self.CurrentData == "rating":
         self.rating = content
      elif self.CurrentData == "stars":
         self.stars = content
      elif self.CurrentData == "description":
         self.description = content

if ( __name__ == "__main__"):

   # create an XMLReader
   parser = xml.sax.make_parser()
   # turn off namepsaces
   parser.setFeature(xml.sax.handler.feature_namespaces, 0)

   # override the default ContextHandler
   Handler = MovieHandler()
   parser.setContentHandler( Handler )

   parser.parse("movies.xml")

输出

*****Movie*****
Title: Enemy Behind
Type: War, Thriller
Format: DVD
Year: 2003
Rating: PG
Stars: 10
Description: Talk about a US-Japan war
*****Movie*****
Title: Transformers
Type: Anime, Science Fiction
Format: DVD
Year: 1989
Rating: R
Stars: 8
Description: A schientific fiction
*****Movie*****
Title: Trigun
Type: Anime, Action
Format: DVD
Rating: PG
Stars: 10
Description: Vash the Stampede!
*****Movie*****
Title: Ishtar
Type: Comedy
Format: VHS
Rating: PG
Stars: 2
Description: Viewable boredom

使用DOM API解析XML

#!/usr/bin/python3

from xml.dom.minidom import parse
import xml.dom.minidom

# Open XML document using minidom parser
DOMTree = xml.dom.minidom.parse("movies.xml")
collection = DOMTree.documentElement
if collection.hasAttribute("shelf"):
   print ("Root element : %s" % collection.getAttribute("shelf"))

# Get all the movies in the collection
movies = collection.getElementsByTagName("movie")

# Print detail of each movie.
for movie in movies:
   print ("*****Movie*****")
   if movie.hasAttribute("title"):
      print ("Title: %s" % movie.getAttribute("title"))

   type = movie.getElementsByTagName(‘type‘)[0]
   print ("Type: %s" % type.childNodes[0].data)
   format = movie.getElementsByTagName(‘format‘)[0]
   print ("Format: %s" % format.childNodes[0].data)
   rating = movie.getElementsByTagName(‘rating‘)[0]
   print ("Rating: %s" % rating.childNodes[0].data)
   description = movie.getElementsByTagName(‘description‘)[0]
   print ("Description: %s" % description.childNodes[0].data)

输出

Root element : New Arrivals
*****Movie*****
Title: Enemy Behind
Type: War, Thriller
Format: DVD
Rating: PG
Description: Talk about a US-Japan war
*****Movie*****
Title: Transformers
Type: Anime, Science Fiction
Format: DVD
Rating: R
Description: A schientific fiction
*****Movie*****
Title: Trigun
Type: Anime, Action
Format: DVD
Rating: PG
Description: Vash the Stampede!
*****Movie*****
Title: Ishtar
Type: Comedy
Format: VHS
Rating: PG
Description: Viewable boredom

原文地址:https://www.cnblogs.com/sea-stream/p/10178995.html

时间: 2024-08-27 07:11:37

Python XML解析和处理的相关文章

Python XML解析

Python XML解析 什么是XML? XML 指可扩展标记语言(eXtensible Markup Language). 你能够通过本站学习XML教程 XML 被设计用来传输和存储数据. XML是一套定义语义标记的规则,这些标记将文档分成很多部件并对这些部件加以标识. 它也是元标记语言.即定义了用于定义其它与特定领域有关的.语义的.结构化的标记语言的句法语言. python对XML的解析 常见的XML编程接口有DOM和SAX,这两种接口处理XML文件的方式不同.当然使用场合也不同. pyth

Python XML 解析

什么是 XML? XML 指可扩展标记语言(eXtensible Markup Language). XML 被设计用来传输和存储数据. XML 是一套定义语义标记的规则,这些标记将文档分成许多部件并对这些部件加以标识. 它也是元标记语言,即定义了用于定义其他与特定领域有关的.语义的.结构化的标记语言的句法语言. Python 对 XML 的解析 常见的 XML 编程接口有 DOM 和 SAX,这两种接口处理 XML 文件的方式不同,当然使用场合也不同. Python 有三种方法解析 XML,S

python xml解析例子

# -*- coding: utf-8 -*- """ Created on Thu Apr 16 23:18:27 2015 @author: shifeng """ ''' 功能:解析CDR_sample.xml文件,输出格式为DNorm接收的格式,并将训练集的"label"写入到文档中 xml文件:见CSDN资源共享 参考博客:http://www.cnblogs.com/fnng/p/3581433.html '''

python xml解析和生成

解析使用xml.etree.ElementTree 模块,生成使用xml.dom.minidom模块,  ElementTree比dom快,dom生成简单且会自动格式化. <?xml version='1.0' encoding='utf-8'?> <baspools> <bas> <basprovider>0</basprovider> <portal_version>1</portal_version> <tim

Python xml 解析百度糯米信息

先利用爬虫利用百度糯米提供的api来采集北京当天的团购信息,保存为numi.html import xml.etree.ElementTree as ET import os class Nuomi():        def __init__(self):                self.numi=[]    def Parse(self,filepath): tree=ET.parse(filepath)        root =tree.getroot()        for

python xml解析之 xml.etree.ElementTree

import xml.etree.ElementTree as tree import io xml = '''<?xml version="1.0"?> <data> <country name="Liechtenstein"> <rank updated="yes">2</rank> <year>2008</year> <gdppc>141100&

[Python3]XML解析处理 - Element Tree

概述 本文就是python xml解析进行讲解,在python中解析xml有很多种方法,本文通过实例来讲解如何使用ElementTree来解析xml.对于其他的xml解析方法,请自行去查找资料. 请注意,本文不是ElementTree手册,不会将所有的特性进行演示,笔者从实际用到的一些关键特性进行实例演示,对于其他特性,大家可以参见官方文档学习和了解: https://docs.python.org/3/library/xml.etree.elementtree.html 什么是ElementT

使用python如何解析XML?

本文和大家分享的主要是使用python解析XML相关内容,一起来看看吧,希望对大家学习python有所帮助. 什么是XML? XML 指可扩展标记语言(e X tensible M arkup L anguage). XML 被设计用来传输和存储数据. XML是一套定义语义标记的规则,这些标记将文档分成许多部件并对这些部件加以标识. 它也是元标记语言,即定义了用于定义其他与特定领域有关的.语义的.结构化的标记语言的句法语言. python对XML的解析 常见的XML编程接口有DOM和SAX,这两

python DOM解析XML

#conding:utf-8 # -*- coding:utf-8 -*- __author__ = 'hdfs' """ XML 解析 :DOM解析珍整个文档作为一个可遍历的对象 提交给应用程序,dom解析会将文档全部load进内存,这样对于大型的xml可能性能不多好. """ import pprint import xml.dom.minidom from xml.dom.minidom import Node doc=xml.dom.mi