爪哇国新游记之三十四----Dom4j的XPath操作

Dom4j是Java访问XML的利器之一,另一个是JDom。记得当年因为粗掌握点JDomAPI但项目要求使用Dom4j还闹一阵情绪,现在看来真是没必要,只花一些时间成本就进去一个新世界绝对是值得做的一件事。更何况JDom因无人更新而停顿了。

Dom4j有两个包,一个是dom4j-1.6.1.jar,它提供基本的XML API支持,如访问节点,属性等。

还有一个是jaxen-1.1-beta-9.jar,它提供XPath支持。

言归正传,下面请看例程。

1.访问特定节点群

XML样本:

<applications>
    <application name=‘chat‘>
        <mtLanguage source=‘ar_ar‘ target=‘en_us‘ />
        <mtLanguage source=‘zh_cn‘ target=‘en_us‘ />
        <mtLanguage source=‘zh_tw‘ target=‘en_us‘ />
        <mtLanguage source=‘en_us‘ target=‘ar_ar‘ />
        <mtLanguage source=‘en_us‘ target=‘zh_cn‘ />
        <mtLanguage source=‘en_us‘ target=‘zh_tw‘ />
        <mtLanguage source=‘en_us‘ target=‘fr_fr‘ />
        <mtLanguage source=‘en_us‘ target=‘de_de‘ />
        <mtLanguage source=‘en_us‘ target=‘it_it‘ />
        <mtLanguage source=‘en_us‘ target=‘ja_jp‘ />
        <mtLanguage source=‘en_us‘ target=‘ko_kr‘ />
        <mtLanguage source=‘en_us‘ target=‘pt_br‘ />
        <mtLanguage source=‘en_us‘ target=‘ru_ru‘ />
        <mtLanguage source=‘en_us‘ target=‘es_es‘ />
        <mtLanguage source=‘fr_fr‘ target=‘en_us‘ />
        <mtLanguage source=‘de_de‘ target=‘en_us‘ />
        <mtLanguage source=‘it_it‘ target=‘en_us‘ />
        <mtLanguage source=‘ja_jp‘ target=‘en_us‘ />
        <mtLanguage source=‘ko_kr‘ target=‘en_us‘ />
        <mtLanguage source=‘pt_br‘ target=‘en_us‘ />
        <mtLanguage source=‘ru_ru‘ target=‘en_us‘ />
        <mtLanguage source=‘es_es‘ target=‘en_us‘ />
    </application>
    <application name=‘doc‘>
        <mtLanguage source=‘ar_ar‘ target=‘en_us‘ />
        <mtLanguage source=‘zh_cn‘ target=‘en_us‘ />
        <mtLanguage source=‘zh_tw‘ target=‘en_us‘ />
        <mtLanguage source=‘en_us‘ target=‘ar_ar‘ />
        <mtLanguage source=‘en_us‘ target=‘zh_cn‘ />
        <mtLanguage source=‘en_us‘ target=‘zh_tw‘ />
        <mtLanguage source=‘en_us‘ target=‘fr_fr‘ />
        <mtLanguage source=‘en_us‘ target=‘de_de‘ />
        <mtLanguage source=‘en_us‘ target=‘hi_in‘ />
        <mtLanguage source=‘en_us‘ target=‘it_it‘ />
        <mtLanguage source=‘en_us‘ target=‘ja_jp‘ />
        <mtLanguage source=‘en_us‘ target=‘ko_kr‘ />
        <mtLanguage source=‘en_us‘ target=‘pt_br‘ />
        <mtLanguage source=‘en_us‘ target=‘ru_ru‘ />
        <mtLanguage source=‘en_us‘ target=‘es_es‘ />
        <mtLanguage source=‘en_us‘ target=‘ur_pk‘ />
        <mtLanguage source=‘fr_fr‘ target=‘en_us‘ />
        <mtLanguage source=‘de_de‘ target=‘en_us‘ />
        <mtLanguage source=‘hi_in‘ target=‘en_us‘ />
        <mtLanguage source=‘it_it‘ target=‘en_us‘ />
        <mtLanguage source=‘ja_jp‘ target=‘en_us‘ />
        <mtLanguage source=‘ko_kr‘ target=‘en_us‘ />
        <mtLanguage source=‘pt_br‘ target=‘en_us‘ />
        <mtLanguage source=‘ru_ru‘ target=‘en_us‘ />
        <mtLanguage source=‘es_es‘ target=‘en_us‘ />
        <mtLanguage source=‘ur_pk‘ target=‘en_us‘ />
    </application>
</applications>

现在,如果我想要访问属性为chat的application节点下的所有mtLanguage子节点,XPath应该这样写:

//applications/application[@name=‘chat‘]/mtLanguage

而具体操作的Java语句是:

Document doc= DocumentHelper.parseText(responseXML);// 这个responseXML就是上面的XML样例
List<?> elms=doc.selectNodes("//applications/application[@name=‘chat‘]/mtLanguage");
System.out.println("There are "+elms.size()+" language pairs available in text translation");

for(Object obj:elms){
     Element elm=(Element)obj;
     System.out.println("From "+elm.attributeValue("source")+" to "+elm.attributeValue("target"));
}

执行上面语句输出如下:

There are 22 language pairs available in text translation
From ar_ar to en_us
From zh_cn to en_us
From zh_tw to en_us
From en_us to ar_ar
From en_us to zh_cn
From en_us to zh_tw
From en_us to fr_fr
From en_us to de_de
From en_us to it_it
From en_us to ja_jp
From en_us to ko_kr
From en_us to pt_br
From en_us to ru_ru
From en_us to es_es
From fr_fr to en_us
From de_de to en_us
From it_it to en_us
From ja_jp to en_us
From ko_kr to en_us
From pt_br to en_us
From ru_ru to en_us
From es_es to en_us

2.访问特定节点

XML样本:

<rep sts="OK" a="trep" tl="zh-CN">
    <docs>
        <d dt="ndoc" did="d20160223213120480009045125076363146" lang="en-US"
            ctime="2016-02-23T21:31:20" mtime="2016-02-23T21:31:20" orig="1"
            mime="text/x-mt-xml" wc="2">
            <p pid="1" wc="2">
                <s sid="1">
                    <t tid="1" tt="orig" wc="2">Good evening</t>
                </s>
            </p>
        </d>
        <d dt="ndoc" did="d20160223213120480009045125076363146" lang="zh-CN"
            ctime="2016-02-23T21:31:20" mtime="2016-02-23T21:31:20" orig="0"
            mime="text/x-mt-xml" sc="100.00" wc="1">
            <p pid="1" wc="1">
                <s sid="1">
                    <t tid="1" tt="mt" src="mt" sc="100.00" wc="1">晚上好</t>
                </s>
            </p>
        </d>
    </docs>
</rep>

如果我想得到上文中“晚上好”这段文字,XPath应该这样写

//rep/docs/d[last()]/p/s/t

对应的Java代码是:

Document doc= DocumentHelper.parseText(responseXML);
Element elm = (Element) doc.selectSingleNode("//rep/docs/d[last()]/p/s/t");
targetTxt=elm.getText();

3.取属性

XML样本:

<rep sts="OK" a="trep" tl="zh-CN">
    <docs>1</docs>
</rep>

要取根节点rep的sts属性,XPath可以这样写:

//rep/@sts

而对应的Java语句是:

System.out.println("XML="+responseXML);
Document doc= DocumentHelper.parseText(responseXML);
Attribute attr = (Attribute) doc.selectSingleNode("//rep/@sts");      

return attr.getText();
时间: 2024-08-09 10:44:21

爪哇国新游记之三十四----Dom4j的XPath操作的相关文章

爪哇国新游记之十四----初试JDBC

import java.sql.Connection; import java.sql.DriverManager; import java.sql.PreparedStatement; import java.sql.ResultSet; import java.sql.SQLException; public class A{ public static void search1(){ Connection conn=null; PreparedStatement ps=null; Resu

爪哇国新游记之三十二----邮件发送

由三个类完成任务,第一个为主,main中是用法示例. package com.ufo.util.mail; import java.util.Date; import java.util.Properties; import javax.activation.DataHandler; import javax.activation.DataSource; import javax.activation.FileDataSource; import javax.mail.BodyPart; imp

爪哇国新游记之十八----泛型栈类

import java.lang.reflect.Array; /** * 泛型栈 * * @param <T> */ public class Stack<T>{ private Class<T> type;// 栈元素所属的类 private int size;// 栈深度 private T[] arr;// 用数组存储 private int top;// 栈顶元素的下标 public Stack(Class<T> type,int size){ t

爪哇国新游记之十九----使用Stack检查数字表达式中括号的匹配性

/** * 辅助类 * 用于记载字符和位置 * */ class CharPos{ char c; int pos; public CharPos(char c,int pos){ this.c=c; this.pos=pos; } } /** * 括号检查类 * */ public class BracketChecker{ /** * 检查函数 * @param str * @return * @throws Exception */ public static boolean check(

爪哇国新游记之十六----泛型单链表类

/** * 单链表节点类 * @param <T> */ class Node<T extends Object>{ protected T value; protected Node next; } /** * 单链表类 * @param <T> */ public class ChainList<T extends Object>{ private Node<T> first; public void addTail(T t){ Node&l

爪哇国新游记之十五----泛型动态数组类

import java.lang.reflect.Array; /** * 泛型动态数组类 * */ public class DynamicArray<T extends Object>{ private T[] arr; private Class<T> type; private int currCount; private static final int InitSize=2; public DynamicArray(Class<T> type){ this.

爪哇国新游记之十----异常初相识

import java.util.ArrayList; import java.util.List; public class A{ private int[] arr=new int[3]; private List<String> ls; public A(){ ls=new ArrayList<String>(); ls.add("12"); } public int getArrValue(int i){ return arr[i]; } public

爪哇国新游记之十二----线程创建的两种形式

public class Thread1 extends Thread{ public void run(){ int i=0; while(i<10){ i++; System.out.println(i); } } public static void main(String[] args){ Thread1 t=new Thread1(); t.start(); } } public class Thread2 implements Runnable{ @Override public v

爪哇国新游记之三十一----日期时间与字符串间的转化

1.由日期时间转化成字符串 Date date = new Date(); Format formatter = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"); String dateString=formatter.format(date); 上述代码使用的是SimpleDateFormat的format函数 2.由字符串转化成日期时间 String dateStr1="20141216"; SimpleDateForma