Grails Cookbook - A collection of tutorials and examples

Groovy XmlParser Examples for Parsing XML

Parsing XML information is common to many projects. We showed int he previous post on how to parse XML documents using XmlSlurper. This post will focus on giving examples on how to parse documents using XmlParser.

Simple XmlParser Example

If the XML to be parsed in stored in a String, below is a way on how to parse this:

package test
/**
 * A simple application that parses a String that contains XML information using XmlParser.
 */
class Test {
    static stringXML = '<person>'+
        '<firstName>John</firstName><lastName>Doe</lastName><age>25</age>'+
        '</person>'
    static main(args) {
        def person = new XmlParser().parseText(stringXML)
        println "first name: ${person.firstName.text()}"
        println "last name: ${person.lastName.text()}"
        println "age: ${person.age.text()}"
    }
}

The code is fairly short and straightforward. There is no much plumbing code involved, we just need to instantiate XmlParser and use it right away. To access the value, we need to invoke the corresponding property's text() method as shown. Below is the expected output when executed:
first name: John
last name: Doe
age: 25

Parse XML Attribute

If we wish to get the value of a tag's attribute, we can use the "@". This gives an object with a value property that is a List. If we expect a single instance of the attribute, we can just access the value at index 0. Below is an example to illustrate this:
package test
/**
 * A Simple Example that parses an XML document and access an attribute using XmlParser
 */
class Test {
    static stringXML = '<person>'+
        '<firstName>John</firstName><lastName>Doe</lastName><age accurate="true">25</age>'+
        '</person>'
    static main(args) {
        def person = new XmlParser().parseText(stringXML)
        println "age: ${person.age.text()}"
        println "age: ${person.age.@accurate.value[0]}"
    }
}
The syntax is a bit weird but not hard to follow. It is handly to know the right notation. E.g. person.age.@accurate.value[0]. The output is below:
age: 25
age: true

Parse XML File

It is more common to parse XML documents stored in files. For example, if we have the file "c:/temp/person.xml", with the following contents:
<person>
    <firstName>Jane</firstName>
    <lastName>Smith</lastName>
    <age>31</age>
</person>
The code to parse the file in Groovy using XmlParser is shown below:
package test
/**
 * A Simple example that parses an XML stored in a file using XmlParser.
 */
class Test {
    static main(args) {
        def person = new XmlParser().parse(new File("c:/temp/person.xml"))
        println "first name: ${person.firstName.text()}"
        println "last name: ${person.lastName.text()}"
        println "age: ${person.age.text()}"
    }
}
As shown, the code is simple and self explanatory. The output is below:
first name: Jane
last name: Smith
age: 31

Parse XML Given URL

We can also use XmlParser to parse documents hosted on a web server. We show an example where we parse a WSDL document located in: http://www.webservicex.com/globalweather.asmx?WSDL

The contents looks something like below. Note that the actual document contains more information but shortened here to focus on the message tags. Missing parts are indicated with ...

<?xml version="1.0" encoding="utf-8"?>
<wsdl:definitions xmlns:tm="http://microsoft.com/wsdl/mime/textMatching/" xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/" xmlns:mime="http://schemas.xmlsoap.org/wsdl/mime/" xmlns:tns="http://www.webserviceX.NET" xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/" xmlns:s="http://www.w3.org/2001/XMLSchema" xmlns:soap12="http://schemas.xmlsoap.org/wsdl/soap12/" xmlns:http="http://schemas.xmlsoap.org/wsdl/http/" targetNamespace="http://www.webserviceX.NET" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/">
  <wsdl:types>
    ...
  </wsdl:types>
  <wsdl:message name="GetWeatherSoapIn">
    <wsdl:part name="parameters" element="tns:GetWeather" />
  </wsdl:message>
  <wsdl:message name="GetWeatherSoapOut">
    <wsdl:part name="parameters" element="tns:GetWeatherResponse" />
  </wsdl:message>
  <wsdl:message name="GetCitiesByCountrySoapIn">
    <wsdl:part name="parameters" element="tns:GetCitiesByCountry" />
  </wsdl:message>
  <wsdl:message name="GetCitiesByCountrySoapOut">
    <wsdl:part name="parameters" element="tns:GetCitiesByCountryResponse" />
  </wsdl:message>
  ...
</wsdl:definitions>
The code to retrieve and parse this XML using XmlParser is as simple as shown below:
package test
/**
 * A Simple Example that parses an XML from a Web Document.
 */
class Test {
    static main(args) {
        def wsdl = new XmlParser().parse("http://www.webservicex.com/globalweather.asmx?WSDL")
        wsdl.'wsdl:message'.each { message ->
            println " message.name = "+message.@'name'	
            println "    part.name = " + message.'wsdl:part'.@'name'[0]
            println "    part.element = " + message.'wsdl:part'.@'element'[0]
        }
    }
}
As shown above, accessing the components are the same as the previous examples. The partial output is shown below:
 message.name = GetWeatherSoapIn
    part.name = parameters
    part.element = tns:GetWeather
 message.name = GetWeatherSoapOut
    part.name = parameters
    part.element = tns:GetWeatherResponse
 message.name = GetCitiesByCountrySoapIn
    part.name = parameters
    part.element = tns:GetCitiesByCountry
 message.name = GetCitiesByCountrySoapOut
    part.name = parameters
    part.element = tns:GetCitiesByCountryResponse
...

Searching Information from XML parsed by XmlParser

We can manipulate the object result of XmlParser. For example, we can search for information given a criteria. Below is an example that searches for all people with age greater than 20.
package test
/**
 * A Simple Example that searches information from XML parsed by XmlParser.
 */
class Test {
    static stringXML = 
        '<personDatabase>'+
        '  <person><firstName>John</firstName><lastName>Doe</lastName><age>25</age></person>'+
        '  <person><firstName>Jane</firstName><lastName>Smith</lastName><age>31</age></person>'+
        '  <person><firstName>Robert</firstName><lastName>Doe</lastName><age>11</age></person>'+
        '  <person><firstName>Michael</firstName><lastName>Smith</lastName><age>55</age></person>'+
        '  <person><firstName>Scott</firstName><lastName>Williams</lastName><age>35</age></person>'+
        '  <person><firstName>Alice</firstName><lastName>Anthony</lastName><age>14</age></person>'+
        '</personDatabase>'
    static main(args) {
        def people = new XmlParser().parseText(stringXML)
        people.person.findAll { p ->
            p.age[0].text().toInteger() > 20
        }.each { p ->
            println "${p.lastName[0].text()}, ${p.firstName[0].text()} is ${p.age[0].text()} years old."
        }
    }
}
The syntax is a bit tricky because it needs to use index and text() method. E.g. p.age[0].text().toInteger(). Below is the expected output:
Doe, John is 25 years old.
Smith, Jane is 31 years old.
Smith, Michael is 55 years old.
Williams, Scott is 35 years old.
Below is a slightly modified code that searches for all people whose surname starts with the letter "D".
package test
/**
 * A Simple Example that searches information from XML parsed by XmlParser.
 */
class Test {
    static stringXML = 
        '<personDatabase>'+
        '  <person><firstName>John</firstName><lastName>Doe</lastName><age>25</age></person>'+
        '  <person><firstName>Jane</firstName><lastName>Smith</lastName><age>31</age></person>'+
        '  <person><firstName>Robert</firstName><lastName>Doe</lastName><age>11</age></person>'+
        '  <person><firstName>Michael</firstName><lastName>Smith</lastName><age>55</age></person>'+
        '  <person><firstName>Scott</firstName><lastName>Williams</lastName><age>35</age></person>'+
        '  <person><firstName>Alice</firstName><lastName>Anthony</lastName><age>14</age></person>'+
        '</personDatabase>'
     static main(args) {
        def people = new XmlParser().parseText(stringXML)
        people.person.findAll { p ->
            p.lastName[0].text().startsWith('D')
        }.each { p ->
            println "${p.lastName[0].text()}, ${p.firstName[0].text()}"
        }
    }
}
And the expected output is:
Doe, John
Doe, Robert
Posts about XML