EMS logo

Products Navigation

choose your database tool

Our Partnership Status

Microsoft Certified Partner
Oracle Certified Partner
Embarcadero Technology Partner

SQL Articles

All SQL Articles

Alexander Barkov and Peter Gulutzan
MySQL 5.1's New XML Functions

MySQL version 5.1.5 has functions for searching and changing XML documents. This article has examples.

Let's make a database and put two XML documents in it.

CREATE TABLE x (doc VARCHAR(150));

INSERT INTO x VALUES
('
<book>
<title>A guide to the SQL standard</title>
<author>
<initial>CJ</initial>
<surname>Date</surname>

</author>
</book>
');

INSERT INTO x VALUES
('
<book>
<title>SQL:1999</title>
<author>
<initial>J</initial>

<surname>Melton</surname>
</author>
</book>
');

The doc columns have an internal hierarchical structure, with books containing titles and authors, and authors in turn containing initials and surnames. It's a popular way to format and store, and the "markup" -- words like "<book>" and </book>" -- makes it easy to see the hierarchy if you're careful about indentation.

ExtractValue()

Syntax
EXTRACTVALUE (XML_document, XPath_string);
1st Parameter
XML_document string formatted as in the example
2nd Parameter
XPath_string (XPath is a "sub-language")
Action
returns string containing a value from the document
Example #E1
mysql> SELECT EXTRACTVALUE(doc,'/book/author/initial') FROM x;
+------------------------------------------+
| EXTRACTVALUE(doc,'/book/author/initial') |
+------------------------------------------+
| CJ                                       |
| J                                        |
+------------------------------------------+
2 rows in set (0.01 sec)

What happened here? Books contain authors which contain initials. With EXTRACTVALUE() we navigated down through the hierarchy to get the values at the final node points: 'CJ' and 'J'. A basic extraction is just a matter of specifying the hierarchy in the XPath_string argument.

Example #E2
mysql> SELECT EXTRACTVALUE(doc,'/*/*/initial') FROM x;
+----------------------------------+
| EXTRACTVALUE(doc,'/*/*/initial') |
+----------------------------------+
| CJ                               |
| J                                |
+----------------------------------+
2 rows in set (0.01 sec)

You don't have to list the whole hierarchy. When part of a path is a wildcard, that means "any name will do".

Example #E3
mysql> SELECT extractValue(doc,'/book/child::*') FROM x;
+---------------------------------------------+
| extractValue(doc,'/book/child::*')          |
+---------------------------------------------+
| A guide to the SQL standard                 |
| SQL:1999                                    |
+---------------------------------------------+
2 rows in set (0.00 sec)

With /book/child:: we find what's immediately below book, namely the title data. We could use a variety of operators here:
child ... what's immediately below
descendant ... what's below at all levels
parent ... what's immediately above
ancestor ... what's above at all levels
following-sibling ... what's next at same level
preceding-sibling ... what's before at same level
self ... not before, not after, same level

Example #E4
mysql> select
extractValue(doc,'/book/author/surname[self:text()="Date"]') from x;
+--------------------------------------------------------------+
| extractValue(doc,'/book/author/surname[self:text()="Date"]') |
+--------------------------------------------------------------+
| Date                                                         |
|                                                              |
+--------------------------------------------------------------+
2 rows in set (0.00 sec)

And here's one way to add a predicate (a conditional expression). By saying "in the text of self, that is, in the text of surname because the predicate immediately comes after surname, look for value = Date", we include book/author/surname=Date and we exclude book/author/surname=Melton. The Melton row is blank. Naturally = isn't the only operator we could use here; we could have self:text()>="Date", self:text()="Date" OR self:text()="Melton", and so on.

What you've seen is: an XPath expression can contain nodes separated by slashes (vaguely like a Unix path expression), and you can pick values from one or more nodes. Wildcards, navigation aids, and predicates are supported. Although the examples all used extractValue() in the SELECT list, it can be used in any statement wherever an expression is allowed. A good tip is to combine XML columns with fulltext indexing.

UpdateXML()

Now here's a new function for updating the structure.

Syntax
UPDATEXML (XML_document, XPath_string, new_value);
1st Parameter
XML_document string formatted as in the example
2nd Parameter
XPath_string (XPath is a "sub-language")
3rd Parameter
new_value to replace whatever is found
Action
changes string containing a value from the document
Example #U1
mysql> select UpdateXML(doc,'/book/author/initial','!!') from x;
+----------------------------------------------------------+
| UpdateXML(doc,'/book/author/initial','!!')               |
+----------------------------------------------------------+
|
<book>
<title>A guide to the SQL standard</title>
<author>
!!

<surname>Date</surname>
</author>
</book> |
|
<book>
<title>SQL:1999</title>
<author>

!!
<surname>Melton</surname>
</author>
</book> |
+----------------------------------------------------------+
2 rows in set (0.00 sec)

UpdateXML's first two arguments are the same as for ExtractValue because the first thing we want to do is navigate to the node. The third argument is a replacement string. So we change book/author/initial to !!. The return value is the complete new document. To replace the document permanently, you could say UPDATE x SET doc = UpdateXML(doc,/book/author/initial','!!');

But this is probably a mistake! We didn't just change the text to !!. We changed <initial>CJ></initial> to !! So we changed the document structure. Normally, we only want to change the contents. For that, we should say: select UpdateXML(doc,'/book/author/initial','<initial>!!</initial>') from x;

Example #U2
mysql> select
extractvalue(
UpdateXML(doc,'/book/author/initial','<initial>!!</initial>'),'/book/author/
initial') from x;
+-----------------------------------------------------------------------------------------------------------------+
| extractvalue(UpdateXML(doc,'/book/author/initial','<initial>!!</initial>'), '/book/author/initial') |
+-----------------------------------------------------------------------------------------------------------------+
| !!
|
| !!
|
+-----------------------------------------------------------------------------------------------------------------+
2 rows in set (0.01 sec)

This final example, a combination of ExtractValue() and UpdateXML(), shows what would happen if we change the initial node to !! and then select the initial node. Naturally, we get !!.

Go Get

There's lots more to the XPath story. To get more depth, try these pages about XML and XPath:
http://www.w3.org/TR/xpath
http://www.oreilly.com/catalog/xmlnut/chapter/ch09.html
http://en.wikipedia.org/wiki/XPath
http://www.rpbourret.com/xml/XPathIn5.htm

You'll find that most (not all but most) examples on those pages are applicable to MySQL now. That's why it's nice to follow standards.

Or just download MySQL 5.1 and see what you can see. We're glad to get early comments about our alpha versions. The download page is here: http://dev.mysql.com/downloads/mysql/5.1.html

Source: MySQL Developers Zone

twitterfacebook