ExtractValue() (Extracts a value from an XML string.) and UpdateXML() (Return replaced XML fragment.) functions provide basic XPath 1.0 capabilities. XPath expressions used these functions to support user variables and local stored program variables.
ExtractValue() function
The ExtractValue() function is used to extract a value from an XML string using XPath notation.
Syntax:
ExtractValue()
User variables: Variables using the syntax $@variable_name (user variables) are not checked. You will not get any warnings or errors from the server if a variable has the wrong type or has previously not been assigned a value.
mysql> SET @xml = '<a><b>X</b><b>Y</b><b>Z</b></a>'; Query OK, 0 rows affected (0.00 sec) mysql> SET @i =1, @j =2, @k =0; Query OK, 0 rows affected (0.00 sec) mysql> SELECT @i, ExtractValue(@xml,'//b[$@i]'); +------+-------------------------------+ | @i | ExtractValue(@xml,'//b[$@i]') | +------+-------------------------------+ | 1 | X | +------+-------------------------------+ 1 row in set (0.02 sec) mysql> SELECT @j, ExtractValue(@xml,'//b[$@j]'); +------+-------------------------------+ | @j | ExtractValue(@xml,'//b[$@j]') | +------+-------------------------------+ | 2 | Y | +------+-------------------------------+ 1 row in set (0.00 sec) mysql> SELECT @k, ExtractValue(@xml,'//b[$@k]'); +------+-------------------------------+ | @k | ExtractValue(@xml,'//b[$@k]') | +------+-------------------------------+ | 0 | | +------+-------------------------------+ 1 row in set (0.00 sec)
Variables in stored programs : Variables using the syntax $variable_name can be declared and used with these functions when they are called inside stored programs. Such variables are local to the stored program in which they are defined, and are strongly checked for type and value.
mysql> DELIMITER | mysql> USE TEST Database changed mysql> CREATE PROCEDURE myproc () -> BEGIN -> DECLARE i INT DEFAULT 1; -> DECLARE xml VARCHAR(25) DEFAULT '<a>M</a><a>N</a><a>O</a>'; -> -> WHILE i < 5 DO -> SELECT xml, i, ExtractValue(xml, '//a[$i]'); -> SET i = i+1; -> END WHILE; -> END | Query OK, 0 rows affected (0.03 sec) mysql> DELIMITER ; mysql> CALL myproc(); +--------------------------+------+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+------+------------------------------+ | <a>M</a><a>N</a><a>O</a> | 1 | M | +--------------------------+------+------------------------------+ 1 row in set (0.00 sec) +--------------------------+------+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+------+------------------------------+ | <a>M</a><a>N</a><a>O</a> | 2 | N | +--------------------------+------+------------------------------+ 1 row in set (0.01 sec) +--------------------------+------+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+------+------------------------------+ | <a>M</a><a>N</a><a>O</a> | 3 | O | +--------------------------+------+------------------------------+ 1 row in set (0.01 sec) +--------------------------+------+------------------------------+ | xml | i | ExtractValue(xml, '//a[$i]') | +--------------------------+------+------------------------------+ | <a>M</a><a>N</a><a>O</a> | 4 | | +--------------------------+------+------------------------------+ 1 row in set (0.02 sec) Query OK, 0 rows affected (0.02 sec)
Parameters variables are used in XPath expressions inside stored routines that are passed in as parameters are also subject to strong checking.
Expressions containing user variables or variables local to stored programs must otherwise conform to the rules for XPath expressions containing variables as given in the XPath 1.0 specification.
Note: Currently, a user variable used to store an XPath expression is treated as an empty string. Because of this, it is not possible to store an XPath expression as a user variable.
ExtractValue(xml_frag, xpath_expr)
ExtractValue() takes two string arguments, a fragment of XML markup xml_frag and an XPath expression xpath_expr (also known as a locator); it returns the text (CDATA) of the first text node which is a child of the elements or elements matched by the XPath expression. In MySQL 5.6.6 and earlier, the XPath expression could contain at most 127 characters, the limitation was lifted in MySQL 5.6.7.
Using this function is the equivalent of performing a match using the xpath_expr after appending /text(). In other words, ExtractValue('Gopal', '/a/b') and ExtractValue('Gopal', '/a/b/text()') produce the same result.
If multiple matches are found, the content of the first child text node of each matching element is returned (in the order matched) as a single, space-delimited string.
If no matching text node is found for the expression (including the implicit /text())—for whatever reason, as long as xpath_expr is valid, and xml_frag consists of elements which are properly nested and closed—an empty string is returned. No distinction is made between a match on an empty element and no match at all.
If you need to determine whether no matching element was found in xml_frag or such an element was found but contained no child text nodes, you should test the result of an expression that uses the XPath count() function. For example, both of these statements return an empty string, as shown here:
Example
mysql> SELECT ExtractValue('<a></b></a>', '/a/b'); +-------------------------------------+ | ExtractValue('<a></b></a>', '/a/b') | +-------------------------------------+ | NULL | +-------------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql> SELECT ExtractValue('<a></c></a>', '/a/b'); +-------------------------------------+ | ExtractValue('<a></c></a>', '/a/b') | +-------------------------------------+ | NULL | +-------------------------------------+ 1 row in set, 1 warning (0.00 sec) </pre> <p>However, you can determine whether there was actually a matching element using the following :</p> <pre class="well"> mysql> SELECT ExtractValue('<a><b/></a>', '/a/b'); +-------------------------------------+ | ExtractValue('<a><b/></a>', '/a/b') | +-------------------------------------+ | | +-------------------------------------+ 1 row in set (0.00 sec) mysql> SELECT ExtractValue('<a><b/></a>', 'count(/a/b)'); +--------------------------------------------+ | ExtractValue('<a><b/></a>', 'count(/a/b)') | +--------------------------------------------+ | 1 | +--------------------------------------------+ 1 row in set (0.00 sec) mysql> SELECT ExtractValue('<a><c/></a>', 'count(/a/b)'); +--------------------------------------------+ | ExtractValue('<a><c/></a>', 'count(/a/b)') | +--------------------------------------------+ | 0 | +--------------------------------------------+ 1 row in set (0.00 sec)
ExtractValue() returns only CDATA, and does not return any tags that might be contained within a matching tag, nor any of their content (see the result returned as val1 in the following example).
mysql> SELECT -> ExtractValue('<a>CAT<b>DOG</b></a>', '/a') AS val1, -> ExtractValue('<a>CAT<b>DOG</b></a>', '/a/b') AS val2, -> ExtractValue('<a>CAT<b>DOG</b></a>', '//b') AS val3, -> ExtractValue('<a>CAT<b>DOG</b></a>', '/b') AS val4, -> ExtractValue('<a>CAT<b>DOG</b><b>EAT</b></a>', '//b') AS val5; +------+------+------+------+---------+ | val1 | val2 | val3 | val4 | val5 | +------+------+------+------+---------+ | CAT | DOG | DOG | | DOG EAT | +------+------+------+------+---------+ 1 row in set (0.00 sec)
This function uses the current SQL collation for making comparisons with contains(), performing the same collation aggregation as other string functions (such as CONCAT()), in taking into account the collation coercibility of their arguments.
NULL is returned if xml_frag contains elements which are not properly nested or closed, and a warning is generated. See the following example :
mysql> SELECT ExtractValue('<book>RAMAYANA</X><books/>', '//book'); +------------------------------------------------------+ | ExtractValue('<book>RAMAYANA</X><books/>', '//book') | +------------------------------------------------------+ | NULL | +------------------------------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql> SHOW WARNINGS; +---------+------+-------------------------------------------------------------- ----------------------------+ | Level | Code | Message | +---------+------+-------------------------------------------------------------- ----------------------------+ | Warning | 1525 | Incorrect XML value: 'parse error at line 1 pos 18: '</X>' un expected ('</ook>' wanted)' | +---------+------+-------------------------------------------------------------- ----------------------------+ 1 row in set (0.00 sec) mysql> SELECT ExtractValue('<book>RAMAYANA</book><books/>', '//book'); +---------------------------------------------------------+ | ExtractValue('<book>RAMAYANA</book><books/>', '//book') | +---------------------------------------------------------+ | RAMAYANA | +---------------------------------------------------------+ 1 row in set (0.00 sec)
UpdateXML() function
The UpdateXML() function is used to replace a single portion of a given fragment of XML markup xml_target with a new XML fragment new_xml, and then returns the changed XML. The portion of xml_target that is replaced matches an XPath expression xpath_expr supplied by the user. In MySQL 5.6.6 and earlier, the XPath expression could contain at most 127 characters. This limitation is lifted in MySQL 5.6.7. If no expression matching xpath_expr is found, or if multiple matches are found, the function returns the original xml_target XML fragment. All three arguments should be strings.
mysql> SELECT -> UpdateXML('<a><b>CAT</b><d></d></a>', '/a', '<e>FOX</e>') AS val1, -> UpdateXML('<a><b>CAT</b><d></d></a>', '/b', '<e>FOX</e>') AS val2, -> UpdateXML('<a><b>CAT</b><d></d></a>', '//b', '<e>FOX</e>') AS val3, -> UpdateXML('<a><b>CAT</b><d></d></a>', '/a/d', '<e>FOX</e>') AS val4 , -> UpdateXML('<a><d></d><b>CAT</b><d></d></a>', '/a/d', '<e>FOX</e>') AS val5 -> \G *************************** 1. row *************************** val1: <e>FOX</e> val2: <a><b>CAT</b><d></d></a> val3: <a><e>FOX</e><d></d></a> val4: <a><b>CAT</b><e>FOX</e></a> val5: <a><d></d><b>CAT</b><d></d></a> 1 row in set (0.00 sec)
Descriptions and examples of some basic XPath expressions follow:
/tag
Matches <tag/> if and only if <tag/> is the root element.
Example: /a has a match in <a><b/></a> because it matches the outermost (root) tag. It does not match the inner a element in <b><a/></b> because in this instance it is the child of another element.
/tag1/tag2
Matches <tag2/> if and only if it is a child of <tag1/>, and <tag1/> is the root element.
Example: /a/b matches the b element in the XML fragment <a><b/></a> because it is a child of the root element a. It does not have a match in <b><a/></b> because in this case, b is the root element (and hence the child of no other element). Nor does the XPath expression have a match in <a><c><b/></c></a>; here, b is a descendant of a, but not actually a child of a.
This construct is extendable to three or more elements. For example, the XPath expression /a/b/c matches the c element in the fragment <a><b><c/></b></a>.
//tag
Matches any instance of <tag>.
Example: //a matches the a element in any of the following: <a><b><c/></b></a>; <c><a><b/></a></b>; <c><b><a/></b></c>.
// can be combined with /. For example, //a/b matches the b element in either of the fragments <a><b/></a> or <a><b><c/></b></a>
Note //tag is the equivalent of /descendant-or-self::*/tag. A common error is to confuse this with /descendant-or-self::tag, although the latter expression can actually lead to very different results, as can be seen here:
mysql> SET @xml = '<books><book id="bk101"><author>Celeste Ng</author><title>Eve rything I Never Told You</title> <genre>Novel</genre><price>$16.17</price>< /book><book id="bk102"><author>Phil Klay</author> <title>ERedeployment</tit le><genre>war writing</genre><price>$16.17</price></book></books>'; Query OK, 0 rows affected (0.00 sec) mysql> SELECT @xml; +------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------+ | @xml | +------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------+ | <books><book id="bk101"><author>Celeste Ng</author><title>Everything I Never T old You</title> <genre>Novel</genre><price>$16.17</price></book><book id="b k102"><author>Phil Klay</author> <title>ERedeployment</title><genre>war wri ting</genre><price>$16.17</price></book></books> | +------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------+ 1 row in set (0.00 sec) mysql> SELECT ExtractValue(@xml, '//book[1]'); +---------------------------------+ | ExtractValue(@xml, '//book[1]') | +---------------------------------+ | | +---------------------------------+ 1 row in set (0.00 sec) mysql> SELECT ExtractValue(@xml, '//book[0]'); +---------------------------------+ | ExtractValue(@xml, '//book[0]') | +---------------------------------+ | | +---------------------------------+ 1 row in set (0.00 sec) mysql> SELECT ExtractValue(@xml, '//author[1]'); +-----------------------------------+ | ExtractValue(@xml, '//author[1]') | +-----------------------------------+ | Celeste Ng Phil Klay | +-----------------------------------+ 1 row in set (0.00 sec) mysql> SELECT ExtractValue(@xml, '//author[2]'); +-----------------------------------+ | ExtractValue(@xml, '//author[2]') | +-----------------------------------+ | | +-----------------------------------+ 1 row in set (0.00 sec) mysql> SELECT ExtractValue(@xml, '//descendant-or-self::*/author[1]'); +---------------------------------------------------------+ | ExtractValue(@xml, '//descendant-or-self::*/author[1]') | +---------------------------------------------------------+ | Celeste Ng Phil Klay | +---------------------------------------------------------+ 1 row in set (0.00 sec) mysql> SELECT ExtractValue(@xml, '//descendant-or-self::*/author[2]'); +---------------------------------------------------------+ | ExtractValue(@xml, '//descendant-or-self::*/author[2]') | +---------------------------------------------------------+ | | +---------------------------------------------------------+ 1 row in set (0.00 sec) mysql> SELECT ExtractValue(@xml, '//descendant-or-self::author[1]'); +-------------------------------------------------------+ | ExtractValue(@xml, '//descendant-or-self::author[1]') | +-------------------------------------------------------+ | Celeste Ng Phil Klay | +-------------------------------------------------------+ 1 row in set (0.00 sec) mysql> SELECT ExtractValue(@xml, '//descendant-or-self::author[2]'); +-------------------------------------------------------+ | ExtractValue(@xml, '//descendant-or-self::author[2]') | +-------------------------------------------------------+ | Phil Klay | +-------------------------------------------------------+ 1 row in set (0.00 sec)
To filter multiple attribute values, simply use multiple attribute-comparison clauses in succession. For example, the expression //b[@c="x"][@d="y"] matches the element <b c="x" d="y"/> occurring anywhere in a given XML fragment.
To find elements for which the same attribute matches any of several values, you can use multiple locators joined by the | operator. For example, to match all b elements whose c attributes have either of the values 23 or 17, use the expression //b[@c="23"]|//b[@c="17"]. You can also use the logical or operator for this purpose: //b[@c="23" or @c="17"].
Note The difference between or and | is that or joins conditions, while | joins result sets.
XPath Limitations. The XPath syntax supported by these functions is currently subject to the following limitations :
mysql> SELECT ExtractValue( -> '<a><b c="1">M</b><b c="2">A</b></a>', -> 'a/b' -> ) AS result; +--------+ | result | +--------+ | M A | +--------+ 1 row in set (0.00 sec)
In this case, the locator a/b resolves to /a/b.
Relative locators are also supported within predicates. In the following example, d[../@c="1"] is resolved as /a/b[@c="1"]/d:
mysql> SELECT ExtractValue( -> '<a> '> <b c="1"><d>Z</d></b> '> <b c="2"><d>Z</d></b> '> </a>', -> 'a/b/d[../@c="1"]') -> AS result; +--------+ | result | +--------+ | Z | +--------+ 1 row in set (0.00 sec)
However, name tests (such as axis::name and axis::*) are supported, as shown in these examples:
mysql> SELECT ExtractValue('xy','/red/chi ld::blue'); +------------------------------------------------------------------------------+ | ExtractValue('xy','/red/child::blue') | +------------------------------------------------------------------------------+ | x | +------------------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> SELECT ExtractValue('xy','/red/chi ld::*'); +---------------------------------------------------------------------------+ | ExtractValue('xy','/red/child::*') | +---------------------------------------------------------------------------+ | x y | +---------------------------------------------------------------------------+ 1 row in set (0.00 sec)
XPath expressions passed as arguments to ExtractValue() and UpdateXML() may contain the colon character (“:”) in element selectors, which enables their use with markup employing XML namespaces notation. For example:
mysql> SET @xml = '<red>100<black:green>200<white>300</white><yellow:pink>400</y ellow:pink></black:green></red>'; Query OK, 0 rows affected (0.00 sec) mysql> SELECT ExtractValue(@xml, '//yellow:pink'); +-------------------------------------+ | ExtractValue(@xml, '//yellow:pink') | +-------------------------------------+ | 400 | +-------------------------------------+ 1 row in set (0.00 sec) mysql> SELECT UpdateXML(@xml, '//black:green', '<brown:redish>500</brown:redish> '); +----------------------------------------------------------------------+ | UpdateXML(@xml, '//black:green', '<brown:redish>500</brown:redish>') | +----------------------------------------------------------------------+ | <red>100<brown:redish>500</brown:redish></red> | +----------------------------------------------------------------------+ 1 row in set (0.00 sec)
This is similar in some respects to what is permitted by Apache Xalan and some other parsers, and is much simpler than requiring namespace declarations or the use of the namespace-uri() and local-name() functions.
Error handling. For both ExtractValue() and UpdateXML(), the XPath locator used must be valid and the XML to be searched must consist of elements which are properly nested and closed. If the locator is invalid, an error is generated:
mysql> SELECT ExtractValue('author', '/&book'); ERROR 1105 (HY000): XPATH syntax error: '&book'
If xml_frag does not consist of elements which are properly nested and closed, NULL is returned and a warning is generated, as shown in this example:
mysql>SELECT ExtractValue('<book>author</book><ERedeployment', '//book'); +-------------------------------------------------------------+ | ExtractValue('<book>author</book><ERedeployment', '//book') | +-------------------------------------------------------------+ | NULL | +-------------------------------------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql>SHOW WARNING\ ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WARNI NG' at line 1 mysql>SELECT ExtractValue('<book>author</book><ERedeployment/>', '//book'); +---------------------------------------------------------------+ | ExtractValue('<book>author</book><ERedeployment/>', '//book') | +---------------------------------------------------------------+ | author | +---------------------------------------------------------------+ 1 row in set (0.00 sec)
The replacement XML used as the third argument to UpdateXML() is not checked to determine whether it consists solely of elements which are properly nested and closed.
XPath Injection. code injection occurs when malicious code is introduced into the system to gain unauthorized access to privileges and data. It is based on exploiting assumptions made by developers about the type and content of data input from users. XPath is no exception in this regard.
A common scenario in which this can happen is the case of application which handles authorization by matching the combination of a login name and password with those found in an XML file, using an XPath expression like this one:
//user[login/text()='neapolitan' and password/text()='1c3cr34m']/attribute::id
This is the XPath equivalent of an SQL statement like this one:
SELECT id FROM users WHERE login='neapolitan' AND password='1c3cr34m';
No checks are performed on the input. This means that a malevolent user can “short-circuit” the test by entering ' or 1=1 for both the login name and password, resulting in $xpath being evaluated as shown here:
//user[login/text()='' or 1=1 and password/text()='' or 1=1]/attribute::id
Since the expression inside the square brackets always evaluates as true, it is effectively the same as this one, which matches the id attribute of every user element in the XML document:
//user/attribute::id
One way in which this particular attack can be circumvented is simply by quoting the variable names to be interpolated in the definition of $xpath, forcing the values passed from a Web form to be converted to strings:
$xpath = "//user[login/text()='$login' and password/text()='$password']/attribute::id";
This is the same strategy that is often recommended for preventing SQL injection attacks. In general, the practices you should follow for preventing XPath injection attacks are the same as for preventing SQL injection:
Just as SQL injection attacks can be used to obtain information about database schemas, so can XPath injection be used to traverse XML files to uncover their structure, as discussed in Amit Klein's paper Blind XPath Injection (PDF file, 46KB).
It is also important to check the output being sent back to the client. Consider what can happen when we use the MySQL ExtractValue() function:
mysql> SELECT ExtractValue( -> LOAD_FILE ('test.xml'), -> '//user[login/text()="" or 1=1 and password/text()="" or 1=1]/attribute:: id' -> ) AS id; +------+ | id | +------+ | NULL | +------+ 1 row in set (0.03 sec)
Previous: MySQL Information Functions
Next: MySQL Miscellaneous Functions
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4