php xpath dom
In a recent article I discussed PHP’s implementation of the DOM and introduced various functions to pull data from and manipulate an XML structure. I also briefly mentioned XPath, but didn’t have much space to discuss it. In this article, we’ll look closer at XPath, how it functions, and how it is implemented in PHP. You’ll find that XPath can greatly reduce the amount of code you have to write to query and filter XML data, and will often yield better performance as well.
在最近的一篇文章中,我讨论了PHP对DOM的实现,并介绍了各种从XML结构中提取数据和操作XML结构的函数。 我还简要提到了XPath,但是没有太多的讨论空间。 在本文中,我们将仔细研究XPath,其功能以及如何在PHP中实现。 您会发现XPath可以大大减少查询和过滤XML数据所需编写的代码量,并且通常还可以产生更好的性能。
I’ll use the same DTD and XML from the previous article to demonstrate the PHP DOM XPath functionality. To quickly refresh your memory, here’s what the DTD and XML look like:
我将使用与上一篇文章相同的DTD和XML来演示PHP DOM XPath功能。 为了快速刷新内存,DTD和XML如下所示:
<!ELEMENT library (book*)> <!ELEMENT book (title, author, genre, chapter*)> <!ATTLIST book isbn ID #REQUIRED> <!ELEMENT title (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT genre (#PCDATA)> <!ELEMENT chapter (chaptitle,text)> <!ATTLIST chapter position NMTOKEN #REQUIRED> <!ELEMENT chaptitle (#PCDATA)> <!ELEMENT text (#PCDATA)> <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE library SYSTEM "library.dtd"> <library> <book isbn="isbn1234"> <title>A Book</title> <author>An Author</author> <genre>Horror</genre> <chapter position="first"> <chaptitle>chapter one</chaptitle> <text><![CDATA[Lorem Ipsum...]]></text> </chapter> </book> <book isbn="isbn1235"> <title>Another Book</title> <author>Another Author</author> <genre>Science Fiction</genre> <chapter position="first"> <chaptitle>chapter one</chaptitle> <text><![CDATA[<i>Sit Dolor Amet...</i>]]></text> </chapter> </book> </library>XPath is a syntax available for querying an XML document. In it’s simplest form, you define a path to the element you want. Using the XML document above, the following XPath query will return a collection of all the book elements present:
XPath是可用于查询XML文档的语法。 以最简单的形式,您可以定义所需元素的路径。 使用上面的XML文档,以下XPath查询将返回存在的所有book元素的集合:
//library/bookThat’s it. The two forward slashes indicate library is the root element of the document, and the single slash indicates book is a child. It’s pretty straight forward, no?
而已。 两个斜杠表示库是文档的根元素,单斜杠表示book是子级。 很简单,不是吗?
But what if you want to specify a particular book. Let’s say you want to return any books written by “An Author”. The XPath for that would be:
但是,如果您想指定一本书,该怎么办? 假设您要退回任何由“作者”撰写的书籍。 的XPath将是:
//library/book/author[text() = "An Author"]/..You can use text() here in square braces to perform a comparison against the value of a node, and the trailing “/..” indicates we want the parent element (i.e. move back up the tree one node).
您可以在此处使用方括号将text()与一个节点的值进行比较,后缀“ / ..”表示我们需要父元素(即,将树上移一个节点)。
XPath queries can be executed using one of two functions: query() and evaluate(). Both perform the query, but the difference lies in the type of result they return. query() will always return a DOMNodeList whereas evaluate() will return a typed result if possible. For example, if your XPath query is to return the number of books written by a certain author rather than the actual books themselves, then query() will return an empty DOMNodeList. evaluate() will simply return the number so you can use it immediately instead of having to pull the data from a node.
可以使用以下两个函数之一执行XPath查询: query()和evaluate() 。 两者都执行查询,但是区别在于它们返回的结果类型。 query()将始终返回DOMNodeList而DOMNodeList evaluate()将尽可能返回类型化的结果。 例如,如果您的XPath查询返回的是某位作者撰写的书籍数量,而不是实际的书籍本身,那么query()将返回一个空的DOMNodeList 。 evaluate()只会返回该数字,因此您可以立即使用它,而不必从节点中提取数据。
Let’s do a quick demonstration that returns the number of books written by an author. The first method we’ll look at will work, but doesn’t make use of XPath. This is to show you how it can be done without XPath and why XPath is so powerful.
让我们做一个简短的演示,返回作者写的书数。 我们将要研究的第一种方法可以使用,但不使用XPath。 这是向您展示如何在没有XPath的情况下完成此操作以及XPath为何如此强大。
<?php public function getNumberOfBooksByAuthor($author) { $total = 0; $elements = $this->domDocument->getElementsByTagName("author"); foreach ($elements as $element) { if ($element->nodeValue == $author) { $total++; } } return $number; }The next method achieves the same result, but uses XPath to select just those books that are written by a specific author:
下一个方法可以达到相同的结果,但是使用XPath仅选择那些由特定作者编写的书:
<?php public function getNumberOfBooksByAuthor($author) { $query = "//library/book/author[text() = '$author']/.."; $xpath = new DOMXPath($this->domDocument); $result = $xpath->query($query); return $result->length; }Notice how we this time we have removed the need for PHP to test against the value of the author. But we can go one step further still and use the XPath function count() to count the occurrences of this path.
请注意,这次我们是如何消除了PHP来根据作者的价值进行测试的。 但是我们可以再走一步,并使用XPath函数count()来计算该路径的出现次数。
<?php public function getNumberOfBooksByAuthor($author) { $query = "count(//library/book/author[text() = '$author']/..)"; $xpath = new DOMXPath($this->domDocument); return $xpath->evaluate($query); }We’re able to retrieve the information we needed with only only line of XPath and there is no need to perform laborious filtering with PHP. Indeed, this is a much simpler and succinct way to write this functionality!
我们仅用XPath行就能检索到所需的信息,而无需使用PHP进行费力的过滤。 确实,这是编写此功能的一种更简单明了的方式!
Notice that evaluate() was used in the last example. This is because the function count() returns a typed result. Using query() will return a DOMNodeList but you will find that it is an empty list.
注意,在上一个示例中使用了evaluate() 。 这是因为函数count()返回类型化的结果。 使用query()将返回一个DOMNodeList但您会发现它是一个空列表。
Not only does this make your code cleaner, but it also comes with speed benefits. I found that version 1 was 30% faster on average than version 2 but version 3 was about 10 percent faster than version 2 (about 15% faster than version 1). While these measurements will vary depending on your server and query, using XPath in it’s purest form will generally yield a considerable speed benefit as well as making your code easier to read and maintain.
这不仅使您的代码更整洁,而且还带来了速度上的好处。 我发现版本1平均比版本2快30%,但版本3比版本2快10%(比版本1快15%)。 尽管这些度量会根据服务器和查询的不同而有所不同,但是以其最纯粹的形式使用XPath通常会带来可观的速度优势,并使代码更易于阅读和维护。
There are quite a few functions that can be used with XPath and there are many excellent resources which detail what functions are available. If you find that you are iterating over DOMNodeLists or comparing nodeValues, you will probably find an XPath function that can eliminate a lot of the PHP coding.
XPath可以使用很多功能,并且有许多出色的资源详细介绍了可用的功能。 如果发现要遍历DOMNodeList或比较nodeValue ,则可能会找到一个XPath函数,该函数可以消除许多PHP编码。
You’ve already see how count() functions. Let’s use the id() function to return the titles of the books with the given ISBNs. The XPath expression you will need to use is:
您已经了解了count()功能。 让我们使用id()函数返回具有给定ISBN的书籍的书名。 您将需要使用的XPath表达式是:
id("isbn1234 isbn1235")/titleNotice here that the values you are searching for are enclosed within quotes and delimited with a space; there is no need for a comma to delimit the terms.
请注意,您要搜索的值用引号引起来,并用空格定界。 无需使用逗号来分隔术语。
<?php public function findBooksByISBNs(array $isbns) { $ids = join(" ", $isbns); $query = "id('$ids')/title"; $xpath = new DOMXPath($this->domDocument); $result = $xpath->query($query); $books = array(); foreach ($result as $node) { $book = array("title" => $booknode->nodeValue); $books[] = $book; } return $books; }Executing complex functions in XPath is relatively simple; the trick is to become familiar with the functions that are available.
在XPath中执行复杂的功能相对简单。 诀窍是要熟悉可用的功能。
Sometimes you may find that you need some greater functionality that the standard XPath functions cannot deliver. Luckily, PHP DOM also allows you to incorporate PHP’s own functions into an XPath query.
有时,您可能会发现需要一些标准XPath函数无法提供的更强大的功能。 幸运的是,PHP DOM还允许您将PHP自身的功能合并到XPath查询中。
Let’s consider returning the number of words in the title of a book. In it’s simplest function, we could write the method as follows:
让我们考虑返回书名中的单词数。 在最简单的函数中,我们可以将方法编写如下:
<?php public function getNumberOfWords($isbn) { $query = "//library/book[@isbn = '$isbn']"; $xpath = new DOMXPath($this->domDocument); $result = $xpath->query($query); $title = $result->item(0)->getElementsByTagName("title") ->item(0)->nodeValue; return str_word_count($title); }But we can also incorporate the function str_word_count() directly into the XPath query. There are a few steps that need to be completed to do this. First of all, we have to register a namespace with the XPath object. PHP functions in XPath queries are preceded by “php:functionString” and then the name of the function function you want to use is enclosed in parentheses. Also, the namespace to be defined is http://php.net/xpath. The namespace must be set to this; any other values will result in errors. We then need to call registerPHPFunctions() which tells PHP that whenever it comes across a function namespaced with “php:”, it is PHP that should handle it.
但是我们也可以将功能str_word_count()直接合并到XPath查询中。 为此,需要完成一些步骤。 首先,我们必须向XPath对象注册一个名称空间。 XPath查询中PHP函数以“ php:functionString”开头,然后要使用的函数名称放在括号中。 另外,要定义的名称空间是http://php.net/xpath 。 名称空间必须设置为此。 其他任何值都将导致错误。 然后,我们需要调用registerPHPFunctions()来告诉PHP,只要遇到以“ php:”命名的函数,就应该由PHP处理。
The actual syntax for calling the function is:
调用该函数的实际语法为:
php:functionString("nameoffunction", arg, arg...)Putting this all together results in the following reimplementation of getNumberOfWords():
将所有这些放在一起会导致以下getNumberOfWords()重新实现:
<?php public function getNumberOfWords($isbn) { $xpath = new DOMXPath($this->domDocument); //register the php namespace $xpath->registerNamespace("php", "http://php.net/xpath"); //ensure php functions can be called within xpath $xpath->registerPHPFunctions(); $query = "php:functionString('str_word_count',(//library/book[@isbn = '$isbn']/title))"; return $xpath->evaluate($query); }Notice that you don’t need to call the XPath function text() to provide the text of the node. The registerPHPFunctions() method does this automatically. However the following is just as valid:
注意,您不需要调用XPath函数text()来提供节点的文本。 registerPHPFunctions()方法自动执行此操作。 但是,以下内容同样有效:
php:functionString('str_word_count',(//library/book[@isbn = '$isbn']/title[text()]))Registering PHP functions is not restricted to the functions that come with PHP. You can define your own functions and provide those within the XPath. The only difference here is that when defining the function, you use “php:function” rather than “php:functionString”. Also, it is only possible to provide either functions on their own or static methods. Calling instance methods are not supported.
注册PHP函数并不限于PHP附带的函数。 您可以定义自己的功能,并在XPath中提供这些功能。 唯一的区别是定义函数时,使用的是“ php:function”而不是“ php:functionString”。 而且,只能提供自己的功能或静态方法。 不支持调用实例方法。
Let’s use a regular function that is outside the scope of the class to demonstrate the basic functionality. The function we will use will return only books by “George Orwell”. It must return true for every node you wish to include in the query.
让我们使用超出类范围的常规函数来演示基本功能。 我们将使用的功能将仅返回“ George Orwell”的书籍。 对于要包含在查询中的每个节点,它必须返回true。
<?php function compare($node) { return $node[0]->nodeValue == "George Orwell"; }The argument passed to the function is an array of DOMElements. It is up to the function to iterate through the array and determine whether the node being tested should be returned in the DOMNodeList. In this example, the node being tested is /book and we are using /author to make the determination.
传递给函数的参数是DOMElement的数组。 取决于函数的作用是遍历数组并确定是否应在DOMNodeList返回被测试的节点。 在这个例子中,被测试的节点是/book ,我们使用/author进行确定。
Now we can create the method getGeorgeOrwellBooks():
现在我们可以创建方法getGeorgeOrwellBooks() :
<?php public function getGeorgeOrwellBooks() { $xpath = new DOMXPath($this->domDocument); $xpath->registerNamespace("php", "http://php.net/xpath"); $xpath->registerPHPFunctions(); $query = "//library/book[php:function('compare', author)]"; $result = $xpath->query($query); $books = array(); foreach($result as $node) { $books[] = $node->getElementsByTagName("title") ->item(0)->nodeValue; } return $books; }If compare() were a static method, then you would need to amend the XPath query so that it reads:
如果compare()是静态方法,则需要修改XPath查询,使其读取:
//library/book[php:function('Library::compare', author)]In truth, all of this functionality can be easily coded up with just XPath, but the example shows how you can extend XPath queries to become more complex.
实际上,仅使用XPath即可轻松编码所有这些功能,但是该示例说明了如何扩展XPath查询以使其变得更加复杂。
Calling an object method is not possible within XPath. If you find you need to access some object properties or methods to complete the XPath query, the best solution would be to do what you can with XPath and then work on the resulting DOMNodeList with any object methods or properties as necessary.
在XPath中不可能调用对象方法。 如果发现需要访问一些对象属性或方法以完成XPath查询,则最佳解决方案是使用XPath进行操作,然后根据需要使用任何对象方法或属性来处理生成的DOMNodeList 。
XPath is a great way of cutting down the amount of code you have to write and to speed up the execution of the code when working with XML data. Although not part of the official DOM specification, the additional functionality that the PHP DOM provides allows you to extend the normal XPath functions with custom functionality. This is a very powerful feature and as your familiarity with XPath functions increase you may find that you come to rely on this less and less.
XPath是减少使用XML数据时必须编写的代码量并加快代码执行速度的好方法。 尽管不是官方DOM规范的一部分,但PHP DOM提供的附加功能使您可以使用自定义功能扩展常规XPath功能。 这是一项非常强大的功能,并且随着您对XPath函数的熟悉程度的提高,您可能会发现越来越少地依赖此功能。
Image via Fotolia
图片来自Fotolia
翻译自: https://www.sitepoint.com/php-dom-using-xpath/
php xpath dom
相关资源:jdk-8u281-windows-x64.exe