json解析 带json

tech2022-10-19  135

json解析 带json

Phunkie is a library with functional structures for PHP. In this two-part tutorial, Phunkie’s creator Marcello Duarte, head of training at Inviqa, explains how to create Parser combinators using the functional library. This post first appeared on the Inviqa blog, and was republished here with their permission. For an exploration of PHP’s “non-exotic” features, we have great paid course.

Phunkie是一个具有PHP功能结构的库。 在这个分为两部分的教程中,Punkie的创建者Marcello Duarte(Inviqa的培训负责人)解释了如何使用功能库创建Parser组合器。 这篇文章首先出现在Inviqa博客上 ,并在他们的允许下在此处重新发布。 为了探索PHP的“非异国情调”功能,我们提供了付费的高级课程 。

In the first part of this series we explored parsers and combinators to help you start getting values from functional programming with PHP. We covered the basics using examples, and now we’ll move onto more sequencing and other strategies.

在本系列的第一部分中,我们探索了解析器和组合器,以帮助您开始从使用PHP进行函数式编程中获取价值。 我们使用示例介绍了基础知识,现在,我们将继续介绍更多的排序和其他策略。

Let’s pick up where we left off!

让我们从停下来的地方接站!

测序组合器 (Sequencing Combinators)

Ok, let’s now try a more useful parser. A parser that, given a predicate, keeps the character if it satisfies the predicate and fails otherwise. We will call this parser sat, from satisfies.

好的,让我们现在尝试一个更有用的解析器。 给定谓词的解析器在满足谓词时保留字符,否则失败。 从满足性出发,我们称此解析器为sat 。

describe("sat", function() { it("parses one character when it satisfies the predicate", function(){ expect(sat("is_numeric")->run("4L"))->toEqual(ImmList(Pair('4', 'L'))); }); it("returns Nil when the character does not satisfy the predicate", function(){ expect(sat("is_numeric")->run("L4"))->toEqual(Nil()); }); });

The implementation, using the primitive parsers item, result and zero, looks like this:

使用原始解析器item , result和zero如下所示:

function sat(callable $predicate): Parser { return item()->flatMap(function($x) use ($predicate) { return $predicate($x) ? result($x) : zero(); }); }

You can see how the building blocks are now put to work. We call the item parser, flatMap on its result, and apply the predicate. If the predicate succeeds, we return the result parser, which basically lifts the $x into the parser context. Otherwise, zero will just do the same, but with Nil instead of any result.

您可以看到构建块现在是如何工作的。 我们将item解析器, flatMap其结果,并应用谓词。 如果谓词成功,则返回result解析器,这基本上会将$x提升到解析器上下文中。 否则, zero将执行相同的操作,但是使用Nil而不是任何结果。

We can quickly capitalize on sat, by creating a few other sequencing combinators.

通过创建其他一些排序组合器,我们可以快速利用sat 。

context("Sequencing combinators", function() { describe("char", function() { it("parses a character", function() { expect(char('h')->run("hello"))->toEqual(ImmList(Pair('h', "ello"))); }); }); describe("digit", function() { it("parses a digit", function() { expect(digit()->run("42"))->toEqual(ImmList(Pair('4', '2'))); }); }); describe("lower", function() { it("parses a lowercase character", function() { expect(lower()->run("hello"))->toEqual(ImmList(Pair('h', "ello"))); }); }); describe("upper", function() { it("parses an upper case character", function() { expect(upper()->run("Hello"))->toEqual(ImmList(Pair('H', "ello"))); }); }); });

You will laugh at how simple the implementation is!

您会嘲笑实现的简单程度!

function char($c): Parser { return sat(function($input) use ($c) { return $input === $c; }); } function digit(): Parser { return sat('is_numeric'); } function lower(): Parser { return sat(function($c) { return ctype_lower($c); }); } function upper(): Parser { return sat(function($c) { return ctype_upper($c); }); }

选择组合器 (Choice Combinators)

In the real world of grammars, if we had to return an empty list as soon as the first parser failed, we would not survive. Parsers have to deal with grammatical choice constructs. A very common scenario is either matching one pattern or another pattern. We do that by adding the plus combinator to our arsenal of parsers.

在语法的真实世界中,如果我们必须在第一个解析器失败后立即返回一个空列表,我们将无法生存。 解析器必须处理语法选择结构。 一种非常常见的情况是匹配一个模式或另一种模式。 我们通过将plus组合器添加到解析器库中来实现。

use function concat; function plus(Parser $p, Parser $q)): Parser { return new Parser(function(string $s) use ($p, $q) { return concat($p->run($s), $q->run($s)); }); }

I like moving some combinator implementations into the Parser class itself. For example, the plus combinator becomes the or method in the class, creating some syntax sugar for the parsers.

我喜欢将一些组合器实现移入Parser类本身。 例如, plus组合器成为类中的or方法,为解析器创建一些语法糖。

function plus(Parser $p, Parser $q)): Parser { return $p->or($q); }

An example of choice parsers could be one that accepts either lowercase or uppercase characters. We could name that parser letter. We can also create another parser that accepts digits or letters, which we can name alphanum.

选择解析器的一个示例可能是接受小写或大写字符的解析器。 我们可以命名该解析器letter 。 我们还可以创建另一个接受数字或字母的解析器,我们可以将其命名为alphanum 。

context("Choice combinators", function() { describe("letter", function() { it("can combine parsers to parse letters", function() { expect(letter()->run("hello"))->toEqual(ImmList(Pair('h', "ello"))); expect(letter()->run("Hello"))->toEqual(ImmList(Pair('H', "ello"))); expect(letter()->run("5ello"))->toEqual(Nil()); }); }); describe("alphanum", function() { it("can combine parsers to parse alphanum", function() { expect(alphanum()->run("hello"))->toEqual(ImmList(Pair('h', "ello"))); expect(alphanum()->run("Hello"))->toEqual(ImmList(Pair('H', "ello"))); expect(alphanum()->run("5ello"))->toEqual(ImmList(Pair('5', "ello"))); expect(alphanum()->run("#ello"))->toEqual(Nil()); }); });

And the elegance of the implementation speaks for itself:

实现的优雅不言而喻:

function letter(): Parser { return plus(lower(), upper()); } function alphanum(): Parser { return plus(letter(), digit()); }

递归组合器 (Recursive Combinators)

One nice trick we can use is to pass the result parser to plus to create non-deterministic parsers. To illustrate this let’s build a word parser that recognizes entire words from a string. The result may surprise you:

我们可以使用的一个不错的技巧是将result解析器传递给plus以创建非确定性解析器。 为了说明这一点,让我们构建一个word解析器来识别字符串中的整个单词。 结果可能会让您感到惊讶:

context("Recursive combinators", function() { describe("word", function() { it("recognises entire words out of a string", function() { expect(word()->run("Yes!")) ->toEqual(ImmList( Pair("Yes", '!'), Pair("Ye", "s!"), Pair('Y', "es!"), Pair("", "Yes!") )); }); }); //... });

Before diving into the reason why we have so many pairs of results let’s look at the implementation:

在深入探讨产生如此多结果对的原因之前,让我们看一下实现:

function word(): Parser { $nonEmptyWord = letter()->flatMap(function($x) { return word()->map(function($xs) use ($x) { return $x . $xs; }); }); return plus($nonEmptyWord, result('')); }

As we can see the use of letter means we are consuming a letter when one is available, however we may consume a letter but not reach the end of the parsing. This kind of choice combinator is non-deterministic. We got some kind of result; we have reached a letter. Now let’s see if there is more. Until we find something that is not a letter and we stop.

正如我们所看到的,使用letter意味着我们在消耗一个字母的情况下会消耗掉它,但是我们可能会消耗一个字母但没有到达解析的结尾。 这种选择组合器是不确定的。 我们得到了某种结果; 我们已经收到一封信。 现在让我们看看是否还有更多。 直到我们找到不是字母的东西,然后我们停下来。

Also, note how we use recursion in this implementation to continue parsing, concatenating the results. By the way, that’s the reason I didn’t use the for notation here. PHP is not a lazy language, thus implementing recursivity with the notation can result in stack overflow.

另外,请注意我们如何在此实现中使用递归继续解析,并连接结果。 顺便说一句,这就是我在此处未使用for notation的原因。 PHP不是一种惰性语言,因此使用该符号实现递归可能会导致堆栈溢出。

A very useful parser is one that can recognize an entire string (or token) inside another string. We will call it string and implement it recursively.

一个非常有用的解析器是可以识别另一个字符串中的整个字符串(或令牌)的解析器。 我们将其称为string并递归实现。

describe("string", function() { it("parses a string", function() { expect(string("hello")->run("hello world")) ->toEqual(ImmList(Pair("hello", " world"))); }); expect(string("helicopter")->run("hello world")) ->toEqual(Nil()); });

Here’s the implementation:

这是实现:

function string($s): Parser { return strlen($s) ? for_( __($c)->_(char($s[0])), __($cs)->_(string(substr($s, 1))) )->call(concat, $c, $cs) : result (''); }

简单重复 (Simple Repetitions)

You can probably imagine that the repetition pattern used in word and string is one that parsers often encounter. We can probably generalize that too. The beauty of creating parsers this way is how they can be combined very easily to create new parsers.

您可能可以想象,单词和字符串中使用的重复模式是解析器经常遇到的模式。 我们可能也可以对此进行概括。 以这种方式创建解析器的妙处在于如何轻松地将它们组合在一起以创建新的解析器。

We will now define the simple repetition parser many. We will make many non-deterministic choices, which means it will never fail, but will return an empty string instead.

现在,我们将定义简单重复解析器many 。 我们将做出许多不确定的选择,这意味着它永远不会失败,但是将返回一个空字符串。

context("Simple repetitions", function() { describe("many", function() { it("generalises repetition", function() { expect(many(char('t'))->run("ttthat's all folks"))->toEqual(ImmList( Pair("ttt", "hat's all folks"), Pair("tt", "that's all folks"), Pair('t', "tthat's all folks"), Pair("", "ttthat's all folks") )); }); it ("never produces errors", function() { expect(many(char('x'))->run("ttthat's all folks"))->toEqual(ImmList( Pair("", "ttthat's all folks") )); }); }); //... });

This implementation will look very similar to the word implementation, only we ask for a parser instead of using letter, so we can represent repetition of any kind.

该实现与word实现非常相似,只是我们要求解析器而不是使用letter ,因此我们可以表示任何形式的重复。

function many(Parser $p): Parser { return plus($p->flatMap(function($x) use ($p) { return many($p)->map(function($xs) use ($x) { return $x . $xs; }); }), result('')); }

We can now define word simply as many(letter()). Similarly we could try and implement a parser for numbers with many(digit()), but since many is non-deterministic, we need a version of many that matches at least one character. We will call it many1.

现在,我们可以简单地将word定义为many(letter()) 。 类似地,我们可以尝试为具有many(digit())数字实现解析器,但是由于many是不确定的,因此我们需要many至少匹配一个字符的版本。 我们称它为many1 。

describe("many1", function() { it("does not have the empty result of many", function() { expect(many1(char('t'))->run("ttthat's all folks"))->toEqual(ImmList( Pair("ttt", "hat's all folks"), Pair("tt", "that's all folks"), Pair('t', "tthat's all folks") )); }); it("may produce an error", function() { expect(many1(char('x'))->run("ttthat's all folks"))->toEqual(Nil()); }); });

Implementation with the for notation:

用for notation :

function many1(Parser $p): Parser { return for_( __($x)->_($p), __($xs)->_(many($p)) )->call(concat, $x, $xs); }

We can then use this to implement our natural numbers parser, nat. This time we will go a step further and evaluate the result by casting it into an integer. We can get the result by taking the head of the list, which is a Pair, then use the _1 assessor to get the result. Here is the example:

然后,我们可以使用它来实现我们的自然数解析器nat 。 这次,我们将更进一步,并通过将结果转换为整数来评估结果。 我们可以通过选择列表的head (一对)来获得结果,然后使用_1评估者来获得结果。 这是示例:

describe("nat", function() { it("can be defined with repetition", function() { expect(nat()->run("34578748fff"))->toEqual(ImmList( Pair(34578748, "fff"), Pair(3457874, "8fff"), Pair(345787, "48fff"), Pair(34578, "748fff"), Pair(3457, "8748fff"), Pair(345, "78748fff"), Pair(34, "578748fff"), Pair(3, "4578748fff") )); expect(nat()->run("34578748fff")->head->_1)->toEqual(34578748); }); });

And here’s the implementation with the casting:

这是强制转换的实现:

function nat(): Parser { return many1(digit())->map(function($xs) { return (int) $xs; }); }

If we want negative numbers as well we can use nat to build an int parser.

如果我们也想要负数,我们可以使用nat构建一个int解析器。

describe("int", function() { it("can be defined from char('-') and nat", function() { expect(int()->run("-251")->head->_1)->toEqual(-251); expect(int()->run("251")->head->_1)->toEqual(251); }); });

Implemented simply it looks like:

实现起来很简单,如下所示:

function int() { return plus(for_( __($_)->_(char('-')), __($n)->_(nat()) )->call(negate, $n), nat()); }

Whatever falls into the $_ gets ignored but matched. The negate function from the Phunkie library will then convert that number into a negative integer. nat() makes sure we pick up the positive numbers too.

落入$_都会被忽略但匹配。 然后,来自Phunkie库的negate函数会将该数字转换为负整数。 nat()确保我们也可以获取正数。

用分隔符重复 (Repetition with Separators)

At this point we can create very interesting parsers already. We could, for example, parse a list of integers written in the PHP array notation: [1,-42,500]. With what we have already we can write it like this:

至此,我们已经可以创建非常有趣的解析器了。 例如,我们可以解析以PHP数组符号[1,-42,500]编写的整数列表。 有了我们已经拥有的东西,我们可以这样写:

for_( __($open) ->_ (char('[')), __($n ) ->_ (int() ), __($ns ) ->_ (many( for_( __($comma ) ->_ (char(',')), __($m ) ->_ (int() ) ) -> call (concat, $comma, $m))), __($close) ->_ (char(']')) ) -> call (concat, $open , $n , $ns , $close);

In fact, we can simplify this by generalizing the usage of the separator — in this case, the character comma (,). Next we will implement sepBy1, a parser that is applied many times, separated by the application of another parser. Here is an example:

实际上,我们可以通过概括化分隔符的用法(在本例中为字符逗号(,))来简化此操作。 接下来,我们将实现sepBy1 ,该解析器被多次应用,并由另一个解析器的应用分隔。 这是一个例子:

describe("sepby1", function() { it("applies a parser several times separated by another parser", function() { expect(sepby1(letter(), digit())->run("a1b2c3d4")) ->toEqual(ImmList( Pair("abcd", '4'), Pair("abc", "3d4"), Pair("ab", "2c3d4"), Pair('a', "1b2c3d4")) ); }); });

We can quickly implement it like this:

我们可以像这样快速实现它:

function sepBy1(Parser $p, Parser $sep) { return for_( __($x)->_($p), __($xs)->_(many(for_( __($_)->_($sep), __($y)->_($p) )->yields($y))) )->call(concat, $x, $xs); }

After moving the implementation of sepBy1 into the Parser class, we can now re-implement our parser for the list of integers using sepBy1 an infix operator, which makes it more readable.

将sepBy1的实现sepBy1到Parser类之后,我们现在可以使用infix运算符sepBy1重新实现整数列表的解析器,这使它更具可读性。

function ints() { return for_( __ ($_) ->_ (char('[') ) ,__ ($ns ) ->_ (int()->sepBy1(char(','))) ,__ ($_) ->_ (char(']')) ) -> yields ($ns); }

Let’s apply one more generalization here extracting the surrounding pattern.

让我们在这里应用另一种概括来提取周围的模式。

function surrounded(Parser $open, Parser $p, Parser $close) { return for_( __($_)->_($open), __($ns)->_($p), __($_)->_($close) )->yields($ns); }

In such a way we can then re-write ints:

这样,我们就可以重写ints :

function ints() { return surrounded( char('['), int()->sepBy1(char(',')), char(']') ); }

JSON解析器 (A JSON Parser)

Using what we have now at our disposal, let’s build a JSON parser. I don’t think we’ll need a step-by-step description for this, and I will link the github repo from the post. Here I’ll also talk you through some of the less intuitive aspects.

使用我们现在可以使用的资源,让我们构建一个JSON解析器。 我认为我们不需要对此进行分步说明,并且我将从这篇文章中链接github存储库。 在这里,我还将通过一些不太直观的方面向您介绍。

The parser should basically construct a JsonObject object based on the specification. This is an almost full implementation. I will leave-out the empty spaces, fancy numbers and string issues for simplicity and brevity.

解析器基本上应该根据规范构造一个JsonObject对象。 这几乎是完整的实现。 为了简单起见,我将省略空白,花哨的数字和字符串问题。

So here is the example:

因此,这是示例:

context("Json parser", function() { describe("json_object", function() { it("parses json objects with no space or fancy digits", function() { expect((string)json_object()->run( '{a:1,b:null,c:true,d:{x:"a",b:[1,2,3]}}' )->head->_1)->toEqual((string) new JsonObject( ImmMap([ "a" => new JsonNumber(1), "b" => new JsonNull(), "c" => new JsonBoolean(true), "d" => new JsonObject( ImmMap([ "x" => new JsonString("a"), "b" => new JsonArray([ new JsonNumber(1), new JsonNumber(2), new JsonNumber(3) ]) ]) ) ]) )); }); }); });

The json_value parser is a choice parser that sits on top. Its implementation is very straightforward:

json_value解析器是位于顶部的选择解析器。 它的实现非常简单:

function json_value(): Parser { return json_string() ->or(json_boolean()) ->or(json_null()) ->or(json_number()) ->or(json_array()) ->or(json_object()); }

An aspect worth mentioning is the conversion from the parsed value into the JSON objects. This is usually achieved with a map at the end of the parser combinator. See, for example, the json_null parser:

值得一提的方面是从解析后的值到JSON对象的转换。 这通常是通过解析器组合器末端的map来实现的。 参见例如json_null解析器:

function json_null(): Parser { return string("null")->map(function($_) { return new JsonNull(); }); }

I create a sepBy1array combinator that, instead of returning the concatenated values at the end, returns an array of the values parsed. This is very handy for the json_array parser.

我创建了sepBy1array组合器,该组合器返回已解析的值的数组,而不是最后返回串联的值。 这对于json_array解析器非常方便。

function json_array(): Parser { return char('[')->flatMap(function($_) { return sepBy1array(json_value(), char(','))->flatMap(function($elements) { return char(']')->map(function($_) use ($elements) { return new JsonArray( array_filter($elements, function($a){ return $a != ''; }) ); }); }); }); }

And here’s a similar arrangement using Phunkie’s immutable maps for the objects.

这是使用庞克不可变地图绘制对象的类似安排。

function json_object(): Parser { return char('{')->flatMap(function($_) { return sepBy1Map(word()->flatMap(function($key) { return char(':')->flatMap(function($colon) use ($key) { return json_value()->map(function($value) use ($key) { return ImmMap($key , $value); }); }); }), char(','))->flatMap(function($pairs) { return char('}')->map(function() use ($pairs) { return new JsonObject($pairs); }); }); }); }

That’s it for now! I hope this has served to show you how powerful functional programming is and how composable parser combinators are. I hope you’ve enjoyed this tutorial and are inspired to try some Phunkie for yourself!

现在就这样! 我希望这有助于向您展示功能编程的功能以及可组合的解析器组合器的功能。 希望您喜欢本教程,并受到启发尝试自己做些Phunkie!

Questions? Comments? Find me on Twitter at @_md or drop a comment below!

有什么问题吗 注释? 在Twitter上@@ _ md找到我,或在下面发表评论!

[1] – Phunkie repository https://github.com/phunkie/phunkie [2] – Marcello Duarte’s parsers combinators repository https://github.com/MarcelloDuarte/ParserCombinators

[1] – Phunkie存储库https://github.com/phunkie/phunkie [2] – Marcello Duarte的解析器组合器存储库https://github.com/MarcelloDuarte/ParserCombinators

翻译自: https://www.sitepoint.com/functional-programming-phunkie-building-php-json-parser/

json解析 带json

最新回复(0)