Finish example walkthrough

author: Matthew Hall <hallmatthew314@gmail.com> 2023-04-04 22:26:23 +1200
committer: Matthew Hall <hallmatthew314@gmail.com> 2023-04-04 22:26:23 +1200
commit: c98c2b5dba76a64f51fc4dfba279cd616e3338ad (patch)
tree: 6fb86f638982dd76dd8114a404466ebcc8ded33d /README.md
parent: 1888a8b5ea6316724bef2ca7a3d7b5d61707bd96 (diff)
1 files changed, 107 insertions, 8 deletions
diff --git a/README.md b/README.md
index b532ea2..bf15ac4 100644
--- a/README.md
+++ b/README.md
@@ -104,9 +104,111 @@ sign = Parser.token('-')
   .recover(1_i32)
 ```
 
-Final code:
+Now that we can parse a number and its sign, all we need to do is parse them together and combine them.
+There are multiple ways this can be done, but for now we can just use the `parser_chain` macro:
+```
+int32 = parser_chain Char, Int32, "int32",
+  {s, sign},
+  {n, abs_num},
+  pure: n * s
+```
+
+The `parser_chain` macro is given the types needed to generate the parser, as well as a name.
+Next, it receives tuples with some identifier `x` and some other parser `p`.
+The final parser will run each of the parsers in order to `x`, to be accessed later.
+These values can be used by other parsers in the chain, or even to define them.
+
+Finally, we have the named argument `pure`, which indicates we just want to compute some value to return at the end.
+In our case, we multiply the results of `sign` and `abs_num` to get the final integer value.
+
+The next step is to parse two numbers as a key-value pair.
+The format of such a pair is a number, followed by optional whitespace, the `=>` symbol, more optional whitespace, then another number.
+
+We already know how to parse the numbers, so let's try to parse the whitespace:
+
+```
+# `#many` is similar to `#some`, but allows matching zero times
+ws = Parser(Char, Char).satisfy(&.whitespace?).many
+```
+
+The `=>` symbol is also easy enough to parse, since we know exactly what to look for:
+
+```
+# `Parser.token_sequence` accepts an array of tokens and only succeeds if
+# the input starts with those same tokens.
+arrow = Parser.token_sequence("=>".chars)
+```
+
+Sine the arrow and the optional whitespace are all expected together, let's re-define that last parser to include the whitespace:
+
+```
+arrow = (ws >> Parser.token_sequence("=>".chars) >> ws).named("arrow")
+```
+
+Here we see another option for combining parsers: the `>>` operator.
+Parser objects have special overloads for `+`, `>>`, and `<<`.
+These can be used to quickly sequence multiple parsers together in situations where we don't really care what the actual results of each parser are.
+In this case, `>>` will parse the left parser, then the right parser, keeping only the right parser's result.
+
+Now that we have our pair separator defined, we can parse a pair of numbers:
+```
+pair = parser_chain Char, {Int32, Int32}, "pair",
+  {x, int32},
+  {_, arrow},
+  {y, int32},
+  pure: {x, y}
+```
 
-TODO: add to practical tests
+Here, we parse a number to be stored in `x`, an arrow separator which is then discarded (bound to `_`), and then one more number to be stored in `y`.
+At the end we then put the two numbers in a tuple and return them.
+
+Not much further now!
+Now we need to be able to parse zero or more pairs separated by commas.
+
+The comma parser should look familiar:
+```
+# We're only looking for one token this time, so `Parser.token` is enough here
+delim = (ws >> Parser.token(',') >> ws).named("delim")
+```
+
+But we'll want to employ a new method for all the elements together:
+```
+elements = pair.sep_by(delim).named("elements")
+```
+
+`#sep_by` is a bit like `#some`, because it will parse one or more instances of something.
+The key difference is that it will also parse an instance of something else between.
+
+The current implementation would work, but it doesn't support trailing commas.
+That's an easy enough fix with the `<<` operator:
+```
+# `<<` will parse two things, and keep only the value of the first parser.
+elements = (pair.sep_by(delim) << delim.optional).named("elements")
+```
+
+Now we just have to define parsers for the beginning and ending parts of the hash...
+```
+hash_start = (Parser.token('{') >> ws).named("start")
+hash_end   = (ws >> Parser.token('}')).named("end")
+```
+
+...and we have everything we need!
+
+Putting it all together:
+```
+hash = parser_chain Char, Hash(Int32, Int32), "hash",
+  {_,  hash_start},
+  {es, elements.recover([] of {Int32, Int32})}
+  {_,  hash_end},
+  pure: es.to_h
+```
+
+You may have noticed in the earlier definition of `elements` that it would not be able to parse zero pairs of numbers.
+We have addressed that here with a call to `#recover`, supplying an empty array.
+
+Similar to the `int32` parser, we convert the array of tuples into a hash in the `pure` argument.
+
+Final code:
 
 ```
 d = Parser(Char, Char).satisfy(&.number?).named("digit")
@@ -127,23 +229,20 @@ arrow = (ws >> Parser.token_sequence("=>".chars) >> ws).named("arrow")
 
 pair = parser_chain Char, {Int32, Int32}, "pair",
   {x, int32},
-  {_, sep},
+  {_, arrow},
   {y, int32},
   pure: {x, y}
 
 delim = (ws >> Parser.token(',') >> ws).named("delim")
 
-elements = parser_chain Char, Array({Int32, Int32}), "elements",
-  {pairs, pair.sep_by(delim)},
-  {_,     delim.optional}, # trailing comma
-  pure: pairs
+elements = (pair.sep_by(delim) << delim.optional).named("elements")
 
 hash_start = (Parser.token('{') >> ws).named("start")
 hash_end   = (ws >> Parser.token('}')).named("end")
 
 hash = parser_chain Char, Hash(Int32, Int32), "hash",
   {_,  hash_start},
-  {es, elements.recover([] of {Int32, Int32})}
+  {es, elements.recover([] of {Int32, Int32})},
   {_,  hash_end},
   pure: es.to_h
 ```
author	Matthew Hall <hallmatthew314@gmail.com>	2023-04-04 22:26:23 +1200
committer	Matthew Hall <hallmatthew314@gmail.com>	2023-04-04 22:26:23 +1200
commit	c98c2b5dba76a64f51fc4dfba279cd616e3338ad (patch)
tree	6fb86f638982dd76dd8114a404466ebcc8ded33d /README.md
parent	1888a8b5ea6316724bef2ca7a3d7b5d61707bd96 (diff)