aboutsummaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'README.md')
-rw-r--r--README.md115
1 files changed, 107 insertions, 8 deletions
diff --git a/README.md b/README.md
index b532ea2..bf15ac4 100644
--- a/README.md
+++ b/README.md
@@ -104,9 +104,111 @@ sign = Parser.token('-')
.recover(1_i32)
```
-Final code:
+Now that we can parse a number and its sign, all we need to do is parse them together and combine them.
+There are multiple ways this can be done, but for now we can just use the `parser_chain` macro:
+```
+int32 = parser_chain Char, Int32, "int32",
+ {s, sign},
+ {n, abs_num},
+ pure: n * s
+```
+
+The `parser_chain` macro is given the types needed to generate the parser, as well as a name.
+Next, it receives tuples with some identifier `x` and some other parser `p`.
+The final parser will run each of the parsers in order to `x`, to be accessed later.
+These values can be used by other parsers in the chain, or even to define them.
+
+Finally, we have the named argument `pure`, which indicates we just want to compute some value to return at the end.
+In our case, we multiply the results of `sign` and `abs_num` to get the final integer value.
+
+The next step is to parse two numbers as a key-value pair.
+The format of such a pair is a number, followed by optional whitespace, the `=>` symbol, more optional whitespace, then another number.
+
+We already know how to parse the numbers, so let's try to parse the whitespace:
+
+```
+# `#many` is similar to `#some`, but allows matching zero times
+ws = Parser(Char, Char).satisfy(&.whitespace?).many
+```
+
+The `=>` symbol is also easy enough to parse, since we know exactly what to look for:
+
+```
+# `Parser.token_sequence` accepts an array of tokens and only succeeds if
+# the input starts with those same tokens.
+arrow = Parser.token_sequence("=>".chars)
+```
+
+Sine the arrow and the optional whitespace are all expected together, let's re-define that last parser to include the whitespace:
+
+```
+arrow = (ws >> Parser.token_sequence("=>".chars) >> ws).named("arrow")
+```
+
+Here we see another option for combining parsers: the `>>` operator.
+Parser objects have special overloads for `+`, `>>`, and `<<`.
+These can be used to quickly sequence multiple parsers together in situations where we don't really care what the actual results of each parser are.
+In this case, `>>` will parse the left parser, then the right parser, keeping only the right parser's result.
+
+Now that we have our pair separator defined, we can parse a pair of numbers:
+```
+pair = parser_chain Char, {Int32, Int32}, "pair",
+ {x, int32},
+ {_, arrow},
+ {y, int32},
+ pure: {x, y}
+```
-TODO: add to practical tests
+Here, we parse a number to be stored in `x`, an arrow separator which is then discarded (bound to `_`), and then one more number to be stored in `y`.
+At the end we then put the two numbers in a tuple and return them.
+
+Not much further now!
+Now we need to be able to parse zero or more pairs separated by commas.
+
+The comma parser should look familiar:
+```
+# We're only looking for one token this time, so `Parser.token` is enough here
+delim = (ws >> Parser.token(',') >> ws).named("delim")
+```
+
+But we'll want to employ a new method for all the elements together:
+```
+elements = pair.sep_by(delim).named("elements")
+```
+
+`#sep_by` is a bit like `#some`, because it will parse one or more instances of something.
+The key difference is that it will also parse an instance of something else between.
+
+The current implementation would work, but it doesn't support trailing commas.
+That's an easy enough fix with the `<<` operator:
+```
+# `<<` will parse two things, and keep only the value of the first parser.
+elements = (pair.sep_by(delim) << delim.optional).named("elements")
+```
+
+Now we just have to define parsers for the beginning and ending parts of the hash...
+```
+hash_start = (Parser.token('{') >> ws).named("start")
+hash_end = (ws >> Parser.token('}')).named("end")
+```
+
+...and we have everything we need!
+
+Putting it all together:
+```
+hash = parser_chain Char, Hash(Int32, Int32), "hash",
+ {_, hash_start},
+ {es, elements.recover([] of {Int32, Int32})}
+ {_, hash_end},
+ pure: es.to_h
+```
+
+You may have noticed in the earlier definition of `elements` that it would not be able to parse zero pairs of numbers.
+We have addressed that here with a call to `#recover`, supplying an empty array.
+
+Similar to the `int32` parser, we convert the array of tuples into a hash in the `pure` argument.
+
+Final code:
```
d = Parser(Char, Char).satisfy(&.number?).named("digit")
@@ -127,23 +229,20 @@ arrow = (ws >> Parser.token_sequence("=>".chars) >> ws).named("arrow")
pair = parser_chain Char, {Int32, Int32}, "pair",
{x, int32},
- {_, sep},
+ {_, arrow},
{y, int32},
pure: {x, y}
delim = (ws >> Parser.token(',') >> ws).named("delim")
-elements = parser_chain Char, Array({Int32, Int32}), "elements",
- {pairs, pair.sep_by(delim)},
- {_, delim.optional}, # trailing comma
- pure: pairs
+elements = (pair.sep_by(delim) << delim.optional).named("elements")
hash_start = (Parser.token('{') >> ws).named("start")
hash_end = (ws >> Parser.token('}')).named("end")
hash = parser_chain Char, Hash(Int32, Int32), "hash",
{_, hash_start},
- {es, elements.recover([] of {Int32, Int32})}
+ {es, elements.recover([] of {Int32, Int32})},
{_, hash_end},
pure: es.to_h
```