1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
|
# parcom
A simple parser combinator library with a dumb name.
## WARNING
This library is a work in progress.
Any version of this library <1.0.0 should not be used in production environments.
The library is still growing and breaking changes may occur at any time.
## Description
Parcom is a Crystal library the provides parser combinator functionality.
## Prerequisites
* Git
## Installation
Add the following dependency to your project's `shard.yml` file:
```
dependencies:
parcom:
git: "https://git.matthewhall.xyz/parcom"
version: "0.3.0"
```
Then, run
```
shards install
```
## General usage
Parcom parsers work by creating parser objects, and then calling their `#parse` method with the given input.
As this library use parser combinators, complex parser objects should be made by combining simple parsers together.
## Example walkthrough
Before we get started, it is recommended to `include` the Parcom module in whatever namespace you are working in:
```
require "parcom"
include Parcom
module YourModule
def self.main
puts "Hello world!"
end
end
YourModule.main
```
Suppose we want to parse a `Hash(Int32, Int32)` literal from a string.
First, we should define how to parse a digit:
```
# This defines a parser that will parse a single Char, check if
# it is a digit, and fail if it is not a digit.
d = Parser(Char, Char).satisfy(&.number?)
```
Numbers often have one or more digits [citation needed], so let's make another parser based on `d` that parses multiple digits:
```
# `Parser#some` is a method that creates a new parser that parses
# one or more instances of what the original parser would parse.
abs_num = d.some
```
We're not quite done with this yet, as we want a parser of `Int32`, but this parser will parse an `Array(Char)`.
We need to change the value inside the parser with the `Parser#map` method:
```
# The `Parser#map` method accepts a block or proc that takes the expected
# parser result and transforms it into something else.
# In this case, we're converting our array of digits into an Int32.
abs_num = d.some.map { |ds| ds.join.to_i32 }
```
Now we have a parser that can parse positive integers (in base 10). But what about negative numbers?
First, we make a parser that parses a '-' sign if it can, but doesn't fail if it can't fine one:
```
# `Parser#optional` creates a new parser that tries to parse with the original
# parser, but will return `nil` without consuming any input instead of failing.
sign = Parser.token('-').optional
```
Then we can change the value to `1` or `-1` to multiply by later, based on the result:
```
sign = Parser.token('-').optional.map do |minus_or_nil|
minus_or_nil.nil? : -1_i32 : 1_i32
end
```
Another way to do this is to use `Parser#recover`, which allows a default value to be specified:
```
# `#map_const` is like `#map`, but it takes a single value to replace
# the parser value with unconditionally.
sign = Parser.token('-')
.map_const(-1_i32)
.recover(1_i32)
```
Final code:
TODO: add to practical tests
```
d = Parser(Char, Char).satisfy(&.number?).named("digit")
abs_num = d.some.map { |ds| ds.join.to_i32 }.named("abs_num")
sign = Parser.token('-').map_const(-1_i32).recover(1_i32).named("sign")
int32 = parser_chain Char, Int32, "int32",
{s, sign},
{n, abs_num},
pure: n * s
ws = Parser(Char, Char).satisfy(&.whitespace?)
.many
.named("whitespace")
arrow = (ws >> Parser.token_sequence("=>".chars) >> ws).named("arrow")
pair = parser_chain Char, {Int32, Int32}, "pair",
{x, int32},
{_, sep},
{y, int32},
pure: {x, y}
delim = (ws >> Parser.token(',') >> ws).named("delim")
elements = parser_chain Char, Array({Int32, Int32}), "elements",
{pairs, pair.sep_by(delim)},
{_, delim.optional}, # trailing comma
pure: pairs
hash_start = (Parser.token('{') >> ws).named("start")
hash_end = (ws >> Parser.token('}')).named("end")
hash = parser_chain Char, Hash(Int32, Int32), "hash",
{_, hash_start},
{es, elements.recover([] of {Int32, Int32})}
{_, hash_end},
pure: es.to_h
```
|