Earley Parser

Earley Parser

May 30
This is the first post in what will hopefully be a series about how to build an Earley parser. I've been struggling to hammer the grammar for a language I'm working on into a form suitable for use with a PEG (parsing expression grammar) parser. Over the past two weekends I was even trying to add support for handling left-recursive grammar rules into Citrus (a PEG parser), but as it turns out, the currently known methods for enabling left-recursion are not powerful enough to recognize all forms of left-recursive grammars. I officially gave up trying to use a PEG parser for my grammar, it simply isn't flexible enough. I've decided to go whole hog, and start trying to implement a parsing method that I know is flexible enough to recognize the entire class of context free grammars; the Earley parsing algorithm will do just that.

If you start looking for resources online that explain how the Earley's algorithm works, you quickly find that there isn't much practically useful information available for free. I found that to be the case when I was investigating whether to use Earley's algorithm or a similarly powerful parsing technique called SGLR parsing. The papers explaining these parsing techniques are all written by professors and researchers whose sole purpose in life is to get published. As a result, they submit their papers to publishers who then agree to include them into their publication but require the author to assign all distribution rights to the publisher. The publisher is then the only one who can publish the paper, and they do so, for $30 per paper! Don't even get me started… In any case, all the good papers about SGLR are behind pay-walls.

Papers about Earley's algorithm are a little more accessible, but the really nice thing about Earley's algorithm is that it is covered in detail in a book called Parsing Techniques: a practical guide. I will be building my parser based on the book's description of Earley's algorithm. Another really nice thing about Earley's algorithm is that it's supposed to be easier to implement than an SGLR parser. We'll see.

If you're interested, section 7.2 of the first edition of Parsing Techniques: a practical guide is where Earley parsers are covered.
Trading Simulation Language

Trading Simulation Language

Apr 30

In designing my simple trading simulation language, I’ve been reading up on Haskell a lot lately. For the same reasons that Rich Hickey documents on the Clojure state and identity page, I like the idea of making a simple functional language. I’ve heard that programs written in functional languages are easier to reason about, and I think that that property makes functional code more intuitive to the reader.

My idea is to translate programs written in my simple language into Clojure (compile to Clojure) similarly to how some high level languages translate to C or C++ (e.g. SequenceL). Clojure is expressive enough to represent all of the constructs in my simple language.

One idea I’m stealing from Haskell is to allow functions of two arguments be called using infix notation. For example, given the following definition of the function elementOf? :

elementOf?(e, sequence) = any?(sequence, (item){ e == item } )

elementOf? can be invoked using either of the following notations:

1. elementOf?(1, [1,2,3])
2. 1 elementOf? [1,2,3]

Both invocations will return true.

One cool thing that Haskell and Erlang both support is pattern matching or destructuring. I’m not planning on implementing destructuring anytime soon. Clojure supports destructuring in let bindings, and it has come in handy every one in a while, but it’s not a must have in my language.

Hopefully, people without a CS background will be able to use my language. That’s the idea.

Parsing and Compilers

Parsing and Compilers

Dec 12
Compilers are a topic in which the phrase "here be dragons" is implied. The classic definitive text on compilers is known as the Dragon Book. I wonder if that was intended.
Language Project

Language Project

Dec 11
So, I've decided I'm going to build a simple domain specific language (DSL) for running trading simulations and for stock screening. My former advisor Dr. Rushton told me that he favors general purpose programming languages over DSLs, but I think I can build a DSL that is easy to use while still being very expressive. I began trying to use one of the many PEG parser generators/combinators, but I found out quickly that PEG grammars, like LL grammars, must avoid left-recursive grammar productions. I want to be able to write whatever grammar rule is intuitive for me to write, and avoiding left recursion is not intuitive for me. I recently found out about generalized LR parsers (GLRs) and wanted to implement one, then I read about scannerless GLRs (SGLRs), but there isn't as much information about those as there is about GLRs. I also wanted a simple parser generator/combinator, and I wanted both a Ruby and Javascript implementation. So, I started thinking about how to build a simple bottom up parser, and I have an idea for how to implement one. I believe I can use regular expressions to build all possible parse trees of a given input sentence (i.e. a parse forrest). In the case that more than one parse tree exists for a given sentence, you must have a way to identify which of the parse trees is the "correct" parse tree. In other words, you need a way to prune all parse trees except one from the parse tree forrest. I think I can prune the parse forrest down to one "correct" parse tree as long as the ambiguous grammar rules are assigned a precedence "rank" and as long as rules that have a right-hand-side containing more than one non-terminal indicate which should be parsed first (i.e. some kind of associativity precedence). I'll keep you posted on the details. I'll put the parser up on github.com as soon as I have something to show.
Bad code isn’t Technical Debt, it’s an unhedged Call Option | Steve Freeman

Bad code isn’t Technical Debt, it’s an unhedged Call Option | Steve Freeman

Nov 21
1) There is an apocryphal story about a trader buying chocolate santa futures and forgetting to sell them on. Eventually a truckload turned up at the Wall Street headquarters.

The article is actually about bad code being characterized as a naked call (a call that you sell without already owning the underlying security) as opposed to “technical debt”, but I mostly liked the story at the end.

Colony (4th weekend)

Colony (4th weekend)

Oct 30

Finally, a working library, sample script and all! I’m thinking I’ll pull out two sub-libraries so I can use them in other projects. I had to build a small redis model library and also a module template library.

The redis model library is a simplistic (read: not ActiveRecord) ORM library that is useful for building Redis-backed model classes.

The module template library allows you to create parameterized modules. This makes Ruby meta-programming much more structured and easy to understand.

You can check out colony at http://github.com/davidkellis/colony.

How AT&T gives the middle finger to its departing “Valued Customers”

How AT&T gives the middle finger to its departing “Valued Customers”

Oct 29

Here’s the exchange I just had with AT&T:

me: Hi, I’d like to cancel my AT&T mobile account.

ATT: Sure, I’ll be happy to help you with that. Now, since your contract is not up, an early termination fee will apply.

me: Yeah, I’d like to avoid the termination fee. My contract expires on November 1st, can I schedule the account cancellation date for November 2nd, so I don’t have to pay the $175 early termination fee?

 

ATT: Sure, we can cancel then. You can cancel your account anytime. Do you want to cancel on the last day of your billing cycle?

me: No, my billing cycle just ended on Oct. 23rd, and the new billing cycle just began a few days ago. I’d like to cancel my account on November 2nd, since my contract ends on November 1st. Can I can cancel on the 2nd and just be charged a pro-rated amount for the week or so of service that I’ll use?

ATT: I’m sorry, we don’t pro-rate the final bill.

me: So you’re going to charge me for a full month of service for only one week of use, even though my account would be cancelled on the 2nd?

ATT: Um, yes.

The conversation ended with me explaining that I was not expecting to be charged for a full month of service for one week of use and that I’d call back once I had decided on a cancellation date.

The real kicker is that one of the last things the lady I was talking with told me was “… and, Mr. Ellis, you are a valued customer, and you will have 45 days (I don’t remember the exact number) to re-open your account if the other carrier doesn’t work out.”

Gee, thanks.

The essence of this conversation is:

me: I’d like to cancel my account.

ATT: Sure, we’d be happy to do that, you !*%&^!%!, I mean, valued customer. By the way, we’ll be charging you for service that you won’t be receiving. Thank you and have a nice day.

Colony (2nd weekend)

Colony (2nd weekend)

Oct 15

I finally have a functioning prototype. It’s incomplete, but I’m checking in what I’ve got.

You can enqueue simple tasks and have them run on a distributed work queue. Once a task is queued, the client can either poll for the computed result, or it can block until the result is available.

Now I’m going to add in the functionality that allows completed tasks to call a callback function once the task is complete.

If you’re interested, check out http://github.com/davidkellis/colony.