What’s a random variable?

What’s a random variable?

Feb 21

I just found the most intuitive definition of a random variable I’ve seen so far: here and here. In short, a random variable is a function that maps each element in a sample space to a real number. A set composed of elements from the sample space, in which every element in the set maps to the same real number value, is an event. In other words, a random variable defines events in (on?, from?) a sample space.

Earley Parser

Earley Parser

May 30
This is the first post in what will hopefully be a series about how to build an Earley parser. I've been struggling to hammer the grammar for a language I'm working on into a form suitable for use with a PEG (parsing expression grammar) parser. Over the past two weekends I was even trying to add support for handling left-recursive grammar rules into Citrus (a PEG parser), but as it turns out, the currently known methods for enabling left-recursion are not powerful enough to recognize all forms of left-recursive grammars. I officially gave up trying to use a PEG parser for my grammar, it simply isn't flexible enough. I've decided to go whole hog, and start trying to implement a parsing method that I know is flexible enough to recognize the entire class of context free grammars; the Earley parsing algorithm will do just that.

If you start looking for resources online that explain how the Earley's algorithm works, you quickly find that there isn't much practically useful information available for free. I found that to be the case when I was investigating whether to use Earley's algorithm or a similarly powerful parsing technique called SGLR parsing. The papers explaining these parsing techniques are all written by professors and researchers whose sole purpose in life is to get published. As a result, they submit their papers to publishers who then agree to include them into their publication but require the author to assign all distribution rights to the publisher. The publisher is then the only one who can publish the paper, and they do so, for $30 per paper! Don't even get me started… In any case, all the good papers about SGLR are behind pay-walls.

Papers about Earley's algorithm are a little more accessible, but the really nice thing about Earley's algorithm is that it is covered in detail in a book called Parsing Techniques: a practical guide. I will be building my parser based on the book's description of Earley's algorithm. Another really nice thing about Earley's algorithm is that it's supposed to be easier to implement than an SGLR parser. We'll see.

If you're interested, section 7.2 of the first edition of Parsing Techniques: a practical guide is where Earley parsers are covered.
Trading Simulation Language

Trading Simulation Language

Apr 30

In designing my simple trading simulation language, I’ve been reading up on Haskell a lot lately. For the same reasons that Rich Hickey documents on the Clojure state and identity page, I like the idea of making a simple functional language. I’ve heard that programs written in functional languages are easier to reason about, and I think that that property makes functional code more intuitive to the reader.

My idea is to translate programs written in my simple language into Clojure (compile to Clojure) similarly to how some high level languages translate to C or C++ (e.g. SequenceL). Clojure is expressive enough to represent all of the constructs in my simple language.

One idea I’m stealing from Haskell is to allow functions of two arguments be called using infix notation. For example, given the following definition of the function elementOf? :

elementOf?(e, sequence) = any?(sequence, (item){ e == item } )

elementOf? can be invoked using either of the following notations:

1. elementOf?(1, [1,2,3])
2. 1 elementOf? [1,2,3]

Both invocations will return true.

One cool thing that Haskell and Erlang both support is pattern matching or destructuring. I’m not planning on implementing destructuring anytime soon. Clojure supports destructuring in let bindings, and it has come in handy every one in a while, but it’s not a must have in my language.

Hopefully, people without a CS background will be able to use my language. That’s the idea.

Parsing and Compilers

Parsing and Compilers

Dec 12
Compilers are a topic in which the phrase "here be dragons" is implied. The classic definitive text on compilers is known as the Dragon Book. I wonder if that was intended.
Language Project

Language Project

Dec 11
So, I've decided I'm going to build a simple domain specific language (DSL) for running trading simulations and for stock screening. My former advisor Dr. Rushton told me that he favors general purpose programming languages over DSLs, but I think I can build a DSL that is easy to use while still being very expressive. I began trying to use one of the many PEG parser generators/combinators, but I found out quickly that PEG grammars, like LL grammars, must avoid left-recursive grammar productions. I want to be able to write whatever grammar rule is intuitive for me to write, and avoiding left recursion is not intuitive for me. I recently found out about generalized LR parsers (GLRs) and wanted to implement one, then I read about scannerless GLRs (SGLRs), but there isn't as much information about those as there is about GLRs. I also wanted a simple parser generator/combinator, and I wanted both a Ruby and Javascript implementation. So, I started thinking about how to build a simple bottom up parser, and I have an idea for how to implement one. I believe I can use regular expressions to build all possible parse trees of a given input sentence (i.e. a parse forrest). In the case that more than one parse tree exists for a given sentence, you must have a way to identify which of the parse trees is the "correct" parse tree. In other words, you need a way to prune all parse trees except one from the parse tree forrest. I think I can prune the parse forrest down to one "correct" parse tree as long as the ambiguous grammar rules are assigned a precedence "rank" and as long as rules that have a right-hand-side containing more than one non-terminal indicate which should be parsed first (i.e. some kind of associativity precedence). I'll keep you posted on the details. I'll put the parser up on github.com as soon as I have something to show.
Bad code isn’t Technical Debt, it’s an unhedged Call Option | Steve Freeman

Bad code isn’t Technical Debt, it’s an unhedged Call Option | Steve Freeman

Nov 21
1) There is an apocryphal story about a trader buying chocolate santa futures and forgetting to sell them on. Eventually a truckload turned up at the Wall Street headquarters.

The article is actually about bad code being characterized as a naked call (a call that you sell without already owning the underlying security) as opposed to “technical debt”, but I mostly liked the story at the end.

Colony (4th weekend)

Colony (4th weekend)

Oct 30

Finally, a working library, sample script and all! I’m thinking I’ll pull out two sub-libraries so I can use them in other projects. I had to build a small redis model library and also a module template library.

The redis model library is a simplistic (read: not ActiveRecord) ORM library that is useful for building Redis-backed model classes.

The module template library allows you to create parameterized modules. This makes Ruby meta-programming much more structured and easy to understand.

You can check out colony at http://github.com/davidkellis/colony.