Posts

Showing posts from July, 2009

OCaml vs F#: Burrows Wheeler Transform

The Burrows-Wheeler Transform (BWT) is a block-sorting data compression algorithm that acts as a preconditioner to dramatically improve the compression ratios of subsequent compression phases in many cases. This algorithm is the core of the bzip2 data compression utility that is ubiquitous on Unix and, in particular, is often much more effective than zip and gzip. The core of the BWT algorithm is easily written in F# : let cmp (str: _ array) i j = let rec cmp i j = if i=str.Length then 1 else if j=str.Length then -1 else let c = compare str.[i] str.[j] in if c 0 then c else cmp (i+1) (j+1) cmp i j let bwt (str: byte array) = let n = str.Length let a = Array.init n (fun i -> i) Array.sortInPlaceWith (cmp str) a Array.init n (fun i -> str.[(a.[i] + n - 1) % n]) However, this implementation is very slow and, in particular, is many times slower than the extremely heavily optimized C implementation found in the bzip2 program. The main ca...

OCaml vs F#: QR decomposition

Recent articles in the OCaml and F#.NET Journals derived high-level implementations of QR decomposition via Householder reductions. This numerical method has many applications, most notably in the computation of best fit parameters of linear sums. Imperfections in OCaml The OCaml programming language allows this algorithm to be expressed very elegantly in only 15 lines of code: # let qr a = let m, n = Matrix.dim a in let rec qr_aux k q r qa = if k = n then q, Ma...

Problems with fslex and fsyacc

Back in 2004, we wrote a mini Mathematica implementation in OCaml in only four days. For fun, we decided to port this OCaml program to F#. Incredibly, porting just the lexer and parser from OCaml to F# has taken longer than it took us to write the entire original OCaml code! The reason is simply that OCaml has an incredibly powerful and mature suite of compiler tools for generating lexers and parsers whereas F# does not. Given that our original OCaml lexer and parser for Mathematica source code were written using ocamllex and ocamlyacc , it seemed obvious to translate the code into fslex and fsyacc . However, there are some serious problems with the F# versions of these tools: Incomplete : the F# tools were only ever intended for the F# compiler itself and not for wider use. Consequently, features are missing, e.g. in the regular expression implementation in  fslex . Unintegrated : there are Visual Studio support for these tools and, consequently, development is cumbersome compar...