Posts

Showing posts from December, 2010

Extensibility in functional programming languages

Most software developers are now familiar with inheritance and virtual methods as common techniques for extensibility from the object oriented paradigm. When faced with functional programming for the first time, these developers often ask how to write extensible code in this alien paradigm. The functional paradigm actually only provides a single form of extensibility: higher-order functions. These allow you to factor out "inner" functions. For example, code that often appears with the same first and last code blocks: let f x = first x stuff1 x last x let g x = first x stuff2 x last x can be factored into a general higher order function that is reused from the specific cases: let hof stuff x = first x stuff x last x let f = hof stuff1 x let g = hof stuff2 x Applying this aggressively leads to design patterns such as parser combinators and is a very powerful and lightweight technique for making code extensible. However, it does not make data types extensible. Conseque...

Distinctive traits of functional programming languages

The landscape of functional programming languages is remarkably diverse, with most of the major families having quite distinctive traits and dialects that bring their own quirks. Here are some of the major categorizations: Evaluation strategy : non-strict (Miranda, Haskell) vs strict evaluation. Type system : static (Standard ML, OCaml, F#, Haskell, Scala, C# 3) vs dynamic (Scheme, Lisp, Clojure, Erlang) typing and untyped (Mathematica). Kind of static typing : structural (OCaml) vs nominal (F#, Haskell, Scala, C# 3) static typing. Type inference : Damas-Milner (Standard ML, OCaml, F#, Haskell) vs "local" inference (Scala, C# 3). Destructuring : pattern matching (Standard ML, OCaml, F#, Haskell, Erlang, Mathematica) vs manual deconstruction (Scheme, Lisp, C#). Extensibility of algebraic types : always closed (Standard ML, Haskell) vs optionally closed (OCaml). Pattern matching : linear (Standard ML, OCaml, Haskell) vs unbounded (F#, Mathematica). Run-time code generation : me...

Why GC when you have reference counted smart pointers?

Reference counted smart pointers are a simple form of garbage collection usable from the C++ programming language. A recent question on Stack Exchange asks why anyone would want anything more when reference counted smart pointers are already available. Other forms of garbage collection (most notably tracing GCs) have several advantages over reference counting: Accuracy: Reference counting alone leaks cycles so reference counted smart pointers will leak memory in general unless other techniques are added to catch cycles. Once those techniques are added, reference counting's benefit of simplicity has vanished. Throughput: Smart pointers are one of the least efficient forms of garbage collection, particularly in the context of multi-threaded applications when reference counts are bumped atomically. There are advanced reference counting techniques designed to alleviate this but tracing GCs are still the algorithm of choice in production environments. Latency: Typical smart pointer ...

Towards a mark-region GC for HLVM

Image
Our previous article highlighted the advantages of the recent mark-region GC design and hinted at HLVM adopting this design. We just completed some preliminary tests using a prototype written in C++ to measure the performance of different allocation strategies. Our results are as follows with times normalized by the time an equivalent OCaml program takes (so 1.0 means as fast as OCaml): The four columns in each section give the times relative to OCaml for solving the 8-, 9-, 10- and 11-queens problems. The "Boehm" section refers to the conservative Boehm GC which is 40-70% slower than OCaml on this benchmark. The "malloc" section refers to allocating using the malloc function from glibc without ever freeing and is 2.2-3.1× slower than OCaml. The "free" section refers to allocating with malloc and freeing (manually) and is 1.9-2.3× slower than OCaml. The "bump" section refers to a naive bump allocator that never recycles memory and is 1.4-1.7× sl...

When generational GC goes bad

For many years, generational collection was the defacto-standard GC architecture. Based upon the observation that the distribution of value lifetimes is heavily skewed towards short lifetimes (most values die young), generational garbage collectors allocate into a nursery generation and survivors are copied out into an old generation. Many practical language implementations use generational garbage collection including OCaml, GHC and .NET. Generational collection works well when the generational hypothesis holds but struggles when values survive the nursery only to become unreachable soon afterwards. This corresponds to common allocation patterns such as cycling values through mutable queues or caches and filling hash tables. Imagine repeatedly enqueuing and dequeuing values on a queue. The lifetimes of the values are proportional to the length of the queue. Thus, this provides a simple way to quantify the performance overhead of generational garbage collection. If boxed values are enq...

Getting paid to remove features

Although the Industrial Haskell Group has yet to garner its first industrial member since its inception almost two years ago, they have managed the impressive feat of getting paid to remove a feature from Haskell. Specifically, to make it easier to build programs written in Haskell that do not rely upon the GNU Multiprecision library for arbitrary-precision arithmetic (bignums). We made this interesting observation when considering adding bignums using GMP as a primitive type for HLVM . Apparently, having bignums in the language is not very useful beyond irrelevant microbenchmarks like computing the digits of π . The slides here also criticize the CAML Consortium (which has garnered 11 members) for charging too little and states that the IHG aimed to garner five members each paying £12k per annum. Why has this target not yet been reached? Our guess is insufficient sales and marketing directed at decision makers in industry who could benefit from using Haskell. As an aside, we believ...

Texas Multicore Technologies on Haskell

Image
US-based startup Texas Multicore Technologies have published some of the results they obtained using Haskell for parallel programming. Their results mirror our own : Thanks to Manuel Chakravarty of the University of New South Wales for drawing this to our attention .