Posts

Showing posts from September, 2010

Lessons learned from HLVM

Although our HLVM project is intended to bring modern VM techniques to modern programming languages without requiring any new research, we have ended up producing several enlightening new results. This article takes a look at some of the weird and wonderful techniques we used in HLVM . GC as a metaprogram (good!) Whereas most VMs use a garbage collector written in C or C++, the GC in HLVM is written in its own intermediate representation. This unusual design has a variety of benefits: The GC acts as a test for the compiler. The GC benefits from improvements we make to the compiler. The GC is simplified by being able to use features like tail call elimination and higher-order functions. Overall, implementing the GC in HLVM's own IR proved to be hugely beneficial: the GC was very easy to write and is easy to maintain. Fat references (bad!) Most VMs store header data in each heap-allocated block that describes the size and type of a value in order to allow the garbage collector to tr...

The effectiveness of Boehm's GC

Many people still seem to be trying to use Boehm's garbage collector . This is surprising because that GC is conservative , meaning it is incapable of accurately distinguishing between integers and pointers and, consequently, it is prone to memory leaks due to false positives where integers are assumed to be pointers into an allocated heap block, preventing the block from being reclaimed. Consequently, Boehm's GC is a notorious source of memory leaks. Moreover, 32-bit machines with 4Gb of RAM and programs that use a significant proportion of that RAM are still very common and the proportion of the address space in use is, therefore, high so the probability of false positives is very high. Imagine a 32-bit program using 40Mb of heap contains a random integer. The probability of that random integer coincidentally pointing into an allocated heap block is approximately 1% because 1% of the 4Gb address space is covered by heap blocks. Now imagine a 32-bit program containing n rando...

Are multicore-capable garbage collectors hard to write?

In this era of multicore computing, garbage collectors need to allow user threads (aka mutators ) to run in parallel on separate cores simultaneously in order to facilitate efficient shared memory parallel programming. There are two relevant phrases from garbage collection terminology here: Parallel GC means the garbage collector itself has been parallelised in order to speed up garbage collections. For example, a stop-the-world collector might traverse the heap in parallel when marking in an attempt to reduce pause times. Concurrent GC means the garbage collector runs at the same time as the user threads (aka mutators ). For example, Dijkstra's algorithm and derivatives like Doligez-Leroy use fine-grained synchronization to keep a concurrently-running collector apprised of the constantly-changing heap topology. However, we are talking about neither parallel GC nor concurrent GC but, rather, the simpler challenge of just allowing mutators to run in parallel. In the absence of any...

Burton Smith vs Erik Meijer

We were recently led to a fascinating Channel 9 interview where the insideous Erik Meijer walks in on parallelism-expert Burton Smith , culminating in a fight the likes of which have not been seen since Chris Curry vs Sir Clive Sinclair in the Baron of Beef here in Cambridge. A pragmatic Burton points out that lazy evaluation renders performance unpredictable and that, in turn, makes it impossible to tune the granularity of parallel programs to ensure that more effort is spent doing actual work than is wasted administering parallel tasks. The puritanical Erik points out that strictness is essentially another form of side effect because it affects issues such as non-termination. The conclusion is irreconcilable differences between what parallel programming requires and what purely functional programming can provide. Erik and Burton were joined by an elephant in the room though: caches. They are the main difference between supercomputers and multicores and are a game changer, yet peopl...