Posts

Memory management myths: promptness

People often assert that scope-based reference counting such as shared_ptr in C++ collects garbage “promptly” and some people define this as “collects at the earliest possible point”. For example, at the time of writing the Wikipedia page about garbage collection says: “ Compared to tracing garbage collection, reference counting guarantees that objects are destroyed as soon as they become unreachable … ”  – Wikipedia Similar claims can even be seen in published research such as the paper “ Down for the Count? Getting Reference Counting Back in the Ring ”: “Of the two fundamental algorithms on which the garbage collection literature is built, reference counting has lived in the shadow of tracing. It has a niche among language developers for whom either performance or completeness is not essential, and is unused by mature high performance systems, despite a number of intrinsic advantages such as promptness of recovery and dependence on local rather than global state.” – Blackburn...

Memory management myths: determinism

Although the vast majority of programmers have now migrated to garbage collected languages and will probably never go back, there are still a few clinging to manual memory management. In most cases, the continued use of manual memory management is for good reason but some of these people are perpetuating myths in an attempt to justify avoiding garbage collection. Determinism can be a genuinely good reason to stick with manual memory management and is practically important in memory-constrained embedded devices. However, C++ programs are not as deterministic as people sometimes claim and, in particular, thread-safe reference counting using  shared_ptr  is non-deterministic. Specifically, threads holding references to shared reference-counted objects race to perform the final decrement and the thread that wins the race is responsible for destruction.

Herb Sutter's favorite C++ 10-liner has a memory management bug

In a recently-posted video , Herb Sutter (a prominent C++ expert) describes his favorite C++ 10-liner as “a thread-safe reference-counted object cache”: shared_ptr<widget> get_widget(int id) {   static map<int, weak_ptr<widget>> cache;   static mutex m;   lock_guard<mutex> hold(m);   auto sp = cache[id].lock();   if (!sp) cache[id] = sp = load_widget(id);   return sp; } This example is very interesting. Firstly, it manages to pull in reference counting, weak references and a mutex which are all very rare in modern programming. Secondly, it contains a memory leak that is difficult to fix in C++ because APIs are burdened with memory management details and this API is incapable of expressing deterministic cleanup because there is no facility for a widget's destructor to remove its entry in the map. Finally, the correct name for this data structure is a concurrent weak dictionary , specifically one with weak values. You'll find corre...

How do reference counting and tracing garbage collection compare?

Reference counting works by counting the number of references to each object. When the reference count drops to zero the object is definitely unreachable and can be recycled. The first advantage is simplicity. The second advantage is that decrements can occur as locals fall out of scope so collection can be deterministic and predictable. The third advantage is that the asymptotic complexity of reference counting does not depend upon the total size of the heap. The first problem is that cycles keep reference counts above zero even when they are unreachable, so reference counting alone cannot handle the general case. The second problem with reference counting is that incrementing and decrementing a counter in an object every time a reference to it is copied or dropped is very computationally expensive because it incurs a cache miss even if nothing else in the object is read or written. The third problem is that multithreaded reference counting is non-deterministic because increments and ...

Why is garbage collection an important feature?

Garbage collection eliminates a class of bugs caused by erroneous memory management (forget to free, free too soon, free more than once). Garbage collection removes the need for APIs to describe contracts about memory management. Garbage collection facilitates programming styles such as first-class lexical closures in functional programming (see the  Funarg problem ).

What's new in garbage collection?

Since the 1950s there have been three main families of collectors: semi-space, mark-sweep and mark-compact. Almost all production GCs have been generational mark-sweep even though they exhibit pathological performance when the nursery is full of survivors because they are marked, (physically) copied and all references to them updated which is ~3x slower than necessary and is practically useful when filling a hash table with heap-allocated keys and/or values. The Beltway (2002) and Immix (2008) garbage collectors introduced the new family called the mark-region GCs. With a mark-region GC the entire heap is a collection of regions and so is the nursery generation so it can be logically aged by replacing it with another region when it is full of survivors. Sun's Hotspot JVM introduced the first mainstream mark-region GC with its G1 collector. The advent of multicore in 2005 has meant more emphasis on parallel and concurrent garbage collectors. The Staccato (2008) garbage collector ...

A look at the IronJS source code

The author of IronJS has made some claims (e.g. F# can feel “like a slightly crappier version of C#”) that I find surprising so I'm reading his code. I find it to be quite odd for a variety of reasons: Surprisingly OOP (dozens of interfaces and classes and hundreds of augmentations, static members and overloads) given how well suited ML is to writing compilers. Lots of low-level bit-twiddling. Uses  List.empty  instead of  [] . Uses large tuples. Uses arrays of closures, e.g.  Stmt : (Lexer.Token -> State -> Ast.Tree) array . Doesn't use a symbol table. Choice of data structures appears to incur lots of unnecessary boxing. Odd choice of data structures for memory representation, e.g.  null ,  left  and  stmt  fields for a token are scattered. Doesn't seem to use concurrency or parallelism. Parser contains only 42  match  expressions in 1,400 lines of code. Massive amounts of code duplication. Hand-rolled parser is unusual. Ha...