Posts

Showing posts from May, 2012

Learning garbage collection theory

I want to learn the theory behind garbage collection. How do i go about it? I am also a dabbler interested in garbage collection (to the point that I wrote my own garbage collected VM called  HLVM ). I learned by reading as many research papers on garbage collection as I could get my hands on and by playing with the ideas myself, both raw in my virtual machine and also by writing memory-safe high-level simulations. The obvious answer is - a compiler textbook... The question is, is it necessary to learn lexical analysis, parsing and other stuff that usually precedes garbage collection in a text? The lexical analysis, parsing and other stuff is not relevant to garbage collection. You might get an outdated cursory overview of garbage collection from a compiler book but you need to read research papers to get an up-to-date view, e.g. with regard to multicore. In short, what are the prerequisites to learning about Garbage collection theory? You need to know about basic graph theory, poi...

Why do garbage collectors pause?

GCs work by tracing reachable heap blocks starting from a set of global roots (global variables, thread stacks and CPU registers). GCs lie on a sliding scale from snapshot to on-the-fly. Snapshot GCs work from a snapshot of the global roots and heap topology. On-the-fly GCs incrementally update their interpretation of the heap as the mutators run. Wholly snapshot GCs attain high throughput because the collector runs almost entirely independently of the mutators but have high latency because taking a snapshot incurs a pause. Wholly on-the-fly GCs attain low latency because everything is done incrementally but low throughput because of the fine grained communication between the mutators and GC. In practice, all GCs lie somewhere between these two  extremes.  VCGC  is primarily a snapshot GC but it uses a write barrier to keep the collector apprised of changes to the heap topology.  Staccato  was the world's first parallel and concurrent and real-time GC but it sti...

Functors vs generics

From this Stack Overflow question: If I understand it correctly, functor gives you a new set of functions from a type you provide More generally, functors map modules to modules. Your  Set  example maps a module adhering to the ORDERED_TYPE  signature to a module implementing a set. The  ORDERED_TYPE  signature requires a type and a comparison function. Therefore, your C# is not equivalent because it parameterizes the set only over the type and not over the comparison function. So your C# can only implement one set type for each element type whereas the functor can implement many set modules for each element module, e.g. in ascending and descending order. Even if functors are just generics-for-namespace, what's the significant advantage of that approach? One major advantage of a higher-order module system is the ability to gradually refine interfaces. In OOP, everything is public or private (or sometimes protected or internal etc.). With modules, you can gradua...

Is metaprogramming possible in C#?

Metaprogramming is possible in .NET (see compiler compilers, regular expressions, code DOM, reflection etc.) but C# is not capable of  template  metaprogramming because it does not have that language feature.

Why don't purely functional languages use reference counting?

You can create cyclic data structures using purely functional programming simply by defining mutually-recursive values at the same time. For example, two mutually-recursive lists in OCaml: let rec xs = 0::ys and ys = 1::xs However, it is possible to define languages that make it impossible to create cyclic structures by design. The result is known as a  unidirectional heap  and its primary advantage is that garbage collection can be as simple as reference counting. Some real languages do prohibit cycles and use reference counting. Erlang and Mathematica are examples. For example, in Mathematica when you reference a value you make a deep copy of it so mutating the original does not mutate the copy: In[1] := xs = {1, 2, 3} Out[1] = {1, 2, 3} In[2] := ys = xs Out[2] = {1, 2, 3} In[3] := xs[[1]] = 5 Out[3] = 5 In[4] := xs Out[4] = {5, 2, 3} In[5] := ys Out[5] = {1, 2, 3}

What's wrong with C#?

null  everywhere. const  nowhere. APIs are inconsistent, e.g. mutating an array returns  void  but appending to a  StringBuffer returns the same mutable  StringBuffer . Collection interfaces are incompatible with immutable data structures, e.g.  Add  in  System.Collections.Generic.IList<_>  cannot return a result. No structural typing so you write  System.Windows.Media.Effects.SamplingMode.Bilinear  instead of just  Bilinear . Mutable  IEnumerator  interface implemented by classes when it should be an immutable  struct . Equality and comparison are a mess: you've got  System.IComparable  and  Equals  but then you've also got: System.IComparable<_>  System.IEquatable System.Collections.IComparer  System.Collections.IStructuralComparable  System.Collections.IStructuralEquatable  System.Collections.Generic.IComparer  System.Collections.Generic.IEqualityCo...

Algebraic data types vs objects

I am a long time OO programmer and a functional programming newbie. From my little exposure algebraic data types only look like a special case of inheritance to me where you only have one level hierarchy and the super class cannot be extended outside the module. You are describing closed sum types, the most common form of algebraic data types, as seen in F# and Haskell. Basically, everyone agrees that they are a useful feature to have in the type system, primarily because pattern matching makes it easy to dissect them by shape as well as by content and also because they permit exhaustiveness and redundancy checking. However, there are other forms of algebraic datatypes. An important limitation of the conventional form is that they are closed, meaning that a previously-defined closed sum type cannot be extended with new type constructors (part of a more general problem known as "the expression problem"). OCaml's polymorphic variants allow both open and closed sum types and...

Quotations in F#, OCaml and Lisp

A quotation mechanism lets you embed code in your code and have the compiler transform that code from the source you provide into a data structure that represents it. For example, the following gives you a data structure representing the F# expression   1+2 : > <@ 1+2 @>;; val it : Quotations . Expr < int > =   Call ( None , Int32 op_Addition [ Int32 , Int32 , Int32 ]( Int32 , Int32 ),       [ Value ( 1 ), Value ( 2 )])     { CustomAttributes = [ NewTuple ( Value ( "DebugRange" ),           NewTuple ( Value ( "stdin" ), Value ( 3 ), Value ( 3 ), Value ( 3 ), Value ( 6 )))];      Raw = ...;      Type = System . Int32 ;} You can then hack on this data structure in order to apply transformations to your code, such as translating it from F# to Javascript in order to run it client side on almost any browser. The F# quotation mechanism is extremely limited in functionality ...