Posts

Benchmarking in the web age

The TechEmpower website contains some fascinating benchmarks of servers. The results on this benchmark of multiple requests to servers provide some insight into the performance characteristics of .NET on a modern problem. Specifically, the C# on ASP.NET Core solutions range from 2.5-80× slower than fastest solution which is written in Rust. In fact, C# is beaten by the following programming languages in order: Rust Java Kotlin Go C Perl Clojure PHP C++ Furthermore, .NET Core is Microsoft's new improved and faster version of .NET aimed specifically at these kinds of tasks. So why is it beaten by all those languages? I suspect that a large part of this is the change in workload from the kind of number crunching .NET was designed for to a modern string-heavy workload and I suspect .NET's GC isn't as optimised for this as the JVM is. As we have found, .NET has really poor support for JSON compared to other languages and frameworks, with support fragmented across many non-stand...

On "Quantifying the Performance of Garbage Collection vs. Explicit Memory Management"

The computer science research paper " Quantifying the Performance of Garbage Collection vs. Explicit Memory Management " by Emery Berger and Matthew Hertz contains an interesting study about memory management. However, the conclusions given in the paper were badly worded and are now being used to justify an anti-GC ideology. Introduction That paper describes an experiment that analyzed the performance of a benchmark suite using: Tracing garbage collection. Oracular memory management (precomputing the earliest point free could have been inserted). The experiment was performed: On one VM, the Jikes Research Virtual Machine (RVM). Using one programming language, Java, and consequently one programming paradigm, object oriented programming. Using each of the five different garbage collection algorithms provided by that VM. The five GC algorithms are: GenMS - Appel-style generational collector  (1988) GenCopy -  two generations with copying mature space CopyMS -  nursery...
Some objective quantitative measurements I once posted on Usenet: Jon Harrop wrote: > Andreas Rossberg wrote: >> That is a wild claim, and I doubt that you have any serious statistics >> to back it up. > > Here are some statistics on the proportion of lines of code devoted to > type annotations from 175kLOC of production OCaml and 5kLOC of production > Haskell: > > OCaml: > Hevea 9.0% > ADVI 8.6% > FFTW3 5.2% > Unison 3.5% > MLDonkey 2.5% > LEdit 1.4% > MTASC 0.0% > HLVM 0.0% > > Haskell: > XMonad 19% > Darcs 12% For further comparison, here are some statistics for compilers written in OCaml and Standard ML: OCaml: 6.3% of 217kLOC MosML: 13% of 69kLOC

What languages did you learn, in what order?

I once compiled the following list of programming languages and which year I started learning them: 1981: Sinclair BASIC (on a ZX81) 1983: BBC BASIC (on a BBC Micro) 1985: Pascal (installed via ROM) 1987: Logo 1987: 6502 Assembly (on a BBC Micro) 1989: ARM assembly (on an Acorn Archimedes) 1992: Casio fx-7700 BASIC 1992: C (using the excellent Norcroft compiler for Acorn computers) 1994: C++ (the awful Beebug Easy C++) 1995: UFI (on a VAX running OpenVMS) 1996: StrongARM assembly 1996: Standard ML 1997: Mathematica 1998: Quake C 2004: OCaml 2006: Java 2007: Common Lisp 2007: Scheme 2007: Scala 2007: Haskell 2007: F# 2009: Clojure 2010: HLVM 2017: Elm 2017: Javascript I was recently concerned to hear from  Socio-PLT: Quantitative and Social Theories for Programming Language Adoption that the number of languages a programmer knows stagnates after the age of just 20. The decade is nearing its end but I have only learned three languages. Now I'm wondering which languages I should lear...

Background reading on the reference counting vs tracing garbage collection debate

Image
Eight years ago I answered a question on Stack Overflow about the suitability of OCaml and Haskell for soft real-time work like visualization: " for real-time applications you will also want low pause times from the garbage collector. OCaml has a nice incremental collector that results in few pauses above 30ms but, IIRC, GHC has a stop-the-world collector that incurs arbitrarily-long pauses " My personal experience has always been that RAII in C++ incurs long pauses when using non-trivial data (i.e. nested, structured, collections of collections of collections, trees, graphs and so on), non-deferred reference counting has the same problem for the same reason, tracing garbage collectors like OCaml work beautifully but there are many notoriously bad tools like Java that have given tracing garbage collection a bad name. Now that I am revisiting this issue I am surprised to find many individuals and organisations repeating exactly the same experimental tests that I did and coming...

Does reference counting really use less memory than tracing garbage collection? Mathematica vs Swift vs OCaml vs F# on .NET and Mono

Image
Our previous post caused some controversy by questioning the validity of some commonly-held beliefs. Specifically, the beliefs that reference counting (RC) always requires less memory than tracing garbage collection (GC) and that tracing GCs require 3-4x more memory in order to perform adequately. We presented a benchmark written in Swift and OCaml and noted that the RC'd Swift implementation ran over 5x slower and required over 3x more memory than the tracing GC'd OCaml implementation. This observation disproves these beliefs in their strong form and even brings into question whether there is even any validity in the weak forms of those beliefs. After all, we have never seen any empirical evidence to support these beliefs in any form. We received a lot of criticism for that post. The vast majority of the criticism was not constructive but two valid points did arise. Firstly, although our result was anomalous it would be more compelling to see the benchmark repeated across a w...

Does reference counting really use less memory than tracing garbage collection? Swift vs OCaml

Image
The way software developers cling to folklore and avoid proper experimental testing of hypotheses really bothers me. I think perhaps the worst branch of computer science still riddled with myths and legends is memory management. Despite over half a century of research on garbage collection showing that reference counting is inferior to tracing garbage collections algorithms (even when almost all GC research restricts consideration to Java when much better algorithms have been known for years) there are still many people who claim otherwise. C++ developers obviously still believe that reference counted smart pointers are superior to tracing garbage collection but now other people are doing it too. People who should know better. This situation recently reared its ugly head again when Apple released a promising new programming language called Swift that shuns tracing garbage collection in favor of reference counting. Now, there are logical reasons for Apple to have chosen reference count...