The D Blog

Snowflake Strings

Feb 22, 2017 • WalterBright • #Compilers & Tools, #Core Team, #Guest Posts

Walter Bright is the BDFL of the D Programming Language and founder of Digital Mars. He has decades of experience implementing compilers and interpreters for multiple languages, including Zortech C++, the first native C++ compiler. He also created Empire, the Wargame of the Century.

Consider the following D code in file.d:

int foo(int i) {
    assert(i < 3);
    return i;
}

This is equivalent to the C code:

#include <assert.h>

int foo(int i) {
    assert(i < 3);
    return i;
}

The assert() in the D code is “lowered” (i.e. rewritten by the compiler) to the following:

(i < 3 || _d_assertp("file.d", 2)) We're interested in how the compiler writes that string literal, `"file.d"` to the generated object file. The most obvious implementation is to write the characters into the data section and push the address of that to call `_d_assertp()`.

Indeed, that does work, and it’s tempting to stop there. But since we’re professional compiler nerds obsessed with performance, details, etc., there’s a lot more oil in that olive (to borrow one of Andrei’s favorite sayings). Let’s put it in a press and start turning the screw, because assert()s are used a lot.

First off, string literals are immutable (they were originally mutable in C and C++, but are no longer, and D tries to learn from such mistakes). This suggests the string can be put in read-only memory. What advantages does that deliver?

Read-only memory is, well, read only. Attempts to write to it are met with a seg fault exception courtesy of the CPUs memory management logic. Seg faults are a bit graceless, like a cop pulling you over for driving on the wrong side of the road, but at least there wasn’t a head-on collision or corruption of the program’s memory.
Read-only pages are never swapped out by the virtual memory system; they never get marked as “dirty” because they are never written to. They may get discarded and reloaded, but that’s much less costly.
Read-only memory is safe from corruption by malware (unless the malware infects the MMU, sigh).
Read-only memory in a shared library is shared - copies do not have to be made for each user of the shared library.
Read-only memory does not need to be scanned for pointers to the heap by the garbage collection system (if the application does GC).

Essentially, shoving as much as possible into read-only memory is good for performance, size and safety. There’s the first drop of oil.

The next issue crops up as soon as there’s more than one assert:

int foo(int i, int j) {
    assert(i < 3);
    assert(j & 1);
    return i + j;
}

The compiler emits two copies of "file.d" into the object file. Since they’re identical, and read-only, it makes sense to only emit one copy:

string TMP = "file.d";
int foo(int i, int j) {
    (i < 3 || _d_assertp(TMP, 2))
    (j & 1 || _d_assertp(TMP, 3))
    return i + j;
}

This is called string pooling and is fairly simple to implement. The compiler maintains a hash table of string literals and their associated symbol names (TMP in this case).

So far, this is working reasonably well. But when asserts migrate into header files, macros, and templates, the same string can appear in lots of object files, since the compiler doesn’t know what is happening in other object files (the separate compilation model). Other string literals can exhibit this behavior, too, when generic coding practices are used. There needs to be some way to present these in the object file so the linker can pool identical strings.

The dmd D compiler currently supports four different object file formats on different platforms:

Elf, for Linux and FreeBSD
Mach-O, for OSX
MS-COFF, for Win64
OMF, for Win32

Each does it in a different way, with different tradeoffs. The methods tend to be woefully under documented, and figuring this stuff out is why I get paid the big bucks.

Elf

Elf turns out to have a magic section just for this purpose. It’s named .rodata.strM.N where N is replace by the number of bytes a character has, and M is the alignment. For good old char strings, that would be .rodata.str1.1. The compiler just dumps the strings into that section, and the Elf linker looks through it, removing the redundant strings and adjusting the relocations accordingly. It’ll handle the usual string types - char, wchar, and dchar - with aplomb.

There’s just a couple flaws. The end of a string is determined by a nul character. This means that strings cannot have embedded nuls, or the linker will regard them as multiple strings and shuffle them about in unexpected ways. One cannot have relocations in those sections, either. This means it’s only good for C string literals, not other kinds of data.

This poses a problem for D, where the strings are length-delineated strings, not nul-terminated ones. Does this mean D is doomed to being unable to take advantage of the C-centric file formats and linker design? Not at all. The D compiler simply appends a nul when emitting string literals. If the string does have an embedded nul (allowed in D), it is not put it in these special sections (and the benefit is lost, but such strings are thankfully rare).

Mach-O

Mach-O uses a variant of the Elf approach, a special section named __cstring. It’s more limited in that it only works with single byte chars. No wchar_ts for you! If there ever was confirmation that UTF-16 and UTF-32 are dead end string types, this should be it.

MS-COFF

Microsoft invented MS-COFF by extending the old Unix COFF format. It has many magic sections, but none specifically for strings. Instead, it uses what are called COMDAT sections, one for each string. COMDATs are sections tagged with a unique name, and when the linker is faced with multiple COMDATs with the same name, one is picked and all references to the other COMDATs are rewritten to refer to the Anointed One. COMDATs first appeared in object formats with the advent of C++ templates, since template code generation tends to generate the same code over and over in separate files.

(Isn’t it interesting how object file formats are driven by the needs of C and C++?)

The COMDAT for "hello" would look something like this:

<code>??_C@_05CJBACGMB@hello?$AA@:
db 'h', 'e', 'l', 'l', 'o', 0</code> The tty noise there is the mangled name of the COMDAT which is generated from the string literal's contents. The algorithm must match across compilation units, as that is how the linker decides which ones are the same (experimenting with it will show that the substring `CJBACGMB` is some sort of hash). Microsoft's algorithm for the mangling and hash is undocumented as far as I can determine, but it doesn't matter anyway, it only has to have a 1:1 mapping between name and string literal. That screams "MD5 hash" to me, so that's what dmd does. The name is an MD5 hash of the string literal contents, which also has the nice property that no matter how large the string gets, the identifier doesn't grow.

COMDATs can have anything stuffed in them, so this is a feature that is usable for a lot more than just strings.

The downside of the COMDAT scheme is the space taken up by all those names, so shipping a program with the debug symbols in it could get much larger.

OMF

The caboose is OMF, an ancient format going back to the early 80’s. It was extended with a kludgy COMDAT system to support C++ just before most everyone abandoned it. DMD still emits it for Win32 programs. We’re stuck with it because it’s the only format the default linker (OPTLINK) understands, and so we find a way to press it into service.

Since it has COMDATs, that’s the mechanism used. The wrinkle is that COMDATs are code sections or data sections only; there are no other options. We want it to be read-only, so the string COMDATs are emitted as code sections (!). Hey, it works.

Conclusion

I don’t think we’ve pressed all the oil out of that olive yet. It may be like memcpy, where every new crop of programmers thinks of a way to speed it up.

I hope you’ve enjoyed our little tour of literals, and may all your string literals be unique snowflakes.

Thanks to Mike Parker for his help with this article.

A New Import Idiom

Feb 13, 2017 • DanielNielsen • #Guest Posts, #The Language

Daniel Nielsen is an Embedded Software Engineer. He is currently using D in his spare time for an unpublished Roguelike and warns that he “may produce bursts of D Evangelism”.

I remember one day in my youth, before the dawn of Internet, telling my teachers about “my” new algorithm, only to learn it had been discovered by the ancient Greeks in ~300 BC. This is the story of my life and probably of many who are reading this. It is easy to “invent” something; being the first, not so much!

Anyway, this is what all the fuss is about this time:

template from(string moduleName)
{
  mixin("import from = " ~ moduleName ~ ";");
}

The TL;DR version: A new idiom to achieve even lazier imports.

Before the C programmers start running for the hills, please forget you ever got burned by C++ templates. The above snippet doesn’t look that complicated, now does it? If you enjoy inventing new abstractions, take my advice and give D a try. Powerful, yet an ideal beginner’s language. No need to be a template archwizard.

Before we proceed further, I’d like to call out Andrei Alexandrescu for identifying that there is a problem which needs solving. Please see his in depth motivation in DIP 1005. Many thanks also to Dominikus Dittes Scherkl, who helped trigger the magic spark by making his own counter proposal and questioning if there really is a need to change the language specification in order to obtain Dependency-Carrying Declarations (DIP 1005).

D, like many modern languages, has a fully fledged module system where symbols are directly imported (unlike the infamous C #include). This has ultimately resulted in the widespread use of local imports, limiting the scope as much as possible, in preference to the somewhat slower and less maintainable module-level imports:

// A module-level import
import std.datetime;
  
void fun(SysTime time)
{
  import std.stdio; // A local import
  ...
}

Similar lazy import idioms are possible in other languages, for instance Python.

The observant among you might notice that because SysTime is used as the type of a function parameter, std.datetime must be imported at module level. Which brings us to the point of this blog post (and DIP 1005). How can we get around that?

void fun(from!"std.datetime".SysTime time)
{
  import std.stdio;
  ...
}

There you have it, the Scherkl-Nielsen self-important lookup.

In order to fully understand what’s going on, you may need to learn some D-isms. Let’s break it down.

When instantiating a template (via the ! operator), if the TemplateArgument is one token long, the parentheses can be omitted from the template parameters. So from!"std.datetime" is the same as from!("std.datetime"). It may seem trivial, but you’d be surprised how much readability is improved by avoiding ubiquitous punctuation noise.
Eponymous templates. The declaration of a template looks like this:

template y() {
    int x;
}

With that, you have to type y!().x in order to reach the int. Oh, ze horror! Is that a smiley? Give me x already! That’s exactly what eponymous templates accomplish:

template x() {
    int x;
}

Now that the template and its only member have the same name, x!().x can be shortened to simply x.

Renamed imports allow accessing an imported module via a user-specified namespace. Here, std.stdio is imported normally:

void printSomething(string s) {
    import std.stdio;
    writeln(s);           // The normal way
    std.stdio.writeln(s)  // An alternative using the fully qualified 
                          // symbol name, for disambiguation
}

Now it’s imported and renamed as io:

void printSomething(string s) {
    import io = std.stdio;
    io.writeln(s);         // Must be accessed like this.
    writeln(s);            // Error
    std.stdio.writeln(s);  // Error
}

Combining what we have so far:

template dt() {
    import dt = std.datetime; 
}
void fun(dt!().SysTime time) {}

It works perfectly fine. The only thing which remains is to make it generic.

String concatenation is achieved with the ~ operator.

string hey = "Hello," ~ " World!";
assert(hey == "Hello, World!");   5. [String mixins](https://dlang.org/mixin.html) put the power of a compiler writer at your fingertips. Let's generate code at compile time, then compile it. This is typically used for domain-specific languages (see [Pegged](https://github.com/PhilippeSigaud/Pegged) for one prominent use of a DSL in D), but in our simple case we only need to generate one single statement based on the name of the module we want to import. Putting it all together, we get the final form, allowing us to import any symbol from any module inline:

template from(string moduleName)
{
  mixin("import from = " ~ moduleName ~ ";");
}

In the end, is it all really worth the effort? Using one comparison made by Jack Stouffer:

import std.datetime;
import std.traits;

void func(T)(SysTime a, T value) if (isIntegral!T)
{
    import std.stdio : writeln;
    writeln(a, value);
}

Versus:

void func(T)(from!"std.datetime".SysTime a, T value)
    if (from!"std.traits".isIntegral!T)
{
    import std.stdio : writeln;
    writeln(a, value);
}

In this particular case, the total compilation time dropped to ~30% of the original, while the binary size dropped to ~41% of the original.

What about the linker, I hear you cry? Sure, it can remove unused code. But it’s not always as easy as it sounds, in particular due to module constructors (think __attribute__((constructor))). In either case, it’s always more efficient to avoid generating unused code in the first place rather than removing it afterwards.

So this combination of D features was waiting there to be used, but somehow no one had stumbled on it before. I agreed with the need Andrei identified for Dependency-Carrying Declarations, yet I wanted even more. I wanted Dependency-Carrying Expressions. My primary motivation comes from being exposed to way too much legacy C89 code.

void foo(void)
{
#ifdef XXX /* needed to silence unused variable warnings */
  int x;
#endif
... lots of code ...
#ifdef XXX
  x = bar();
#endif
}

Variables or modules, in the end they’re all just symbols. For the same reason C99 allowed declaring variables in the middle of functions, one should be allowed to import modules where they are first used. D already allows importing anywhere in a scope, but not in declarations or expressions. It was with this mindset that I saw Dominikus Dittes Scherkl’s snippet:

fun.ST fun()
{
   import someModule.SomeType;
   alias ST = SomeType;
   ...
}

Clever, yet for one thing it doesn’t adhere to the DRY principle. Still, it was that tiny dot in fun.ST which caused the spark. There it was again, the Dependency-Carrying Expression of my dreams.

Criteria:

It must not require repeating fun, since that causes problems when refactoring
It must be lazy
It must be possible today with no compiler updates

Templates are the poster children of lazy constructs; they don’t generate any code until instantiated. So that seemed a good place to start.

Typically when using eponymous templates, you would have the template turn into a function, type, variable or alias. But why make the distinction? Once again, they’re all just symbols in the end. We could have used an alias to the desired module (see Scherkl’s snippet above); using the renamed imports feature is just a short-cut for import and alias. Maybe it was this simplified view of modules that made me see more clearly.

Now then, is this the only solution? No. As a challenge to the reader, try to figure out what this does and, more importantly, its flaw. Can you fix it?

static struct STD
{
  template opDispatch(string moduleName)
  {
    mixin("import opDispatch = std." ~ moduleName ~ ";");
  }
}

Project Highlight: DPaste

Jan 30, 2017 • DBlogAdmin • #Compilers & Tools, #Project Highlights

DPaste is an online compiler and collaboration tool for the D Programming Language. Type in some D code, click run, and see the results. Code can also be saved so that it can be shared with others. Since it was first announced in the forums back in 2012, it has become a frequently used tool in facilitating online discussions in the D community. But Damian Ziemba, the creator and maintainer of DPaste, didn’t set out with that goal in mind.

Actually it was quite spontaneous and random. I was hanging out on the #D IRC channel at freenode. I was quite amazed at how active this channel was. People were dropping by asking questions, lots of code snippets were floating around. One of the members created an IRC bot that was able to compile code snippets, but it was for his own language that he created with D. Someone else followed and created the same kind of bot, but with the ability to compile code in D, though it didn’t last long as it was run on his own PC. So I wrote my own, purely in D, that was compiling D snippets. It took me maybe 4-5 hours to write both an IRC support lib and the logic itself. Then some server hardening where the bot was running and voila, we had nazbot @ #D, which was able to evaluate statements like ^stmt import std.stdio; writeln("hello world"); and would respond with, "hello world".

Nazbot became popular and people started floating new ideas. That ultimately led Damian to take a CMS he had already written in PHP and repurpose it to use as a frontend for what then became DPaste.

The frontend is written in PHP and uses MySQL for storage. It acts as a web interface (using a Bootstrap HTML template and jQuery) and API provider for 3rd Parties. The backend is responsible for actual compilation and execution. It’s possible to use multiple backends. The frontend is a kind of load-balancer when it comes to choosing a backend. The frontend and the backend may live on different machines.

DPaste is primarily used through the web interface, but it’s also used by dlang.org.

Once dpaste.dzfl.pl was well received, the idea popped up that maybe we could provide runnable examples on the main site. So it was implemented. The next idea, proposed by Andrei Alexandrescu, was to enable runnable examples on all of the Phobos documentation. I got swallowed by real life and couldn’t contribute at the time, but eventually Sebastian Wilzbach took it up and finished the implementation. So today we have interactive examples in the Phobos documentation.

When Damian first started work on DPaste in 2011, the D ecosystem looked a bit different than it does today.

There weren’t as many 3rd party libraries as we have now; there was no DUB, there was no vibe.d, etc. I wish I’d had vibe.d back then. I would have implemented the frontend in D instead of PHP.

What I enjoy the most about D is just how “nice” to the eye the language is (compared to C and C++, which I work with on a daily basis) and how easy it is to express what’s in your mind. I’ve never had to stop and think, “how the hell can I implement this”, which is quite common with C++ in my case. In the current state, what is also amazing is how D is becoming a “batteries-included” package. Whatever you need, you just dub fetch it.

He’s implemented DPaste such that it requires very little in terms of maintenance costs. It automatically updates itself to the latest compiler release and also knows how to restart itself if the backend hangs for some reason. He says the only real issue he’s had to deal with over the past five years is spam, which has forced him to reimplement the captcha mechanism several times.

As for the future? He has a few things in mind.

I plan to rewrite the backend from scratch, open source it and use a docker image so anybody can easily pick up development or host his own backend (which is almost done). Functionally, I want to maintain different compiler versions like DMD 2.061.0, DMD 2.062.1, DMD 2.063.0, LDC 0.xx, GDC x.xx.xx, etc., and connect more architectures as backends (currently x86, arm and aarch64 are planned).

I also want to rewrite the frontend in D using vibe.d, websockets, and angular.js. In general, I would like to make the created applications more interactive. So, for example, you could use the output from your code snippet in realtime as it is produced. I would like also to split a middle end off from the frontend. The middle end would provide communication with backends and offer both a REST API and websockets. Then the frontend would be responsible purely for user interaction and nothing else.

He would also like to see DPaste become more official, perhaps through making it a part of dlang.org. And for a point further down the road, Damian has an even grander plan.

I hope to make a full blown online IDE for dlang.org with workspaces, compilers to chose, and so on.

That would be cool to see!

Testing In The D Standard Library

Jan 20, 2017 • JackStouffer • #Guest Posts, #Phobos, #The Language

Jack Stouffer is a member of the Phobos team and a contributor to dlang.org. You can check out more of his writing on his blog.

In the D standard library, colloquially named Phobos, we take a multi-pronged approach to testing and code review. Currently, there are five different services any addition has to go through:

The whole complier chain of tests: DMD’s and DRuntime’s test suite, and Phobos’s unit tests
A documentation builder
Coverage analysis
A style checker
And a community project builder/test runner

Using these, we’re able to automatically catch the vast majority of common problems that we see popping up in PRs. And we make regressions much less likely using the full test suite and examining coverage reports.

Hopefully this will provide some insight into how a project like a standard library can use testing in order to increase stability. Also, it can spark some ideas on how to improve your own testing and review process.

Unit Tests

In D, unit tests are an integrated part of the language rather than a library feature:

size_t sum(int[] a)
{
    size_t result;

    foreach (e; a)
    {
        result += e;
    }

    return result;
}

unittest
{
    assert(sum([1, 2, 3]) == 6);
    assert(sum([0, 50, 100]) == 150);
}

void main() {}

Save this as test.d and run dmd -unittest -run test.d. Before your main function is run, all of the unittest blocks will be executed. If any of the asserts fail, execution is terminated and an error is printed to stderr.

The effect of putting unit tests in the language has been enormous. One of the main ones we’ve seen is tests no longer have the “out of sight, out of mind” problem. Comprehensive tests in D projects are the rule and not the exception. Phobos dogfoods inline unittest blocks and uses them as its complete test suite. There are no other tests for Phobos than the inline tests, meaning for a reviewer to check their changes, they can just run dmd -main -unittest -run std/algorithm/searching.d (this is just for quick and dirty tests; full tests are done via make).

Every PR onto Phobos runs the inline unit tests, DMD’s tests, and the DRuntime tests on the following platforms:

Windows 32 and 64 bit
MacOS 32 and 64 bit
Linux 32 and 64 bit
FreeBSD 32 and 64 bit

This is done by Brad Roberts’s auto-tester. As a quick aside, work is currently progressing to make bring D to iOS and Android.

Idiot Proof

In order to avoid pulling untested PRs, we have three mechanisms in place. First, only PRs which have at least one Github review by someone with pull rights can be merged.

Second, we don’t use the normal button for merging PRs. Instead, once a reviewer is satisfied with the code, we tell the auto-tester to merge the PR if and only if all tests have passed on all platforms.

Third, every single change to any of the official repositories has to go through the PR review process. This includes changes made by the BDFL Walter Bright and the Language Architect Andrei Alexandrescu. We have even turned off pushing directly to the master branch in Github to make sure that nothing gets around this.

Unit Tests and Examples

Unit tests in D solve the perennial problem of out of date docs by using the unit test code itself as the example code in the documentation. This way, documentation examples are part of the test suite rather than just some comment which will go out of date.

With this format, if the unit test goes out of date, then the test suite fails. When the tests are updated, the docs change automatically.

Here’s an example:

/**
 * Sums an array of `int`s.
 * 
 * Params:
 *      a = the array to sum
 * Returns:
 *      The sum of the array.
 */
size_t sum(int[] a)
{
    size_t result;

    foreach (e; a)
    {
        result += e;
    }

    return result;
}

///
unittest
{
    assert(sum([1, 2, 3]) == 6);
    assert(sum([0, 50, 100]) == 150);
}

// only tests with a doc string above them get included in the
// docs
unittest
{
    assert(sum([100, 100, 100]) == 300);
}

void main() {}

Run dmd -D test.d and it generates the following un-styled HTML:

Phobos uses this to great effect. The vast majority of examples in the Phobos documentation are from unittest blocks. For example, here is the documentation for std.algorithm.find and here is the unit test that generates that example.

This is not a catch all approach. Wholesale example programs, which are very useful when introducing a complex module or function, still have to be in comments.

Protecting Against Old Bugs

Despite our best efforts, bugs do find their way into released code. When they do, we require the person who’s patching the code to add in an extra unit test underneath the buggy function in order to protect against future regressions.

Docs

For Phobos, the documentation pages which were changed are generated on a test server for every PR. Developed by Vladimir Panteleev, the DAutoTest allows reviewers to compare the old page and the new page from one location.

For example, this PR changed the docs for two structs and their member functions. This page on DAutoTest shows a summary of the changed pages with links to view the final result.

Coverage

Perfectly measuring the effectiveness of a test suite is impossible, but we can get a good rough approximation with test coverage. For those unaware, coverage is a ratio which represents the number of lines of code that were executed during a test suite vs. lines that weren’t executed.

DMD has built-in coverage analysis to work in tandem with the built-in unit tests. Instead of dmd -unittest -run main.d, do dmd -unittest -cov -run main.d and a file will be generated showing a report of how many times each line of code was executed with a final coverage ratio at the end.

We generate this report for each PR. Also, we use codecov in order to get details on how well the new code is covered, as well as how coverage for the whole project has changed. If coverage for the patch is lower than 80%, then codecov marks the PR as failed.

At the time of writing, of the 77,601 lines of code (not counting docs or whitespace) in Phobos, 68,549 were covered during testing and 9,052 lines were not. This gives Phobos a test coverage of 88.3%, which is increasing all of the time. This is all achieved with the built in unittest blocks.

Project Tester

Because test coverage doesn’t necessarily “cover” all real world use cases and combinations of features, D uses a Jenkins server to download, build, and run the tests for a select number of popular D projects with the master branches of Phobos, DRuntime, and DMD. If any of the tests fail, the reviewers are notified.

Style And Anti-Pattern Checker

Having a code style set from on high stops a lot of pointless Internet flame wars (tabs vs spaces anyone?) dead in their tracks. D has had such a style guide for a couple of years now, but its enforcement in official code repos was spotty at best, and was mostly limited to brace style.

Now, we use CircleCI in order to run a series of bash scripts and the fantastically helpful dscanner which automatically checks for all sorts of things you shouldn’t be doing in your code. For example, CircleCI will give an error if it finds:

bad brace style
trailing whitespace
using whole module imports inside of functions
redundant parenthesis

And so on.

The automation of the style checker and coverage reports was done by Sebastian Wilzbach. dscanner was written by Brian Schott.

Closing Thoughts

We’re still working to improve somethings. Currently, Sebastian is writing a script to automatically check the documentation of every function for at least one example. Plus, the D Style Guide can be expanded to end arguments over the formatting of template constraints and other contested topics.

Practically speaking, other than getting the coverage of Phobos up to >= 95%, there’s not too much more to do. Phobos is one of the most throughly tested projects I’ve ever worked on, and it shows. Just recently, Phobos hit under 1000 open bugs, and that’s including enhancement requests.

The D Language Foundation Google Summer of Code 2016 Postmortem

Jan 13, 2017 • CraigDillabaugh • #D Foundation, #GSoC, #Guest Posts

Craig Dillabaugh was first drawn to D by its attractive syntax and Walter Bright’s statement that D is “a programming language, not a religion”. He maintains bindings to the geospatial libraries shapelib and gdal, volunteered to manage the GSoC 2015 & 2016 efforts for D, and has taken it on again for 2017. He lives near Ottawa, Canada, and works for a network monitoring/security company called Solana Networks.

The 2016 Google Summer of Code (GSoC) proved to be a great success for the D Language Foundation. Not only did we have, for us, a record number of slots allotted (four) and all projects completed successfully, perhaps most important of all we attracted four excellent students who will hopefully be long time contributors to the D Language and its community. This report serves as a review for the community of our GSoC efforts this past summer, and tries to identify some ways we can make 2017 an equal, or better, success.

Background

Back in 2011 and 2012, Digital Mars applied to participate in, and was accepted to, Google Summer of Code. In each of those years we were awarded three slots and had successful projects. Additionally, a number of long time D contributors, including David Nadlinger, Alex Bothe, and Dmitry Olshansky, were involved as students. Sadly, in the succeeding two years we were not awarded any slots. After 2014’s unsuccessful bid, Andrei asked on the forums if anyone wanted to take the lead for the 2015 GSoC, as he had too many things on his plate. This is when I decided to volunteer for the job.

I prepared for the 2015 GSoC and worked on getting some solid items for our Ideas page. I even prepared what I thought was a beautifully typeset document in LaTeX for our final submission. Needless to say, I was very disappointed when I had to copy/paste each section into the simple web form that Google provided for submissions. Sadly, that year we were rejected once more, though I felt our list of ideas was solid.

We applied again in 2016 for the first time as The D Language Foundation. Again, the community contributed lots of solid suggestions for the Ideas page and we were accepted for the first time in four years. I think that perhaps getting accepted involves a bit of luck, as our ideas were similar to, or repeated from, those that were not accepted in 2015. However, more effort was put into polishing up the page, so perhaps that helped.

The Selection Process

Once we were accepted as a mentoring organization, the process of receiving student proposals began. We received interest from a large number of students from all over the world (about 35). In the end, a total of 23 proposals were officially submitted, ranging from very short–obviously last minute–pieces, to several excellent efforts, including Sebastian Wilzbach’s 20-page document.

Our selection process was, I felt, very rigorous. We had seven of our potential admins/mentors screen the initial proposals. This involved reading all 23 proposals, which was a significant amount of work. From this initial screening we identified eight students/proposals that we thought could become successful projects. We then had all mentors individually rank each of the shortlisted proposals, another significant time commitment on their part.

Finally, interviews were arranged with all eight students. In most cases, two mentors interviewed each student, and the interviews were fairly intense, job-style interviews that involved coding exercises. A number of our mentors were involved in this process, but I think Amaury Sechet interviewed all of the students. It is no small feat to arrange and then conduct interviews with students in so many different time zones, so a huge thanks to all the mentors, but Amaury in particular. Those involved in the screening/interview process included Andrei Alexandrescu, Ilya Yaroshenko, Adam Ruppe, Adam Wilson, Dragos Carp, Russel Winder, Robert Schadek, Amaury, and myself.

Awarding of Slots

The next step for our organization was to decide how many slots we would request from Google. I really had no idea what to expect, but I was hoping we might get two slots awarded to us, as there were many good organizations vying for a limited number of slots. We felt that most of the short-listed projects could have been successful, but decided to not be too greedy and requested just four slots. As it turned out, perhaps we should have asked for more; we were awarded all four. We then selected our top four ranked students from the interview process. They were, in no particular order:

Sebastian Wilzbach: Science for D - a non-uniform RNG (Ilya Yaroshenko mentor)
Lodovico Giaretta: Phobos: std.xml (Robert Schadek mentor)
Wojciech Szeszol: Improvements to DStep (Russel Winder mentor)
Jeremy DeHaan: Precise Garbage Collector (Adam Wilson mentor)

Summer of Code

Once the projects were awarded, I must say that most of my work was done. From there on the mentors and students got down to work. I tried to keep tabs on progress and asked for regular updates from both the mentors and the students. These were, in most cases, promptly provided.

While there were some challenges, and a few projects had to be modified slightly in some instances, everyone progressed steadily throughout the summer, so there were no emergencies to deal with. All of our students passed their mid-term evaluations and by the end of the summer all four projects were completed (although Jeremy has some on-going work on his precise GC). As a result, everyone got paid and, I presume, everyone was happy.

In addition to our original mentors, thanks are due to Jacob Carlborg (DStep) and Joseph Rushton Wakeling (RNG) for providing additional expertise.

Mentor Summit

Google offered money for students to attend academic conferences and present results based on their GSoC work. Google also offered to pay travel costs for two mentors to travel to the mentor summit in California. Regrettably, none of our students had the time to take advantage of the conference money, but Robert Schadek was able to attend the Mentor Summit from Oct 28th to 30th in Sunnyvale, California. There he was able mingle with, and learn from, mentors from the other organizations that participated.

Looking Forward

It is hard to believe, but the process starts all over again in a few short months. The success of this past year will create expectations for 2017, and I hope that we can replicate that success. A number of lessons were learned from this past year that we can carry forward into the next round. So in this section, I will try to distill some of what we learned to help guide our efforts in the coming year.

The Ideas Page and Advertising

Most of the work of identifying projects was carried out through the D Forums, with the odd email to past mentors. This was generally successful, but a number of proposals from previous years ended up being recycled. While it may be inevitable, it seemed that many of the proposal ideas were added at the last minute. Since a number of our best ideas from the 2016 page are now completed projects, we will need to replenish the Ideas page for 2017. Recommendations

We should post a PDF version of one of the successful proposals on our Ideas page to give students an example of what we expect. Although it was excellent, we likely shouldn’t use Sebastian Wilzbach’s treatise, as that may scare some people off.
Try to get a decent set of solid proposals with committed mentors earlier in the process. In 2016 a number of the mentors were signed up at the last minute. The earlier the proposals are posted the more time we have to polish them and make them look more attractive.

Interview and Selection Process

The selection process went well, but was a lot of work. Having input from a number of different mentors/individuals was invaluable. Recommendations

Streamline the selection process, but reuse much of what was done last year. Having a rigorous selection process was a key contributor to 2016’s success.
Start the interview portion of the selection process earlier so that we have more time to set up and carry out the interviews.

Project Progress and Mentoring

Much of the success of an individual project involves having a good relationship and work plan between the student and mentor. From this perspective, the organization isn’t heavily involved. Since all of our students worked well with their mentors, even less organizational administration was required. This is a byproduct of good screening and a solid set of ideas, and being fortunate enough to get good students.

However, there are areas where we could have run things a bit better. Students and mentors were asked to regularly provide updates on their progress, and they generally did this well, but there was no formal reporting process. Also, it would be worthwhile to have a centralized collection of project timelines/milestones where administrators and others involved in the projects (we had a few individuals working in advisory roles) can keep an eye on project progress.

Recommendations

We should keep a centralized version of project timelines somewhere (ie. Google Docs Spreadsheet) where we can check on project milestones. This should be shared with all individuals involved in a project (student/mentors/advisors/admins).
Have a more formalized process for students and mentors reporting on their progress. This would involve weekly student updates and biweekly mentor updates.

Summary

The 2016 GSoC was a great success, and with any luck will be a good foundation for our successful participation in the year to come. We were fortunate that everything seemed to fall nicely into place, from our being awarded all four projects, to having all of our students complete their projects. Perhaps Sebastian, Lodovico, Wojciech or Jeremy will be involved again as students (or even mentors), and in any case continue to contribute to the D Language.

The D Blog in 2016: Seven Months of Page Views

Jan 6, 2017 • DBlogAdmin • #The D Blog

The D Blog was born at DConf 2016 and the first post was published on June 3rd. There were 27 more posts between then and the end of the year, most of which were shared on the usual social media sites. In case some of you in DLand are curious about such things, a year-end stats post is a fun way to kick off the new year.

First, we welcomed 39,471 visitors who viewed a total of 53,013 pages. The top five referrers in terms of page views:

16,604 – Reddit
3,698 – The D Forums
3,123 – Hacker News
2,847 – Twitter
1,759 – Facebook

The top five countries in terms of page views:

17,244 – United States
4,427 – Germany
3,349 – United Kingdom
2,251 – Canada
1,598 – France

Several posts included links to D projects at GitHub. Counting both projects and profiles, the top five most-clicked were:

The single most-clicked page was the DLangUI screenshot page.

Finally, the top six posts in terms of page views:

5,865 – Find Was Too Damn Slow, So We Fixed It
5,602 – Ruminations on D: An Interview with Walter Bright
4,267 – Project Highlight: DLangUI
2,704 – Programming in D: A Happy Accident
2,579 – Project Highlight: Timur Gafarov
2,257 – Project Highlight: Voxelman

The list of posts was intended to be a top-five, but it was interesting that Voxelman was posted only on December 30th and managed to become the sixth most-viewed post on the site.

2016 was the time for the blog to find its sea legs. The coming year will see more Project Highlights and more guest posters (including Andrei and Walter). We’re also looking to expand the scope somewhat, so keep your eyes open for new types of content.

If you would like to write for the D blog, please go and contact the fellow who owns this GitHub profile, where he’s showing his email address for the world to see. He would be happy to discuss posts about your D projects, idioms you like to use, tutorials you’d like to share, or anything related to the D Programming Language.

Thanks for tuning in, and Happy 2017!

Happy New Year from the D Language Foundation

Jan 2, 2017 • AliCehreli • #D Foundation, #Guest Posts

Ali Çehreli uses D professionally at _Weka.io, _is the author of _Programming in D, and is frequently found in the D Learn forum with ready answers to questions on using the language. He also is an officer of the D Language Foundation. _

Happy 2017!

2016 was filled with many great things happening for the D community:

The D Language Foundation became a tax-exempt non-profit organization
DConf 2016 was a great success
Our academic collaboration program started to make a big impact
We released 3 DMD versions and 6 point releases; LDC reached a point on-par with DMD’s frontend version, and GDC is following close behind
The official D blog produced 28 articles in just 7 months since its inception
Many local D language meetups were organized
Countless hours were poured into countless open source D projects

All of that was achieved by you through your direct contributions or the donations that you’ve made.

We look forward to another great year filled with many cool things happening in the D world. We can’t wait to see your work on D in 2017, some of which we hope to hear about at DConf 2017. :D

Project Highlight: Voxelman

Dec 30, 2016 • DBlogAdmin • #Game Development, #Project Highlights

If you spend any time over at r/VoxelGameDev, you may have seen posts about Voxelman, the plugin-driven game engine MrSmith33 is developing with D. His real name is Andrey Penechko, and he started work on Voxelman after he was inspired by Minecraft to think about all the cool things he could do with a voxel engine, particularly the low-level optimization tricks he could use in implementing one. Then he jumped in and started figuring things out.

I started the project somewhere in 2011 or 2012. It began with creating an SDL window and getting some triangles on the screen. Then I did cubes, then a single chunk. It was a simple, single-threaded thing. I did it all with a fixed camera and only had rudimentary camera controls.

For that initial version of the project, he was using C++, but he found himself stuck from a lack of knowledge about the language. So he started searching to see what else was out there. That led him to D.

I don’t really remember how I found D. I was in need of some statically typed compiled language other than C++. I was frustrated about all the source file organisation, the need of forward declarations, header separation and the include system. In D, it was as simple as writing code. I bought a cheap 10 inch tablet just to read Andrei’s book, because my 3.2” PPC was too small to read the whole thing. I enjoyed reading every single bit of it.

His ultimate goal with the project is to provide a platform for which people can create and share plugins and game worlds.

Ideally a complete project build should have the engine source and tools (launcher, source editor, compiler). Players should be able to initiate a connection to any server in the server list, then the launcher will download any missing plugins, compile a new executable and start the engine with the list of plugins. Currently, a build of Voxelman is less than 3MB in size. I think that this is a good property to have.

The major sticking point he sees with this approach is the dependency DMD has on the Microsoft tools for 64-bit (and 32-bit COFF) support on Windows (specifically the Windows SDK and the Microsoft linker). Even though the MS linker is considered the system linker, it’s not uncommon to see Cygwin and or one of the various distributions of MinGW installed instead of the MS tools. In a perfect world, he could tell people to download the D compiler and they would have everything they need. But it’s not a deal-breaker, so he’s not letting it stop him.

Voxelman uses a client-server architecture, where the server can be launched in a dedicated process or as part of the client’s. This is managed by a launcher which, in addition to launching the game, can be used to compile projects, manage the world, and find servers to connect with.

World and mesh generation is multi-threaded and, as in most such engines, the model is chunk-based. The chunk management implementation is informed by the concept of entity component systems, with a chunk’s world position serving as its entity ID and layers functioning as components.

Each dimension is broken into chunks. A chunk is a 32³ array of blocks. Each chunk can have a set of data layers (currently blocks and block entities). Each layer is essentially an immutable snapshot. It can be of different storage types (uniform, where all blocks are the same, or a compressed or full array, where the layer stores an array of data). Those layers then can be freely transmitted between threads, with reference counting done in the main thread. When a layer is no longer needed it’s deleted.

Immutable chunk data makes for fast auto saves of chunk snapshots in a separate IO thread.

When a chunk is received on the client side, it can be sent to a worker thread and the geometry will be generated. Snapshots are sent to the IO thread when save points occur, and they can still be used in the main thread, sent to the client, or processed by other worker threads. One can easily use an old snapshot while several new ones are in use. Whenever a layer is being modified, data is copied into a write buffer, changes are made, and at a commit point at the end of the frame, all write buffers are committed to chunk storage.

Andrey calls his plugin system “semi-hackish”.

All plugins inherit from an IPlugin interface. Then, each plugin registers itself in a global table of plugins from a shared static constructor. The global table has lists for server and client plugins. The engine adds those plugins to the plugin manager based on a provided plugin pack. The plugin manager implements the initialization sequence. When starting initialization, you have lots of dependencies, so you need to run things in a specific order.

He has found a lot of things to like about D. As major pros, he cites the module system (“no forward declarations”), foreach loops (“99% of loops in my code are these guys”), associative arrays, delegates, and templates (“They’re beautiful; you simply add another set of parentheses and you’re done”). He also loves D’s dynamic arrays (slices).

They are a perfect design, with the pointer and the length bundled together. You can append to them, concatenate them, and change their length.

As minor pros, he lists D’s Compile-Time Function Execution and its code generation and compile-time introspection features. Unlike some D users, he also counts the garbage collector in that group. He has implemented a mix of GC-ed and non-GCed memory in Voxelman.

High-level stuff is fully in GC memory. I call something high-level if it has only one instance, so I use interfaces/classes for the high-level parts. Low-level things are mostly stack allocated, using structs (which are POD in D), and the most performance sensitive and memory consuming parts use manual memory management (via Mallocator). This includes chunk storage and chunk meshes.

He also has a list of rough corners. He doesn’t like that support for DLLs is not yet fully functional and reliable. He has found problems when trying to use shared (for example, the Mutex class cannot be used with it). He also finds all the use cases of the is expression confusing, saying the syntax “feels like regular expressions for templates; very powerful and concise, but hard to understand.”

His difficulties with shared actually took him down an interesting path that ultimately had a positive impact on performance.

I started my multi-threading by using the send and receive functions from std.concurrency. I found that I needed to send messages of variable length. For example, when loading or saving chunks, you need to send all the layers to another thread. This involved allocating arrays for all the layers and also required the use of shared.

This situation led me to the implementation of a lock-free message queue, where each message is just a stream of bytes. You write variables on one end and read them from the other. This is obviously a single producer, single consumer queue.

A disadvantage was the use of a fixed-size circular array. You need to make sure that the queue doesn’t fill up. This was a point where I found a good book that explains how atomics work: C++ Concurency in Action: Practical Multithreading. This is one of the places in D’s documentation where you feel a lack of pointers on where to find relevant information on a specific topic.

So the new solution doesn’t require any allocations and is actually faster than the built-in one. Later I added a notification system via Semaphore, so that worker threads wait when out of work.

If you’re looking for an open source D game to contribute to, Voxelman is waiting for you. You can read more about some of its internals on reddit, check out some images on imgur, and watch some videos on YouTube. I’ll leave you with this example of it in action:

The D Language Foundation's Scholarship Program

Dec 5, 2016 • DBlogAdmin • #Core Team, #D Foundation, #Interviews, #News

The D Language Foundation recently announced a new scholarship program aimed at EE and CS majors attending University “Politehnica” Bucharest (UPB). I contacted Andrei Alexandrescu for a few details on how the initiative came together, hoping for just enough tidbits of backstory to craft a blog post around. He obliged in a big way, turning my one question and “a few details” into an informative conversation.

Mike: I assume quite a lot of work went into this. Could you share a few details about how it came about?

Andrei: Gladly! The story starts back in 2012, when I gave a talk at the How to Web conference in Bucharest, my native city. It was a great event and I got to meet many great people. Except for one whose name kept coming up all over the Romanian IT space, Andrei Pitis.

I heard he was an instructor in the CS department at UPB (the best IT school in Romania, also noted internationally). He’s been directly involved in a number of IT-related foundations and professional organizations, and he created and led the immensely successful Vector Smart Watch startup. So, having heard he’d be around, I went to the conference speakers’ dinner hoping to bump into him.

Not knowing what he looked like, I was just craning my neck in search of someone who seemed popular. Meanwhile, I was passing time by making chit chat with a nice fellow who introduced himself to me. Now, you know how these group parties go. There’s always loud music and conversation, so I didn’t even hear his name and assumed he hadn’t heard mine.

As the evening progressed, I figured Andrei Pitis wasn’t going to show, so I had more time to chat with that fine gentleman. And I noticed two things. First, he was incredibly insightful. Second, he seemed equally excited about meeting me as I was about meeting Andrei Pitis. After a long while, the coin dropped: they were one and the same.

Thus started a great friendship. Andrei gave me great tips about how to start and conduct The D Language Foundation. Recently, he introduced me to two UPB CS systems professors, Razvan Deaconescu and Razvan Rughinis (together, the three had created the Tech Lounge nonprofit organization dedicated to helping graduating CS students start their careers).

Razvan Rughinis came up with the scholarship idea while we were chatting over beers in the quaint old town of Bucharest. In great part the idea was motivated by the strong interest UPB systems graduate students had in participating in a high-impact open source project such as the D language as part of their MSc thesis. In systems research (unlike e.g. CS theory), actual system building is a key part of the research project; therefore, a visible OSS project makes for a much stronger dissertation than the usual throwaway experimental code.

Clearly a strong opportunity had presented itself, and the DLang UPB scholarship is its realization.

Mike: How does the selection process work?

Andrei: The two professors introduce a few candidates, which I pass through the rigors of the typical Facebook interview. We also ask for the usual suspects - proof of enrollment, transcripts, motivation letter, and references.

Of all components, the most important are (in order) the interview, the quality of the BSc projects, and the recommendation letters from their professors. The four current scholarship recipients passed the interview with flying colors and have very strong BSc projects and references. Some of them returned from summer internships at prestigious companies such as Bloomberg, others won CS awards. I have no doubt any company in the Bay Area or elsewhere would be happy to work with them. Once they finish their MSc, of course :o).

And I should mention here that the two professors aren’t only involved in the selection process. They will make themselves available to help manage the students on an ongoing basis. We’re very fortunate to have them.

Mike: Can you provide any info on the current recipients and their projects?

Andrei: The current recipients are Alexandru Razvan Caciulescu, Lucia Cojocaru, Eduard Staniloiu, and Razvan Nitu. I have posted an introduction to each on the D forums and, now that you mention it, I told them to create a wiki page with a blurb for each. They are hosted in a nice shared office kindly donated by Tech-Lounge.ro and… we’re in the process of getting a coffee machine up there :o).

They are all obviously interested in taking large systems projects that benefit their research interests and have an impact on the D language. To get them started, I took a page from Facebook’s practice and defined a “bootcamp” program. Bootcamp is a month-long process (six weeks at Facebook) during which the so-called n00bs get familiar with the technologies used in the organization: the language proper; the core runtime and standard library; the build process; the way code changes are created, reviewed, accepted, and committed; and, last but not least, the community ethos and the kind of problems we are facing that are fit for ingenious solutions.

To kickstart the bootcamp program, I defined a “bootcamp” label in our Bugzilla and applied it to a bunch of existing bugs, with an eye for the kind of bug that simultaneously has low surface (you don’t need to know a lot of internal details to get into it) and offers a good learning experience. Right now each student is busy fixing a couple of such bugs.

Long-term we are looking at high-impact libraries and tools. I do have a few ideas, but I have no doubt the students will come up with their own. Just give them time.

Mike: Speaking of time… is there any room here for an update on the D Foundation’s finances?

Andrei: Of course. To be honest, right now we’re in better shape than ever before (and than I would have hoped). Thanks to Sociomantic, who footed a large part of DConf 2016’s bills, we have quite a bit of change left from conference registration fees. I have also personally carried a number of high-profile appearances at public tech events and private corporate training events, with proceeds flowing to the Foundation.

So we have accumulated a little war chest - not much, but definitely not negligible. With our current funds and operational costs, we are covered for over two years. Of course, the situation is fluid and I am working on expanding both income and (useful) expenditures.

We’re running a very tight operation, and I want to keep it that way. By the Foundation bylaws, its officers (Walter Bright, Ali Çehreli, and myself) cannot get income from the Foundation, which preempts a variety of conflicts of interest. We are a public charity, which reduces and simplifies our taxation. We use modern, low-overhead money transfer methods such as transferwise.com and constantly scan for better ones. Anyone who considers donating should know that about every five dollars donated goes straight to pay for one hour of an exceptional graduate student’s time.

Mike: Are there more applications in the queue? Do you plan to extend scholarships to other universities?

Andrei: UPB seems to be off to a great start, but it’s also a happy case for many reasons: it’s my undergrad alma mater, we know professors there, and we don’t need to pay tuition. If we wanted to extend a scholarship to another university we’d need to avail ourselves of similar strategic advantages. Needless to say, if anyone who reads this has ideas on the matter, please contact me.

Anyhow, for the time being, we got one more strong DLang UPB scholarship application literally today.

Mike: To close out, is there anything you’d like to say to people who’d like to help out?

Andrei: I’m very excited about this scholarship program and possible extensions to it. The reason for my excitement is that this is but a part of a larger strategy. Allow me to explain.

Up until now, we had no idea what to do with money even if we had it. A while ago, I met this potential donor who said, “OK, say I gave the Foundation half a million dollars over two years, no strings attached. What would you do with it?” To my own surprise, I had only vague answers. I asked Walter the same question, and he had even less of a clue than me.

So then I figured it’s essential for the Foundation to have a strong response to that. I’m a big believer in the adage “luck helps the prepared”, of which the converse is “luck is wasted on the unprepared”. By that paradigm, not knowing what we’d do with money was a definite way to ensure we’d never be big. Now that we have the scholarship program, there exists a powerful reason for people to donate to the Foundation: donations help us find and support good students to work on high-impact D-related projects that push the state of CS systems research forward.

Another thing that would be great to have “donations” of is contributor time. Receiving more students starts pushing against our management capacity. Currently, and somewhat to my surprise, I am effectively a manager, seeing that all of these things I just gave you an earful of (bringing money in to the Foundation, managing bootcamp, finances, operations) take enough time to be a full-time job that leaves little time for coding. At some point, I won’t be able to help everyone with their research, so I’ll need to delegate some of that work to other folks. I’m talking any capacity here - from code reviews to managing to co-authoring papers to co-advising.

There are more things I have in mind, but it’s early to share those. In brief, we need to organize ourselves for further growth. What’s clear to me is we’re no longer a seat-of-the-pants operation in a (virtual) basement. The D Language is exiting its adolescence.

Project Highlight: The New CTFE Engine

Nov 18, 2016 • DBlogAdmin • #Compilers & Tools, #Project Highlights

CTFE (Compile-Time Function Execution) is today a core feature of the D Programming Language. D creator Walter Bright first implemented it in DMD as an extension of the constant folding logic that was already there. Don Clugston (of FastDelegate fame) made a pass at improving it and, according to Walter, “took it much further”. Since that time, usage of CTFE has shown up in one D project after another, including in D’s standard library. For example, Dmitry Olshansky employed it in his overhaul of std.regex to great effect.

On the last day of DConf 2016, Stefan Koch gave a lightning talk on his thoughts about CTFE in D. At the end of the talk, in response to a question from Andrei Alexandrescu on how D’s implementation could be improved, he said the following:

CTFE is really a hack. You can see that it’s a hack. It’s implemented as a hack. It is the most useful hack that I’ve ever seen, and it is definitely a hacker’s tool to do stuff that are like magic. But to be fast, it would need to be heavily redesigned, reimplemented, possibly executed in multiple threads, because it is used for stuff that we could never have envisioned when it was invented.

Not long after that, Stefan opened a discussion on the fourms and took up the torch to improve the CTFE engine. As to why he got started on this journey in the first place, Stefan says, “I started work on the CTFE engine because I said so at DConf.” But, of course, there’s more to it than that.

I have pretty heavy-weight CTFE needs (I worked on a compile-time trans-compiler). Also my CTFE SQLite reader is failing if you want to read a database bigger then 2MB at ctfe.

His investigations into the performance of the CTFE interpreter shed light on its problems.

The current interpreter interprets every AST-Node it sees directly. This leaves very little space to collect information about the code that is being interpreted. It doesn’t know when something will be used as a reference, so it needs to copy every variable on every mutation. It has to do a deep-copy for this. That means it copies the whole chain of mutations every time.

To clarify, he offers the following example.

Imagine foreach(i;0 .. 10) { a = i; }. On the first iteration we save a = 0 and set ato `1`. On the second iteration we save `a = 1 and a= 0` and we set `a to 2 , then a= 1` and `a`` = 0` and so on. As you can see, the memory requirements just shoot up. It’s basically a factorial function with a very small coefficient. That is why for very small workloads this extreme overhead is not noticeable.

That flaw looked unfixable. Indeed the whole architecture in dinterpret.d is very convoluted and hard to understand. I did a few experiments on improving memory-management of the interpreter but it proved fruitless.

Once he realized there was going to be no quick fix, Stefan sat down and drew up a plan to avoid digging himself into the same hole the current interpreter was in. The result of his planning led him down a road he hadn’t expected to travel.

Direct Interpretation was out of the question since it would give the new engine too little time to analyze data-flow and decided whether a copy was really needed or not. I had to implement an Intermediate Representation. It had to be portable to different evaluation back-ends. I ended up with a solution, inspired by OpenGL, of defining my interface in the form of function calls an evaluation back end had to implement. That meant I would not be able to simply modify the current interpreter. This made the start very steep, but it is a decision I do not regret.

His implementation consists of a front end and a back end.

The front end walks the AST and issues calls to the back end. And the back end transforms those calls into actual bytecode. This bytecode is interperted by the back end as soon as the front end requires it.

In terms of functionality, he likens the current implementation to an immediate mode graphics API, and his revamp to retained mode. In this case, though, it’s the immediate mode that’s the memory hog.

You can read about his progress in the CTFE Status thread, where he has been posting frequent updates. His updates include problems he encounters, features he implements, and performance statistics. Eventually, every compiler that uses the DMD front end will benefit from his improvements.