The D Blog

The official blog for the D Programming Language.

DustMite: The General-Purpose Data Reduction Tool

Apr 13, 2020 • VladimirPanteleev • #Compilers & Tools, #Core Team, #Guest Posts

If you’ve been around for a while, or are a particularly adventurous developer who enjoys mixing language features in interesting ways, you may have run into one compiler bug or two:

Implementation bugs are inevitably a part of using cutting-edge programming languages. Should you run into one, the steps to proceed are generally as follows:

  1. Reduce the failing program to a minimal, self-contained example.

  2. Add a description of what happens and what you expect to happen.

  3. Post it on the bug tracker.

Nine years ago, an observation was made that when filing and fixing compiler bugs, a disproportionate amount of time was spent on the first step. When your program stops compiling “out of the blue”, or when the bug stops reproducing after the code is taken out of its context, manually paring down a large codebase by repeatedly cutting out code and checking if the bug still occurs becomes a tedious and repetitive task.

Fortunately, tedious and repetitive tasks are what computers are good for; they just have to be tricked into doing them, usually by writing a program. Enter DustMite.

The first version.

The basic operation is simple. The tool takes as inputs:

  • a data set to reduce (such as, a directory containing D source code which exhibits a particular compiler bug)

  • an oracle (or, more mundanely, a test script), which itself:

  • takes as input a variation of the data set, and

  • produces a yes-or-no answer on whether the input still satisfies the sought property (such as reproducing the particular compiler bug).

DustMite’s output is some local minimum variation of the data set, which it reaches by consecutively trying to remove parts of the data set and saving the results which the oracle approves. In the compiler bug example, this means removing bits of code which are not required to reproduce the problem at hand.

DustMite wouldn’t be very efficient if it attempted to remove things line-by-line or character-by-character. In order to maximize the chance of finding good reductions, the input is parsed into a tree according to the syntax of the input files.

Each tree node consists of a “head” (string), children (list of node pointers), and “tail” (string). Technically, it is redundant to have both “head” and “tail”, but they make representing some constructs and performing reductions much simpler, such as paren/bracket pairs.

Nodes are arranged into a binary tree as an optimization.

Additionally, nodes may have a list of dependencies. The dependency relationship essentially means “if this node is removed, these nodes should be removed too”. These constraints are not representable using just the tree structure described above, and are used to allow reducing things such as lists where trailing item delimiters are not allowed, or removing a function parameter and corresponding arguments from the entire code base at once.

In the case of D source code, declarations, statements, and subexpressions get their own tree nodes, so that they can be removed in one go if unneeded. The parser DustMite uses for D source code is intentionally very simple because it needs to handle potentially invalid D code, and you don’t want your bug reduction tool to also crash on top of the compiler.

How DustMite sees a simple D program.

An algorithm decides the order in which nodes are queued for potential deletion; DustMite implements several (it calls them “strategies”). Fundamentally, a strategy’s interface is (state_i_, result_i_) ⇒ (state_i_+1, reduction_i_+1), i.e., which reduction is chosen next depends on the previous reduction and its result. The default “inbreadth” strategy visits nodes in ascending depth order (row by row) and starts over from the top as long as it finds new reductions.

DustMite today supports quite a few more options:

The current version.

Probably, the most interesting of these is the -j switch–one reason being that DustMite’s task is inherently not parallelizable. Which reduction is chosen next, and the tree version to which that reduction is applied, depends on the previous reduction’s result.

DustMite works around this by putting unused CPU cores to work on lookahead: using a highly sophisticated predictor, it guesses what the result of the current reduction will be, and based on that assumption, calculates the next reduction. If the guess was right, great! We get to use that result. Otherwise, the work is wasted. Implementing this meant that strategies now needed to have copyable state, and so had to be refactored from an imperative style to a state machine.

Unfortunately, although the highly expected feature was implemented four years ago, the initial implementation was rather underwhelming. DustMite still did too much work in the main thread and wasted too much CPU time on rescanning the data set tree on every reduction. The problem was so bad that, at high core counts, lookahead mode was even slower than single-threaded mode.

I have recently set out to resolve these inadequacies. The following obstacles stood in the way:

Problem 1: Hashing was too slow. Because the oracle’s services (i.e., running the test script) are usually expensive, DustMite keeps track of a cache of previously attempted reductions and their outcome. This helps because not all tree transformations result in a change of output, and some strategies will retry reductions in successive iterations. A hash of the tree is used as the cache key; however, calculating it requires walking the entire tree every time, which is slow for large inputs.

Would it be possible to make the hash calculation incremental? One approach would be Merkle trees (each node’s hash is the hash of its children’s hashes), however that is suboptimal in the case of e.g., empty leaf nodes. CS erudite Ivan Kazmenko blesses us with an answer: polynomial hashes! By representing strings as polynomials, it is possible to use modulo arithmetic to calculate an incremental fixed-size hash and cache subtree hashes per node.

Each node holds its cumulative hash and length.

The number theory staggered me at first, so I recruited the assistance of feep from #d. After we went through a few draft implementations, I could begin working on the final version. The first improvement was replacing the naive exponentiation algorithm with exponentiation by squaring (D CTFE allowed precomputing a table at compile-time and a faster calculation than the classical method). Next, there was the matter of the modulo.

Initially, we used integer overflow for modulo arithmetic (i.e. q=264), however Ivan cautioned against using powers of two as the modulo, as this makes the algorithm susceptible to Thue-Morse strings. Not long ago I was experimenting with using long multiplication/division CPU instructions (where multiplying one machine word by another yields the result in two machine words with a high and low part, and vice-versa for division). D allows generating assembler code specific to the types that the function template is instantiated with, though in DustMite we only use the unsigned 64-bit variant (on x86 we fall back to using integer overflow).

With the hashing algorithm implemented, all that remained was to mark dirty nodes (they or their children had their content edited) and incrementally recalculate their hashes as needed. Dependencies posed a small obstacle: at the time, they were implemented as simply an array of pointers to the dependency node within the tree. As such, we didn’t know how to get to their parents (to mark them dirty as well), however this was easily overcome by adding a “parent” pointer to each node.

Well, or so I thought, until I got to work on the next problem.

Problem 2: Copying the tree. At the time, the current version of the tree representing the data set was part of the global state. Because of this, applying a reduction was implemented twice:

  • Temporarily, when saving the data set with a proposed reduction for testing:

void save(Reduction reduction, string savedir)

The tree here is not mutated, instead the reduction was applied “dynamically” to the output while walking the tree.

  • Permanently, when accepting a successful reduction, and irreversibly mutating the in-memory tree to apply it:

void applyReduction(ref Reduction r)

This was clumsy, but faster and less complicated than making a copy of the entire tree just to change one part of it to test a reduction. However, doing so was a requirement for proper lookahead, otherwise we would be unable to test reductions based on results where past tests predicted a positive outcome, or do nearly anything in a separate thread.

One issue was the tree “internal pointers”–making a copy would require updating all pointers within the tree to point to the new copies in the new tree. This was easy for children/parent pointers (since we can reliably visit every such pointer exactly once), but not quite for dependencies: because they were also implemented as simple pointers to nodes, we would have to keep track of a map of which node was copied where in order to update the dependency pointers.

One way to solve this would be to change the representation of node references from pointers to indices into a node array; this way, copying the tree would be as simple as a .dup. However, large inputs meant many nodes, and I wanted to see if it was possible to avoid iterating over every node in the tree (i.e. O(n)) for every reduction.

Was it possible? It would mean that we would copy only the modified nodes and their parents, leaving the rest of the tree in-place, and only reusing it as the copies’ children. This goal conflicted with the existence of “parent” pointers, because a parent would have to point towards either the old or new root, so to resolve this ambiguity every node would have to be copied. As a result, the way we handled dependencies needed to be rethought.

Editing trees with “copy on write” involves copying just the edited nodes (🔴), and their parents.

With internal pointers out, the next best thing to array indices for referencing a node was a series of instructions for how to reach the node from the tree root: an address. The representation of these addresses that I chose was a bit string represented as a linked list, where each list node holds the child index at that depth, starting from the deep end. Such a representation can be arranged in a tree where the root-side ends are shared, mimicking the structure of the tree containing the nodes for the reduced data, and thus allowing us to reuse memory and minimize allocations.

Nodes cannot hold their own address (as that would make them unmovable),
which is why they need to be stored outside of the main tree.

For addresses to work, the object they point at needs to remain the same, which means that we can no longer simply remove children from tree nodes–an address going through the second child would become invalid if the first child was removed. Rewriting all affected addresses for every tree edit is, of course, impractical, which leads us to the introduction of tombstones–dead nodes that only serve to preserve the index of the children that follow it. Because one of the possible reduction types involves moving subtrees around the tree, we now also have “redirects” (which are just tombstones with a “see here” address attached).

With the above changes in place, we can finally move forward with fixing and optimizing lookahead, as well as implementing incremental rehashing in a way that’s compatible with the above! The mutable global “current” tree variable is gone, save now simply takes a tree root as an argument, and applyReduction is now:

/// Apply a reduction to this tree, and return the resulting tree.
/// The original tree remains unchanged.
/// Copies only modified parts of the tree, and whatever references them.
Entity applyReduction(Entity origRoot, ref Reduction r)

With the biggest hurdle behind us, and a few more rounds of applying Walter Bright’s secret weapon, the performance metrics started to look more like what they should:

Going deeper would likely involve using OS-specific I/O APIs or rewriting D’s GC.

A mere 3.5x speed-up from a 32-fold increase in computational power may seem underwhelming. Here are some reasons for this:

  • With a 50/50 predictor, predictions form a complete binary tree, so doubling the number of parallel jobs gives you +1x more speed. That’s roughly log₂(jobs)-1, or 4 for 32 jobs - not far off!

  • The results will depend on the reduction being performed, so YMMV. For a certain artificial test case, one optimization (not pictured above) yielded a 500x speed-up!

  • DustMite does not try to keep all CPU cores busy all the time. If a prediction turns out false, all lookahead jobs based on it become wasted work, so DustMite only starts new lookahead tasks when a reduction’s outcome is resolved. Perhaps ideally DustMite would keep starting new jobs but kill them as soon as it discovers they’re based on a misprediction. As there is no cross-platform process group management in Phobos, the D standard library, this is something I left for future versions.

  • Some work is still done in the main thread, because moving it to a worker thread actually makes things slower due to the global GC lock.

There still remains one last place where DustMite iterates over every tree node per reduction: saving the tree to disk (so that it could be read by the test script). This seems unavoidable at first, but could actually be avoided by caching each node’s full text contents within the node itself.

I opted to leave this one out. With the other related improvements, such as using lockingBinaryWriter and aggregating writes of contiguous strings as one I/O operation, the increase in memory usage was much more dramatic than the decrease in execution time, even when optimized to just one allocation per reduction (polynomial hashing gives us every node’s total length for free). But, for a brief instant, DustMite processed reductions in sub-_O(n)_ time.

One more addition is worth mentioning: Andrej Mitrovic suggested a switch which would replace removed text with whitespace, which would allow searching for exact line numbers in the test script. At the time, its addition posed significant challenges, as there needed to be some way to keep removed nodes in the tree but exclude them from future removal attempts. With the new tree representation, this became much easier, and also allowed creating the following animation:

In conclusion, I’d like to bring up that DustMite is good at more than just reducing compiler test cases. The wiki lists some ideas:

  • Finding the source of ambiguous or misleading compiler error messages (e.g., errors with the file/line information pointing only inside the standard library).

  • Alternative (much slower, but also much more thorough) method of verifying unit test code coverage. Just because a line of code is executed, that doesn’t mean it’s necessary; DustMite can be made to remove all code that does not affect the execution of your unit tests.

  • Similarly, if you have complete test coverage, it can be used for reducing the source tree to a minimal tree which includes support for only enabled unittests. This can be used to create a version of a program or library with a test-defined subset of features.

  • The --obfuscate mode can obfuscate your code’s identifiers. It can be used for preparing submission of proprietary code to bug trackers.

  • The --fuzz mode (a new feature) can help find bugs in compilers and tools by creating random programs (using fragments of other programs as input).

But DustMite is not limited to D programs (or any kind of programs) as input. With the --split option, we can tell DustMite how to parse and reduce other kinds of files. DustMite successfully handled the following scenarios:

  • reducing C++ programs (the D parser supports some C++-only syntax too);

  • reducing Python programs (using the indent split mode);

  • reducing a large commit to a minimal diff (using the diff split mode);

  • reducing a commit list, when git bisect is insufficient due to the problem being introduced across more than any single commit;

  • reducing a large data set to a minimal one, resulting in the same code coverage, with the purpose of creating a test suite;

  • and many more which I do not remember.

Today, some version of DustMite is readily available in major distributions (usually as part of some D-related package), so I’m happy having a favorite tool one apt-get / pacman -S away when I’m not at my PC.

Discovering a problem which can be elegantly reduced away by DustMite is always exciting for me, and I’m hoping you will find it useful too.

D 2.091.0 Released

Mar 17, 2020 • DBlogAdmin • #Compilers & Tools, #DConf, #DMD Releases, #News

Digital Mars D logoThe latest release of DMD, the D reference compiler, ships with 18 major changes and 66 bugfixes from 55 contributors. This release contains, among other goodies, improvements to the Windows experience and enhancements to C and C++ interoperability. As fate would have it, the initial release announcement came in the aftermath of some unfortunate news regarding DConf 2020.

DMD on Windows

Over the years, some D users have remarked that the development of D is Linux-centric, that Windows is the black sheep or red-headed stepchild of D platforms. For anyone familiar with D’s early history, that seems an odd thing to say, given that DMD started out as a Windows-only compiler that could only output 32-bit objects in the OMF format. But it’s also understandable, as anyone not familiar with that history could only see that DMD on Windows lagged behind the Linux releases.

64-bit

One place where the official DMD releases on Windows have continued to differ from the releases on other platforms is the lack of 64-bit binaries in the release packages. Again, there’s a historical reason for this. The default output of the compiler is determined by how it is compiled, e.g., 32-bit versions output 32-bit binaries by default. When Walter first added support to DMD for 64-bit output on Windows, it required giving the back end the ability to generate object files in Microsoft’s version of the COFF format and also requiring users to install the Microsoft Build Tools and Platform SDK for access to the MS linker and system link libraries. This is quite a different experience from other platforms, where you can generally expect a common set of build tools to have been installed via the system package manager on any system set up for C and C++ development.

For a Windows developer who chooses GCC for their C and C++ development (or who does no C or C++ development at all), it’s a big ask to require them to download and install several GBs they might not already have installed and probably will never use for anything else. So D releases on Windows continued to ship with 32-bit binaries and the OPTLINK linker in order to provide a minimum out-of-the-box experience. That was a perfectly fine solution, unless you happened to be someone who really wanted 64-bit output (posts from disgruntled Windows users who didn’t want to install the MS tools can be found sprinkled throughout the forum archives).

Eventually, the LLVM linker (LLD) was added to the DMD Windows release packages, along with system link libraries generated from the MinGW definitions. This allowed users to compile 64-bit output out of the box and, once the kinks were worked out, eliminated the dependency on the MS linker. Yet, the official release packages still did not include a 64-bit version of DMD and still did not support 64-bit output by default.

With DMD 2.091.0, the black sheep has come back into the fold. The official DMD releases on Windows now ship with 64-bit binaries, so those of you masochists out there who cling to Makefiles and custom build scripts can expect the default output be what you expect it to be (for the record, DUB, the build tool and package manager that ships with DMD, has been instructing the compiler to compile 64-bit output by default on 64-bit systems for the past few releases).

Windows gets even more love

There are lots of goodies for Windows in this release. Another biggie is that DMD is now 30-40% faster on Windows. It’s no secret that LDC, the LLVM-based D compiler, generates faster binaries than DMD (for some D users, the general rule of thumb is to develop with DMD for its fast compile times and release with LDC for its faster binaries, though others argue that LDC is plenty fast for development and DMD is fine for production). There have been requests for some time to stop compiling DMD with DMD and start doing it with LDC instead. This release is the first to put that into practice.

There are a number of smaller enhancements to the Windows experience: the install.sh script available on the DMD downloads page that some people prefer now supports POSIX environments on Windows; the system link libraries that ship with the compiler have been upgraded from MinGW  5.0.2 to 7.0.0; LLD has been upgraded to 9.0.0; and there’s plenty more in the changelog.

C++ Header Generation

With just about every major release of DMD, D’s interoperability with C and C++ sees some kind of improvement. This release brings a huge one.

Over the years, some have speculated that it would be excellent if the D compiler could generate headers for C and C++ for D libraries intended to be usable in C or C++ programs. Now that wishful thinking has become a(n experimental) reality. Given a set of extern(C) or extern(C++) functions, DMD can generate header files that contain the appropriate C or C++ declarations. Three compiler switches get the job done:

  • -HC will cause the header to be generated and printed to standard output

  • -HCf=fileName will cause the header to be generated and printed to the specified file

  • -HCd=directoryname will (once it’s implemented) cause the header to be printed to a file in the specified directory

See the changelog for example output.

Other News

While the Corona virus was initially ramping up out of sight from most of the world, plans for DConf 2020 were ramping up online from different locations around the world. Planning began in November, the venue was secured in late December, and the website launched with the announcement in early January.

As news of the virus outbreak spread, the conference organizers grew concerned. Would we be okay in June? In late February, that concern manifested as a discussion of possible contingency plans. Two weeks later, it resulted in the decision to cancel DConf 2020. Thankfully, the D community has been supportive of the decision.

As part of the discussion of contingency plans, the possibility was raised of hosting an online conference. The idea of course came up in the discussion of the cancellation in the forums, and a few people reached out shortly after the initial announcement offering to provide help in setting something up. Walter created a forum thread to discuss the topic for anyone interested.

No one involved with organizing DConf has any experience with hosting an online conference. We’re currently exploring options and looking at what the organizers of other Conferences in the Time of COVID-19 are doing. We want to do it, and we want to do it well. Experience with organizing DConf in the real world has taught us not to jump on any old technology without first having a fallback (ahem, DConf 2018 livestream) and making sure the tech does what we expect it to (ahem, DConf 2019 livestream). So don’t expect a quick announcement. We want to find the right tech that fits our requirements and explore how it works before we move forward with setting dates. But do expect that DConf 2020 Online is looking more and more likely to become a thing.

Tracing D Applications

Mar 13, 2020 • AlexandrDruzhinin • #Code, #Guest Posts, #Tutorials

At one time or another during application development you need to make a decision: does your application work like it should and, if not, what is wrong with it? There are different techniques to help you decide, some of which are logging, tracing, and profiling. How are they different? One way to look at it is like this:

  • when you know exactly the events you are interested in to make the decision, you use logging

  • when you don’t know exactly the events you need to make a decision and you are forced to collect as many events as you can, you use tracing

  • when you need to collect some events and analyze them to derive new information, you use profiling

In this article, we focus on tracing.

When developing an application, you can use tracing to monitor its characteristics at run time to, for example, estimate its performance or memory consumption. There are several options to do so, and some of them are:

  • means provided by the programming language (for example, using D’s writef, a.k.a. printf debugging)

  • debuggers (using scripts or remote tracing)

  • OS-specific tracing frameworks (linux {k u}probes and usdt probes, linux kernel event, performance events in windows etc)

In this article, the following contrived D example is used to help illustrate all three cases. We’ll be focusing on Linux. All example code in this article can be found in the GitHub repository at https://github.com/drug007/tracing_post.

foreach(counter; 0..total_cycles)
{
    // randomly generate one of three kinds of event
    Event event = cast(Event) uniform(0, 3);

    // "perform" the job and benchmark its CPU execution time
    switch (event)
    {
        case Event.One:

            doSomeWork;

        break;
        case Event.Two:

            doSomeWork;

        break;
        case Event.Three:

            doSomeWork;

        break;
        default:
            assert(0);
    }
}

doSomeWork simulates a CPU-intensive job by using DRuntime’s Thread.sleep method. This is a very common pattern where an application runs in cycles and, on every iteration, performs a job depending on the application state. Here we can see that the application has three code paths (CaseOne, CaseTwo, and CaseThree). We need to trace the application at run time and collect information about its timings.

The writef-Based Approach

Using writef/ln from Phobos, D’s standard library, to trace the application is naive, but can be very useful nevertheless. The code from tracing_writef.d:

    case Event.One:
            auto sw = StopWatch(AutoStart.no);
            sw.start();

            doSomeWork;

            sw.stop();
            writefln("%d:\tEvent %s took: %s", counter, event, sw.peek);
        break;

With this trivial approach, StopWatch from the standard library is used to measure the execution time of the code block of interest. Compile and run the application with the command dub tracing_writef.d and look at its output:

Running ./example-writef
0:      Event One took:   584 ms, 53 μs, and 5 hnsecs
1:      Event One took:   922 ms, 72 μs, and 6 hnsecs
2:      Event Two took:   1 sec, 191 ms, 73 μs, and 8 hnsecs
3:      Event Two took:   974 ms, 73 μs, and 7 hnsecs
...

There is a price for this—we need to compile tracing code into our binary, we need to manually implement the collection of tracing output, disable it when we need to, and so on—and this means the size of the binary increases. To summarize:

Pros

  • all the might of Phobos is available to employ (except when in BetterC mode)

  • tracing output can be displayed in a human readable format (look at the nice output of Duration above; thanks to Jonathan M. Davis for the std.datetime package)

  • source code is portable

  • easy to use

  • no third-party tools required

Cons

  • the application must be rebuilt and restarted in order to make any changes, which is inappropriate for some applications (such as servers)

  • no low-level access to the application state

  • noise in the code due to the addition of tracing code

  • can be unusable due to a lot of debug output

  • overhead can be large

  • can be hard to use in production

This approach is very suitable in the early stages of development and less useful in the final product. Although, if the tracing logic is fixed and well defined, this approach can be used in production-ready applications/libraries. For example, this approach was suggest by Stefan Koch for tracing the DMD frontend to profile performance and memory consumption.

Debugger-Based Approach

The debugger, in this case GDB, is a more advanced means to trace applications. There is no need to modify the application to change the tracing methodology, making it very useful in production. Instead of compiling tracing logic into the application, breakpoints are set. When the debugger stops execution on a breakpoint, the developer can use the large arsenal of GDB functionality to inspect the internal state of the inferior (which, in GDB terms, usually refers to the process being debugged). It is not possible in this case to use Phobos directly, but helpers are available and, moreover, you have access to registers and the stack—options which are unavailable in the case of writef debugging.

Let’s take a look the code from tracing_gdb.d for the first event:

case Case.One:

    doSomeWork;

break;

As you can see, now there is no tracing code and it is much cleaner. The tracing logic is placed in a separate file called trace.gdb. It consists of a series of command blocks configured to execute on specific breakpoints, like this:

set pagination off
set print address off

break app.d:53
commands
set $EventOne = currClock()
continue
end

break app.d:54
commands
set $EventOne = currClock() - $EventOne
printf "%d:\tEvent One   took: %s\n", counter, printClock($EventOne)
continue
end

...

run
quit

In the first line, pagination is switched off. This enables scrolling so that there is no need to press “Enter” or “Q” to continue script execution when the current console fills up. The second line disables showing the address of the current breakpoint in order to make the output less verbose. Then breakpoints are set on lines 53 and 54, each followed by a list of commands (between the commands and end labels) that will be executed when GDB stops on these breakpoints. The breakpoint on line 53 is configured to fetch the current timestamp (using a helper) before doSomeWork is called, and the one on line 54 to get the current timestamp after doSomeWork has been executed. In fact, line 54 is an empty line in the source code, but GDB is smart enough to set the breakpoint on the next available line. $EventOne is a convenience variable where we store the timestamps to calculate code execution time. currClock() and printClock(long) are helpers to let us prettify the formatting by means of Phobos. The last two commands in the script initiate the debugging and quit the debugger when it’s finished.

To build and run this tracing session, use the following commands:

dub build tracing_gdb.d --single
gdb --command=trace.gdb ./tracing-gdb | grep Event

trace.gdb is the name of the GDB script and tracing-gdb is the name of the binary. We use grep to make the GDB output look like writefln output for easier comparison.

Pros

  • the code is clean and contains no tracing code

  • there is no need to recompile the application to change the tracing methodology—in many cases, it’s enough to simply change the GDB script

  • there is no need to restart the application

  • it can be used in production (sort of)

  • there is no overhead if/when not tracing and little when tracing

  • watchpoints and catchpoints can be used instead of breakpoints

Cons

  • using breakpoints in some cases may be inconvenient, annoying, or impossible.

  • GDB’s pretty-printing provides “less pretty” output because of the lack of full Phobos support compared to the writef approach

  • sometimes GDB is not available in production

The point about setting breakpoints in GDB being inconvenient is based on the fact that you can use only an address, a line number, or a function name (see the gdb manual). Using an address is too low level and inconvenient. Line numbers are ephemeral and can easily change when the source file is edited, so the scripts will be broken (this is annoying). Using function names is convenient and stable, but is useless if you need to place a tracing probe inside a function.

A good example of using GDB for tracing is Vladimir Panteleev’s dmdprof.

The USDT-Based Approach

So far we have two ways to trace our application that are complimentary, but is there a way to unify all the advantages of these two approaches and avoid their drawbacks? Of course, the answer is yes. In fact there are several ways to achieve this, but hereafter only one will be discussed: USDT (Userland Statically Defined Tracing).

Unfortunately, due to historical reasons, the Linux tracing ecosystem is fragmented and rather confusing. There is no plain and simple introduction. Get ready to invest much more time if you want to understand this domain. The first well-known, full-fledged tracing framework was DTrace, developed by Sun Microsystems (now it is open source and licensed under the GPL). Yes, strace and ltrace have been around for a long time, but they are limited, e.g., they do not let you trace what happens inside a function call. Today, DTrace is available on Solaris, FreeBSD, macOS, and Oracle Linux. DTrace is not available in other Linux distributions because it was initially licensed under the CDDL. In 2018, it was relicensed under the GPL, but by then Linux had its own tracing ecosystem. As with everything, Open Source has disadvantages. In this case, it resulted in fragmentation. There are now several tools/frameworks/etc. that are able to solve the same problems in different ways but somehow and sometimes can interoperate with each other.

We will be using bpftrace, a high level tracing language for Linux eBPF. In D, USDT probes are provided by the usdt library. Let’s start from the code in tracing_usdt.d:

case Case.One:
	mixin(USDT_PROBE!("dlang", "CaseOne", kind));

	doSomeWork;

	mixin(USDT_PROBE!("dlang", "CaseOne_return", kind));
break;

Here we mixed in two probes at the start and the end of the code of interest. It looks similar to the first example using writef, but a huge difference is that there is no logic here. We only defined two probes that are NOP instructions. That means that these probes have almost zero overhead and we can use them in production. The second great advantage is that we can change the logic while the application is running. That is just impossible when using the writef approach. Since we are using bpftrace, we need to write a script, called bpftrace.bt, to define actions that should be performed on the probes:

usdt:./tracing-usdt:dlang:CaseOne
{
	@last["CaseOne"] = nsecs;
}

usdt:./tracing-usdt:dlang:CaseOne_return
{
	if (@last["CaseOne"] != 0)
	{
		$tmp = nsecs;
		$period = ($tmp - @last["CaseOne"]) / 1000000;
		printf("%d:\tEvent CaseOne   took: %d ms\n", @counter++, $period);
		@last["CaseOne"] = $tmp;
		@timing = hist($period);
	}
}
...

The first statement is the action block. It triggers on the USDT probe that is compiled in the ./tracing-usdt executable (it includes the path to the executable) with the dlang provider name and the CaseOne probe name. When this probe is hit, then the global (indicated by the @ sign) associative array last updates the current timestamp for its element CaseOne. This stores the time of the moment the code starts running. The second action block defines actions performed when the CaseOne_return probe is hit. It first checks if corresponding element in the @last associative array is already initialized. This is needed because the application may already be running when the script is executed, in which case the CaseOne_return probe can be fired before CaseOne. Then we calculate how much time code execution took, output it, and store it in a histogram called timing.

The BEGIN and END blocks at the top of bpftrace.bt define actions that should be performed at the beginning and the end of script execution. This is nothing more than initialization and finalization. Build and run the example with:

dub build tracing_usdt.d   --single --compiler=ldmd2 # or gdc
./tracing-usdt &                                     # run the example in background
sudo bpftrace bpftrace.bt                            # start tracing session

Output:

Attaching 8 probes...
0:	Event CaseThree took: 552 ms
1:	Event CaseThree took: 779 ms
2:	Event CaseTwo   took: 958 ms
3:	Event CaseOne   took: 1174 ms
4:	Event CaseOne   took: 1059 ms
5:	Event CaseThree took: 481 ms
6:	Event CaseTwo   took: 1044 ms
7:	Event CaseThree took: 611 ms
8:	Event CaseOne   took: 545 ms
9:	Event CaseTwo   took: 1038 ms
10:	Event CaseOne   took: 913 ms
11:	Event CaseThree took: 989 ms
12:	Event CaseOne   took: 1149 ms
13:	Event CaseThree took: 541 ms
14:	Event CaseTwo   took: 1072 ms
15:	Event CaseOne   took: 633 ms
16:	Event CaseTwo   took: 832 ms
17:	Event CaseTwo   took: 1120 ms
^C



@timing:
[256, 512)             1 |@@@@@                                               |
[512, 1K)             10 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1K, 2K)               7 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                |

In the session output above there are only 18 lines instead of 20; it’s because tracing-usdt was started before the bpftrace script so the first two events were lost. Also, it’s necessary to kill the example by typing Ctrl-C after tracing-usdt completes. After the bpftrace script stops execution, it ouputs the contents of the timing map as a histogram. The histogram says that one-time code execution takes between 256 and 512 ms, ten times between 512 and 1024 ms, and seven times more between 1024 and 2048 ms. These builtin statistics make using bpftrace easy.

Pros

  • provides low-level access to the code (registers, memory, etc.)

  • minimal noise in the code

  • no need to recompile or restart when changing the tracing logic

  • almost zero overhead

  • can be effectively used in production

Cons

  • learning USDT can be hard, particularly considering the state of the Linux tracing ecosystem

  • requires external tools (frontends)

  • OS specific

Note: GDB has had support for USDT probes since version 7.5. To use it, modify the trace.gdb script to set breakpoints using USDT probes instead of line numbers. That eases development because it eliminates the need to synchronize line numbers during source code modification.

Futher reading:

Conclusion

Feature writef gdb usdt </tr>
pretty printing by means of Phobos and other libs by means of pretty-printing limited builtins
low-level no yes yes
clean code no yes sort of
recompilation yes no no
restart yes no no
usage complexity easy easy+ advanced
third-party tools no only debugger tracing system front end
cross platform yes sorta of OS specific
overhead can be large none can be ignored even in production
production ready sometimes possible sometimes impossible yes

Feature descriptions:

  • pretty printing is important if the tracing output should be read by humans (and can be ignored in the case of inter-machine data exchange)

  • low-level means access to low-level details of the traced binary, e.g., registers or memory

  • clean code characterizes whether additional tracing code which is unrelated to the applications’s business logic would be required.

  • recompilation determines if it is necessary to recompile when changing the tracing methodology

  • restart determines if it is necessary to restart the application when changing the tracing methodology

  • usage complexity indicates the level of development experience that may be required to utilize this technology

  • third-party tools describes tools not provided by standard D language distributions are required to use this technology

  • cross platform indicates if this technology can be used on different platforms without changes

  • overhead - the cost of using this technology

  • production ready - indicates if this technology may be used in a production system without consequences

News Update: Swag, Platforms, Documentation Help and More

Feb 17, 2020 • DBlogAdmin • #Community, #D Foundation, #Donations, #News

Here are a few updates on things that have been going on both in front of and behind the scenes of the D Programming Language community.

New D Swag

We’ve got some new items in the DLang Swag Emporium: t-shirts, coffee mugs, and stickers sporting the Royal D logo. (If all Royal D items aren’t showing up for you in the Royal D category, check the D Rocket category. Everything should be in the correct location in a day or two).

You may notice that there are fewer options on the product page than for the other items, i.e. only one mug and sticker, and no dark tee option. They are available, though! When you select one of the existing products, you can change the style of the selection to one of several options. Beware! This may also change the price.

Remember, a small percentage of every item you order from the DLang Swag Emporium goes into the D Language Foundation’s General Fund. Plus, if you click through the link above or on the blog’s sidebar, we’ll get an additional referral fee on top of the item royalty. It’s an easy way to both get some D swag and contribute a few bucks to the Foundation.

Expanded Platform Progress

You maybe aware that some work has been ongoing in getting D onto more platforms. Adam Ruppe was working on contract to get LDC’s Android support to the finish line. He wrapped things up a few weeks back and has been paid out of the Foundation’s HR Fund.

Sebastiaan Koppe has been working on contract to get DRuntime ported to WebAssembly. Progress is ongoing and we currently expect it to be mostly wrapped up by the end of March. Like Adam, he’ll be out of the HR Fund when the contract is complete.

Work is also underway to bring LDC to iOS and iPadOS. We had been hoping to get someone to work on contract for this, but there are few people we know who are familiar enough with the platform to get it done and we were unable to find anyone then with the time to work on it. So we put up a bounty for it and kept our fingers crossed.

Recently, you may have seen forum posts from Jacob Carlborg indicating he’s been working on it in his spare time. Some preliminary support was merged in the LDC 1.20.0 release. Although he isn’t working under contract, he is working toward the bounty. That means anyone who wants to support him can contribute by increasing the bounty. Two contributors have already done so. The base amount of $3000 will be taken from the HR Fund when the work is complete.

And speaking of bounties, there are several others waiting for someone to claim them!

The HR Fund

With one payout from the fund and two coming up, we need to replenish it so we can always have cash earmarked for more contract work and bounties. You can make one-time or recurring donations of any amount directly and receive the same rewards available on our Open Collective page, or you can use a different link to make a $60 donation and get a DConf 2019 t-shirt in return. We’ve still got a few shirts available, so help us get rid of them and boost the HR Fund at the same time!

Documentation Event

Behind-the-scenes discussions about ideas to improve the D ecosystem in one way or another are frequently cycling through the inboxes of the people who can make them happen. Most never see the light of day, but there is one that has great potential. If it all comes together, I’ll be able to announce it in the coming weeks. We need your help to make that happen.

We need some specifics regarding areas where the documentation for D and items in the the D ecosystem is lacking. For example, people often complain about inconsistencies in the D spec, and missing info or examples in the DUB and vibe.d docs.

I’ve started a thread in the D forums where you can post your gripes about incomplete/missing/lackluster documentation. Remember, we need you to be specific. Just saying “the DUB docs are incomplete” doesn’t help. What specifically is missing? Or what specifically is wrong? The more information you can provide the better. And the more examples we can collect the better. The goal is to be able to define specific documentation tasks that anyone with the requisite knowledge can complete.

If we can get enough examples with enough detail, then I should be able to announce a new event sponsored by one of our generous benefactors. And I really want to be able to announce it!

DConf 2020

We really want to see a flood of talk submissions this year. If you’ve never been to DConf, or never presented at any conference, don’t let that stop you! Send us your submission and you may end up with a free trip to the conference.

Also, if you pay for an early-bird registration now (a 15% discount over the regular registration rate) and your talk is selected later, we’ll reimburse your registration fee. So if you’re planning to attend the conference even if your talk isn’t selected, it’s a good idea to register now and avoid the risk of missing the early-bird deadline.

We’re also offering once again the Open Source and Academic Discount; if you are a major open source contributor, a student, or an academic, we’ll give you a 50% discount on the regular registration rate. If you think you qualify, please don’t hesitate to take advantage of it by contacting social@dlang.org (or you can contact me directly at aldacron@gmail.com) for details on how to take advantage.

Finally, we never want to leave anyone out of DConf because they can’t afford to pay. This has been a policy of Walter’s from the beginning. If you are in or around London June 17 - 20 and would like to attend DConf but are unable to afford the registration and/or don’t qualify for the special discount, please email one of the addresses above and we’ll work something out.

DConf 2020: Submission Deadline, Early-Bird Registration, and Invited Keynote

Feb 6, 2020 • DBlogAdmin • #Community, #D Foundation, #DConf, #SAoC

In early January, I announced that Symmetry Investments is bringing DConf back to London for our 2020 edition. At the same time, I said we’d start taking submissions from anyone who wanted to send them in. In the interim, we’ve fixed our deadlines and prepared to start accepting reservations. There was only one thing remaining before I was ready for the formal call for submissions and opening of early-bird registrations: confirming our invited keynote speaker. Now that he has confirmed, it’s all official!

Invited Keynote

We’re excited to welcome Roberto Ierusalimschy to DConf 2020! You may know him from his work as the leading architect of the Lua programming language. He’s the author of Programming in Lua and an Associate Professor of Computer Science at PUC-Rio (the Pontifical Catholic University of Rio de Janeiro).

We don’t know yet what his talk will about, but it can be about any topic he wants. We’ll have more information on that for you when we publish the schedule of all selected talks after April 19.

Call for Submissions

We are accepting submissions for DConf 2020 until April 12. Authors will be notified of their final status by April 19.

We’re eager to see some new faces on the stage this year. If you’ve never presented at a DConf before, please don’t hesitate to send us one or more submissions. One person has already sent in seven!

Unless you’re Roberto Ierusalimschy, we prefer topics that are directly or indirectly related to D. We aren’t intransigent, though, so we’re willing to consider other topics. If someone sends us a proposal that isn’t about D but piques our collective interest, we’ll certainly give it serious consideration.

Having a talk selected is a great way to get to DConf if you’re on a budget. You’ll pay no registration fee, plus we’ll reimburse your transportation and lodging costs (within reason—five-star hotels and business- or first-class plane tickets aren’t on the menu). That’s a pretty good deal.

You can find instructions for writing and submitting your submissions on the DConf 2020 homepage.

Early-Bird Registration

Early-bird registration is available at $340, which is 15% off the regular $400 rate. Because we’re being sponsored by Symmetry in London once more, we once again must include a 20% VAT. So the total early-bird rate is $408 (similarly, the regular rate with VAT will be $480). We’re required by UK law to show you the basic rate and VAT in GBP based on the current HMRC exchange rate. That changes every month, so you can see the latest GPB rates in the registration section of the DConf 2020 homepage.

There, you’ll find options for Flipcause and PayPal. From our perspective, we prefer you use our Flipcause form. That gives you the option to cover the credit card processing fee for us so that 100% of your payment can be put toward DConf expenses. If you choose to uncheck that option, that’s fine, too! It will still save us from paying other fees. Every penny we can put toward the expenses helps.

If you do choose to go through PayPal, you have an option for USD and one for GBP. Some registrants told me last year that they get a GBP option even when clicking the USD button. And of course, some register with GBP-based credit cards. However, the GBP button on the DConf 2020 homepage is a fixed amount based on the current HMRC exchange rate. It changes, but only once a month. It may turn out to be cheaper for you than the rate you get from PayPal or your credit card provider. Of course, it could turn out to be more expensive, so if you’re looking to save a few pounds, you may want to investigate the different exchange rates if they apply to your situation.

And Now For Something Completely Different

DConf isn’t the only event Symmetry Investments is sponsoring these days. We recently wrapped up the 2019 edition of the Symmetry Autumn of Code.

This year, we started with five participants working on five interesting projects. Each participant was to complete a total of four milestones over four months with guidance from a mentor. At the successful completion of the first three milestones, each participant would receive $1000. At the end of the fourth and final milestone, one participant would be selected to receive one more $1000 payment and an all-expense paid trip to DConf.

As the event played out, we lost one of the participants at the end of Milestone 2. Two more were unable to fully commit to the Milestone 4 deadline (though they promised to continue working on their projects after SAOC). That left two participants for the SAOC review committee to select from. It was a very difficult decision, as both participants did excellent work and received glowing evaluations from their mentors.

Now I can announce that the SAOC 2019 finalist was Roberto Rosmaninho!

Roberto, with his mentor Nicholas Wilson, worked on adding support for Multi-Level Intermediate Representation (MLIR) to LDC, the LLVM-based D compiler. He is currently working on putting together pull requests for LDC and intends to work on optimizations going forward. He has also confirmed that he will take advantage of his reward so that we will have at least two Robertos at DConf this year.

As we did last year with Francesco Gallà, the SAOC 2018 finalist, we’ve asked Roberto to submit a talk this year. He promised to do so. We can’t promise his talk will be selected (though the odds are high out of the gate), but he still gets a free trip if it isn’t! Besides, we’re looking forward to meeting him.

On behalf of the D Language Foundation and Symmetry Investments, I want to thank everyone who participated in SAOC 2019. Keep an eye on this blog for news about future events.

Now go prep your DConf 2020 submissions!

wc in D: 712 Characters Without a Single Branch

Jan 28, 2020 • RobertSchadek • #Code, #Guest Posts, #The Language, #Tutorials

After reading “Beating C With 80 Lines Of Haskell: Wc”, which I found on Hacker News, I thought D could do better. So I wrote a wc in D.

The Program

It consists of one file and has 34 lines and 712 characters.

import std.stdio : writefln, File;
import std.algorithm : map, fold, splitter;
import std.range : walkLength;
import std.typecons : Yes;
import std.uni : byCodePoint;

struct Line {
	size_t chars;
	size_t words;
}

struct Output {
	size_t lines;
	size_t words;
	size_t chars;
}

Output combine(Output a, Line b) pure nothrow {
	return Output(a.lines + 1, a.words + b.words, a.chars + b.chars);
}

Line toLine(char[] l) pure {
	return Line(l.byCodePoint.walkLength, l.splitter.walkLength);
}

void main(string[] args) {
	auto f = File(args[1]);
	Output o = f
		.byLine(Yes.keepTerminator)
		.map!(l => toLine(l))
		.fold!(combine)(Output(0, 0, 0));

	writefln!"%u %u %u %s"(o.lines, o.words, o.chars, args[1]);
}

Sure, it is using Phobos, D’s standard library, but then why wouldn’t it? Phobos is awesome and ships with every D compiler. The program itself does not contain a single if statement. The Haskell wc implementation has several if statements. The D program, apart from the main function, contains three tiny functions. I could have easily put all the functionally in one range chain, but then it probably would have exceeded 80 characters per line. That’s a major code-smell.

The Performance

Is the D wc faster than the coreutils wc? No, but it took me 15 minutes to write mine (I had to search for walkLength, because I forgot its name).

file lines bytes coreutils haskell D
app.d 46 906 3.5 ms +- 1.9 ms 39.6 ms +- 7.8 ms 8.9 ms +- 2.1 ms
big.txt 862 64k 4.7 ms +- 2.0 ms 39.6 ms +- 7.8 ms 9.8 ms +- 2.1 ms
vbig.txt 1.7M 96M 658.6ms +- 24.5ms 226.4 ms +- 29.5 ms 1.102 s +- 0.022 s
vbig2.txt 12.1M 671M 4.4 s +- 0.058 s 1.1 s +- 0.039 s 7.4 s +- 0.085 s

Memory:

file coreutils haskell D
app.d 2052K 7228K 7708K
big.txt 2112K 7512K 7616K
vbig.txt 2288K 42620K 7712K
vbig2.txt 2360K 50860K 7736K

Is the Haskell wc faster? For big files, absolutely, but then it is using threads. For small files, GNU’s coreutils still beats the competition. At this stage my version is very likely IO bound, and it’s fast enough anyway.

I’ll not claim that one language is faster than another. If you spend a chunk of time on optimizing a micro-benchmark, you are likely going to beat the competition. That’s not real life. But I will claim that functional programming in D gives functional programming in Haskell a run for its money.

A Bit About Ranges

Digital Mars D logoA range is an abstraction that you can consume through iteration without consuming the underlying collection (if there is one). Technically, a range can be a struct or a class that adheres to one of a handful of Range interfaces. The most basic form, the InputRange, requires the function

void popFront();

and two members or properties:

T front;
bool empty;

T is the generic type of the elements the range iterates.

In D, ranges are special in a way that other objects are not. When a range is given to a foreach statement, the compiler does a little rewrite.

foreach (e; range) { ... }

is rewritten to

for (auto __r = range; !__r.empty; __r.popFront()) {
    auto e = __r.front;
    ...
}

auto e = infers the type and is equivalent to T e =.

Given this knowledge, building a range that can be used by foreach is easy.

struct Iota {
	int front;
	int end;

	@property bool empty() const {
		return this.front == this.end;
	}

	void popFront() {
		++this.front;
	}
}

unittest {
	import std.stdio;
	foreach(it; Iota(0, 10)) {
		writeln(it);
	}
}

Iota is a very simple range. It functions as a generator, having no underlying collection. It iterates integers from front to end in steps of one. This snippet introduces a little bit of D syntax.

@property bool empty() const {

The @property attribute allows us to use the function empty the same way as a member variable (calling the function without the parenthesis). The trailing const means that we don’t modify any data of the instance we call empty on. The built-in unit test prints the numbers 0 to 10.

Another small feature is the lack of an explicit constructor. The struct Iota has two member variables of type int. In the foreach statement in the test, we create an Iota instance as if it had a constructor that takes two ints. This is a struct literal. When the D compiler sees this, and the struct has no matching constructor, the ints will be assigned to the struct’s member variables from top to bottom in the order of declaration.

The relation between the three members is really simple. If empty is false, front is guaranteed to return a different element, the next one in the iteration, after a call to popFront. After calling popFront the value of empty might have changed. If it is true, this means there are no more elements to iterate and any further calls to front are not valid. According to the InputRange documentation:

  • front can be legally evaluated if and only if evaluating empty has, or would have, equaled false.

  • front can be evaluated multiple times without calling popFront or otherwise mutating the range object or the underlying data, and it yields the same result for every evaluation.

Now, using foreach statements, or loops in general, is not really functional in my book. Lets say we want to filter all uneven numbers of the Iota range. We could put an if inside the foreach block, but that would only make it worse. It would be nicer if we had a range that takes a range and a predicate that can decide if an element is okay to pass along or not.

struct Filter {
	Iota input;
	bool function(int) predicate;

	this(Iota input, bool function(int) predicate) {
		this.input = input;
		this.predicate = predicate;
		this.testAndIterate();
	}

	void testAndIterate() {
		while(!this.input.empty
				&& !this.predicate(this.input.front))
		{
			this.input.popFront();
		}
	}

	void popFront() {
		this.input.popFront();
		this.testAndIterate();
	}

	@property int front() {
		return this.input.front;
	}

	@property bool empty() const {
		return this.input.empty;
	}
}

bool isEven(int a) {
	return a % 2 == 0;
}

unittest {
	foreach(it; Filter(Iota(0,10), &isEven)) {
		writeln(it);
	}
}

Filter is again really simple: it takes one Iota and a function pointer. On construction of Filter, we call testAndIterate, which pops elements from Iota until it is either empty or the predicate returns false. The idea is that the passed predicate decides what to filter out and what to keep. The properties front and empty just forward to Iota. The only thing that actually does any work is popFront. It pops the current element and calls testAndIterate. That’s it. That’s an implementation of filter.

Sure, there is a new while loop in testAndIterate, but rewriting that with recursion is just silly, in my opinion. What makes D great is that you can use the right tool for the job. Functional programming is fine and dandy a lot of the time, but sometimes it’s not. If a bit of inline assembly would be necessary or nicer, use that.

The call to Filter still does not look very nice. Assuming, you are used to reading from left to right, Filter comes before Iota, even though it is executed after Iota. D has no pipe operator, but it does have Uniform Function Call Syntax (UFCS). If an expression can be implicitly converted to the first parameter of a function, the function can be called like it is a member function of the type of the expression. That’s a lot of words, I know. An example helps:

string foo(string a) {
	return a ~ "World";
}

unittest {
	string a = foo("Hello ");
	string b = "Hello ".foo();
	assert(a == b);
}

The above example shows two calls to the function foo. As the assert indicates, both calls are equivalent. What does that mean for our Iota Filter example? UFCS allows us to rewrite the unit test to:

unittest {
	foreach(it; Iota(1,10).Filter(&isEven)) {
		writeln(it);
	}
}

Implementing a map/transform range should now be possible for every reader. Sure, Filter can be made more abstract through the use of templates, but that’s just work, nothing conceptually new.

Of course, there are different kinds of ranges, like a bidirectional range. Guess what that allows you to do. A small tip: a bidirectional range has two new primitives called back and popBack. There are other range types as well, but after you understand the input range demonstrated twice above, you pretty much know them all.

P.S. Just to be clear, do not implement your own filter, map, or fold; the D standard library Phobos has everything you every need. Have a look at std.algorithm and std.range.

About the Author

Robert Schadek received a master’s degree in Computer Science at the University of Oldenburg. His master thesis was titled “DMCD A Distributed Multithreading Caching D Compiler” where he work on building a D compiler from scratch. He was a computer science PhD student from 2012–2018 at the University of Oldenburg. His PhD research focuses on quorum systems in combination with graphs. Since 2018 he is happily using D in his day job working for Symmetry Investments.

What is Symmetry Investments?

Symmetry Investments is a global investment company with offices in Hong Kong, Singapore and London. We have been in business since 2014 after successfully spinning off from a major New York-based hedge fund.

At Symmetry, we seek to engage in intelligent risk-taking to create value for our clients, partners and employees. We derive our edge from our capacity to generate Win-Wins – in the broadest sense. Win-Win is our fundamental ethical and strategic principle. By generating Win-Wins, we can create unique solutions that reconcile perspectives that are usually seen as incompatible or opposites, and encompass the best that each side has to offer. We integrate fixed-income arbitrage with global macro strategies in a novel way. We invent and develop technology that focuses on the potential of human-machine integration. We build systems where machines do what they do best, supporting people to do what people do best. We are creating a collaborative meritocracy: a culture where individual contribution serves both personal and collective goals – and is rewarded accordingly. We value both ownership thinking AND cooperative team spirit, self-realisation AND community.

People at Symmetry Investments have been active participants in the D community since 2014. We have sponsored the development of excel-d, dpp, autowrap, libmir, and various other projects. We started Symmetry Autumn of Code in 2018 and hosted DConf 2019 in London.

D For Data Science: Calling R from D

Jan 27, 2020 • LanceBachmeier • #Code, #D and C, #Guest Posts

Digital Mars D logoD is a good language for data science. The advantages include a pleasant syntax, interoperability with C (in many cases as simple as adding an #include directive to import a C header file via the dpp tool), C-like speed, a large standard library, static typing, built-in unit tests and documentation generation, and a garbage collector that’s there when you want it but can be avoided when you don’t.

Library selection for data science is a different story. Although there are some libraries available, such as those provided by the mir project, the available functionality is extremely limited compared with languages like R and Python. The good news is that it’s possible to call functions in either language from D.

This article shows how to embed an R interpreter inside a D program, pass data between the two languages, execute arbitrary R code from within a D program, and call the R interface to C, C++, and Fortran libraries from D. Although I only provide examples for Linux, the same steps apply for Windows if you’re using WSL, and with minor modifications to the DUB package file, everything should work on macOS. Although it is possible to do so, I don’t talk about calling D functions from R, and I don’t include any discussion of interoperability with Python. (This is normally done using pyd.)

Dependencies

The following three dependencies should be installed:

  • R

  • R package RInsideC

  • R package embedr

It’s assumed that anyone reading this post already has R installed or can install it if they don’t. The RInsideC package is a slightly modified version of the excellent RInside project of Dirk Eddelbuettel and Romain Francois. RInside provides a C++ interface to R. The modifications provide a C interface so that R can be called from any language capable of calling C functions. Install the package using devtools:

library(devtools)
install_bitbucket("bachmeil/rinsidec") [The embedr package](https://embedr.netlify.com/) provides the necessary functions to work with R from within D. That package is also installed with devtools:

install_bitbucket("bachmeil/embedr") ## A First Program

The easiest way to do the compilation is to use D’s package manager, called DUB. From within your project directory, open R and create a project skeleton:

library(embedr)
dubNew()

This will create a /src subdirectory to hold your project’s source code if it doesn’t already exist, add a file called r.d to /src and create a dub.sdl file in the project directory. Create a file in the /src directory called hello.d, containing the following program:

import embedr.r;

void main() {
  evalRQ(`print("Hello, World!")`);
}

From the terminal, in the project directory (the one holding dub.sdl, not the /src subdirectory), enter

dub run

This will print out “Hello, World!”. The important thing to realize is that even though you just used DUB to compile and run a D program, it was R that printed “Hello, World!” to the screen.

Executing R Code From D

There are two ways to execute R code from a D program. evalR executes a string in R and returns the output to D, while evalRQ does the same thing but suppresses the output. evalRQ also accepts an array of strings that are executed sequentially.

Create a new project directory and run dubNew inside it, as you did for the first example. In the src/ subdirectory, add a file named reval.d:

import embedr.r;
import std.stdio;

void main() {
  // Example 1
  evalRQ(`print(3+2)`); // evaluates to 5 in R, R prints the output [1] 5 to the screen

  // Example 2
  writeln(evalR(`3+2`).scalar); // evaluates to 5 in R, output is 5

  // Example 3
  evalRQ(`3+2`); // evaluates to 5 in R, but there is no output

  // Example 4
  evalRQ([`x <- 3`, `y <- 2`, `z <- x+y`, `print(z)`]); // evaluates this code in R
}

Example 1 tells R to print the sum of 3 and 2. Because we use evalRQ, no output is returned to D, but R is able to print to the screen. Example 2 evaluates 3+2 in R and returns the output to D in the form of an Robj. evalR(3+2).scalar executes 3+2 inside R, captures the output in an Robj, and converts the Robj into a double holding the value 5. This value is passed to the writeln function and printed to the screen. Example 3 doesn’t output anything, because evalRQ does not return any output, and R isn’t being told to print anything to the screen. Example 4 executes the four strings in the array sequentially, returning nothing to D, but the last tells R to print the value of z to the screen.

There’s not much more to say about executing R code from D. You can execute any valid R code from D, and if there’s an error, it will be caught and printed to the screen. Graphical output is automatically captured in a PDF file. To work interactively with R, or if it’s sufficient to save the results to a text file and read them into D, this is all you need to know. The more interesting cases involve passing data between D and R, and for the times when there is no alternative, using the R interface to call directly into C, C++, or Fortran libraries.

Passing Data Between D and R

A little background is needed to understand how to pass data between D and R. Everything in R is represented as a C struct named SEXPREC, and a pointer to a SEXPREC struct is called a SEXP in the R source code. Those names reflect R’s origin as a Scheme dialect, where code takes the form of s-expressions. In order to avoid misunderstanding, embedr uses the name Robj instead of SEXP.

It’s necessary to let R allocate the memory for any data passed to R. For instance, you cannot tell D to allocate a double[] array and then pass a pointer to that array to R. You would instead do something like this:

auto v = RVector(100);
foreach(ii; 0..100) {
  v[ii] = 1.5*ii;
}
v.toR("vv");
evalRQ(`print(vv)`);

The first line tells R to allocate a vector with room for 100 elements. v is a D struct holding a pointer to the memory allocated by R plus additional information that allows you to read and change the elements of the vector. Behind the scenes, the RVector struct protects the vector from R’s garbage collector. R is a garbage collected language, and if the only reference to the data is in your D program, there’s nothing to prevent the R garbage collector from freeing that memory. The RVector struct uses the reference counting mechanism described in Adam Ruppe’s D Cookbook to protect objects from R’s garbage collector and unprotect them when they’re no longer in use.

After filling in all 100 elements of v, the toR function creates a new variable in R called vv, and associates it with the vector held inside v. The final line tells R to print out the variable vv.

In practice, no data is ever passed between D and R. The only thing that’s passed around is a single pointer to the memory allocated by R. That means it’s practical to call R functions from D even for very large datasets.

Calling the R API

The R API) provides a convenient (by C standards) interface to some of R’s functions and constants, including the numerical optimization routines underlying optim, distribution functions, and random number generators. This example shows how to solve an unconstrained nonlinear optimization problem using the Nelder-Mead algorithm, which is the default when calling optim in R.

The objective function is

f = x^2 + y^2

We want to choose x and y to minimize f. The obvious solution is x=0 and y=0.

Create a new project directory and initialize DUB from within R, with the one additional step to add the wrapper for R’s optimization libraries:

library(embedr)
dubNew()
dubOptim()

dubOptim() adds the file optim.d to the src/ directory. Create a file called nelder.d inside the src directory with the following program:

import embedr.r, embedr.optim;
import std.stdio;

extern(C) {
  double f(int n, double * par, void * ex) {
    return par[0]*par[0] + par[1]*par[1];
  }
}

void main() {
  auto nm = NelderMead(&f);
  OptimSolution sol = nm.solve([3.5, -5.5]);
  sol.print;
}

First we define the objective function, f, using the C calling convention so it can be passed to various C functions. We then create a new struct called NelderMead, passing a pointer to f to its constructor. Finally, we call the solve method, using [3.5, -5.5] as the array of starting values, and print out the solution. You’ll want to confirm that the failure code in the output is false (implying the convergence criterion was met). The most common reason that Nelder-Mead will fail to converge is because it took too many iterations. To change the maximum number of iterations to 10,000, you’d add nm.maxit = 10_000; to your program before the call to nm.solve.

There’s no overhead associated with calling an interpreted language in this example. We’re calling a C shared library directly, and at no point does the R interpreter get involved. As in the previous example, since there’s no copying of data, this approach is efficient even for large datasets. Finally, if you’re not comfortable with garbage collection, the inner loops of the optimization are done entirely in C. We nonetheless do take advantage of the convenience and safety of D’s garbage collector when allocating the nm and sol structs, as the performance advantages of manual memory management (to the extent that there are any) are irrelevant.

Calling R Interfaces from D

The purpose of many R packages is to provide a convenient interface to a C, C++, or Fortran library. The term “R interface” normally means one of two things. For modern C or C++ code, it’s a function taking Robj structs as arguments and returning one Robj struct as the output. For Fortran code and older C or C++ code, it’s a void function taking pointers as arguments. In either case, you can call the R interface directly from D code, meaning any library with an R interface also has a D interface.

An example of an R interface to Fortran code is found in the popular glmnet package). Lasso estimation using the elnet function is done by passing 28 pointers to the function elnet in libglmnet.so with this interface:

.Fortran("elnet", ka, parm=alpha, nobs, nvars, as.double(x), y,
                  weights, jd, vp, cl, ne, nx, nlam, flmin, ulam, thresh,
                  isd, intr, maxit, lmu=integer(1), a0=double(nlam),
                  ca=double(nx*nlam), ia=integer(nx), nin=integer(nlam),
                  rsq=double(nlam), alm=double(nlam), nlp=integer(1),
                  jerr=integer(1), PACKAGE="glmnet")

You might want to work with the R interface directly if you’re calling elnet inside a loop in your D program. Most of the time it’s better to pass the data to R and then call the R function that calls elnet. Calling Fortran functions can be error-prone, leading to hard to debug segmentation faults.

Conclusion

D was designed from the beginning to be compatible with the C ABI. The intention was to facilitate the integration of new D code into existing C code bases. The practical result has been that, due to C’s lingua franca status, D can be used in combination with myriad languages. Data scientists looking for alternatives to C and C++ when working with R may find benefit in giving D a close look.

Lance Bachmeier is an associate professor of economics at Kansas State University and co-editor of the journal _Energy Economics. He does research on macroeconomics and energy economics. He has been using the D programming language in his research since 2013._

DIP Reviews: Discussion vs. Feedback

Jan 26, 2020 • DBlogAdmin • #Community, #DIPs, #News

Digital Mars D logoFor a while now, I’ve been including a link to the DIP Reviewer Guidelines in the initial forum post for every DIP review. My hope was that it would encourage reviewers to keep the thread on topic and also to provide more focused feedback. As it turns out, a link to reviewer guidelines is not quite enough. Recent review threads have turned into massive, 20+ page discussions touching on a number of tangential topics.

The primary purpose of the DIP review process, as I’ve tried to make clear in blog posts, forum discussions, and the reviewer guidelines, is to improve the DIP. It is not a referendum on the DIP. In every review round, the goal is to strengthen the content where it is lacking, bring clarity and precision to the language, make sure all the bases are covered, etc.

At the same time, we don’t want to discourage discussion on the merits of the proposal. Opinions about the necessity or the validity of a DIP can raise points that the language maintainers can take into consideration when they are deciding whether to approve or reject it, or even cause the DIP author to withdraw the proposal. It’s happened before. That’s why such discussion is encouraged in the Community Review rounds (though it’s generally discouraged in Final Review, which should be focused wholly on improving the proposal).

The problem

One issue with allowing such free-form discussion in the review threads is that there is a tremendous amount of noise drowning out the signal. Finding specific DIP-related feedback requires trawling through every post, digging through multiple paragraphs of mixed discussion and feedback. Sometimes, one or more people will level a criticism that spawns a long discussion and results in a changing of minds. This makes it time consuming for me as the DIP manager when I have to summarize the review. It also increases the likelihood that I’ll overlook something.

My summary isn’t just for the ‘Reviews’ section at the bottom of the DIP. It’s also my way of ensuring that the DIP author is aware of and has considered all the unique points of feedback. More than once I have found something the DIP author missed or had forgotten about. But if I overlook something and the DIP author also overlooks it, then we may have missed an opportunity to improve the DIP.

I have threatened to delete posts that go off topic in these threads,  but I can count on one hand the number of posts I’ve actually deleted. In reality, these discussions branch off in so many directions that it’s not easy to say definitively that a post that isn’t focused on the DIP itself is actually off topic. So I tend to let the posts stand rather than risk derailing the thread or removing information that is actually relevant.

The Solution

Starting with the upcoming Final Review of DIP 1027, I’m going to take a new approach to soliciting feedback. Rather than one review thread, I’ll be launching two for each DIP.

The Discussion Thread will be much the same as the current review thread. Opinions and discussion will be welcome and encouraged. I’ll still delete posts that are completely off topic, but other than that I’ll let the discussion flow where it may.

The Feedback Thread will be exclusively for feedback on the document and its contents. There will be no discussion allowed. Every post must contain specific points of feedback (preferably actionable items) intended to improve the proposal. Each post should be a direct reply to my initial post. There are only two exceptions: when a post author who has decided to retract feedback they made in a previous post, said poster can reply to the post in which they made the original feedback in order to make the retraction; and the DIP author may reply directly to any feedback post in order to indicate agreement or disagreement.

Posts in the feedback thread should contain answers to the questions posed in the DIP Reviewer Guidelines. It would be great if reviewers could take the time to do what Joseph Rushton Wakeling did in the Community Review for DIP 1028, where he explicitly listed and answered each question, but we won’t be requiring it. Feedback as bullet points is also very welcome.

Opinions on the validity of the proposed feature will be allowed in the feedback thread as long as they are backed with supporting arguments. In other words, “I’m against this! This is a terrible feature.” is not valid for the feedback thread. That sort of post goes in the discussion thread. However, “I’m against this. This is a terrible feature because " is acceptable.

The rules of the feedback thread will be enforced without prejudice. Any post that is not a reply to my initial post, retraction of previous feedback, or a response by the DIP author will be deleted. Any post that does not provide the sort of feedback described above will be deleted. If I do delete a post, I won’t leave a new post explaining why. I’m going to update the DIP Reviewer Guidelines and each opening post in a feedback thread will include a link to that document as well as a paragraph or two summarizing the rules.

I’ll require DIP authors to follow both threads and to participate in the discussion thread. When it comes time to summarize the review, the feedback thread will be my primary source. I will, of course, follow the discussion thread as well and take notes on anything relevant. But if you want to ensure any specific criticisms you may have about a DIP are accounted for, be sure to post them in the feedback thread.

Hopefully, this new approach won’t be too disruptive. We’ll see how it goes.

Recent D Compiler Releases

Jan 8, 2020 • DBlogAdmin • #DMD Releases, #LDC Releases, #News

Digital Mars D logoThe LDC team closed out the old year with release 1.19.0 of the LLVM-based D compiler, and the core D team opened the new year with version 2.090.0 of the reference D compiler, DMD. And if you haven’t yet heard, there was some big news about the GCC-based D compiler, GDC, a while back. Time to catch up!

LDC 1.19.0

This release updates the LDC compiler to D front end version 2.089.1, which was the current version when the compiler was released on the day after Christmas. The prebuilt packages are based on LLVM 9.01.

Among the big items in this release is some love for Android. The prebuilt DRuntime/Phobos library is now available for all supported Android targets. This release can be used in conjunction with Adam Ruppe’s D Android project, a collection of helper programs and interfaces, currently in beta, to facilitate D development on Android with LDC.

Windows users will find that the bundled MinGW-based link libraries for Windows development have been upgraded. They are now derived from .def files from the MinGW-w64 7.0.0 package. These libraries allow you to use the Windows system libraries without needing to install the Windows SDK.

DMD 2.090.0

The latest version of DMD was announced on January 7th. It ships with 10 major changes and 71 closed issues courtesy of 48 contributors.

With this release, it’s now possible to do more with lazy parameters. D has long supported lazy parameters:

An argument to a lazy parameter is not evaluated before the function is called. The argument is only evaluated if/when the parameter is evaluated within the function. Hence, a lazy argument can be executed 0 or more times.

Under the hood, they are implemented as delegates. Now, it’s possible to get at the underlying delegate by taking the address of the parameter, an operation which was previously illegal.

import std.stdio;

void chillax(lazy int x)
{
    auto dg = &x;
    assert(dg() == 10);
    writeln(x);
}

void main()
{
    chillax(2 * 5);
}

This release also renders obsolete a D idiom used by those who find themselves with a need to distinguish between finalization (non-deterministic object destruction usually initiated by the garbage collector) and normal destruction (deterministic object destruction) from inside a class or struct destructor.

With the current GC implementation, it’s illegal to perform some GC operations during finalization. However, D does not provide for separate finalizers and destructors. There is only ~this, which is referred to as a destructor even though it fills both roles. This sometimes presents difficulties when implementing destructors for types that are intended to be used with both GC and non-GC allocation. Any cleanup activity that touches the GC could throw an InvalidMemoryOperationError. Hence the need for the aforementioned workaround.

Now it’s possible to call the static GC member function, core.memory.GC.inFinalizer, to get your bearings in a destructor. It returns true if the current thread is performing object finalization, in which case you don’t want to be taking any actions that touch on GC operations. (I’ve been waiting for something like this before writing the next article in my GC series.)

GDC

Thanks to the hard work of Iain Buclaw, Johannes Pfau, and all of the volunteers who have maintained and contributed to it over the years, GDC was accepted into GCC 9 in late 2018 and made available as part of the GCC 9.1 package released in May of last year. GCC 9.2 was released last August. This version of GDC implements version 2.076 of the D front end. You can build it yourself or install it from the same place you install the GCC 9.x series.

DConf 2020: Double Decker Edition

Jan 4, 2020 • DBlogAdmin • #D Foundation, #DConf, #News

To kick off the year of double 20’s (or double X’s if you prefer), the D Language Foundation is excited to announce that DConf 2020 will return to 99 City Road for a second round in London! We had such a great time last year that we were over the moon when we heard that our DConf 2019 hosts and sponsors at Symmetry Investments were willing to do it all again in 2020. The venue’s Sinisa Poznanovic will be back live streaming the talks on the D Language Foundation’s YouTube channel, and all of the talks will once again be recorded in HD via the capable hands of the Stage Engage crew.

Since DConf 2013, our annual D gathering has taken place each year in May. This time, we’re breaking tradition by running the conference in the middle of June. Our usual three days of talks will take place June 17th - 19th, followed by our annual DConf Hackathon on the 20th. There’s a reason we picked these dates, but it’s tied to an announcement I hope to make some time in the next few weeks. My fingers are crossed that things work out the way we intend and that I can make that announcement sooner rather than later.

Early-bird registration will open in the near future. From now, we’re accepting submissions. If you’ve got an idea for a talk or a panel, don’t be shy. It doesn’t matter if you’re a D veteran or a D noob, if you’ve spoken at multiple DConfs or spoken at none (or never spoken in public at all). We’re eager to see submissions from any and all, but we’re particularly interested in seeing some new faces at the lectern this year. If your talk gets selected, you can plan your trip to London and, as a speaker, you’ll be eligible for reimbursement for the cost of your transportation and lodgings. If your talk doesn’t make the cut, you lose nothing. So head to the DConf 2020 web site for the details and send us your submission!

Additionally, we’re currently working out the details of a potential event peripheral to the conference itself. If all goes well and the plans come to fruition, I’ll announce it here as soon as I’m able (otherwise, I’ll have just teased you for no reason whatsoever). We’ve also got an eye out for opportunities like the walking tours we organized before the conference last year. I don’t know which ideas or opportunities will materialize this year, but I do want you to know that we’re looking.

So start making your plans, send your thanks to Laeeth and Symmetry Investments (if you’re so inclined) for taking on a second DConf and for everything they’ve done and continue to do for the D community, and send us your submissions!