Microsoft’s Roslyn: Reinventing the compiler as we know it
Whatever you may think of its business practices, Microsoft has always been top-notch when it comes to developer tools. Visual Studio is widely hailed as the best IDE out there, and .Net is an intelligently designed platform that borrows the best of what Java has to offer and takes it a few steps further.
Nothing could be further from the truth. Looking past the Metro hype, the Build conference also revealed promising road maps for C#, Visual Studio, and the .Net platform as a whole.
Perhaps the most exciting demo of the conference for .Net developers, however, was Project Roslyn, a new technology that Microsoft made available yesterday as a Community Technology Preview (CTP). Roslyn aims to bring powerful new features to C#, Visual Basic, and Visual Studio, but it’s really much more than that. If it succeeds, it will reinvent how we view compilers and compiled languages altogether.
Deconstructing the compiler
Roslyn has been described as “compiler-as-a-service technology,” a term that’s caused a lot of confusion. I’ve even seen headlines heralding the project as “Microsoft’s cloud compiler service” or “bringing .Net to the cloud.” None of that is correct. Technically, it would be possible to offer code compilation as a cloud-based service, but it’s hard to see the advantage, except in special circumstances.
Roslyn isn’t services in the sense of software-as-a-service (SaaS), platform-as-a-service (PaaS), or similar cloud offerings. Rather, it’s services in the sense of Windows services. Roslyn is a complete reengineering of Microsoft’s .Net compiler toolchain in a new way, such that each phase of the code compilation process is exposed as a service that can be consumed by other applications.
As Microsoft’s Anders Hejlsberg explained in a Build conference session, “Traditionally, a compiler is just sort of a black box. On one side you feed it source files, magic happens, and out the other end comes object files, or assemblies, or whatever the output format is.”
Internally, however, there’s a lot more going on. Typically, first the compiler parses your source code and breaks it down into a syntax tree. Then it builds a list of all the symbols in your program. Then it begins binding the symbols with the appropriate objects and so on.
An ordinary compiler discards all of this intermediate information once the final code is output. But with Roslyn-enabled compilers, the data from each step is accessible via its own .Net APIs. For example, a call to one API will return the entire syntax tree of a given piece of code as an object. A call to another API might return the number of methods in the code.
So what is Roslyn good for?
The most obvious advantage of this kind of “deconstructed” compiler is that it allows the entire compile-execute process to be invoked from within .Net applications. Hejlsberg demonstrated a C# program that passed a few code snippets to the C# compiler as strings; the compiler returned the resulting IL assembly code as an object, which was then passed to the Common Language Runtime (CLR) for execution. Voilà! With Roslyn, C# gains a dynamic language’s ability to generate and invoke code at runtime.
Put that same code into a loop that accepts input from the user, and you’ve created a fully interactive read-eval-print loop (REPL) console for C#, allowing you to manipulate and experiment with .Net APIs and objects in real time. With the Roslyn technology, C# may still be a compiled language, but it effectively gains all the flexibility and expressiveness that dynamic languages such as Python and Ruby have to offer.
The separate phases of the compilation process have their uses, too. For example, according to a blog post by Microsoft’s Eric Lippert (Silverlight required), various groups have written their own C# language parsers, even within Microsoft. Maybe the Visual Studio team needed to write a syntax-coloring component, or maybe another group wanted to translate C# code into something else. In the past, each team would write its own parser, of varying quality. With Roslyn, they can simply access the compiler’s own syntax parser via an API and get back a syntax tree that’s exactly the same as what the compiler would use. (Roslyn even exposes a syntax-coloring API.)
The syntax and binding data exposed by the Roslyn APIs also makes code refactoring easier. It even allows developers to write their own code refactoring algorithms in addition to the ones that ship with Visual Studio.
Hejlsberg’s most remarkable demo, however, showed how Roslyn’s syntax tree APIs make it remarkably easy to translate source code from one CLR language to another. To illustrate, Hejlsberg copied some Visual Basic source code to the clipboard, opened a new file, and chose Paste as C#. The result was the same algorithm, only now written in C#. Translations back and forth don’t yield identical code — for loops might translate into, say, while loops — but in all cases the code was perfectly valid, ready to compile, execute, or refactor.
Can I have it now, please?
The catch: Hejlsberg wouldn’t commit to a ship date for the Roslyn technologies or even that they’d make it into a shipping Visual Studio release. For that matter, he wouldn’t comment on any future Visual Studio releases or whether there would be another version at all. Even the Roslyn CTP release is running a little late. At the Build conference running Sept. 13 to 16, Hejlsberg said it would arrive “in four weeks.” It arrived yesterday — a week late — instead.
Don’t think Roslyn is too far-fetched to happen, though. It’s actually very similar to the Mono project’s Mono.CSharp library, which exposes the Mono C# compiler as a service and enables a REPL console much like the one Hejlsberg demoed at Build. Mono.CSharp has been shipping with Mono since version 2.2.
The main drawback of Roslyn is that it’s a complete retooling of the .Net compilers, rather than of the platform itself. That means it’s limited to C# and Visual Basic, at least for its initial release. If developers using other .Net languages want to take advantage of Roslyn-like capabilities, those languages’ compilers will need to be completely rewritten.
But maybe they should be. If Microsoft succeeds with everything it has planned, Roslyn represents not merely a new iteration of the Visual Studio toolchain but a whole new way for developers to interact with their tools. It breaks down the barriers between compiled and dynamic languages and enables powerful new interactive capabilities in the coding process itself. It truly is one of the most ambitious and exciting innovations in compiler technology in a long time.
Comments are closed.