This is a collection of ideas of what one might do with technology that makes it easy to add language constructs to an extensible language (closely related to C++) that targets readable C++ source code. The C++ compatibility and readability constraints restrict the language, for sure, but many interesting things can still be done, and barrier for adoption is low, especially when the compiler is used merely as a one-off code generation wizard.
The surface syntax used for the source language (the "C++ code I want written for me" specification language) used in these examples varies. The source-to-source translation has been done manually, but the resulting C++ code has been verified to compile.
or expressions of the same type as the subexpressions
In C++, the
|| operators always produce a boolean result, but sometimes it would be nice to have more Scheme-like
or operators such that
and would yield either the value of the last subexpression or 0, and
or would yield either the value of the first true expression or 0.
This feature is implementable through source-to-source translation in terms of C++ if expressions and statements and temporary variables (in order to avoid repeating side effects).
[anon_class] anonymous classes
- spec files: anon_class.ext.cpp
- generated files: anon_class.hpp anon_class.cpp
- test program: anon_class/
In C++ it is common to use classes with nothing but pure
virtual methods as callback interfaces for clients to implement in order to receive event notifications, for instance. The Symbian platform, for instance, includes a large number of such interfaces, and these classes are known as
M classes due to their naming convention.
interface construct is used in the same way, but Java also supports anonymous classes, which makes it more convenient to implement callbacks where required. This example explores the idea of adding anonymous class support to C++ through source-to-source translation.
[func_obj] lambda expressions and closures
This example is basically what is described for lambda expression surface syntax and semantics in the N2550 specification that is to apparently be adopted for C++0x. Given that the specification defines the “semantics of lambda expressions via translation to function objects” it should be quite possible to systematically source-to-source transform such expressions to valid C++, allowing the construct to be used with older compilers that do not support C++0x.
_LIT declaration within an expression
This example concerns the
_LIT construct which is used in Symbian C++ to declare string (“descriptor”) literals.
_LIT is a macro that produces a C++ declaration, and hence it cannot appear in an expression context. A source-to-source translator could lift such declarations to the nearest preceding declaration context.
(Symbian does have an alternative
_L macro that allows a literal to appear in an expression context, but
_LIT is preferred as
_L involves a performance penalty.)
More generally, allowing declarations within expressions would be powerful particularly when coupled with a macro facility capable of local transformations.
[member_init] member variable initialization with assignment syntax
- spec files: member_init.ext.cpp
- generated files: member_init.hpp member_init.cpp
- test program: member_init/
As your compiler may tell you, “ISO C++ forbids initialization of member” variables if they are instance variables (i.e., not
static), and you must then initialize in the constructor. This probably does not seem attractive to those with a Java background, for instance.
C++ in any case initializes instance variables in the order they are declared, and there hence probably is no confusion if the ctor member initializers were added automatically by a source-to-source translator, letting the actual variable declaration include an assignment specifying the initial value of the variable.
Similarly, for consistency, one might also allow the initialization of
const variables in the same manner as
const variables. Leading to a situation where all member variables can be declared the same way, with initial value and all.
[nested_anon_func] nested and anonymous functions
- spec files: nested_anon_func.lsc
- generated files: nested_anon_func.hpp nested_anon_func.cpp
- test program: nested_anon_func/
Nested and anonymous functions are not supported in C++, yet they may be handy in cases where a function is only referenced in a particular context, in which case it may be desirable to only have the function defined in that context.
Anonymous functions as such are implementable through source-to-source translation in a relatively straightforward way, as basically all that is required is to name the functions uniquely and lift them to the top level, where C++ does allow them.
This example considers the simple case where closures are not supported. To support closures one would have to consider the lifetime of visible variables from enclosing scope, possibly having to provide multiple alternate solutions depending on how memory is to be managed.
[pimpl] automatic hiding of class implementation
Perhaps for future-proofing ABI compatibility, or just to hide implementation details, one often sees the application of the Pimpl idiom or some variation thereof. The idea is to separate at least the private instance data (or perhaps the entire implementation of a class interface) into a separate class whose definition is not given in public header files. Just a pointer to an instance of that class is kept in the “public” class, meaning that even if the implementation class changes, the size of the public class stays the same.
This approach has its benefits, but entails more typing when done manually, and hence this is a potential application for source-to-source translation based automation.
With automation of the boilerplate coding it probably makes sense to hide not only the instance data but also the private methods behind an opaque pointer, as is done in this example. This way one can see the whole implementation by looking at the implementation class (here
Numbers::Impl) alone, as the public class is nothing but a wrapper.
[recur] explicit tail calls
In C++, not all compilers consistently perform tail call optimizations where possible. And if one cannot be sure of such optimization taking place, in cases where many repeated tail calls are possible one may wish to avoid recursion altogether. Which is a shame as using some other looping construct may be less readable and more effort to write.
It is possible to implement a looping construct that has syntax similar to recursive function calls, and which translates to something that does not consume stack with every iteration.
This example gets its name from
recur, which “is the only non-stack-consuming looping construct in Clojure”. The syntax used is that of Scheme’s named
[scope_exit] execute a statement at scope exit
- spec files: scope_exit.ext.cpp
- generated files: scope_exit.hpp scope_exit.cpp
- test program: scope_exit/
The idea here is to make it more convenient to use RAII in order to ensure that a particular operation is executed at scope exit. Leave it to the source-to-source translator to define a class and a destructor and to instantiate it in order to get the code you specify executed. The syntax used is
exit statement, which is similar to the
scope(exit) construct of the D language.
[two_phase] Symbian two-phase construction idiom
Symbian has its own form of exceptions called leaves. These are not allowed in constructors, and if a ctor may leave, then the object is not considered fully constructed after the ctor has been invoked. Rather, one must also invoke a method named
ConstructL, whose naming is by convention. Often, for convenience, a class includes
NewLC static methods that invoke both the constructor and the
Suppose one were to simply write a constructor, and annotate it with attributes specifying whether the ctor is potentially
leaving, and whether one wants
NewLC, or both. This source-to-source transformation example explores that scenario.