Xtext – TypeFox

Parsing simple XML-like, structural languages with Xtext is a no-brainer. However, parsing nested expressions is often considered a bit more complicated. This is due to their recursive nature and also because with Xtext you have to avoid left-recursive parser rules. As the underlying parser (generated by Antlr) uses a top-down approach, it would recurse endlessly if you had a left-recursive grammar.

Let’s have a look at parsing a simple arithmetic expression:

2 + 20 * 2

If you know EBNF a bit and wouldn’t think about avoiding left recursion, operator precedence or associativity, you’ld probably write a grammar like this:

Expression :
  Expression '+' Expression |
  Expression '-' Expression |
  INT;

This grammar would be left-recursive because the parser reads the grammar top-down and left to right and would endlessly call the Expression rule without consuming any characters, i.e. altering the underlying state of the parser. While this kind of grammars can be written for bottom-up parsers, you’ld still have to deal with operator precedence in addition. That is define that a multiplication has higher precedence than an addition for example.

In Xtext you define the precedence implicitly when left-factoring such a grammar. Left-factoring means you get rid of left recursion by applying a certain technique, which I will show in the following.

So here is a left-factored grammar (not yet working with Xtext) for the expression language above :

Addition :
  Multiplication ('+' Multiplication)*;

Multiplication:
  NumberLiteral ('*' NumberLiteral)*;

NumberLiteral:
  INT;

As you can see the main difference is that we have three rules instead of one, and if you look a bit closer you see that there’s a certain delegation pattern involved. The rule Addition doesn’t call itself but calls Multiplication instead. The operator precedence is defined by the order of delegation. The later the rule is called the higher is its precedence. This is at least the case for the first two rules which are of a left-recursive nature (but we’ve left-factored them now). The last rule is not left-recursive which is why you can write them down without applying this pattern.

We should allow users to explicitly adjust precedence by adding parentheses, e.g. write something like (2 + 20) * 2. So let’s add support for that (note that the grammar is still not working with Xtext):

Addition :
  Multiplication ('+' Multiplication)*;

Multiplication:
  Primary ('*' Primary)*;

Primary :
  NumberLiteral |
  '(' Addition ')';

NumberLiteral:
  INT;

Once again: if you have some construct that recurses on the left hand side, you need to put it into the delegation chain according to their operator precedence. The pattern is always the same, the thing that recurses delegates to the rule with the next higher precedence.

Construction of an AST

Now that we know how to avoid left-recursion, let’s have a look at what the parser produces. In Xtext each rule returns some value:

Parser rules return AST nodes (i.e. EObject instances),
enum rules return enum literals and
datatype rules as well as
terminal rules return simple values like strings and the like (EDatatype in EMF jargon).

Xtext can automatically infer whether some rule is a parser rule, i.e. constructs and returns an AST node, or if it is a datatype rule. The grammars above only consisted of datatype rules so all they would produce is a string.
In order to construct an AST we need to add Assignments and Actions. But before we do that we need to talk about return types.

The return type of a rule can be specified explicitly using the ‘returns‘ keyword but can be inferred if the type’s name is the same as the rule’s name. That is

NumberLiteral : ... ;

is a short form of

NumberLiteral returns NumberLiteral : ... ;

However, in the case of the expressions grammar above, the rules all need to return the same type since they are recursive. So in order to make the grammar functional we need to add a common return type explicitly (but the grammar is still missing some bits):

Addition returns Expression:
  Multiplication ('+' Multiplication)*;

Multiplication returns Expression:
  Primary ('*' Primary)*;

Primary returns Expression:
  NumberLiteral |
  '(' Addition ')';

NumberLiteral:
  INT;

The AST type inference mechanism of Xtext will infer two types: Expression and NumberLiteral. Now we need to add assignments and Actions in order to store all the important information in the AST and to create reasonable subtypes for the two operations.

In the following you see the final fully working Xtext grammar:

Addition returns Expression:
  Multiplication ({Addition.left=current} '+' right=Multiplication)*;

Multiplication returns Expression:
  Primary ({Multiplication.left=current} '*' right=Primary)*;

Primary returns Expression:
  NumberLiteral |
  '(' Addition ')';

NumberLiteral:
  value=INT;

Let’s go through the grammar as the parser would do it for the expression

( 1 + 20 ) * 2

(I’m sure it’s pretty hard to follow what’s going just by reading this text. Therefore I have prepared a small interactive slide show. You can find it at the end of this section.)

The parser always starts with the first rule (Addition). Therein the first element is an unassigned rule call to Multiplication which in turn calls Primary. Primary now has two alternatives. The first one calls NumberLiteral which consists only of one assignment to a feature called ‘value’ of what the INT rule returns.

But as the first token in the expression is an opening parenthesis ‘(‘, the parser will take the second alternative in Primary, consume the ‘(‘ and call the rule Addition. Now the value ‘1’ is the look-ahead token and again Addition calls Multiplication and Multiplication calls Primary. This time the parser takes the first alternative because ‘1’ has been consumed by the INT rule (which btw. is a reused library terminal rule).

As soon as the parser hits an assignment it checks whether an AST node for the current rule has already been created. If not it will create one based on the return type, which is NumberLiteral. The Xtext generator will have created an EClass ‘NumberLiteral’ before, which can now be instantiated. That type will also have a property called value of type Integer, which will get the value ‘1’ set. This is what the Java equivalent would look like:

// value=INT
if (current == null) {
  current = new NumberLiteral();
}
current.setValue(ruleINT());
...

Now that the rule has been completed, the created EObject is returned to the calling rule Primary, which in turn returns the object unchanged to its own caller. Within Multiplication the call to Primary has been successfully parsed and returns an instance of NumberLiteral. The second part of the rule is a so called group (everything within the parenthesis). The asterisk behind the closing parenthesis states that this part can be consumed zero or more times. The first token to consume in this part is the multiplication operator ‘*’. Unfortunately in the current situation the next token to consume is the plus operator ‘+’, so the group is not consumed at all and the rule returns the result of the unassigned rule call (the NumberLiteral) .

In rule Addition there’s a similar group, but this time it expects the correct operator so the parser goes into the group. The first element in the group is a so called action. As Xtext grammars are highly declarative and bi-directional, it is not a good idea to allow arbitrary expression within actions as it is usually the case with other parser generators. Instead we only support two kinds of actions. This one will create a new instance of type Addition and assign what was the to be returned object to the feature left. In Java this would have been something like:

// Multiplication rule call
current = ruleMultiplication();
// {Addition.left=current}
Addition temp = new Addition();
temp.setLeft(current);
current = temp;
...

As a result the rule would now return an instance of Addition which has a NumberLiteral set to its property left. Next up the parser consumes the ‘+’ operator. We do not store the operator in the AST because we have an explicit Addition type, which implicitly contains this information. The assignment (right=Multiplication) calls Multiplication another time and assigns the returned object (a NumberLiteral of value=20) to the property named right.

If we now had an additional plus operation ‘+’ (e.g. 1 + 2 + 3) the group would match another time and create another instance of Addition. But we don’t and therefore the rule is completed and returns the created instance of Addition to its caller which was the second alternative in Primary. Now the closing parenthesis is matched and consumed and the stack is reduced once more.

We are now in rule Multiplication and have the multiplication operator ‘*’ on the look ahead. The parser goes into the group and applies the action. Finally it calls the Primary rule, gets another instance of NumberLiteral (value=2), assigns it as the ‘right’ operand of the Multiplication, and returns the Multiplication to Addition which in turn returns the very same object as there’s nothing left to parse.

The resulting AST looks like this:

Parsed Expression Tree

The slide show below illustrates how the parsing process goes.

Associativity

There is still one topic I should mention, which is associativity. There is left and right associativity as well as no associativity. In the example we have seen left associativity. Associativity tells the parser how to construct the AST when there are two infix operations with the same precedence. The following example is taken from the corresponding wikipedia entry:

Consider the expression a ~ b ~ c. If the operator ~ has left associativity, this expression would be interpreted as (a ~ b) ~ c and evaluated left-to-right. If the operator has right associativity, the expression would be interpreted as a ~ (b ~ c) and evaluated right-to-left. If the operator is non-associative, the expression might be a syntax error, or it might have some special meaning.

We already know the most important form which is left associativity:

Addition returns Expression:
 Multiplication ({Addition.left=current} '+' right=Multiplication)*;

Right associativity is done using the following pattern (note the quantity operator and the call to the rule itself at the end):

Addition returns Expression:
 Multiplication ({Addition.left=current} '+' right=Addition)?;

And if you don’t want to allow multiple usages of the same expression in a row (hence non-associativity) you write:

Addition returns Expression:
 Multiplication ({Addition.left=current} '+' right=Multiplication)?;

Note that sometimes it’s better to allow associativity on parser level, but forbid it later using validation, because you can come up with a better error message. Furthermore the whole parsing process won’t be interrupted, so your tooling will generally be more forgiving.

Everybody repeats the popular quote by investor Marc Andreessen: “ Software is Eating the World”. This phrase describes how software is going everywhere these days and how it disrupts traditional businesses. Industries that seemed to be well-established and developed are taken over by startups with software in often just a couple of months. Think of how Netflix has impacted the TV business or how Uber changed the world of taxis. Classic travel agencies are having a hard time competing with platforms like Airbnb. Even the automotive industry got disrupted by a startup you might have heard of called Tesla, mainly because of software. Speaking of the automotive world, did you know that an average modern car runs on around 100 million lines of code?

These are only some of the most popular cases. Software truly is everywhere, and in the future there will be only very few products in which software is not playing a big and important role.

Who Writes All That Code?

“Software everywhere” also means that software systems are getting bigger and more complex, simply because they do more. At the same time everything must be connected, which adds another dimension of complexity. Who writes, and more importantly, who maintains all that code?

The straightforward answer is, of course: software engineers. If you know how to program these days you will very likely find a good job because every company needs to write software. However, software is built for specific purposes. Programmers not only need to know how to write code, but also need to understand the domain for which the software is developed. On top of it all, the domain is not usually trivial. Ideally, you would have employees who are good at programming AND have a deep understanding of the business domain. Good luck finding such personnel!

In practice, we need to build teams composed of people with different strengths, but we need to be careful as the communication overhead increases with the number of individuals in a team. Actually this overhead grows exponentially or, more accurately, as a combinatorial explosion. Always remember, “nine women can’t make a baby in one month”.

In other words, we should be doing everything we can to minimize the number of people and maximize their productivity. Hiring only super motivated and talented domain experts that are also extremely good software engineers is definitely a good recipe. But if you can’t find enough of these talented individuals, you can do other things to improve the productivity of your team and minimize the communication overhead.

Tools To The Rescue

Professional people should use professional tools. For software engineering these tools are debuggers, compilers, code editors, profilers, and many more. Such tools are often combined in an Integrated Development Environment (IDE). These are generic tools for programmers with which they implement the software using a generic language such as Java.

That is all very nice for the coders, but how can we put the domain experts in the loop? Most of them cannot write code. Do we really want them to write prose text about requirements and domain concepts and let the programmers translate that into code? Shouldn’t we try to allow them to participate more actively in the software development process? Maybe we can put the relevant code in a form they can at least read and understand, so they can reason about the actual software rather than about some outdated requirements document.

DSLs Can Bridge The Gap

A domain-specific language (DSL) is a programming language that is tailored for a particular problem domain and a particular group of people. DSLs are formal, so whatever you write using DSLs will have a specific meaning and can be understood by a machine. The notation of a DSL on the other hand is tailored towards the domain people, so it isn’t weighed down by all of the generic complexity of a programming language. Instead it offers powerful concepts to solve and describe domain problems.

Imagine a payroll software and all the laws and guidelines that require implementation. There is a ton of mathematical rules involved to compute a certain payroll. Think about how these rules differ between the various industries and how the rules change on an annual basis. Still the software needs to be able to recalculate a payroll from any point in the past, applying the rules valid at that point in time. I once did a workshop with a company that had such a product. In their case, the payroll experts wrote down all the needed formulas in Excel sheets and put text prose next to them to explain to the software engineers how to include them in the software system. The result was a sea of hundreds of fat Excel sheets full of bugs because the information was only semi-formal and not testable. The software developers would then take these Excel sheets and translate them to code (C# in that particular case). Bugs were only found and fixed in the resulting software and both the system and the Excel sheets were hardly maintainable anymore.

They consulted me because they had heard about DSLs and wanted to know whether they could help improve the communication in the team and make the ever changing rules of payrolls more maintainable. In a two-day workshop we designed a DSL that allowed the domain experts to write the Excel formulas (we reused the Excel syntax, as it was familiar) in a text file, tested them and even integrated them directly in the product. So they were actually writing code and they finally had a single-sourced solution for their highly evolving domain logic. Storing the formulas as text further helped with versioning and working closer with the software team. The DSL supported a handful of powerful concepts that were needed in that domain. For instance, we added the concept of declarative validity ranges. They allowed to annotate a certain formula with a start and an end date. In the past that information was littered across the code in the form of lengthy if-then-else cascades. These are very hard to grasp when you come back after some time and want to add another rule.

Besides the single sourcing which eliminated a lot of redundancy, they could finally look at the source of the system together and discuss and reason about it – no more misunderstandings. We used Eclipse Xtext to design the DSL, so they got a full-featured editor with content assist, error checking and so on.

With Xtext, such a DSL can be implemented in a very short amount of time. In addition, it is easy to enhance and maintain such a DSL over time. The framework supports the whole stack of what a DSL implementation needs, and it even offers advanced editing support for various different platforms. You can ship your DSL as a trimmed down RCP app without all the complexity of a typical Eclipse IDE. Alternatively you can build an update site for developers to install and update the DSL editors. With Xtext’s new web editor integration you can even edit your DSL files in any kind of web application. Imagine an admin tool for the payroll system where you can alter or add forms to the running system.

Summary

What should you take away from this article? If applied wisely, DSLs can dramatically improve the productivity of your team and the maintainability of software systems. The toughest part is definitely identifying the sweet spots for such powerful abstractions, such that the investment pays off. I like to think of DSLs as an extension of what you can do with general purpose programming: Framework developers usually create building blocks for application developers for the same reasons that a language engineer designs a DSL for non-coding but logically thinking team members.

Jbase is a customization of Xbase to handle pure Java expressions and to adhere to the stricter Java type system.

Jbase main implementation aspects are:

redefines many of the Xbase grammar rules so that they can handle Java expressions (including array access expressions with [])
customizes the Xbase compiler to handle additional Java expressions
customizes the Xbase type system to adhere to the stricter Java type system

As shown in this blog post, the programmers already familiar with Xbase will only have to perform a few steps to use Jbase in their DSL.

In this blog post I’d like to show how to get started using Jbase. Of course, you need to be already familiar with Xbase concepts, in particular with the JvmModelInferrer. In the end, you will get all the benefits of Xbase, but with the syntax of Java expressions.

Why would I want that? Well, first of all, I’d like to stress that I really love Xbase, not to mention its main incarnation: Xtend. I’m using Xtend whenever I can, even for non Xtext projects, since its Xbase-based syntax is really a better Java without noise. I started to develop Jbase because in some projects I really need to embed real Java expressions, even if I lose the clean Xbase syntax. In particular, one year ago I implemented Java–, a simpler version of Java aiming to teach programming (e.g., just functions, no classes, no inheritance), that’s when I had to customize Xbase to get Java expressions. Jbase is basically the factoring out of the reusable parts of Java–.

NOTE: Jbase is currently based on Xtext/Xbase 2.8.4, so it will not work with the newer version of Xtext 2.9.0. The porting to Xtext 2.9.0 will start soon.

The source code of Jbase can be found at https://github.com/LorenzoBettini/jbase

and the update site at http://sourceforge.net/projects/xtext-jbase/files/updates/releases

First Tutorial

Create an Xtext project, you can leave the defaults for names, e.g., org.xtext.example.mydsl.

Open the MANIFEST.MF of the main project and

add a dependency to the bundle jbase and re-export that dependency;
add a dependency to the bundle jbase.mwe2 and make it optional;

Save the file.

Change the Xtext grammar, MyDsl.xtext, making the grammar inherit from jbase.Jbase:

grammar org.xtext.example.mydsl.MyDsl with jbase.Jbase

generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"

Model:
    // import section (automatic imports)
    importSection=XImportSection?
    greetings+=Greeting*;
    
Greeting:
    // each greeting will have a Java block
    'Hello' name=ID body=XBlockExpression;

Open the MWE2 workflow, GenerateMyDsl.mwe2 and

in the StandaloneSetup part, add the references to the JbasePackage and to Jbase.gnemodel, the modified part should look like this

bean = StandaloneSetup {
    scanClassPath = true
    platformUri = "${runtimeProject}/.."
    registerGeneratedEPackage = "org.eclipse.xtext.xbase.XbasePackage"
    registerGenModelFile = "platform:/resource/org.eclipse.xtext.xbase/model/Xbase.genmodel"
    // In order to inherit from Jbase, add these two lines:
    registerGeneratedEPackage = "jbase.jbase.JbasePackage"
    registerGenModelFile = "platform:/resource/jbase/model/custom/Jbase.genmodel"
}

search for the XbaseGeneratorFragment reference, and replace it with JbaseGeneratorFragment:

// generates the required bindings only if the grammar inherits from Xbase
// fragment = xbase.XbaseGeneratorFragment auto-inject {}
fragment = jbase.mwe2.generator.JbaseGeneratorFragment auto-inject {}

Run the GenerateMyDsl.mwe2 just to make sure that the Xtext artifacts are correctly generated and that Jbase dependencies are added to the ui project.

Implement the Jvm model inferrer as follows:

package org.xtext.example.mydsl.jvmmodel

import com.google.inject.Inject
import org.eclipse.xtext.xbase.jvmmodel.AbstractModelInferrer
import org.eclipse.xtext.xbase.jvmmodel.IJvmDeclaredTypeAcceptor
import org.eclipse.xtext.xbase.jvmmodel.JvmTypesBuilder
import org.xtext.example.mydsl.myDsl.Model

class MyDslJvmModelInferrer extends AbstractModelInferrer {

    @Inject extension JvmTypesBuilder

    def dispatch void infer(Model element, IJvmDeclaredTypeAcceptor acceptor, boolean isPreIndexingPhase) {
        for (greeting : element.greetings) {
            acceptor.accept(element.toClass("my.company." + greeting.name)) [
                members += greeting.toMethod("hello" + greeting.name, inferredType) [
                    body = greeting.body
                ]
            ]
        }
    }
}

Thus, for each Hello element, we create a Java class with the name of the element (in the package “my.company”) with a public method with name “hello” + the name of the Hello element, and associate the block expression to the body of the method. Note that the return type of the method will be inferred automatically from the contents of the block expression.

If you want to manually test your DSL, run another Eclipse instance (Xtext should have generated a launch configuration in your projects, if you started with an empty workspace); in the new instance, create a Plug-in project, and add a dependency to the bundle org.eclipse.xtext.xbase.lib. In the src folder create a new file with extension .mydsl, e.g., example.mydsl; when the dialog pop-ups, accept to add the Xtext nature to the project.

Try and add some contents to the file; remember that in the codeblock you must use Java syntax, NOT Xbase syntax:

Hello ArrayExample {
    List<String> list = new ListExample().helloListExample();
    String[] a = new String[list.size()];
    for (int i = 0; i < a.length; i++) {
        a[i] = list.get(i);
    }
    return a;
}

Note that variable declarations are like in Java, and that you have array access expressions.

The Domainmodel with Jbase

We now see how to turn an existing Xbase-DSL to use Jbase. We will use the well-known Xtext/Xbase Domainmodel example.

First of all, import this example into the workspace: File => New => Example… then navigate to the “Xtext Examples” category and choose “Xtext Domain-Model Example”, this will materialize into your workspace the standard Xtext Domainmodel example’s three projects.

First of all, open org.eclipse.xtext.example.domainmodel‘s MANIFEST.MF file and

add a dependency to the bundle jbase and re-export that dependency;
add a dependency to the bundle jbase.mwe2 and make it optional;

Save the file.

Modify the GenerateDomainmodelLanguage.mwe2 as shown in the First Tutorial

in the StandaloneSetup part, add the references to the JbasePackage and to Jbase.gnemodel, the modified part should look like this

bean = StandaloneSetup {
    scanClassPath  = true
    registerGeneratedEPackage = "org.eclipse.xtext.xbase.XbasePackage"
    registerGenModelFile = "platform:/resource/org.eclipse.xtext.xbase/model/Xbase.genmodel"
    registerGenModelFile = "platform:/resource/org.eclipse.xtext.common.types/model/JavaVMTypes.genmodel"
    // In order to inherit from Jbase, add these two lines:
    registerGeneratedEPackage = "jbase.jbase.JbasePackage"
    registerGenModelFile = "platform:/resource/jbase/model/custom/Jbase.genmodel"
}

search for the XbaseGeneratorFragment reference, and replace it with JbaseGeneratorFragment:

// generates the required bindings only if the grammar inherits from Xbase
// fragment = xbase.XbaseGeneratorFragment auto-inject {}
fragment = jbase.mwe2.generator.JbaseGeneratorFragment auto-inject {}

Make the Domainmodel.xtext grammar inherit from jbase.Jbase:

grammar org.eclipse.xtext.example.domainmodel.Domainmodel with jbase.Jbase

Now run the mwe2 workflow, to re-generate Xtext artefacts.

The DomainmodelJavaValidator should be modified so that it extends JbaseValidator instead of XbaseValidator, so that it will use our Jbase custom Xbase validator.

There is no need to change the model inferrer, since the runtime model of Jbase is also a valid runtime model of Xbase.

If you want to manually test these modifications to the Domainmodel, run another Eclipse instance (Xtext should have generated a launch configuration in your projects, if you started with an empty workspace); in the new instance, create a Plug-in project, and add a dependency to the bundle org.eclipse.xtext.xbase.lib. In the src folder create a new file with extension .dmodel, e.g., example.dmodel; when the dialog pop-ups, accept to add the Xtext nature to the project.

Here’s an example of what you can write with the Domainmodel example that uses Jbase instead of Xbase. Note the Java variable syntax, array access syntax and formal parameters, which are, as in Java, non-final by default (so they can be assigned):

import java.util.List;

package my.model {

    entity Person {
        name : String
        firstName : String
        friends : List<Person>
        address : Address

        op getFullName() : String {
            return firstName + " " + name;
        }

        op getFriendsArray() : Person[] {
            Person[] a = new Person[friends.size()];
            int i = 0;
            for (Person friend : friends) {
                a[i++] = friend;
            }
            return a;
        }

        op m(int i) {
            // parameters are NON-final by default, as in Java
            i = 10;
        }
    }

    entity Address {
        street : String
        zip : String
        city : String
    }
}

You may also want to modify the DomainmodelFormatter.xtend so that it extends JbaseFormatter instead of XbaseFormatter, so that automatic formatting will work for our Java expressions.

If you want to run the Domainmodel Junit test suite, you first need to update the inputs that are used in those tests, so that they respect the Java syntax: imports and statements must be terminated with a semicolon, method invocations must have parenthesis, etc. Moreover, the original test XbaseIntegrationTest will not work (since it relies on Xbase syntax). Finally, some tests should be removed completely, since they use features/mechanisms of Xbase that are not supported in Jbase, such as, e.g., static extension methods. (The modified domainmodel example can be found in the Git repository of Jbase).

Xtext 2.9 adds support for two additional editor platforms: Web-editors and IDEA. It also adds generic build system integration for Maven and for Gradle. As a result the number of generator options has grown a lot. So we took the opportunity to re-implement Xtext’s code generator – the one that creates the language infrastructure from the grammar.

The new generator no longer relies on Xpand, uses dependency injection to wire up the components and is a lot easier to configure for the user. It is still run by means of an MWE2 workflow. Don’t worry: The old generator is still there so you don’t have to migrate immediately. But new languages will automatically make use of the new generator.

The New Xtext Project Wizard

To understand the new code generator let us first have a look at the new New Xtext Project Wizard.

Xtext 2.9 new project Wizard

The first three facets stand for the three supported editor front-ends. The new Generic IDE Support is needed by all of them. It encapsulates platform-neutral editor functionality such as syntax coloring and parser based code completion. Note that you can skip the generation of an editor entirely by not checking any of the first four facets.

In addition to Testing Support, you can choose a preferred build system. If either Maven or Gradle is checked, the wizard will generate all the projects within a parent project to comply to the standard hierarchical file layout of a Maven/Gradle application. Selecting Maven/Gradle as Source Layout will create the common Maven file structure with source folders as src/main/java, src/test/java etc.

Note that not all combinations make sense, and you will be warned or even corrected by the wizard if your config is irregular.

The Project Layout

The resulting project layout for the above configuration will look like this:

Xtext 2.9 project layout

mydsl: The base runtime infrastructure of the language, like the parser, validation etc. Always generated.
mydsl.ide: The common editor functionality for all front-ends. Generated when Generic IDE Support is checked.
mydsl.idea: The Intellij IDEA plug-in.
mydsl.parent: The parent project containing all others. Generated if a Preferred Build System is chosen.
mydsl.target: Contains the target platform definition, i.e. the definition of the Eclipse plug-ins and features the code is compiled against. Needed when using Maven or Gradle as build system and creating an Eclipse editor.
mydsl.tests: The tests that only depend on the base infrastructure, i.e. not on any editor, should go here as plain JUnit tests. Generated when Testing Support is enabled.
mydsl.ui: The Eclipse editor plug-in, generated by the Eclipse plug-in facet.
mydsl.ui.tests: JUnit plug-in tests for the Eclipse editor plug-in go in here. Generated if Eclipse plug-in and Testing Support are checked.

The Workflow

The MWE2 workflow generated from the above wizard settings is shown below.

The new generator workflow uses a single new component of type XtextGenerator. If you are missing the former DirectoryCleaner and StandaloneSetup: They are now fields of the generator with good defaults.

The generator is configured with a DefaultGeneratorModule. It is a real Guice module, so if you need additional services in custom code generator fragments, you can use Xtext’s module API to bind them in a subclass and use that one instead. All other components can just @Inject any service. The DefaultGeneratorModule also binds an IXtextProjectConfig that defines what kind of Eclipse projects should be created and how they are named, and a CodeConfig holding common code generation parameters (see the source code for substituted variables in the file header).

The XtextGenerator also defines any number of IXtextGeneratorLanguages, each with a language name and a set of file extensions. The StandardLanguage defines a common set of generator fragments as fields, which can be customized individually as in the example. If you want to skip/add fragments, the easiest way is to copy StandardLanguage and adapt it.

module org.xtext.example.mydsl.GenerateMyDsl

import ....

Workflow {
  component = XtextGenerator {
    configuration = /* DefaultGeneratorModule */{
      project = StandardProjectConfig {
        baseName = "org.xtext.example.mydsl"
        rootPath = ".."
        runtimeTest = {
          enabled = true
        }
        eclipsePlugin = {
          enabled = true
        }
        eclipsePluginTest = {
          enabled = true
        }
        ideaPlugin = {
          enabled = true
        }
        web = {
          enabled = true
        }
        createEclipseMetaData = true
      }
      code = /* CodeConfig */ {
        encoding = "UTF-8"
        fileHeader = "/*\n * generated by Xtext \${version}\n */"
      }
    }
    language = StandardLanguage {
      name = "org.xtext.example.mydsl.MyDsl"
      fileExtensions = "mydsl"

      serializer = {
        generateStub = false
      }
      validator = {
        // composedCheck = "org.eclipse.xtext.validation.NamesAreUniqueValidator"
      }
    }
  }
}

Xtext 2.9 ships with a new generator architecture, which is described in the previous post. Even though there is no immediate urge to migrate an existing language to the new generator infrastructure, here is how to do it.

Make sure to keep a copy of your old code in order to safely roll back in case something goes wrong. If you customized a lot or if you want additional functionality such as IDEA, Web or Maven support, it may be easier to create new plug-ins using the New Xtext Project wizard and then copy existing files over.

Prepare the ide plug-in

Core UI functionality that could be used for Eclipse, IDEA and Web editors, is extracted to a new plug-in <myLang>.ide. As it is usually created by the New Xtext Project wizard, you have to create it manually when migrating.

Create the plug-in using File > New > Project… > Plug-in Project. Make sure it is physically located in the same directory as the other plug-in projects of your language (e.g. eGit links directories from different locations into the workspace)
Add a new source folder src-gen and don’t forget to add it in the source.. section of the build.properties.
In the MANIFEST.MF, add a plug-in dependency to
- <myLang> 
- org.eclipse.xtext.ide
-  org.eclipse.xtext.xbase.ide (if your language uses Xbase)
You may want to add the new plug-in to the automatically refresh list in the workflow’s launch configuration: Run as… > Run configurations… > MWE launch > Generate language infrastructure (<myLang>) > Refresh > Specific Resources > Specify Resources…

Optional: Prepare a test.ui plug-in

This is only necessary when you have JUnit plug-in tests. These should be moved into a separate plug-in <myLang>.ui.tests now. Create the plug-in analogously to the ide plug-in above, and add a plug-in dependency to <myLang>.ui.

Convert modules and standalone setup to Xtend

The new generator will by default generate <myLang>RuntimeModule, <myLang>StandaloneSetup and <myLang>UiModule in Xtend. To avoid duplicate classes, convert the existing ones to Xtend by using Convert to Xtend from the context menu of the respective Java files.

Change the workflow

If you have not changed anything in the initially generated workflow, you can replace it by something like this

// adapt name
module <myPlugin>.Generate<MyLang>

import org.eclipse.xtext.xtext.generator.*
import org.eclipse.xtext.xtext.generator.model.project.*

var rootPath = ".."

Workflow {
  component = XtextGenerator {
    configuration = {
      project = StandardProjectConfig {
        // replace with your base plug-in's name 
        baseName = "<myPlugin>" 
        rootPath = rootPath
        runtimeTest = {
          enabled = true
        }
        genericIde = {
          enabled = true
        }
        eclipsePlugin = {
          enabled = true
        }
        // only needed if you have plug-in tests
        eclipsePluginTest = {
          enabled = true
        }
        createEclipseMetaData = true
      }
      code = {
        encoding = "UTF-8"
        fileHeader = "/*\n * generated by Xtext \${version}\n */"
        // preferXtendStubs = false
      }
    }
    language = StandardLanguage {
      // replace with your language’s qualified name
      name = "<myLang>"
      // needed when you import an Ecore model in the grammar
      referencedResource = "platform:/resource/path/to/genmodel"
      // needed when you import an Xcore model in the grammar
      referencedResource = "platform:/resource/path/to/Xcore/model"
      // replace file extensions here
      fileExtensions = "myLang"

      serializer = {
        generateStub = false
      }
    }
  }
}

Note that we no longer use auto-inject, so if you want to generate only Java stubs, you have to do that in the code section.
Use code completion for a list of available fragments and their properties if you want to customize them.
If your grammar uses imported Ecore models, load the genmodels as referencedResource in the StandardLanguage section.
If your grammar uses imported Xcore models, load them as referencedResource in the StandardLanguage section.

Run the workflow and fix remaining compile errors

The MWE2 workflow should run without errors. After that some plug-ins may have compile errors.

<myLang>.ui

In the MANIFEST.MF, add <myLang>.ide as a dependency and remove the Antlr contentassist packages from package exports.
To better serve the use case of multiple languages per plug-in, the Activator is now named the same as the plug-in. This may yield a different case if your languages name uses camel case, e.g. the former MyCamelCaseLanguageActivator will now become MycamelcaseActivator. If that is the case, the manifest shows a warning that you have to fix manually. Some repositories have problems when filenames are changed to a different case, e.g. in Git you have to delete the file, stage the deletion, restore the file from history and stage the new one to make it work.

<myLang>.tests:

The runtime injection provider has moved to <myLang>.tests. Fix compile errors with Organize Imports and remove the stale package import from the manifest.
You can remove the dependency to the <myLang>.ui plug-in.

<myLang>.ui.tests:

The UI injection provider has moved to the <myLang>.ui.tests package.

Adapt plugin.xml files

Compare the plugin.xml and plugin.xml_gen in the runtime and the UI project using the compare editor. We have harmonized the whitespaces in generated files, so the button Ignore Whitespace may help you track down the real differences you have to merge manually.

Use new APIs

(Semantic) highlighting has moved to the <myLang>.ide plug-in. If you see deprecation warnings in classes you customized, just redirect the deprecated imports from org.eclipse.xtext.ui.editor.syntaxcoloring.X to org.eclipse.xtext.ide.editor.syntaxcoloring.XXX. For the binding in the module, you might have to override bindIdeISemanticHighlightingCalculator instead.
Same holds for bracket matching and parser based content assist.

That is it, I hope you were successful. For further assist, ask us in the Xtext forum.

Language parsing is traditionally split into two phases: Lexing and Parsing. In this post I want to talk about how a Lexer generally works and what you can do if it doesn’t.

What A Lexer Does

The duty of a lexer is to turn a sequence of single characters into a sequence of so called tokens. A token is a chunk of characters associated with a certain token-type. Most programming languages define individual lexer rules for things like names (identifiers), string literals, numbers, whitespace and comments. The latter two are usually not passed to the parser, or at least sent in a special way, so that we don’t have to deal with whitespace and comments explicitly afterwards.

Within the parser rules we can then use the more coarse grained token types to define the grammar of a language. It’s possible to parse without a lexer (or scanner), which is then called scanner-less parsing. The main reason for having a lexer doing the tokenizing first is performance.

How A Lexer Works

There are a lot of similarities in how a lexer and a parser work; in Xtext and Antlr the lexer and parser rules share most of the concepts and only have a few differences. The most important difference is that a lexer is usually free of context.

Most lexers work by following these two principles:

Be greedy
That is use the lexer rule that consumes the most characters
First Rule Wins
If there are two or more rules matching the same amount of characters, the first one wins.

Example

In Xtext, the lexer consists of all the lexer rules plus the keywords used in the parser rules. Because we want to give keywords a higher precedence, they are ‘copied’ before the defined lexer rules. Let’s consider the following two rules:

HelloWorld : 'Hello' ID ;

terminal ID: ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '0'..'9')* ;

The input

Hello Xtext

will be lexed into two tokens, ‘Hello’ and ID (if we ignore the whitespace). Although ‘Hello’ would be a perfect match for ID, it is also a match for the ‘hello’ keyword, which is always preferred (declared before any lexer rule). So principle 2 applies here.

For the input

HelloXtext

we end up with one token ID. In this case principle 1 applies, as the sequence ‘HelloXtext’ is longer than what the keyword rule could have matched.

Problems With The Antlr 3 Lexer

Xtext generates an Antlr 3 based lexer, which also follows the two principles above. Unfortunately it only guesses on what rule will possibly match the longest sequence using a statemachine that does look ahead. Based on that outcome, it decides on one of the lexer rules. The problem is that the statemachine doesn’t do the full lexing and sometimes is wrong.

Example:

TheNumber : 'the' 'number' 'is' number=NUMBER '.';

terminal NUMBER : '0'..'9'+ ('.' '0'..'9'+)?;

With this grammar it should be possible to write

the number is 24.

and

the number is 23.00.

but not

the number is 23..

The lexer generated by Antlr 3 will however not work as expected, but will always try to consume any dots following numbers as part of the NUMBER rule. As a result the first and the last text will have errors.

Ways To Solve This Issue

Most of the time, you can get around this kind of problems by using parser rules instead. E.g. you could write the following:

TheNumber : 'the' 'number' 'is' number=Decimal '.';
Decimal : NUMBER ->('.' NUMBER)?;
terminal NUMBER : '0'..'9'+;

But sometimes it is more involved and you really need (or want) to solve this on the lexer level. In that case you can use a different implementation. Both Antlr4 and JFlex will work fine with the above example.

I just recently replaced the lexer in a customer’s project which is actually open-source. If you want to see how this can be done, you can find the code here.

Of course I will be happy to do that for you, as well. Happy Lexing!

Get Support

Xtext is a language development framework that is best known for the rich tool support it gives you for your programming languages. But even if you don’t need editing capabilities, Xtext has much more to offer than a simple parser generator like Antlr. In this post I will first describe the aspects and features that Xtext offers on the runtime side, before I briefly explain how to use them in any vanilla Java process.

Why Use Xtext Even Without IDE Support

Antlr is an excellent parser generator, but a language needs more than just a parser. With Xtext you get callbacks, support and even full implementations for the following non-tooling aspects of a language:

lexer & parser (through Antlr)
typed abstract syntax tree (AST)
parse tree
unparser (AST -> text)
lazy linking
scoping framework
static analysis / validation
code generation
interpreter
EMF compatibility
Incremental compiler support
Maven and Gradle plugins

For the simple cases you would only throw a grammar and a code generator template at Xtext and you have a fully featured language compiler (or transpiler). However, the architecture of Xtext allows you to customize every aspect of your language in a very clean non-invasive manner. On top of that, there are standard plugins for Maven and Gradle that let you include your Xtext compiler within your builds.

How Can I Use The Runtime Part?

No matter what kind of Xtext project you have, the structure already separates tooling-related things from the runtime part. The main project, which contains the *.xtext grammar file, includes everything you need to parse, validate and process files of your language. Since version 2.9 you can even create new projects without any IDE support. To do so you simply need to deselect all the tooling-related options in the wizard, pick whether you want to build your project with Gradle or Maven and your are done. The following screenshot shows that wizard page.

Xtext Wizard- Runtime Only

If you want to use the runtime part of your language in e.g. a mavenized Java project, you simply need to add a dependency to your language’s pom. Next up you can for instance use the EMF API to load files of your language like this:

// do this only once per application
Injector injector = new MyDslStandaloneSetup().createInjectorAndDoEMFRegistration();

// obtain a resourceset from the injector
XtextResourceSet resourceSet = injector.getInstance(XtextResourceSet.class);

// load a resource by URI, in this case from the file system
Resource resource = resourceSet.getResource(URI.createFileURI("./mymodel.mydsl"), true);

If you want to load a bunch of files that have references to each other, you should add them all to the resourceset at this point, by calling

resourceSet.getResource(URI.createFileURI("./anothermodel.mydsl"), true);

It is a good idea to check the validity of the model before processing it, so better call validate next:

// Validation
IResourceValidator validator = ((XtextResource)resource).getResourceServiceProvider().getResourceValidator();
List<Issue> issues = validator.validate(resource, CheckMode.ALL, CancelIndicator.NullImpl);
for (Issue issue : issues) {
  System.out.println(issue.getMessage());
}

The Issue objects contain all the information needed, here we only print out the error message. If you now also want to run the code generator you can do so by asking the injector for the GeneratorDelegate (since 2.9, use IGenerator for earlier versions) like in the following example:

// Code Generator
GeneratorDelegate generator = injector.getInstance(GeneratorDelegate.class);
InMemoryFileSystemAccess fsa = new InMemoryFileSystemAccess();
generator.doGenerate(resource, fsa);
for (Entry<String, CharSequence> file : fsa.getTextFiles().entrySet()) {
  System.out.println("Generated file path : "+file.getKey());
  System.out.println("Generated file contents : "+file.getValue());
}

In this example we pass an in-memory file system to the generator; of course there is also one that delegates to java.io.File, i.e. directly writes to disk.

Using The New Standalone Builder Programmatically

Another option is to use the incremental builder, which takes care of all the lifecycles and indexing. For a set of files that it maintains and builds it will automatically detect the effective changes for subsequent smaller changes. The new gradle deamon as well as the new IntelliJ IDEA plugin use this component. If you want to learn how to use that these unit tests should give you an idea.

When Not To Use Xtext?

There are only a few scenarios where I wouldn’t use Xtext. Mostly this would be because I need to process huge amount of data, where I simply cannot afford translating to an in-memory AST before processing. So basically the cases where you would prefer a SAX parser over a DOM when processing XML.

Platform Independent Tool Support

Maybe you are interested in tool support for your language, but you simply don’t want to have it for Eclipse, IntelliJ IDEA or Orion/Ace/CodeMirror which Xtext covers out-of-the-box. Since 2.9 there is an additional platform independent IDE project that can be used in any Java program and gives you the basic infrastructure for things like content assist and syntax coloring. This code can be used from wherever you want. People have already built support for JavaFX and Atom with it.

Summary

Xtext has a lot to offer even if you don’t need an editor for your language. The grammar language is a convenient way to describe syntax and map it to a typed AST that can be further processed with Java or Xtend. The linking, validation and code generation hooks are all in place and battle-proven. Furthermore Xtext provides a serializer (aka. unparser), which allows you to modify the AST programmatically and write the changes back to the concrete textual syntax.

In the upcoming Version 2.11, Xtext will support the Language Server Protocol defined by Visual Studio Code. This is a very important step, as the protocol is generic and is going to be supported by other editors such as Eclipse or Che as well. In this post I want to give the early adopters among us a head start and explain how to use this exciting new feature.

Try the Example Language

Installing a language extension in VS Code is easy: open the “Extensions” view on the left sidebar and search for “typefox”. Install and activate the “mydsl” language, create a new file with that extension (e.g. test.mydsl), and explore the editor support for this simple DSL. Here’s an example snippet:

type A {
    int x
}
type B extends A {
    A ref
    string name
}

The source of this example is available on GitHub. It has two main components: an Xtext example language consisting of a base project (io.typefox.vscode) and an IDE support project (io.typefox.vscode.ide), and a VS Code extension project (vscode-extension). You can compile and run the example with the following steps:

Run ./gradlew installServer
Open the vscode-extension project with VS Code.
Run npm install in the integrated terminal (View → Integrated Terminal).
Press F5 to start a second instance of Code.
Test the language as described above.

Create Your Own Language Extension

In case you haven’t done that yet, start by creating an Xtext project and choosing Gradle as build system. Make sure your Xtext version in the gradle build files is bumped to 2.11-SNAPSHOT. In order to create a VS Code extension for your language you need to build an executable application from it. I recommend the Gradle application plugin for this, which gives you a bundle with all required libraries and a startup script. Just add the following lines to the build.gradle of the ide project of your language:

apply plugin: 'application'
mainClassName = 'org.eclipse.xtext.ide.server.ServerLauncher'
applicationName = 'xtext-server'

The command gradle installDist generates the executable in the subfolder build/install.

As a next step, create a VS Code extension following the documentation. The official example uses a Node.js module to implement the server. You can change that to start your language server application by using the following code in extension.ts to create the server options:

let executable = process.platform == 'win32' ? 'xtext-server.bat' : 'xtext-server';
let serverLauncher = context.asAbsolutePath(path.join(
        'xtext-server', 'bin', executable));
let serverOptions: ServerOptions = {
    run : { command: serverLauncher }, debug: { command: serverLauncher }
}

If you have set up your VS Code project properly, you should now be able to start a second Code instance that includes your extension by pressing F5. Open a folder in that new instance and create a file according to the file extension of your language. Now language support should be active, and the debug console of the host instance of Code should show a message like “Loading development extension at …” – You’re done!

How Xtext Integrates with VSCode

In VS Code a language server is a process that is started and used by an extension, i.e. a plug-in for VS Code. The process application can be implemented in any programming language, and VSCode speaks to it through an input and an output stream (i.e. standard in/out or a socket).

Starting and Initializing

After launching the process, VS Code initializes the language server by sending a message. This message includes a path to the root directory the editor is looking at (unless the file is opened without a root directory). In Xtext we take that directory and do a quick build that includes indexing. In order to tell what kind of project structure we are looking at, the Xtext language server will be capable of using different project description providers. One for instance could be able to ask Gradle for the modules and dependencies, another could simply read ‘.project’ and ‘.classpath’ files. At the time of writing this we only have a dump; treat it as one project without implementation of dependencies. However, this will change in the coming weeks.

During the first build, Xtext might already find problems in your source code. In that case the language server will send notifications to the editor reporting those diagnostics.

Many Languages per Server

Usually a language server is responsible for one language. However, in order to allow cross-language linking and transitive dependency analyses, the Xtext language server can host as many langauges as you want. For VS Code it will look like one language with many different file extensions. The language server is a common reusable component that you don’t need to configure besides the project description provider mentioned above. The participating languages are loaded through a Java ServiceLoader for the type ISetup. The necessary entry under META-INF is generated for you if you use the latest nightly build.

Roadmap

The Xtext 2.11 release is planned for October 2016. This version will already allow you to create language support extensions for VS Code, but you can expect more interesting integrations in the future, e.g. the web IDE Che.

The current language server implementation of Xtext builds on the ls-api library, which I described in a previous post. This library is going to be moved to LSP4J, a new project proposed under the Eclipse umbrella.

In this post I want to give a short update of what we’ve been doing in Xtext and what the future plans are. As you probably know, Xtext has been around for a couple of years growing into a very mature framework for implementing full blown programming languages like Xtend as well as simpler more focussed domain specific languages. One of the big features is the additional tool support that the framework provides. We started out with Eclipse support and now added support for IntelliJ IDEA and the web editors Ace, CodeMirror and Orion.

Divide & Conquer

As you can imagine supporting a full language development framework with tool support for different editors and IDEs means we need to maintain a lot of code and dependencies. Also, the different IDEs use different build systems which doesn’t make it easier. So far everything was developed in one gigantic repository. For collaborators and users it was very hard to get into it and identify the different parts.

To ensure a more sustainable future for Xtext and especially the core component, which is platform (editor/IDE) independent, we have split up the repository into individual ones and reimplemented the build system. The goal was to end up with coherent, easy to understand sub projects. Xtext Core now contains only the basic framework for both runtime and tool support, but without any tool specific dependencies. It uses Gradle as the build system, which we tried to use where possible (only the Eclipse bundles are built with Tycho). So you simply checkout and call

./gradlew clean build

A simplified overview of Xtext’s components

Language Server Protocol

If you follow this blog or other public activities from us, you might have noticed that we are actively pushing the Language Server Protocol initiative. So in addition to the restructuring this is the second ingredient to a smaller more coherent Xtext. The LSP is an editor and language agnostic protocol that can be used between any editor and any language for advanced language support like content assist, find references and so on.

The LSP is the perfect match for Xtext, and the support of it will be the main new feature in the upcoming version (2.11). For the future this means that we are not going to add more sub projects like “xtext-coolneweditor” but focus on the core and let the tools focus on the proper implementation of LSP. I see this as a very important scope adjustment for a sustainable life of the project.

Today, VSCode is the only editor that fully supports the protocol, but other teams are working on support for further platforms, too! Especially in the Eclipse community Che, Orion and the classic Eclipse IDE are adopting it. By supporting the LSP your Xtext language will run in these editors without further ado. It will take some time until the LSP support in Eclipse catches up with the native Eclipse integration of Xtext. Until then (and probably beyond) we will maintain the specific platform support.

So if you are nearly as excited about this as I am, you could try it out today already (there is still work ahead of us). Miro published a tutorial about it a couple of days ago.

Release in October

Xtext 2.11 is going to be released on October 18, right before EclipseCon Europe. It will be fully API compatible to earlier versions, it is just built with Gradle and developed in smaller more cleanly separated modules. And of course you only need the Core to run in VSCode and other tools through the Language Server Protocol!

P.S: I will give two sessions at EclipseCon Europe related to this. One is about the LSP and the other is about Xtext Core and the vision behind it. Looking forward to see many of you in Ludwigsburg!

The Xtext 2.11 release has been rescheduled for January 24th 2017, as we underestimated the amount of work in front of us and overestimated the amount of time we could spend.

Today’s beta release is merely a sanity check to ensure that we can still build a complete SDK update site. Don’t use it in production, also we will modify and enhance the code base further in the coming weeks until the release. That said, if you could spend a couple of minutes to install it and check how it behaves in your environment, don’t hesitate to submit a GitHub issue if you find a potential problem!

Update site : http://download.eclipse.org/modeling/tmf/xtext/updates/milestones

Also please note that we haven’t done any beta-specific release for the maven/gradle artifacts. Please use the SNAPSHOT releases.

This post describes the most important changes that we’ve been working on so far.

Xtext Core

The Xtext project is evolving! In order to make Xtext more manageable and maintainable in the long run, the code base at eclipse/xtext has been split up into several parts, e.g. xtext-core, xtext-eclipse, and xtext-web. The ideas behind this new structure have been described recently. Of course, these structural changes won’t be noticeable by Xtext users, i.e. you will find the released bundles at the same locations as before. There will also be a talk on this matter at next week’s EclipseCon.

Language Server Protocol

Microsoft’s Language Server Protocol that was developed in the context of their new editor VS Code has been a perfect match for Xtext. The LSP defines how a generic editor client can use language specific services from a reusable language server. If you’ll be at EclipseCon Europe next week, you might want to check out our session about the language server protocol as well as the experiments of the Orion team.

In a previous post I described the basic steps to combine your DSL with a Language Server Protocol client. The Xtext Language Server implementation shipped with today’s beta version already includes most of the features defined by the protocol. You can expect a complete implementation with the final 2.11 version.

The Grammar Language

There is a new feature in the Xtext Grammar Language: Annotations. They may be placed in front of any kind of rule. While this provides a very generic mechanism to add additional meaning to a rule, our primary use case for now is @Override. It serves the same purpose as in Java, but for rules instead of for methods. Example:

grammar pkg.EnglishGreeting with org.eclipse.xtext.xbase.Xbase

generate english "http://www.xtext.org/english-1"

Greeting:
  'Hello' name=ID '!';

grammar pkg.NordicGreeting with pkg.EnglishGreeting

generate nordic "http://www.xtext.org/nordic-1"

Model:
  greetings+=Greeting*;

@Override
Greeting:
  'Moin' name=ID '!';

The language pkg.NordicGreeting inherits from the language pkg.EnglishGreeting. Also, the parser rule Greeting from NordicGreeting overrides the parser rule Greeting from EnglishGreeting. In previous Xtext versions, overriding happened implicitly based on rule name equality. As of now, it should be made explicit by placing an @Override annotation in front of the rule declaration.

The advantage is that if somebody removes or renames the rule in the super language, there will be an error marker in the sub language. For reasons of backwards compatibility, a missing @Override annotation by default only creates a warning.

We are thankful to NumberFour AG, whose sponsorship made this feature possible. You may have heard of one of their Xtext-based languages, N4JS, a statically typed version of JavaScript, which is in the process of becoming an Eclipse Project.

Serializer Performance

We are also happy with the achieved massive performance improvements in the serializer.

The serializer is Xtext’s component for turning models (aka ASTs) back into text based on the grammar. The resulting text is fully compliant with the syntax that you specified in your grammar. So the serializer is the antagonist of the parser, and thus sometimes also referred to as un-parser. Common use cases range from non-textual editors, such as diagram- or form-based editors to semantic modifications of the AST, for example during a QuickFix or refactoring.

In real-world projects we observed speed improvements between a factor of 2.5 and 6. These achievements are the result of more than a week of profiling, optimization, and simplification.

Thanks to ETAS who supported these improvements with their sponsorship.

For Contributors

New Build System

To better address multiple platforms, we re-implemented our build system using Gradle. Maven Tycho is only used for the projects that need Eclipse-P2 for dependency resolution. The Gradle build produces both Maven artifacts as well as OSGi bundles. To be more specific, the repositories xtext-core, xtext-lib, xtext-extras and xtext-idea are built with Gradle, while xtext-eclipse is built with Maven Tycho.

Build Server

TypeFox is happy to provide a new Jenkins build server for the Xtext projects. The server is operated by the committers to allow them to optimally adapt it to their development workflows. The major improvements have been achieved via:

The awesome GitHub Branch Source Plug-in. The plug-in automatically creates a new Jenkins Job for each new git branch.
Job configuration via Jenkinsfile. This turned configuration into git-managed code and eliminated pages-long job-configuration-forms.
Elimination of central artifact repositories for snapshot builds. There is no central repository the server publishes to. Instead, each job archives the repository it produced as its artifacts. Since there is a job for each git-branch, this gives us a repository for each git-branch. Of course, milestones and releases will be published to the usual central repositories.

Hi there, this is Akos. I am the new one at TypeFox, and within this post, I would like to describe you what was my first task after joining TypeFox. Namely, how to embed the Monaco Editor in the web browser and how to support a simple expression language from the browser using the Language Server Protocol (LSP).

monaco_example_add

I came from the Java world; I did Eclipse plug-in development in the last couple of years. Although I also worked with Java EE technologies and created rich web based applications with various frameworks such as JBoss Seam and Vaadin I never really had to deal with the JavaScript part because the currently used technology somehow magically took care of them under the hood and I had to deal rather with the Java code. Embedding the Monaco Editor was a bit more complicated task and required some additional JavaScript and TypeScript knowledge. Besides that, I also used Gradle, Webpack, and npm.

First and foremost, what is the Monaco Editor? The Monaco Editor is a browser-based code editor that powers VS Code. It supports cool features such as syntax and semantic validation, content assist, syntax coloring, parameter hints, hover, and much more out of the box. It is well-documented and relatively easy to connect with a language server and integrate into your projects.

The first thing I needed for this task is the Xtext implementation of the language server. This server is available from the 2.11.0.beta1 milestone version of Xtext and depends on a lightweight library: ls-api. This library is a simple Java binding for the LSP and is going to be replaced by the LSP4J Eclipse project in the future. By default, the Xtext language server supports various features, such as content proposals, hover, mark occurrences, and find references, which one can use for almost any kind of DSLs without any further customizations. Besides that, there are a couple of additional features that require custom implementation; for instance the signature helper which provides parameter hints. Usually, one single language is supported by one server; however, the Xtext server is capable of supporting multiple Xtext languages at the same time. The only requirement is that the actual implementation of the language should be available from the classpath of the server as a bundled jar.

I had to prepare some generic glue code that acts as a web socket server endpoint and handles the lifecycle of the Xtext language server instance associated with the web socket session. On session-open event, it creates a new server instance and caches it and indeed on the web socket session-close event it shuts down the server and removes it from the cache. Besides that, to be able to support Guice based dependency injection in RESTful web services I used the jersey2-guice library which supports DI within the Jersey 2.x implementation of the JAX-RS/JSR-311 specification.

Once the server-side was ready, and our DSL was available on the classpath, I had to implement a language-specific signature helper. I added some tests and switched to the client-side code. On the client-side, I used mostly TypeScript with some additional JavaScript code and invoked Webpack to compile TypeScript to JavaScript and to build the dependency graph with all of my static assets to create one single uglified JavaScript file for the browser. The client code is responsible for creating a web socket and connecting to the Xtext language server. Once the connection is successfully established between the client and server, the language gets registered into a new Monaco Editor instance. Right after the editor instantiation, both the syntax coloring and the auto-bracket insertion was configured in the client code. Currently, the LSP does not support syntax coloring, so this had to be added to the JavaScript code.

The last remaining part of this task was to build a web-archive file and deploy it. Since not all environments have installed Node.js on it, an individual Node.js task was added to the Gradle configuration to install Node.js with npm. Npm can install Webpack and Webpack can gather all modules and their direct and transitive dependencies reading the content of the package.json of our module. Finally, Gradle creates a war file which can be optionally deployed on a Tomcat server using the Gretty Gradle plug-in.

This example web-based Monaco Editor, which was presented at EclipseCon Europe last week, is available here. We are planning to make both the generic glue code (used for the server) and the Monaco Editor code accessible in the future. Once the code is available under the EPL 1.0 license, we’ll come back to you with another blog post with all the technical details and pitfalls. If you cannot wait, feel free to drop me a mail.

Try the Monaco Demo

This week the LSP4J repository finally got created and filled with the initial contributions. LSP4J is a Java binding of Microsoft’s Language Server Protocol (LSP) with a Java implementation of the extended JSON RPC v2.0 the LSP is based on. The project aims at simplifying implementation of a LanguageClient (an editor) or a LanguageServer (e.g. a modern compiler) in Java. Here is a short introduction of how to use it.

Implement Your Language Server (or Client)

The first thing you should do is to implement your language server. To do so just implement the interface org.eclipse.lsp4j.LanguageServer. If you are implementing a client (e.g. an editor) you would need to implement org.eclipse.lsp4j.LanguageClient instead.

Launch and Connect with the Other End

Now that you have an actual implementation you can connect it with a remote client. Let’s assume you have an Inputstream and an Outputstream, over which you want to communicate with a language client.

The utility class LSPLauncher does most of the wiring for you. Here is the code needed.

LanguageServer server = ... ;
Launcher<LanguageClient> launcher = LSPLauncher.createServerLauncher(
                                                        server,
                                                        inputstream, 
                                                        outputstream);

With this we have a Launcher object on which we can obtain the remote proxy (of type LanguageClient in this case). Usually a language server should also implement LanguageClientAware, which defines a single method connect(LanguageClient) over which you can pass the remote proxy to the language server.

if (server instanceof LanguageClientAware) {
   LanguageClient client = launcher.getRemoteProxy();
   ((LanguageClientAware)server).connect(client);
}

Now your language server is not only able to receive messages from the other side, but can send messages back as well.

The final thing you need to to do in order to start listening on the given input stream, is calling

launcher.startListening();

This will start the listening process in a new thread.

Underlying Concepts

As mentioned in the beginning LSP4J is based on JSON RPC. The implementation is completely independent of the LSP, so can be used for other protocols. Also we made sure that it is easy to extend the LSP with new messages. This is important to bridge the last non-standard 20% and to prototype possible extensions for the LSP. We are for instance currently experimenting with support for semantic coloring and will submit an enhancement request once we are happy with it.

Please refer to the documentation to learn more about the JSON RPC layer.

A second milestone towards Xtext 2.11 named Beta 2 has been published today! The feature set is largely at the same state as with the Beta 1 published on October 21st. The main difference is that we spent a lot of effort in the build system for the new repository structure, allowing us to publish both for Eclipse and for Maven in a clean and consistent way. This means that you can use this new milestone also with Gradle or Maven projects, e.g. in applications built on the Xtext web integration.

Eclipse update site: http://download.eclipse.org/modeling/tmf/xtext/updates/milestones/head/S201612190948/
Maven version: 2.11.0.beta2

We would like to encourage all Xtext users to check this milestone version with their applications and to give us feedback. Now there’s still time to improve things before 2.11.0 is released (January 24th).

Using the Cutting Edge

As usual you can find nightly built snapshots on the Xtext Latest update site or on Sonatype Snapshots. However, if you want to apply even more up-to-date versions to your application, all subprojects of Xtext now offer their build artifacts in local repositories on our build server:

xtext-lib: Maven, Eclipse
xtext-core: Maven, Eclipse
xtext-extras: Maven, Eclipse
xtext-eclipse: Eclipse
xtext-web: Maven
xtext-maven: Maven
xtext-xtend: Maven, Eclipse

These builds are triggered automatically when changes are pushed to the corresponding GitHub repositories. Please note that while the nightly built snapshots have signed JARs, the cutting edge builds are not signed.

Hey there, this is Christian.

VS Code‘s Language Server Protocol opens up a new horizon of programming IDE development. And there’s good news: Eclipse Xtext 2.11 will support you in building up a language server for your DSLs.
With this post I want to demo that by means a pre-release snapshot starting with a blank desk. I gonna tell you about

setting up an Eclipse-based development IDE
creating and configuring the required Xtext projects
testing the language server supporting your DSL

The Eclipse-based development IDE

So lets get started with the development IDE. You can use either the Eclipse Installer or, like me, choose a pre-build IDE. I suggest the OxygenM4 build of the “Eclipse IDE for Java Developers” package. Once downloaded, extracted, and running install the Xtext Complete SDK from
http://download.eclipse.org/modeling/tmf/xtext/updates/nightly/.

If you don’t use an Oxygen package, make sure to have the latest milestone of Buildship installed, you’ll get it from http://download.eclipse.org/buildship/updates/e46/milestones/2.x/.

Creating the Xtext projects

With Xtext installed create a new Xtext Project via File → New → Project…. On the Advanced Xtext Configuration page you have to switch to the Gradle build system. Besides you may want to deactivate the Eclipse Plugin.

bildschirmfoto-2016-12-20-um-17-30-22

After finishing the wizard you find your workspace like this:

bildschirmfoto-2016-12-20-um-18-30-26

For simpler orientation in the workspace I suggest to switch to the Project Explorer that is able to present the projects in a hierarchical fashion.

bildschirmfoto-2016-12-21-um-10-37-16

Now invoke the Xtext code generator as usual, e.g. via the context menu.

bildschirmfoto-2016-12-21-um-10-40-46

Adding a dedicated language server test project

For this example I suggest to put our language server test into an additional project named org.xtext.example.mydsl.ide.tests, so let’s create that project. Create a new Java Project and set the project location manually to <yourWorkspaceDir>/org.xtext.example.mydsl.parent/org.xtext.example.mydsl.ide.tests, like this:

bildschirmfoto-2016-12-21-um-10-48-12

Include the new project into the Gradle build configuration by adding it to the settings.gradle of the parent project, see line no 4 below:

bildschirmfoto-2016-12-21-um-10-57-18

Copy build.gradle from org.xtext.example.mydsl.tests to org.xtext.example.mydsl.ide.tests and add a dependency to org.xtext.example.mydsl.ide, see line no 3 in the following screen shot.

bildschirmfoto-2016-12-22-um-13-54-14

Having everything saved we need to poke Buildship, which connects Gradle and Eclipse, to re-evaluate the build settings. This is done via the context menu of a project – I just chose the parent project – → Gradle → Refresh Gradle Project.

bildschirmfoto-2016-12-21-um-16-23-41

The language server test

Create an Xtend class named org.xtext.example.mydsl.ide.tests.LanguageServerTest in the src folder of your new project with the super class org.eclipse.xtext.testing.AbstractLanguageServerTest.
Add the following constructor and test method.

new() {
	super("mydsl")
}

@Test
def void test01_initialization() {
	val capabilities = initialize().capabilities
	Assert.assertTrue(
    	capabilities.definitionProvider && capabilities.documentFormattingProvider)
}

This first test initializes your language server. The language server answers the initialize() call with infos on the supported features. The test assumes the language server to be able of resolving definitions of cross-references and formatting documents according to DSL-specific formatting rules.

The contributions of AbstractLanguageServerTest instantiate the language server and initialize the test. The test itself simulates a language client and collects responses from the server. Last but not least AbstractLanguageServerTest contributes lots of convenience methods corresponding to the services of language server.

Now run your test class as JUnit Test, e.g. via the context menu (right-click on the class name!), …

bildschirmfoto-2016-12-22-um-14-02-38c

… and checkout the result:

bildschirmfoto-2016-12-21-um-18-10-39

Let’s add further tests.
The following one creates a file named hello.mydsl on the disc and instructs the language server to load it. The content is Hello Xtext! The test expects the language server to successfully load the document without any issues, which is indicated by an empty list of diagnostics provided by the server.

@Test
def void test02_openFile() {
	initialize()
	
	val file = 'hello.mydsl'.writeFile("")
	file.open('''
		Hello Xtext!
	''')
	
	Assert.assertTrue("There're issues in file 'hello.mydsl'.", diagnostics.get(file).empty)	
}

Last but not least let’s test the code completion ability of your language server.
The following test assumes a document with the content He, the cursor is located at the end of the line that is in column 2. The test expects the language server to offer a (single) completion proposal labeled Hello replacing the prefix He starting at line 0, column 0 and ending at line 0, column 2 with Hello.

@Test
def void test03_completion() {
	testCompletion [
	    model = 'He'
	    line = 0
	    column = 2
		expectedCompletionItems = '''
			Hello -> Hello [[0, 0] .. [0, 2]]
		'''
	]
}

Conclusion

Congrats! You build your own language server!
For further reading on customizing your Xtext-based language server wrt. the grammar of your DSL, scoping & linking, formatting, and code completion refer to http://www.eclipse.org/Xtext/documentation/. Your DSL contains expressions? Have a look on this post.

Finally, you wanna use your language within Visual Studio Code? Miro explains here how to achieve that.

After more than 7 months we finally got Xtext 2.11 out the door. Thanks for your patience. But good things come to those who wait! So go and get it while it’s hot.

Xtext 2.11 comes with tons of bugfixes and improvements for the framework itself and for Xtend. If you want more technical details, please have a look at the release notes. In this post, I want to highlight three points that are special to this release from my perspective.

With Xtext 2.11 we have split up our monolithic repository into multiple smaller parts. It was a huge effort to separate the Git repos without losing their history or tags, get the individual builds running, setup a staged CI and make all tests green again. But as of now, it should be much easier to fork, build and consume just the parts of the framework you are really interested in.

The biggest innovation in Xtext 2.11 is that it allows you to generate a language server for your language. This language server provides the smartness of your language to various editor clients such as VS Code, the Eclipse Generic Editor, Eclipse Che, Sublime, Eclipse Orion and hopefully more in the future. In short, we are close to provide support for your language in all major editors and IDEs by a single implementation. We have started the new Eclipse project LSP4J with an implementation of the language server protocol in Java to be picked up by other LSP implementors.

Last but not least, this release is a joint effort by multiple individuals and companies. I want to thank Christian Dietrich, Karsten Thoms, Holger Schill, Lorenzo Bettini and of course everybody from TypeFox for their great work. And of course thanks to the community for your helpful feedback and appreciation.

Through the past years we have been integrating all kinds of different JavaScipt code editors, like Ace, Orion or CodeMirror into custom web-based software tools. Since last June another very good editor, has been available: Microsoft’s Monaco editor, the editor widget that is used at the core of VSCode. Besides the very good quality and speed of the editor, the API is very close to the Language Server Protocol (LSP), which is not a surprise given that this is all developed by the same team.

At TypeFox, we love both the Monaco Editor and the Language Server Protocol and use them extensively in our projects. One publicly available project is RIDE, a Data-Science IDE we developed for R-Brain. Another publicly available example is the web calc example, which is covered by Akos here. In such cases, we connect Monaco editors with language servers running remotely. So far, however, Monaco did not speak the LSP out of the box, so we had to do a lot of plumbing and shimming to make it work. This has now been generalized and published as individual npm packages under the MIT license.

The Monaco Editor Language Client provides a language client that establishes communication between Monaco editors and language servers over JSON-RPC and the VSCode WebSocket JSON-RPC package enables JSON-RPC to work over web sockets.

The language client package comes with an example showing how Monaco editor can be connected with JSON language server. Go ahead, check out this repository, follow an instruction to start and play with the example, and then come back for a detailed explanation.

The Server Side

The server side consists of two components: Express Server and JSON Language Server. We use a web application framework for node.js called Express to serve static content, as index.html and js-code, and open a web socket connection. Instead of implementing our own JSON language server we deploy VSCode JSON language service package as a language server by means of VSCode Language Server package (consult VSCode documentation to learn more about it).

Deploying the JSON language server

The JSON language server can be deployed as an external process, or within the express server’s process:

in the first case, a child node process is spawned and JSON-RPC messages are forwarded between the web socket and the node process connections. During forwarding of the initialization request, a parent process id is set to the express server’s process id.
in the second case, the language server works over a web socket connection directly.

Deploying an Xtext language server

What if you want to connect your Xtext language server instead of an example one? You have two options:

stick to Express server, distribute your language server as a jar and deploy it as an external process, an approach demonstrated by Miro for a VSCode extension could be taken;
use a Java web-container as Apache Tomcat or Eclipse Jetty, distribute your language server as a war and deploy it to the chosen container.

The Client Side

Bundling and loading of client code

An entry client page provides a container for the Monaco editor and loads a client side code bundled by webpack. Unfortunately, Monaco is only distributed as an AMD module and cannot be bundled by webpack. To overcome this difficulty one should ensure that Monaco code is loaded before client code.

Starting Monaco language client

Once Monaco code is loaded but before starting Monaco language client one should:

monaco.languages.register({
  id: 'json',
  extensions: ['.json', '.bowerrc', '.jshintrc', '.jscsrc', '.eslintrc', '.babelrc'],
  aliases: ['JSON', 'json'],
  mimetypes: ['application/json'],
});

provide language client services

Default services notify the language client about changes in the Monaco editor models, hook up the Monaco language features with the language client, e.g. a completion, hover, etc., and provide means to log messages from language servers in the console.

const services = createMonacoServices();

establish a web socket connection

// create the web socket
const url = createUrl('/sampleServer')
const webSocket = createWebSocket(url);
// listen when the web socket is opened
listen({
    webSocket,
    onConnection: connection => {
        // create and start the language client
        const languageClient = createLanguageClient(connection);
        const disposable = languageClient.start();
        connection.onClose(() => disposable.dispose());
    }
});

Having everything wired up, one can start the actual client. In this example, we use a reconnecting web socket package to auto reopen the web socket connection if one is closed. Because of it, we have to disable the default error handling, which tries to restart the language client 5 times and start a new language client each time when a new web socket connection is opened.

function createLanguageClient(connection: MessageConnection): BaseLanguageClient {
    return new BaseLanguageClient({
        name: "Sample Language Client",
        clientOptions: {
            // use a language id as a document selector        
            documentSelector: ['json'],
            // disable the default error handler            
            errorHandler: {
                error: () => ErrorAction.Continue,
                closed: () => CloseAction.DoNotRestart
            }
        },
        services,
        // create a language client connection from the JSON RPC connection on demand
        connectionProvider: {
            get: (errorHandler, closeHandler) => {
                return Promise.resolve(createConnection(connection, errorHandler, closeHandler))
            }
        }
    })
}

The Eclipse Xtext language development framework can be used to implement domain-specific languages (DSLs) as well as fully blown programming languages. In addition to a modern compiler architecture, it comes with tool support for different platforms, like Eclipse, IntelliJ and Web.

Since supporting all these different editor platforms is a lot of effort, we are strong supporters of the Language Server Protocol (LSP). The LSP defines a set of requests and notifications for editors to interact with language servers. A language server essentially is a smart compiler watching a workspace and exposing services for an editor. Such services cover things like content assist, find references, rename refactoring and so on. So the big question is :

When should I use Xtext LSP instead of a native editor integration?

As of today if you are looking for an Eclipse Plug-in my answer is clearly, go with the traditional Xtext Eclipse Plug-in. With Eclipse LSP4E there is Eclipse support for language servers, but it is not even close to what our native Eclipse support does. I also doubt that this will change any time in the future. The native Eclipse integration of Xtext is here to stay.

For IntelliJ IDEA the situation is different. Neither the Xtext integration has been updated with the last release, nor has Jetbrains yet started to work on LSP support. The code for the IDEA integration is quite extensive and deep. So deep that we get regularly broken because we use non-public API. Since the demand for IDEA integration is not high, maintaining it doesn’t make sense to us. That is why I asked Jetbrains to work on LSP integration last year already. So far they don’t seem to be convinced, but you could add your 2cents or +1 to this ticket if you think LSP would be a good IDEA.

For the rest of this post, I want to talk about Xtext Web and why you should not use it anymore and prefer the LSP integration instead.

The Xtext Web support was our first attempt to generalize language features over multiple editors. At that time we only abstracted over the web editors Ace, CodeMirror and Eclipse Orion (the editor widget, not the IDE). We did it over a REST interface and focussed on single code editors, only. The LSP integration works with any editor supporting it and while Eclipse Orion is still working on supporting it, the Monaco code editor from Microsoft fully supports it already. So here are my four reasons why you should use LSP for web applications:

Monaco Is Awesome

Our team has been working with Monaco since it came out last summer. For instance, we are developing a data science IDE for r-brain.io (you can try it for free :-)), where we use Monaco with language servers (currently Python and R). The R language server has been implemented in Xtext using the brand new LSP support. Please have a look at this article to learn more about its features.

So far working with Monaco has been a decent experience. The code is well written and organized, and the quality is very high. Microsoft uses TypeScript, which we do, too, when working on a JavaScript stack. It is to JavaScript what our Xtend programming language is to Java :).

Feature-wise I can say that it has all the things other editors have, but also comes with additional nice features like code-lenses, peak definition or the integrated find references. Moreover, it is very extensible letting use inline any kind of html for instance.

Multiple Editor Support

Monaco directly supports to work with multiple editors in a single website and connect them for e.g. navigation. This is also a big difference between Xtext LSP and Xtext Web. Xtext LSP is built on top of our incremental builder infrastructure, so it can naturally deal with multiple documents and even projects and dependencies. This doesn’t mean that you need to serve your files from a file system or need to deal with complicated project setups. It just supports this once you want to do it.

Xtext Web, on the other hand, can only handle a single document, and the underlying resource set needs to be provided programmatically.

Write Once, Run Everywhere

Having a fully compliant language server for your Xtext DSL will allow to use it in other contexts, too. Single-sourcing your language implementation and being able to run it in all different LSP-supporting editors is a huge plus. You decouple the work that you put into your language from the decisions you make regarding in which editors or applications you integrate it.

Future Proof

When it comes to integrating Xtext languages in web applications all our passion and love goes to the LSP. Our customers use either Eclipse or LSP, and we are happy helping people to migrate their existing Xtext Web solutions to LSP and Monaco. Going forward we won’t invest into the Xtext Web support but likely will deprecate it soon. In the future, given the adoption of the LSP, there will be even more tools and editors that can run your Xtext languages.

Final Words

So for me, the main focus in Xtext will be the traditional Eclipse support and the LSP support for everything else. The Eclipse support will benefit from the LSP support as well since we plan to implement new tool features in a way such that it can be used from Eclipse as well as from LSP.

Please get in touch if you have questions or any doubts whether your use case is well covered by this focus.

Today, I want to point you at a GitHub repository we have been contributing to for the last couple of weeks. Theia is a collaborative and open effort to build a new IDE framework in TypeScript.

“Yet another IDE?”, You might think. Let me explain the motivation behind it and how its scope is unique compared to existing open-source projects.

Single-Sourcing Desktop & Browser (Cloud) Tools

Let’s start with the unique selling point: Theia targets IDEs that should run as native desktop applications (using Electron) as well as in modern browsers (e.g. Chrome).

So you would build one application and run it in both contexts. Theia even supports a third mode, which is a native desktop app connecting to a remote workspace. No matter if you target primarily desktop or cloud, you can leverage the goodness of web technology and will be well prepared for the future. Although implemented using web technologies, neither VSCode nor Atom support execution in a browser with a remote backend.

Extensibility

Theia is an open framework that allows users to compose and tailor their Theia-based applications as they want. Any functionality is implemented as an extension, so it is using the same APIs a third-party extension would use. Theia uses the dependency injection framework Inversify.js to compose and configure the frontend and backend application, which allows for fine-grained control of any used functionality.

Since in Theia there is no two-class treatment between core code and extensions, any third-party code runs in the main application processes with the same rights and responsibilities the core application has. This is a deliberate decision to support building products based on Theia.

Dock Layout

Theia focusses on IDE-like applications. That includes developer tools but extends to all kinds of software tools for engineers. We think only splitting an editor is not enough. For such applications, you want to allow representing data in different ways (not only textual) and provide the user more freedom to use the screen estate.

Theia uses the layout manager library phosphor.js. It supports side panels similar to what JetBrains’ products do and allows the user to layout editors and views as they want in the main area.

dock_layout

Language Server Protocol

Another goal of this effort is to reuse existing components when sensible. The language server protocol (LSP) is, therefore, an important, central concept. Theia uses Microsoft’s Monaco code editor, for which I already found some positive words last week. That said, Theia has a thin generic editor API that shields extensions from using Monaco-specific APIs for the most common tasks. Also, other components, like Eclipse Orion’s code editor, could be utilized as the default editor implementation in Theia as well.

To show-case the LSP support, Theia comes with Eclipse’s Java Language Server which also nicely shows how to add protocol extensions. For instance, the Java LS has a particular URI scheme to open source files from referenced jars, which Theia supports.

lsp_support

TypeScript

The JavaScript (JS) language is evolving, but the different targeted platforms lag behind. The solution to this is to write code in tomorrow’s language and then use a transpiler to ‘down-level’ the source code to what the targeted platforms require. The two popular transpilers are Babel and TypeScript. In contrast to Babel, which supports the latest versions of JavaScript (ECMAScript), TypeScript goes beyond that and adds a static type system on top.

Furthermore, the TypeScript compiler exposes language services to provide advanced tool support, which is crucial to read and maintain larger software systems. It allows navigating between references and declarations, gives you smart completion proposals and much more. Finally, we are not the only ones believing TypeScript is an excellent choice (read ‘Why TypeScript Is Growing More Popular’).

Status Quo & Plans

Today we have the basic architecture in place and know how extensions should work. In the Theia repository, there are two examples (one runs in a browser the other on Electron), which you can try yourself. They allow to navigate within your workspace and open files in code editors. We also have a command registry with the corresponding menu and keybinding services. Depending on whether you run in Electron or a browser the menus will be rendered natively (Electron) or using HTML. The language server protocol is working well, and there are two language servers integrated already: Java and Python. We are going to wrap the TypeScript language service in the LSP, so we can start using Theia to implement Theia. Furthermore, a terminal gives you access to the workspace’s shell.

Don’t treat this as anything like a release as this is only the beginning. But we have laid out a couple of important fundamentals and now is a good time to make it public and get more people involved. The CDT team from Ericsson have already started contributing to Theia and more parties will join soon.

Although Theia might not be ready for production today, but if you are starting a new IDE-like product or looking into migrating the UI technology of an existing one (e.g. Eclipse-based), Theia is worth a consideration. Let me know what you think or whether you have any questions.

Xtext 2.12 is released on May 26th. As described in its release notes, a main novelty is an API for tracing generated code.

Why Tracing?

Whenever you transpile code from one language to another, you need some kind of mapping that instructs the involved tools how to navigate from a piece of source code to the respective target code and back. In a debugging session, for example, developers can get quite frustrated if they have to step through the generated code and then try to understand where the problem is in their original source. Allowing to debug directly in the source code saves time and frustration, so it’s definitely the way to go.

For transpiled JVM languages such as Xtend, the JSR 45 specification defines how to map byte code to the source. For languages that target JavaScript, such as TypeScript or CoffeeScript, source maps are generated by the compiler and then processed by the development tools of the web browser. The DSL framework Xtext offers an API to access tracings between source and target code for any language created with Xtext. Xbase languages such as Xtend make use of such tracings automatically, but for other languages the computation of tracing information had to be added manually to the code generator with Xtext 2.11 and earlier versions.

Tracing Xtend Templates

The examples shown in this post are available on GitHub. They are based on a simple Xtext language that describes classes with properties and operations.

The examples are implemented in Xtend, a language that is perfectly suited to writing code generators. Among other things, it features template expressions with smart whitespace handling and embedded conditionals and loops. Here’s an excerpt of the generator implementation of our example, where the target language is C:

The entry point of the new API is TracingSugar, which provides extension methods to generate traced text. The code above uses generateTracedFile to create a file and map its contents to model, the root of our AST. The generateHeader method is shown below. It defines another template, and the resulting text is mapped to the given ClassDeclaration using the @Traced active annotation.

The _name extension method in the code above is another part of the new API. Here it writes the name property of the ClassDeclaration into the output and maps it to the respective source location. This method is generated from the EMF model of the language using the @TracedAccessors annotation. Just pass the EMF model factory class as parameter to the annotation, and it creates a tracing method for each structural feature (i.e. property or reference) of your language.

The Generator Tree

The new tracing API creates output text in two phases: first it creates a tree of generator nodes from the Xtend templates, then it transforms that tree into a character sequence with corresponding tracing information. The base interface of generator nodes is IGeneratorNode. There are predefined nodes for text segments, line breaks, indentation of a subtree, tracing a subtree, and applying templates.

The generator tree can be constructed via templates, or directly through methods provided by TracingSugar, or with a mixture of both. The direct creation of subtrees is very useful for generating statements and expressions, where lots of small text segments need to be concatenated. The following excerpt of our example code generator transforms calls to class properties from our source DSL into C code:

The parts of the TracingSugar API used in this code snippet are

trace to create a subtree traced to a source AST element,
append to add text to the subtree, and
appendNewLine to add line breaks.

The resulting C code may look like this:

Employing the Traces

Trace information is written into _trace files next to the generator output. For example, if you generate a file persons.c, you’ll get a corresponding .persons.c._trace in the same output directory. Xtext ships a viewer for these files, which is very useful to check the result of your tracing computation. In the screenshot below, we can see that the property reference bag is translated to the C code Bag* __local_0 = &this->bag;

The programmatic representation of such a trace file is the ITrace interface. An instance of ITrace points either in the source-to-target or the target-to-source direction, depending on how it was obtained. In order to get such a trace, inject ITraceForURIProvider and call getTraceToTarget (for a source-to-target trace) or getTraceToSource (for a target-to-source trace).

Xtext provides some generic UI for traced generated code: If you right-click some element of your source file and select “Open Generated File”, you’ll be directed to the exact location to which that element has been traced. In the same way, you can right-click somewhere in the generated code and select “Open Source File” to navigate to the respective source location. This behavior is shown in the animation below.

tracing_c_10fps

Enhancing Existing Code Generators

In many cases it is not necessary to rewrite a code generator from scratch in order to enhance it with tracing information. The new API is designed in a way that it can be weaved into existing Xtend code with comparatively little effort. The following hints might help you for such a task, summarizing what we have learned in the previous sections of this post.

Use generateTracedFile to create a traced text file. There are two overloaded variants of that method: one that accepts a template and traces it to a root AST element, and one that accepts a generator node. If you are already using Xtend templates, just pass them to this method.
Add the @Traced annotation to methods that transform a whole AST element into text. In some cases it might be useful to extract parts of a template into local methods so this annotation can be applied.
Use the @TracedAccessors annotation to generate extension methods for tracing single properties and references. For example, if you have an expression such as property.name in your template, you could replace that with property._name so that the expression is properly traced.
Use the TracingSugar methods to construct a generator subtree out of fine-grained source elements such as expressions. If you have previously used other string concatenation tools like StringBuilder or StringConcatenation, you can replace them with CompositeGeneratorNode (see e.g. generateExpression in our example code).

It’s Time to Trace!

With the new Xtext version 2.12, generating traced code has become a lot simpler. If such traces are relevant in any way for your languages, don’t hesitate to try the API described here! We also welcome any feedback, so please report problems on GitHub and meet us on Gitter to discuss things, or just to tell us how cool tracing is

This article shows the necessary steps to enable cross-references between Xtext models and other EMF based models. It focusses on the linking aspects, and keeps things like the synchronization, transactions, singleton editing domains, dirty-state handling etc. aside. So for a full integration, let’s say with Sirius, this is only one part of the story.

There are often good reasons to describe different parts of an EMF model in different notations, e.g. textual and graphical. To connect resources with different notations we can use EMF cross-references. Via the XtextResource, Xtext hides the entire process of parsing (text to EMF) and serialization (EMF to text) behind EMF’s resource API. So in theory, cross-references between Xtext and other EMF-based models should work out-of-the-box, shouldn’t they?

Unfortunately, there is one big difference: Xtext uses names to refer to an element, while EMF uses URIs. This blogpost is about how to get cross-references working anyway.

The Example

We’ve put an example on Github, where we use simple Ecore model for trees and an Xtext grammar on the same model.

Screen Shot 2017-07-06 at 13.51.58

In our setup, we can define tree models either in XMI using the generated EMF tree editor (file extension tree) or textually in an Xtext (file extension xtree), and establish cross-references between the models of both notations. The screenshot shows an XMI-based model on the left and an Xtext-based on the right refering to each other.
Cross-references between Xtext and XMI

Cross-References from Xtext to XMI

To resolve a cross-reference to an element, Xtext takes the name given in the text to look up the referred element in a scope. You can think of a scope as a table of all candidates for a certain cross-reference with their name in a specific context. Scopes are usually chained, such that if the current scope does not yield a result for a given name, it asks its parent scope and so on. The top-most parent scope is called the global scope. It provides all elements from all reachable resources that would be referable at that specific location. It is usually backed by the Xtext index, which stores descriptions of all externally referable elements for each resource. The index is populated by the Xtext builder, which automatically syncs the index data on file changes. This is why you should never deactivate automatic builds for an Xtext project. More information on scoping can be found in the Xtext documentation.

Given the above, in order to refer to XMI from Xtext we have to create index entries for the elements of the XMI-based model. This is achieved by registering a new language for the *.tree resources to the Xtext infrastructure, thus providing services like indexing and name computation. In the example, we created a separate plug-in project for the Xtext language registration. You could of course put that code in an existing plug-in as well. or you might want to put runtime and UI parts into separate plug-ins.

Implement a TreeRuntimeModule inheriting from AbstractGenericResourceRuntimeModule and implement the missing methods. This class is used to configure the runtime dependency injection (DI) container for this language. If you want to override non-UI services you can do this here.
In our example, we override the IQualifiedNameProvider to yield fully qualified names, i.e. Root.Child instead of just Child in the above example to avoid name collisions.
Implement a TreeUiModule inheriting from EmfUiModule. This is the DI config for all Eclipse-based services.
In the example, we added an editor opener that opens the EMF tree editor when the user follows a reference to an XMI-defined tree element in the Xtext editor.
Implement a plug-in Activator (inheriting from AbstractUIPlugin) that creates the injector based on the TreeUiModule, the TreeRuntimeModule and the common SharedStateModule on start(). Make sure to register the Activator in the MANIFEST.MF.
Implement an TreeExecutableExtensionFactory that extends AbstractGuiceAwareExecutableExtensionFactory and delivers both the bundle and the injector from the Activator.
In the plugin.xml, register your language to the extension point org.eclipse.xtext.extension_resourceServiceProvider with the uriExtension tree and an instance of EmfResourceUIServiceProvider created via the TreeExecutableExtensionFactory from the previous step.

You could skip step 2 to 5 if you don’t need Eclipse support. If you want to have the same functionality in a plain Java process, you have to manually create the injector and initialize the EMF registries, as we did in the TreeStandaloneSetup.

Cross-References from XMI to Xtext

As opposed to names in Xtext, EMF uses URIs to refer to elements. In XMI, the standard serialization format for EMF models, a cross-ref becomes an href with the URI of the referred element. The URI consists of the URI of the resource where the element is defined in, followed by the fragment which is a resource unique string identifier of the element, e.g.

href="MyXtextTree.xtree#//@children.1"

The XtextResource delivers such URIs for all contained elements by default. These URIs are picked up by referring XMI resources, so it seems like this works out-of-the-box. But the problems begin as soon as you start modifying the referred Xtext resource.

The default algorithm for computing the fragments uses an Xpath expression, navigating the containment features from the root element by name (children) and index (1). This approach delivers unique fragments for all elements in a resource without relying on a name or a unique ID attribute. The disadvantage is, that it assumes that the position of an element in the content tree is fixed. When we switch the order of Bar and Baz in the example, their path fragments would be switched as well, screwing up existing URI references to them.

If you want the same linking semantics as in Xtext, the fragment should encode the fully qualified name of the element. Xtext allows you to customise that by implementing your own IFragmentProvider. In the example, we have added our own XtreeFragmentProvider and bound it in the XtreeRuntimeModule.

Additionally we might want to include the element’s EClass, because elements of different types in the same model could have the same name. Then, there can be multiple EClasses in different EPackages with the same name, so a complete generic solution for the fragment would be a fully qualified EClass name followed by a fully qualified element name ,  that is

href="[resource URI]#[EPackage nsURI][separator][EClass name][separator][EObject FQN]"

e.g

href=" ./MyXtextTree.xtree#http://www.typefox.io/xtextxmi/tree+Node+Foo.Bar"

This is a lot of information to be packed into a string. We also must sure we don’t break encoding rules. So it may be better to go for a less general, domain-specific solution as we did in the example.

Another problem is the resource part of the URI: In Xtext, a referrer does not care in which resource the cross-referenced element is. Moving an element to a different Xtext resource would break all URI-based links to it, while the name-based links stay intact. A possible solution would be to implement a move refactoring for Xtext elements. That is beyond the scope of this article.

A Word on Rename Refactoring

In URI-based linking, renaming an element will not change a cross reference, as long as the fragments don’t involve the name. With our approach, the links to the Xtext resource are susceptible to such renames.

The good news is that by registering the tree language to Xtext as we did above, cross-references to Xtext elements will be automatically updated when the user triggers a rename refactoring on them.

The bad news is that the links from Xtext to XMI will break when an XMI element is renamed, let’s say in the EMF tree editor. It is up to the implementor of the editor to trigger the a rename refactoring for referring Xtext resources on such user actions if that is the intended behavior. Luckily, broken links from Xtext to XMI will just be marked as errors and can be easily fixed by hand.