Writing an importer: Error handling considerations

What can possibly go wrong when importing a JSON document? Several problems come to mind:

The input file can be unparseable as JSON. For example, the user has chosen a .zip file by mistake instead of .json.
A valid JSON file can have unexpected structure. For example, the user has chosen a .json file with a different, unrelated data set.
The data set is correct but contains incorrect values. For example, a reference mentions an object that does not exist, a field value is out of expected range, or some references form an unexpected cycle.
The file could be fully valid but unexpectedly large.

We often want to avoid leaving the model in an inconsistent state after a failed import. While we can sometimes assume that users are using version control and know how to roll back unintended changes, it is nicer if they don’t have to. Our current importer logic adds the objects straight into the target model as we go through the parsed structure, and as a result it is prone to this issue.

We can fix this by importing objects into a temporary model first, and only moving them to the target model once the full import is complete. While in theory the move could partially fail as well, in practice this is unlikely as it all happens in memory with data that has passed validation.

To create a temporary model we can make use of jetbrains.mps.smodel.tempmodel.TemporaryModels singleton:

TemporaryModels temporaryModels = TemporaryModels.getInstance();
model tempModel = temporaryModels.createReadOnly(TempModuleOptions.forDefaultModule());
try {
  importDataIntoModel(sourceFile, tempModel);

  foreach root in tempModel.roots(<all>) {
    targetModel.add root(root);
  }
} finally {
  temporaryModels.dispose(tempModel);
}

I have moved the original code into importDataIntoModel method that we now call to import the data into the temporary model. After the import is successful, we add all roots to the target model. Adding a root to another model removes it from its original model.

When creating a temporary model we have to choose whether it will be read-only or editable. This pertains to the ability of the user to edit it via an editor, it is possible to modify the model via code in any case. We do not show the model to the user and in any case we don’t intend it to be modifiable by the user, so we choose the read-only option.

In the next post we will make it possible to update already existing data via import, rather than adding new objects on each import. In the meantime, the code is on GitHub.