Writing a code formatter is hard.

Bob Nystrom wrote about his experience creating the Dart formatter and it is titled The Hardest Program I’ve Ever Written.

What unique challenges exist when trying to format Wolfram Language code?

Very long tokens

Wolfram Language has very long tokens in practice. How to split? Insert line continuations? Just let overflow?

Comments

Wolfram Language has multiline comments. It is semi-ok to reformat multiline comments. It is ok indent a comment and all of its contents, but need to preserve the relative positioning of comments, i.e., above or below other code.

Strings

  • Implicit newlines vs. explicit newlines

  • Cannot reformat (would change semantics)

  • Cannot even change where the string starts: there may be relative formatting between lines in the string that needs to be preserved

  • Ok to insert line continuations, but cannot have any whitespace on next line.

Strings with newlines cannot have ANY indentation!

With strings, line continuations do not eat trailing whitespace.

So must do surgery and undo any indentation

Implicit Times

These are not the same!

a b

vs.

a\
 b

How to format Nest[f, x, 100] ?

Easiest question to tackle:

How to format output of:

Nest[f, x, 100]

?

Here is what the Wolfram Language kernel does:

In[1] := ToString[Nest[f, x, 100], InputForm, PageWidth -> 80]

Out[1] =
"f[
 f[
  f[
   f[
    f[
     f[
      f[
       f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[
                          f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[
                          f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[
                          f[f[f[f[f[f[f[f[
                          x]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]\\
]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]"

And here is what the Wolfram Front End does:

f[f[f[f[f[
     f[f[
       f[
        f[
         f[
          f[
           f[
            f[
             f[
              f[
               f[
                f[
                 f[
                  f[
                   f[
                    f[
                    f[f[
                    f[f[f[f[
                    f[f[f[f[f[f[f[f[
                    f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[
                    f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[
                    f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[f[
                    x]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]\
]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]

And the Wolfram Workbench does not do anything.

Updated: