Issue

I was somewhat surprised to observe that the following code

# comment  say 1;
# comment  say 2;
# comment say 3;
# comment say 4;

prints 1, 2, 3, and 4.

Here are the relevant characters after "# comment":

say "  ".uninames.raku;
# OUTPUT: «("PARAGRAPH SEPARATOR", "LINE SEPARATOR", "<control-000B>", "<control-000C>").Seq»

Note that many/all of these characters are invisible in most fonts. At least with my editor, none cause the following text to be printed on a new line. And at least one (<control-000C>, aka Form Feed, sometimes printed as ^L) is in fairly wide use in Vim/Emacs as a section separator.

This raises a few questions:

Is this intentional, or a bug?
If intentional, what's the use-case (other than winning obfuscated code contests!)
Is it just these 4 characters, or are there others? (I found these because they share the mandatory break Unicode property. Does that property (or some other Unicode property?) govern what Raku considers as a newline?)
Just, really, wow.

(I realize #4 is not technically a question, but I feel it needed to be said).

Solution

Raku's syntax is defined as a Raku grammar. The rule for parsing such a comment is:

token comment:sym<#> {
   '#' {} \N*
}

That is, it eats everything after the # that is not a newline character. As with all built-in character classes in Raku, \n and its negation are Unicode-aware. The language design docs state:

\n matches a logical (platform independent) newline, not just \x0a. See TR18 section 1.6 for a list of logical newlines.

Which is a reference to the Unicode standard for regular expressions.

I somewhat doubt there was ever a specific language design discussion along the lines of "let's enable all the kinds of newlines in Unicode, it'll be cool!" Rather, the decisions were that Raku should follow the Unicode regex technical report, and that Raku syntax would be defined in terms of a Raku grammar and thus make use of the Unicode-aware character classes. That a range of different newline characters are supported is a consequence of consistently following those principles.

Answered By - Jonathan Worthington

Answer Checked By - Timothy Miller (PHPFixing Admin)

Wednesday, June 29, 2022

[FIXED] What counts as a newline for Raku source files?

Issue

Solution

Total Pageviews

Featured Post

Why Learn PHP Programming

Wednesday, June 29, 2022

Issue

Solution

Total Pageviews

Featured Post

Why Learn PHP Programming

Subscribe To