Individual rules exist for the following scopes
| Definition | Indented? | Quoted? | Scope | Notes |
|---|---|---|---|---|
| <<"HTML" | No | Double | text.html.embedded.perl | - |
| <<"XML" | No | Double | text.xml.embedded.perl | - |
| <<"CSS" | No | Double | text.css.embedded.perl | - |
| <<"JAVASCRIPT" | No | Double | text.js.embedded.perl | - |
| <<"SQL" | No | Double | source.sql.embedded.perl | - |
| <<"POSTSCRIPT" | No | Double | text.postscript.embedded.perl | - |
| <<"OTHER" | No | Double | string.unquoted.heredoc.doublequote.perl | This rule assigns $self to the incorrect capture. |
| <<'HTML' | No | Single | text.html.embedded.perl | - |
| <<'XML' | No | Single | text.xml.embedded.perl | - |
| <<'CSS' | No | Single | text.css.embedded.perl | - |
| <<'JAVASCRIPT' | No | Single | text.js.embedded.perl | - |
| <<'SQL' | No | Single | source.sql.embedded.perl | - |
| <<'POSTSCRIPT' | No | Single | text.postscript.embedded.perl | - |
| <<'OTHER' | No | Single | string.unquoted.heredoc.quote.perl | This rule assigns $self to the incorrect capture. |
| <<\::: | No | Single | string.unquoted.heredoc.quote.perl | This rule assigns $self to the incorrect capture. |
| <<`OTHER` | No | Backticks | string.unquoted.heredoc.backtick.perl | This rule assigns $self to the incorrect capture. |
| <<HTML | No | None | text.html.embedded.perl | - |
| <<XML | No | None | text.xml.embedded.perl | - |
| <<CSS | No | None | text.css.embedded.perl | This rule is not present. |
| <<JAVASCRIPT | No | None | source.js.embedded.perl | This scope uses "source.js" while its siblings use "text.js". Why? |
| <<SQL | No | None | source.sql.embedded.perl | - |
| <<POSTSCRIPT | No | None | source.postscript.embedded.perl | This scope uses "source.postscript" while its siblings use "text.postscript". Why? |
| <<OTHER | No | None | string.unquoted.heredoc.doublequote.perl | - |
- Fix the 4 rules that assign
$selfto the incorrect capture. It should always match the(.*)in the first line after the heredoc part. (e.g. inmy @vals = (<<HTML, 1);,$selfshould be interested in the, );portion of text.) - Add the missing unquoted embedded CSS case.
- Confirm if the scope name differences among siblings (i.e.
text/sourceare intended or one or the other are in error).
There are currently 22 rules used to handle heredocs. It would be nice if they could be refactored to a fewer number.
- There's 3 main types of parsing approaches: double-quoted, single-quoted, and bare. It would be nice if all double-quoted approaches could be unified, all single-quoted approaches could be unified, and all bare approaches could be unified. This would require being able to specify the
contentNameattribute dynamically based on abegincapture. - In order to make the existing rules also work for indented heredocs, instead of duplicating each current rule and making a small tweak, we'd need conditional expressions to work, e.g.
<<(~)?HTML(?(1)\s*)HTMLwould allow spaces between the twoHTMLs only if the~was present. Despite this being apparently possible it doesn't seem to work. Can it be done?
- Why do the following approaches diverge?
# Double-quoted other (<<"OTHER")
(((<<) *"([^"]*)"))(.*)\n?
# Single-quoted other and craziness
# <<'OTHER'
(((<<) *'([^']*)'))(.*)\n?
# <<\:::
(((<<) *\\((?![=\d\$\( ])[^;,'"`\s\)]*)))(.*)\n?
# Un-quoted other (<<OTHER)
(((<<) *((?![=\d\$\( ])[^;,'"`\s\)]*)))(.*)\n?
- In order to correctly identify an indented heredoc, we should be checking that the whitespace portion of the end terminator is matched exactly at the beginning of each line within the indented heredoc, and if not, then we should not match it. For example:
my @sql_and_bind = (<<~SQL, $id);
SELECT a, b, c
FROM the_table
WHERE id = ?
SQLis a valid indented heredoc, but the following is not:
my @sql_and_bind = (<<~SQL, $id);
SELECT a, b, c
FROM the_table
WHERE id = ?
SQLand neither is:
my @sql_and_bind = (<<~SQL, $id);
SELECT a, b, c
FROM the_table
WHERE id = ?
SQLWhy is this last case not? Because the line for the where clause uses eight spaces to indent, while the end terminator uses a tab. Yeah, perl is that picky. Is this level of parsing possible with textmate grammars?
<dict>
<key>begin</key>
<string>(((<<) *"HTML"))(.*)\n?</string>
<key>captures</key>
<dict>
<key>1</key>
<dict>
<key>name</key>
<string>punctuation.definition.string.perl</string>
</dict>
<key>2</key>
<dict>
<key>name</key>
<string>string.unquoted.heredoc.doublequote.perl</string>
</dict>
<key>3</key>
<dict>
<key>name</key>
<string>punctuation.definition.heredoc.perl</string>
</dict>
<key>4</key>
<dict>
<key>patterns</key>
<array>
<dict>
<key>include</key>
<string>$self</string>
</dict>
</array>
</dict>
</dict>
<key>contentName</key>
<string>text.html.embedded.perl</string>
<key>end</key>
<string>(^HTML$)</string>
<key>patterns</key>
<array>
<dict>
<key>include</key>
<string>#escaped_char</string>
</dict>
<dict>
<key>include</key>
<string>#variable</string>
</dict>
<dict>
<key>include</key>
<string>text.html.basic</string>
</dict>
</array>
</dict>