Hey folks. As Derek mentioned, we’ve been discussing this quite a bit. This is the kind of decision you can only make once and then you’re stuck supporting it forever (stupid in_group() …). So there’s a lot of pressure to get this right. It needs to be memorable and also future proof for potential use outside of conditionals.
Here’s the plan as it stands right now. It’s not gospel until it is released, so critical feedback is definitely appreciated:
We will be adding the following operators for strings, borrowing syntax from the CSS 3 attribute selectors. Template authors across the board are all familiar with CSS, so borrowing from CSS makes the syntax easier to pick up:
"ExpressionEngine" ^= "Exp" // TRUE, read "ExpressionEngine" begins with "Exp"
"ExpressionEngine" $= "ine" // TRUE, read "ExpressionEngine" ends with "ine"
"ExpressionEngine" *= "ionEn" // TRUE, read "ExpressionEngine" contains "ionEn"
variable *= "mary" // TRUE if the variable contains "mary"While that’s not a solution to the original problem, it sets us up with some new language to tackle it.
One issue we had with the current piped list syntax is that it can be a little ambiguous when used on strings with pipes (admittedly rare). It also requires a conversion on the programmers side when dealing with prep_conditionals (have to implode it). And then when parsing it, you don’t know if you’re actually dealing with a set or someone just happens to have some pipes in their string. This makes writing type-independent operators difficult.
So the new official set/list syntax in conditionals will be this:
( "list", 5, TRUE)Operators will be type independent, so they will all work as you might expect (where applicable):
("a", "b", "cd") ^= "a" // TRUE, list begins with a
("a", "b", "cd") $= "cd" // TRUE, list ends with cd
("a", "b", "cd") $= "d" // FALSE, list ends with "cd" not "d", would be true on a string
("a", "b", "cd") *= "b" // TRUE, list contains b
("a", "b") == ("b", "a") // TRUE, unlike php it's about the items not their orderThat last one brings up an interesting question:
("a", "b", "c") ^= ("a", "b") // clearly true
("a", "b", "c") ^= ("b", "a") // true-ish?When you pass an array to prep_conditionals it will automatically be treated as such a list:
{if listofnames ^= name} // listofnames => array("tom", "dick", "harry")The IN keyword (capitalization optional) will be provided as syntactic sugar as a synonym for *= with flipped operands:
"b" IN ("a", "b", "cd") // b in list
"b" IN "basketball" // b found in string twice. definitely trueThis also means that “in” will now be a reserved variable.
Future proofing is important, of course. We imagine that in the future tag parameters might be able to make use of this same syntax:
{exp:addon passmesomedata=(1, 2, 3)} // addon receives an arrayWe may never be able to drop module support of the piped string syntax, but it would be nice to start moving towards an unambiguous template language. Without throwing the baby out with the bathwater and ending up with glorified php.
Thoughts?
Initial thoughts, ok?
I liked the CSS like syntax. It’s a clever idea. But I don’t believe that allow the use of string like operators with arrays is a good one. This can be confusing, specially for non programmers. It looks like it mix both can bring us more harm than good. There’s some EECMS developers who can’t even understand the logic of switchee.
variable *= "mary" // good
("a", "b", "cd") *= "b" // badThe whole array thing is an important point. When I started on EECMS, I had to educate myself to think about the pipe delimited list of numbers as an array. Even on custom add-ons, when I need an array, I prepare the code to accepts “1|2|3” as a parameter and turns it to “1,2,3”, just to keep the standard. The parenthesis can also bring some hard to debug code to complex conditionals.
{if (var IN (2,3,4) OR var IN (10,11,12)) AND var != 4}So maybe it would be even better to not use the parenthesis at all:
"it" IN "keep it simple" // true
"it" IN "keep|it|simple" // true
"IT" IN "keep|it|simple" // false
1 IN 1|2|3 // true
1 IN 10|11|12 // falseKeep just strings and numbers available; not arrays.
But I’m not sure about what to do on this cases:
01 IN 1|2|3 // true
"01" IN 1|2|3 // falseThanks for the attention.
FWIW I think the syntax/functionality should be like the already in use with the search parameter syntax for channel:entries
Designers are already familiar with
{exp:channel:entries search:body="pickles|shoes"}I think ‘in’ is the most suitable, to become something like:
{if in:name="peter|joe"}...{if in:group_id="2|3"}...{if in:group_id="not 2|3"}...And so forth.
Regarding “a|b|c”: The problem with the single pipe delimited string is that it will inevitably end in someone doing this:
{if name IN "tom|{tag}|harry"}
// or even:
{if name IN "{tag}"}It’s the same problem as doing “{tag}” in conditionals right now. We have to defer on evaluating the condition until the tag disappears, or we hit the end of the parsing stage. It’s slow and it all falls apart when the tag has quotes or pipes. Worst of all, when doing it with user input it can let them execute arbitrary template code.
In the very least “a|b|c” would need to become “a”|”b”|”c”.
@Robson: Good points all around. For what it’s worth, the pipes vs comma vs parenthesis vs no parenthesis is all relatively flexible. Any of these are achievable:
{if name in ("foo", 5, var, TRUE)}
{if name in "foo",5,var,TRUE}
{if name in "foo"|5|var|TRUE}
{if name in ("foo" | 5 | var | TRUE)}The parentheses group things a little more obviously in my opinion, but technically aren’t necessary to parse the expression.
@Iain: Aside from the above mentioned issue with piped strings, that’s a very strange syntax for a conditional. It looks almost like an assignment when written inside a boolean expression:
{if title == "Introduction" && in:group_id = "2|3" OR size == 29 && grid:field == "bob"}That said, I don’t disagree with trying to keep things consistent all around. Just not sure if the search parameter syntax is really all that intuitive to begin with.
@Pascal, when I spoke about the parenthesis I was worried about you. Harder for you to code, easier to us find bugs. On the same principle, I… I’m sorry for say this. It looks like I’m overstepping Derek. I’m sorry if I’m doing. I think you shouldn’t worry about :
{if name IN "tom|{tag}|harry"}
// or even:
{if name IN "{tag}"}This is a new functionality. There isn’t legacy code to support. Both Joel and Low will have to update their add-ons and to instruct their customers to update their templates, in any way. I don’t know if you made use of my suggestion of fire an alert on developer log on these conditionals, but in this case you can put an alert right on docs.
“Don’t try this because… etc, etc, etc… Instead ofuse{if name IN "tom|{tag}|harry"}{if name IN "tom|harry" OR name == tag}
This can be a first step into educate the developers on the conditionals “by the recommend way” and to improve the Template Parser so add-on developers doesn’t need to create their own.
@Ian, like Pascal, I don’t like the conditional on search parameter style, too. In my mind parameters != operators:
{exp:foo param:eter="value"}{if variable operator "value"}What I like most on using IN is that it makes a sentence and this is far than easy to understand. Again, keeping the current syntax of values.
Thanks for reading, guys.
Pascal, Derek, we’re doing an amazing job! Thanks!
P.S.: I can’t wait to see the reactions to a blog post about the CSS like conditionals. 😊
Derek, Pascal. As far as I can tell, there’s a couple of things EllisLab have to take into account:
That said, I’m OK with the css/jquery-like operators for string comparisons. I can see that working and the use for it. Taking it one step further, you could consider using the ~= operator like the IN operator.
However, the array notation you suggested, I don’t like that much. I would much prefer the pipe-separated syntax, because that would be most consistent with parameter values, which already is familiar with devs.
It’s also not entirely clear to me how that would translate to variable usage in templates. For example, it happens a lot that entry IDs are put in a pipe-separated list to be used in a entry_id parameter, like with an embed. So, say that {embed:entry_ids} is that pipe-separated variable. How would you then use it? This would be nicest:
{if 5 IN embed:entry_ids}
// Or even:
{if embed:entry_ids ~= 5}The operator would tell the parser that the variable should be considered a pipe-separated list. And you can use the same variable in the parameter value as in the conditional. Consistent.
Now, if you wanted to hard-code the pipe-separated list, these syntaxes could work:
{if 5 IN "1|2|3"} … {/if}
{if var IN "foo|bar|baz"} … {/if}
// Or perhaps
{if "1|2|3" ~= "5"} … {/if}Again, consistency is the key here. You can set “1|2|3” as a parameter value, so why not use it in a conditional? The worry here is escaping variables, right? So you’d need to avoid this:
{if var IN "foo|{bar}|baz"} … {/if}Just a thought, but maybe you can use some sort of alternative syntax for variables inside conditionals like that. Still identify variables in conditionals, but don’t use the “output” syntax like {bar}, for example:
{if var IN "foo|$bar|baz"} … {/if}Here, the $ would tell EE that ‘bar’ is a variable that needs to be escaped before being evaluated. This goes against my “consistency” mantra, but it’s just a thought. Which brings me to…
…why is escaping the vars important in the first place? Is it only to avoid PHP errors? Because those will happen with a malformed conditional now anyway, right?
As a final note for now, as we’re in the dev preview, maybe be careful to not bite off more than you can chew. I can imagine adding all these css-like conditionals in one go can be a lot of work that needs to be tested thoroughly.
Thanks Low!
As a final note for now, as we’re in the dev preview, maybe be careful to not bite off more than you can chew.
While I agree, if we ship with a built in solution for list-like things, then we need to take the time to make sure we get it right and we’re not just hotfixing.
- designer-friendly syntax - escaping variable values - future-proofing
Spot on. I’m not here to shove my syntax down everyone’s throat, but there are a lot more cases to cover than just the most common piped IDs. These includes cases that are silly, or ones that we haven’t come up with yet. So I’m here above all to offer the perspective of building something that is robust for all use cases.
…why is escaping the vars important in the first place? Is it only to avoid PHP errors? Because those will happen with a malformed conditional now anyway, right?
They do PHP error in 2.8, in 2.9 they will just throw a template error. The first build of 2.9 still used eval as a last step, in the next build that will all be done by hand so eval errors should be out of the question. This ability for PHP to be put into invalid states is at the root of the security issue that triggered this rewrite in the first place. The template language should not interpret things ambiguously.
The very core of the encoding issue is that on the php side parsing and conditionals don’t happen in one go. If everyone used ee()->tmpl->parse_variables(), then we could look for {tag} in strings and intelligently do the replacement because we can call prep_conditionals before the replacement and have all the information. However, more frequently this happens:
// template, with the "The" and "Co." bits to show a legitimate use of quoted tags before 2.9
{if "The Jim Patterson Co." == 'The {business} Co.'}out{/if}
// php
$value = "Bob's Painting";
$tagdata = str_replace('{business}', $value, $tagdata);
ee()->functions->prep_conditionals($tagdata, array('business' => $value));That order is backwards and thus doesn’t allow for intelligent optimization. We never get a chance to see {business}; the conditional is already broken by the time we get to it. Once the string is misquoted there is nothing we can do, the code can’t guess at meanings like a human. The only alternative for this case is to dynamically unquote tags we see before allowing modules to parse anything. So essentially we would turn it into this:
{if "Jim Patterson Co." == 'The '.business.' Co.'}out{/if}The problem with that is that people tend to use the quoted syntax in the simple case (foo == “{tag}”) when the unquoted version isn’t working. And the main reason for the unquoted not working is when prep_conditionals isn’t called. So if we unwrap something then we may never see a value for it.
Getting back on track with the arrays: Let’s assume that we can solve all of the above and use “4|{tag}|6”. Then with that as an example, I have a few thinking points:
If a developer passes {tag} as “3|6|8”, does that count as additional values or a single value?
I think that depends on the operator used. If you use ==, use it as a literal string. If you use IN, use it as a list (array).
What if they pass tag as array(3, 6, 8)?
Right now, you woulnd’t be able to output such a var in the template anyway. You can’t use {tag} as a variable pair if you pass it an array like that, because the keys are numeric (and {0}, {1}, etc are invalid). So at the moment, I don’t see that happening.
How do I indicate as a developer that the value I’m passing is actually an array and didn’t just happen to have a pipe in it?
Again, this is up to the operator. The developer should be smart enough to know when to use it, and when it can result in errors. I do see your point, tho. In my code, I also allowed for the & to be used as a separator (as that can be used in parameters as well), and some people had that in field values, which lead to unexpected results. Asking them to change that into the word “and” solved it. For piped values, you could also use an escaped value, just like you’re using IS_EMPTY already.
Setting base rules, which needn’t cover and handle all possible scenarios, is key here. Devs will need to learn the new syntax anyway, so you’re perfectly fine with explaining strict rules and caveats, IMO. People will find ways to break anything. Don’t try to avoid that, but try to facilitate that better with a good consistent set of rules and meaningful error messages.
If (2) means passing an array to parse_variables, then does that mean I can use it in a tag pair?
Again, only if the array is associative. Otherwise it wouldn’t make sense, at least to me.
What happens when I apply a string operator to a piped string? How do I check equality of two piped-strings? Should order of the piped values matter?
Like I said before, if you use a string operator, the piped string would be considered a literal string. Eg.:
{if "1|2|3" == "2|3|4"} // false
{if "1|2|3" == "3|2|1"} // false
{if "1|2|3" == "1|2|3"} // trueNow comes a tricky part; treating the first part of the comparison as a list.
{if "1" IN "1|2|3"} // true
{if "1" IN "2|3|4"} // falseThe above is obvious, of course. My gut feeling, as in, what I would expect as a dev building a template, I’d probably go for something like this:
{if "1|2" IN "1|2|3"} // true
{if "1|2" IN "2|3|4"} // true
{if "1|2|3" IN "3|2|1"} // true
{if "1&2&3" IN "3|2|1"} // true
{if "1&2" IN "2|3|4"} // falseThis adheres to the parameter syntax for implying OR and AND. For those last two lines, you might also consider using a slightly different operator, similar to XOR, for example:
{if "1|2|3" ALL IN "3|2|1"} // true
{if "1|2" ALL IN "2|3|4"} // false…but that is where you can get creative. For example, you could turn it around:
{if "1|2" IN "1|2|3"} // true
{if "1|2" IN "2|3|4"} // false
{if "1|2" ANY IN "2|3|4"} // trueOn the other hand, you can also choose that the IN syntax will always consider the left hand part of the comparison as a literal string, even if there are pipe chars in there. Yes, it will limit the functionality, but as you set it as a rule, it will always be consistent.
Devs will need to learn the new syntax anyway, so you’re perfectly fine with explaining strict rules and caveats, IMO.
I agree with this pretty strongly. However from where I’m standing your solution has more caveats, which is why I’m so hesitant.
To be clear, regardless of the solution, a literal “5|6|7” in a template needs to be interpreted as an array. I’m not arguing against that. The conditional parser needs to either explode those or rewrite them in the updater. Having everyone manually rewrite their piped strings in templates is not really an option. With the existing parameter syntax having that consistency is a good thing, too.
And as long as they are just string literals without variables, any ambiguity created this way will be plainly obvious. No one will write this and think it’s three values:
{if html_title in "About | About | Company | Teams|About | Company | Careers"}obviously not{/if}Where it becomes ambiguous and concerning to me is when the piped string syntax is used as a conduit for passing data. You have an array, you implode it with pipes, pass it to prep_conditionals, we get the string, we check for pipes, we explode it again and treat it as an array. That’s all just for show, it never actually touches the template. Not only does that seem silly, the lack of pipe escaping makes it’s a lossy conversion. That is what leads to the ambiguity:
{if html_title in stash:allthepagetitles}I don't remember stash syntax...{/if}It’s a time-bomb. Unlike the above template, this looks right at the time of writing. I can’t check it for correctness just by looking at it, or even by running it. If the client decides to create those pages a year later it will just seem broken.
Asking them to change that into the word “and” solved it.
You are suggesting rules for the template authors about what data the end-users can or cannot enter. That just doesn’t feel future proof to me. The data shouldn’t change the meaning of the template. I prefer the rules to apply to the developers.
So a newly summarized suggestion:
• Pipe syntax in the template continues to work. • Pipes in variables are treated as part of the data • The pipe outside quotes concatenates lists: “a|b”|”c”|var (a|b is using the backwards compatible quoted format, just can’t be used with an embedded var)
For addon devs:
• If you want it treated as an array, pass it as an array. • If it’s used in a single tag or with a string operator, then we will pipe implode it for you and backslash escape pipes in the array values.
Certain operator <-> data combinations still need to be fleshed out, but would that assuage most of your concerns?
And as long as they are just string literals without variables, any ambiguity created this way will be plainly obvious.
Yes, and developers building templates should know that certain vars could contain pipes, and take that into account. But, curly braces are already encoded to entities in content, so why not pipes, too?
{if var in "Foo | Bar|Space | Bar"}You are suggesting rules for the template authors about what data the end-users can or cannot enter.
What I was trying to say is that people are willing to work around some technical quirks when encountered. I see your point about the possible ambiguity of pipe separated lists, but I also think this is an uncommon use case. Developers will have to know when it’s safe to use.
The pipe outside quotes concatenates lists: “a|b”|“c”|var (a|b is using the backwards compatible quoted format, just can’t be used with an embedded var)
That just looks super confusing to me.
• If you want it treated as an array, pass it as an array. • If it’s used in a single tag or with a string operator, then we will pipe implode it for you and backslash escape pipes in the array values.
Sure, I’m OK with this approach. If I understand correctly, that will result in this:
$vars["foo"] = array("lorem", "ipsum", "dolor|sit");
return ee()->TMPL->parse_variables_row($tagdata, $vars);
// Would result in this:
{foo} // lorem|ipsum|dolor\|sit
{if "dolor|sit" in foo} // true?
{if "dolor\|sit" in foo} // also true?
{if foo == "lorem"} // Umm, false?
{if foo == "lorem|ipsum|dolor\|sit"} // true?
{if foo == "lorem"|"ipsum"|"dolor|sit"} // Umm...I would expect the output given by {foo} to be usable “as is” in a conditional.
Also, in terms of backward compatibility, as soon as I pass an array in EE 2.8-, the site will break. By allowing arrays like that, you’re no longer restricting passable data to strings (and var pairs are just nested strings, really). To me, that almost feels like more of a 3.0 feature, to be honest. Really nice, but 3.0.
On another note, I also notice that people use exp-tags inside their conditionals, eg:
{if {exp:module:method param="foo"} == 'yep'}Where the exp-tag can also be quoted. What’s your stance on that?
That just looks super confusing to me.
And was written to to be. I think most developers would separate a and b. I personally don’t think a pipe in a string should under any circumstances be interpreted differently, but in this case it does appear to be needed for the sake of backwards compatibility. The solution there might be to not allow mixing of the old and new syntax. That would make your examples a little more obvious, too:
{foo} // lorem|ipsum|dolor\|sit
{if "dolor|sit" in foo} // false
{if "dolor\|sit" in foo} // true
{if foo == "lorem"} // falsey-false
{if foo == "lorem|ipsum|dolor\|sit"} // true, because stringified
{if foo == "lorem"|"ipsum"|"dolor|sit"} // trueAlso, in terms of backward compatibility, as soon as I pass an array in EE 2.8-, the site will break. By allowing arrays like that, you’re no longer restricting passable data to strings (and var pairs are just nested strings, really). To me, that almost feels like more of a 3.0 feature, to be honest. Really nice, but 3.0.
Yep. And this does admittedly suck. Even the nested quoting change scares me from a compatibility perspective. The updater can’t spot those problems. None of this was planned, for what it’s worth. The plan was to do our best not to break any more apis/backwards compatibility until 3.0. The security issue forced our hand.
curly braces are already encoded to entities in content, so why not pipes, too?
I wouldn’t mind doing that. In the very least for things that are already passed through typography.
On another note, I also notice that people use exp-tags inside their conditionals, eg:Where the exp-tag can also be quoted. What’s your stance on that?{if {exp:module:method param="foo"} == 'yep'}
Extremely ambivalent. Right now, both cases will mark the conditional as non-evaluatable. Which means that we skip that conditional until the tag is gone or until the last conditional pass (where safety is on). From a template perspective the quoted syntax is more error prone due to quote nesting. It’s harder to make sure that nested quotes aren’t breaking, and currently impossible to guarantee that a quote in the data won’t break the conditional. Surprisingly though, as our parser stands right now, the string case is easier to optimize. Consider these two:
{if "{exp:module}{tag}{/exp:module}" == "foo" || TRUE}
{if {exp:module}{tag}{/exp:module} == "foo" || TRUE}I think it’s safe to assume that the first one will always be TRUE. The assumption here is that the result is always quoted and thus always a single value. The second one is harder to make a call on, because it’s harder to parse out the pair, and {tag} could turn into conditional constructs (don’t ever do that!). In theory the execution is sandboxed for both (tagdata for the module is just {tag}), our template parser just isn’t up to snuff enough to allow us to hijack it for that snippet and figure out quoting dynamically. So in the future I would like to be able to treat them both as a single “module/plugin” value. That will mean they will both be equally optimizable and we can save the data from nested strings. Not possible yet, but we currently don’t do any constant folding either, so they’ll have equally poor performance in 2.9.
Thanks again for the diligent feedback, Low.
Another note about “If you want it treated as an array, pass it as an array.” – what about Snippets? Say someone created a Snippet which contains a pipe-list of IDs, and you wanted to use the IN conditional. How would that work? So…
{my_snippet} // 1|2|3
{if "1" IN my_snippet} // What happens here?
{if entry_id IN my_snippet} // Or in an Entries tagGlobal variables, specifically the early parsed ones, cannot be passed as arrays. At least not at the moment, right? Shouldn’t that work consistently, too?
Edit: I guess you could use the other notation (ie. “1”|”2”|”3”), but that means you’d have to create 2 Snippets if you wanted to use those IDs in an entry_id=”” param, too. That isn’t really an option, IMO.
That just looks super confusing to me.And was written to to be.
You’re deliberately creating confusing code?
I think most developers would separate a and b.
What do you mean by a and b?
Just throwing this out there: use PHP 5.4 array syntax?
I would prefer that over the previously suggested syntax. Certainly more legible:
{if var in ['1', '2', foo, 'a|b|c']}…but the issue with the piped lists still stands.
If anything, I’d recommend fixing the security issue now, and don’t bother with extra syntax in conditionals. It’s easy enough to build a plugin to mimic it (in fact, I’ve already built it):
{if {exp:low_if:in needle="{var}" haystack="{list}"}}Or, if you (EllisLab) are going through with it, maybe also make a regular expression match possible? Like this, possibly?
{if var matches '#pattern#'}Packet Tide owns and develops ExpressionEngine. © Packet Tide, All Rights Reserved.