Thread

global_vars and "regular expression is too large" error

May 12, 2017 3:34pm

Subscribe [2]
  • #1 / May 12, 2017 3:34pm

    litzinger's avatar

    litzinger

    564 posts

    So I have a very large _global_vars array (3205 items). Probably too large, but it is what it is and I can’t change it. I’m getting a “regular expression is too large” error b/c the $regex value in this loop is a very, very long string.

    $regex = $this->getGlobalsRegex();
    while (preg_match_all($regex, $this->template, $result))

    I did some testing with a smaller _global_vars array (~600 items) by changing that loop to just do a strpos() on the template before swaping the variable contents. In my tests it took approximately 0.014 seconds to loop over the _global_vars array. The preg_match_all by comparison took approximately 0.005 seconds. Its about a 64% difference in speed, but still a very small fraction of time. Since preg_match_all has a limitation depending on the compiled php settings would it make sense to change it to a loop with a strpos check?

    preg_match_all

    pro: faster

    con: puts a limit on the number of _global_vars you can have

    foreach & strpos

    pro: no limit on the _global_vars

    con: slower

  • #2 / May 12, 2017 3:51pm

    litzinger's avatar

    litzinger

    564 posts

    This looks to be a change from EE2 to EE3. EE2 just iterates over the _global_vars, thus no limit, EE3 uses preg_match_all, which puts a limit on the _global_vars.

  • #3 / May 15, 2017 4:00pm

    Kevin Cupp's avatar

    Kevin Cupp

    674 posts

    Yes we specifically moved to regex for the performance boost, but that assumed global vars would be used reasonably. We’d need to look into ways to accommodate this without reverting back to the poor performance. But before we sink time into this and thus take time away from important work, is there a legitimate use case for this many global variables?

  • #4 / May 15, 2017 4:15pm

    litzinger's avatar

    litzinger

    564 posts

    In our case we have a Structure/Pages array of over 3000 pages, which are all created as early parsed global variables by Publisher. So I guess technically this is a Publisher issue, but it used to not be an issue. This is partially an architecture problem on our site b/c its putting dang near everything in Structure when it shouldn’t need to. That was before my time and now I’m having to deal with it. I can hack the core to change that loop back to the old version, but thats less than ideal.

  • #5 / May 16, 2017 4:26pm

    Kevin Cupp's avatar

    Kevin Cupp

    674 posts

    Ok, I may have something. Replace your Template.php with this and see how it works:

    https://dl.dropboxusercontent.com/u/28047/Template.txt

  • #6 / May 17, 2017 8:52am

    litzinger's avatar

    litzinger

    564 posts

    May be a day or two before I can get back to this… other priorities :(

    I see where you’re headed with the change though, thanks for taking a look at it.

  • #7 / May 17, 2017 12:34pm

    Kevin Cupp's avatar

    Kevin Cupp

    674 posts

    Alright, we might be putting out a release tomorrow but I was hoping to see how it worked on your site first before merging it in. Let me know if you get to it soon.

  • #8 / May 17, 2017 1:32pm

    litzinger's avatar

    litzinger

    564 posts

    I just checked it out, seems to work fine. I didn’t add my code back to do the timing/performance testing though.

  • #9 / May 17, 2017 1:36pm

    litzinger's avatar

    litzinger

    564 posts

    With 5754 global vars it took 0.014 to 0.017 seconds to do everything inside of the if (count(ee()->config->_global_vars) > 0) conditional.

    With 704 vars it took 0.009 seconds.

  • #10 / May 17, 2017 1:38pm

    Kevin Cupp's avatar

    Kevin Cupp

    674 posts

    I’d say that’s not bad. We’ll get this in then, thanks for kicking the tires!

  • #11 / May 17, 2017 1:41pm

    litzinger's avatar

    litzinger

    564 posts

    Updated times with more details. ^^

  • #12 / May 24, 2017 4:51pm

    litzinger's avatar

    litzinger

    564 posts

    The getGlobalsRegex function seems to change the behavior/usage of _global_vars when used in a layout file when items are added to the array from the template_fetch_template hook.

    From Slack:

    I’m using the In:sert extension in EE3. It works great for any template/global_var that isn’t used in my layout file. E.g. {in:sert:embeds/_footer-top-en} works fine everywhere except the layout file. In:sert is using the template_fetch_template hook and its correctly adding an item to the _global_vars array, but by the time the layout file gets parsed, the items are not in the _global_vars array… its using an old _global_vars array or something.

    Making this change returns it to its expected behavior:

    private function getGlobalsRegex()
     {
      //if ( ! isset($this->globals_regex))
      //{
       $global_names = array_keys(ee()->config->_global_vars);
       $global_names = array_map(
        function($str)
        {
         return preg_quote($str, '/');
        },
        $global_names
       );
    
       $global_names = $this->chunkGlobalsArray($global_names);
    
       $this->globals_regex = array_map(
        function($array)
        {
         return '/'.LD.'('.implode('|', $array).')'.RD.'/';
        },
        $global_names
       );
      //}
    
      return $this->globals_regex;
     }
  • #13 / May 30, 2017 10:03am

    Kevin Cupp's avatar

    Kevin Cupp

    674 posts

    Sorry for the delay! Was out last week. I’ll see if I can work up something so that we can still take advantage of the caching when there are no changes to the globals array.

  • #14 / May 30, 2017 12:22pm

    Kevin Cupp's avatar

    Kevin Cupp

    674 posts

    Ok I’ve put some caching in place based on what’s in the globals array. See how this works, if you want:

    https://dl.dropboxusercontent.com/u/28047/Template.txt

  • #15 / May 30, 2017 2:17pm

    litzinger's avatar

    litzinger

    564 posts

    To test that I’d actually have to roll back a few changes. I assume the only thing you really changed was adding the cache check in getGlobalsRegex?

    $global_names = array_keys(ee()->config->_global_vars); $cache_key = md5(serialize($global_names));

    I figured that would probably work, just wondering how much processing time it adds to serialize and md5 a large global vars array.

ExpressionEngine News

#eecms, #events, #releases