ExpressionEngine CMS
Open, Free, Amazing

Thread

This is an archived forum and the content is probably no longer relevant, but is provided here for posterity.

The active forums are here.

DataMapper 1.6.0

September 05, 2008 12:32pm

Subscribe [115]
  • #301 / Dec 03, 2008 7:40am

    Murodese

    53 posts

    Very few records - 1-5 on each table.

  • #302 / Dec 03, 2008 1:25pm

    Paul Apostol

    43 posts

    Hello,
    The only requested thing, as usual for me, is the joining table prefix.
    Do you think that replacing in “_get_relationship_table” the line:

    $relationship_table = (empty($this->join_prefix)) ? $this->prefix . $relationship_table : $this->join_prefix . $relationship_table;

    with:

    if(is_array( $this->join_prefix)){
        $relationship_table = (empty($this->join_prefix[strtolower($model)])) ? $this->prefix . $relationship_table : $this->join_prefix[strtolower($model)] . $relationship_table;
    } else{
        $relationship_table = (empty($this->join_prefix)) ? $this->prefix . $relationship_table : $this->join_prefix . $relationship_table;
            }

    will be OK?
    That will allow for the people who wants to use different prefixes for the joining table to use an alternative declaration (I’m using it inside the model):

    var $join_prefix = array('model_name'=>'join_prefix')

    The rest is awesome. This library shrink the development time at half.
    Thank you,
    Paul

  • #303 / Dec 03, 2008 4:34pm

    stensi

    109 posts

    Paul, unfortunately at the moment, if you want to have different prefix and join_prefix settings on a per model basis, you can’t use the datamapper config file, or at least have the prefix settings put in it as they apply globally to all DataMapper models (removing the settings from the config will force it to use what’s set in your model).

    In the next version though I’ll be adding the ability to have custom config files per model.  For example, there’s the DataMapper config that applies to all of them, but then if there’s a config file named the same as your model,such as a “user” config for a “User” model, then the user config settings get loaded over the top of the global datamapper config settings.

    I should have time today to put together that test suite I mentioned.  I’m not just looking at speed though as there should have been some memory improvements too.  I’ll be interested to see the results!

  • #304 / Dec 03, 2008 5:17pm

    Paul Apostol

    43 posts

    Thanks stensi,
    I know how it works. I’ll keep patching that function on each new version.

  • #305 / Dec 03, 2008 5:51pm

    stensi

    109 posts

    Actually, now that I think about it.  I should be doing the following in the constructor during the reading of the config file settings:

    // Load stored config settings
    foreach (DataMapper::$config as $key => $value)
    {
        if (empty($this->{$key}))
        {
            $this->{$key} = $value;
        }
    }

    That should solve the issue as if it’s already set in your model, it will keep that setting, thus, per model prefixing.  No extra config files needed then.  However, you’ll need to blank the following variables in the datamapper.php library or they’ll never load from the config:

    var $error_prefix = '';
    var $error_suffix = '';
    var $created_field = '';
    var $updated_field = '';

    Does that suit your needs ok?

     

     


    _________________________

    UPDATE:

    Well, this is both interesting and worrying, lol!  I have the performance test results and it looks like the latest version has a memory issue :down:

    Here’s the setup.  I’m using the groups, groups_users and users tables from the DataMapper Database Schema.

    groups table has 100 records.
    users table has 1000 records.
    groups_users table has 1000 records.

    The users were randomly assigned to a group and it ended up that each group had from 4 to 22 users (so they all have relations to users).

    The code I benchmarked get() with, keeping in mind that there are many records for it to process:

    $this->benchmark->mark('code_start');
    
    $g = new Group();
    $g->get();
    
    foreach ($g->all as $group)
    {
        echo $group->name . ' has these users:
    ';
    
        $group->user->get();
    
        foreach ($group->user->all as $user)
        {
            echo '    ' . $user->username . '
    ';
        }
    }
    
    $this->benchmark->mark('code_end');
    
    echo $this->benchmark->elapsed_time('code_start', 'code_end');


    And the results, showing the execution time of the above code and memory consumption.


    DataMapper 1.4.5 : stress test get()

    12.7795     2,386,920 bytes
    12.3789     2,390,032 bytes
    12.3302     2,386,920 bytes
    12.3157     2,386,920 bytes
    12.8587     2,386,920 bytes


    DataMapper 1.5.0 : stress test get()

    10.7321     11,158,720 bytes
    10.8895     11,158,432 bytes
    10.7776     11,158,912 bytes
    10.7482     11,158,720 bytes
    10.8656     11,158,720 bytes


    As you can see, the new version is roughly 2 seconds faster but it seems there’s a memory issue!

    Note: The speed is relative to the processing power of the test server.  It’s not all that powerful so I expect this would go much faster on my dedicated server.

    I’ll be spending today tracking down the cause and then releasing a fixed version which should end up having the same memory usage as version 1.4.5 but the speed improvements of 1.5.0!


    UPDATE:

    What a guess!

    The first thing I checked turned out to be the cause of the memory issue.  The change I did in replacing the related array with a parent object that is a reference is what did it.  I suspected that would be the cause but am still not sure why it chewed up so much memory.

    I’ll see if I can setup the object reference differently, otherwise I’ll just revert it back to an array.

  • #306 / Dec 03, 2008 9:08pm

    stensi

    109 posts

    DataMapper 1.4.5 : stress test get()

    12.7795     2,386,920 bytes
    12.3789     2,390,032 bytes
    12.3302     2,386,920 bytes
    12.3157     2,386,920 bytes
    12.8587     2,386,920 bytes


    DataMapper 1.5.0 : stress test get()

    10.7321     11,158,720 bytes
    10.8895     11,158,432 bytes
    10.7776     11,158,912 bytes
    10.7482     11,158,720 bytes
    10.8656     11,158,720 bytes


    DataMapper 1.5.1 : stress test get()

    10.5575     2,487,192 bytes
    10.6383     2,483,696 bytes
    10.5385     2,483,696 bytes
    10.5081     2,483,696 bytes
    10.3400     2,483,696 bytes


    With version 1.5.1, it looks like I sped things up just a tiny bit more as well as fixing the memory issue.  Sure, it ends up using roughly 100k more memory than version 1.4.5 but the speed improvement more than makes up for that and when you start using other methods, such as save(), version 1.5.1 ends up doing much better with memory usage on top of the speed.  See the test below.


    I adjusted the test code to the following to test save():

    $this->benchmark->mark('code_start');
    
    $g = new Group();
    $g->get();
    
    foreach ($g->all as $group)
    {
        echo $group->name . ' has these users:
    ';
    
        $group->user->get();
    
        foreach ($group->user->all as $user)
        {
            echo '    ' . $user->username . '
    ';
    
            // Change username to ensure database update
            $user->username = substr(md5(uniqid(rand(), true)), 0, 10);
    
            // Save user
            $user->save();
        }
    }
    
    $this->benchmark->mark('code_end');
    
    echo $this->benchmark->elapsed_time('code_start', 'code_end');


    And the results.


    DataMapper 1.4.5 : stress test save()

    32.6919     4,585,448 bytes
    31.5780     4,588,320 bytes
    31.8926     4,585,448 bytes
    32.7084     4,588,328 bytes
    31.7846     4,585,448 bytes


    DataMapper 1.5.1 : stress test save()

    22.9622     3,290,672 bytes
    24.1194     3,287,776 bytes
    23.0845     3,287,776 bytes
    26.6848     3,287,776 bytes
    24.1903     3,287,776 bytes


    As you can see, with the improvements made to validation and other areas, version 1.5.1 is a hell of a lot faster and uses over 1MB less memory during save() than version 1.4.5!

    Again, note that my test server isn’t very fast (running off a USB) so on a better server you’ll see much quicker times working with that many records.

    Now, I just need to update the documentation and then I’ll put version 1.5.1 up 😊

  • #307 / Dec 03, 2008 9:17pm

    OverZealous

    1030 posts

    Wow, stensi, that’s awesome.  I’m glad to see these improvements.  Once again, I feel quite lucky to have stumbled upon DM when starting my latest project.  Major development time savings, and as time goes on, pretty darn fast, to boot!

    How did you decide to fix the memory issue?  Did you try using assign-by-reference?  (ie: $this->parent =& $parent_object)

    Now, I just need to set aside a day to upgrade to CI 1.7 and DM 1.5.1 😜

  • #308 / Dec 03, 2008 9:49pm

    stensi

    109 posts

    Version 1.5.1 has been released!

    View the Change Log to see what’s changed.

    Basically, the memory issue has been fixed and the config file settings can now be correctly overridden by setting the values directly in your models.

    Enjoy 😊


    @OverZealous.com: PHP5 assigns objects by reference by default so the & is not necessary.

    You can test this by doing:

    $u1 = new User();
    $u1->get();
    
    $u2 = $u1; // notice no =&
    
    // Let's pretend the username is: foo
    echo $u1->username . '
    '; // outputs: foo
    echo $u2->username . '
    '; // outputs: foo
    
    echo '
    ';
    
    // Change username in u2
    $u2->username = 'bar';
    
    echo $u1->username . '
    '; // outputs: bar (in PHP5, PHP4 will remain as foo unless =& was used)
    echo $u2->username . '
    '; // outputs: bar

    Even so, I tried testing with both = and =& and in both cases the memory issue occurred.  I solved it by reverting it back to an array containing the parents id and model name.

    It’s well worth the upgrade! 😊

  • #309 / Dec 03, 2008 10:10pm

    OverZealous

    1030 posts

    Right, I knew that. 😊  I wonder why it would increase the memory, though?  Not really a big issue, for now.

    Update: From what I can tell through my quick skimming online, GC in PHP is, effectively, stupid.  It is unable to determine that two objects which point to each other, yet have no other path of accessibility, need to be GCd.  Therefore, the act of pointing Object A ($parent) at Object B ($child), and pointing $child back to $parent, means that those objects will not be collected until the script exits, or one of those two items are manually removed.

    The only solution would be to create some kind of “destroy” method that was manually called by the application, or to hand unset $parent in the app.

  • #310 / Dec 03, 2008 10:24pm

    Murodese

    53 posts

    Brilliant, thanks Stensi 😊

  • #311 / Dec 03, 2008 11:51pm

    OverZealous

    1030 posts

    Preliminary Report on 1.5.1 😉

    I haven’t made the switch yet (probably will be doing that through the morning…), but I have to say, I’m really impressed with the improvements.  Some things, like using the built-in clone function (I didn’t know existed) are much more elegant than my attempts.

    I love the fact that the “changed” values no longer require DB calls.  Much nicer!

    I’m a little confused why the count() method doesn’t use $this->db->count_all_results()?  This uses COUNT(*) on the backend, so it should be an order of magnitude faster!  There’s also ->count_all() for counting the whole table.

    Also, would it make more sense to use the already-loaded CIFV library by binding it to the DataMapper object?  (ie: Setting and using $this->form_validation instead of creating a new CIFV everytime it is needed.)

    Those are fairly minor issues.

    I would, however, like one fairly important (for me) feature.  I really need to the ability to hook into the model initialization routines - preferable without editing the DM class itself.  What I need are two methods: one after DM has run through it’s global initialization, and one after the model has been initialized.

    These methods would allow me to set the labels from language files, as well as some other general cleanup.

    I suggest this (most of the existing code has been hidden):

    function DataMapper() {
        // ...
        if ( ! array_key_exists($this->model, DataMapper::$common))
        {
            if ($this->model == 'datamapper')
            {
                // Set up DM ...
    +           $this->post_datamapper_init();
                return;
            }
            // Set up model ...
            // Just before storing the common model settings
    +       $this->post_model_init();
            // Now store common settings ...
        }
        // ...
    }
    // add these two empty functions
    function post_datamapper_init()
    {
    }
    function post_model_init()
    {
    }

    It only adds a few lines of code, and basically no overhead since the empty functions might get optimized away.

    I realize I could handle it in my own constructors, but it would be difficult to determine if this was the *first* init or not.

    One last thing.  When the shared model elements are stored into the $common array, they are not stored by reference.  It doesn’t matter for the table name, but the $fields and $validation arrays might be getting copied there.  I’m not 100% sure about that, though.

    Thoughts?

    Thanks for putting so much time into DataMapper!

    UPDATE
    I just found one other problem, easily fixed.  In the _related method, the join query does not add $relationship_table to the current object’s ID field.  This causes problems in multiple-join / multiple-related queries.  It’s easily fixed:

    $this->db->join($relationship_table, $object->table . '.id = ' . $this->model . '_id', 'left');
    // becomes
    $this->db->join($relationship_table, $object->table . '.id = ' . $relationship_table.'.'.$this->model . '_id', 'left');
    
    // and
    $this->db->join($relationship_table, $this->table . '.id = ' . $this->model . '_id', 'left');
    // becomes
    $this->db->join($relationship_table, $this->table . '.id = ' . $relationship_table.'.'.$this->model . '_id', 'left');

    This doesn’t cause an issue at this moment, unless you are using the where_related functions I posted a while back.  Then you will see DB errors.

  • #312 / Dec 04, 2008 1:52am

    stensi

    109 posts

    Ah, I didn’t notice the count_all_results() method, but I knew of the count_all() method.  The reason I’d done it the way I did was because count_all() doesn’t allow any query clauses to work with it, such as where clauses, which I need to be able to do for related queries and self referencing relationships.  However, as count_all_results() works with query clauses, I can switch to use that, so thanks for the tip!

    I just benchmarked the difference and you’re right, it does add just that extra tiny little bit of speed, so I’ve added that for the next version.

    Hmm, I took a look at the $common values.  The first instance doesn’t get assigned by reference but the rest from then on do:

    function test()
    {
        $u1 = new User();
        $u1->get(1);
    
        $u2 = new User();
        $u2->get(1);
    
        $u3 = new User();
        $u3->get(1);
    
        echo '<pre>u1:<br>'.print_r($u1->validation['password']['rules'], TRUE).'</pre><p>‘;<br />
        echo ‘</p><pre>u2:<br>'.print_r($u2->validation['password']['rules'], TRUE).'</pre><p>‘;<br />
        echo ‘</p><pre>u3:<br>'.print_r($u3->validation['password']['rules'], TRUE).'</pre><p>‘;</p>
    
    <p>    $u2->validation[‘password’][‘rules’] = array(‘foo’ => ‘bar’);</p>
    
    <p>    echo ‘<hr >’;</p>
    
    <p>    echo ‘</p><pre>u1:<br>'.print_r($u1->validation['password']['rules'], TRUE).'</pre><p>‘;<br />
        echo ‘</p><pre>u2:<br>'.print_r($u2->validation['password']['rules'], TRUE).'</pre><p>‘;<br />
        echo ‘</p><pre>u3:<br>'.print_r($u3->validation['password']['rules'], TRUE).'</pre><p>‘;<br />
    }
    </pre>

    Outputs:

    u1:
    Array
    (
        [0] => required
        [1] => trim
        [min_length] => 3
        [max_length] => 40
        [2] => encrypt
    )
    
    u2:
    Array
    (
        [0] => required
        [1] => trim
        [min_length] => 3
        [max_length] => 40
        [2] => encrypt
    )
    
    u3:
    Array
    (
        [0] => required
        [1] => trim
        [min_length] => 3
        [max_length] => 40
        [2] => encrypt
    )
    
    _______________________________________
    
    u1:
    Array
    (
        [0] => required      // not by reference so unaffected
        [1] => trim
        [min_length] => 3
        [max_length] => 40
        [2] => encrypt
    )
    
    u2:
    Array
    (
        [foo] => bar         // changed by reference
    )
    
    u3:
    Array
    (
        [foo] => bar         // changed by reference
    )

    So, I’ll need to correct it to assign by reference when first loading it, which is an easy fix.  Thanks for making me investigate 😉


    I’m yet to put time into what will be the related_{clause}() methods.  I’ll be using __call() to feed the related clauses through the same private method, so I don’t have to have separate ones for each clause type (where, or_where, like, or_like, etc).  Looking at the change you need, yep, I’ll put that in since I should have had it like that originally.  Thanks again!

  • #313 / Dec 04, 2008 2:11am

    OverZealous

    1030 posts

    Crapola, apparently some of those strtolower()s you removed were important.  PostGreSQL has case sensitive tables and field names, so I’m unable to query anything, since the model names have uppercase first letters.

    UPDATE
    Wait-a-minute - I have hand coded some of the “models”.  Let me see if I can work around the issue.

    Follow-Up
    Yes, that did the trick, changing all of my models to lowercase.  I still find it weird that PHP classes are not case sensitive.

    Also, the change to an associative $fields array really buggered up some code.  But I have that mostly squared away for now.

  • #314 / Dec 04, 2008 3:31am

    stensi

    109 posts

    Odd, I don’t know how removing those strtolower()‘s could have affected you as the $this->model value is always stored in lowercase (the singular() method changes the get_class() return value to lowercase), thus, the strtolower()‘s weren’t actually doing anything at all and is why I had removed them.

    The only time the model has an uppercase first letter is when I use ucfirst() on them, and that is never stored in $this->model as it’s only done temporarily to create a temporary new instance (and from then on, that instances->model value is used, which is lowercase).

    Out of interest, what sort of thing were you using the $fields array for?  Would you prefer it to be changed back so it just contains the field names only?

    ________________________________

    Version 1.5.2 has been released!

    View the Change Log to see what’s changed.

    Basically, the above mentioned assign by reference issues have been corrected, count() now uses count_all_results() and in _related() the relationship table has been specified on the join by id’s.

    Sorry if you find it annoying that I’m releasing several smaller versions in quick succession, but I’m sure you’d all prefer I release bug fixes while I have the time, instead of leaving you to wait on them!

    Besides, the releases and feedback I get, the better the product 😉

    Enjoy 😊

  • #315 / Dec 04, 2008 4:03am

    OverZealous

    1030 posts

    The “error” came from me having hand-coded the $model of certain classes, using the same name as the class (with a capital first letter).  It was more that the error was masked by the previous strtolower calls.  That’s not an issue, since I just “fixed” the model name (and a ton of other strtolower and ucfirst calls).

    I use the $fields array to handle automatic form saving.  It’s a complicated function, and very specific to my application, but it allows a subset of the $fields and $validation $fields to be automatically processed and saved from a form.  Just this second, I realized that I now could be using the $validation array exclusively, since that now contains all of the $fields as well.  I might do that.  Anyway, I fixed it by either calling array_keys() or changing my foreach loops to handle the association.  I have no problem with the associative arrays.

    Actually, I think I have the update all set, including CI 1.7.  I had to fix a lot of little things, and once I’m certain, I now have to commit some 160+ files to my repository 😉

    Of course, now I have to RE-import DM 1.5.2, AND make my few changes 😉

    BTW: my remaining tweaks to DM are:
    * Adding the letter ‘O’ to the end of the timestamps, because I need the timezone stored.
    * Adding in the previously mentioned post_datamapper_init and post_model_init method calls.

    Otherwise, I’m able to keep all of my changes within my subclass.

.(JavaScript must be enabled to view this email address)

ExpressionEngine News!

#eecms, #events, #releases