silveira neto

carbon-based lifeform. virgo supercluster

Menu Close

Elgg metadata performance issue

Elgg is a open source social networking framework.

In my Elgg instance (1.7.2) I have some entities with a metadata called tags. One entity can have zero, one or many tags. Tags are strings.

I’m trying to retrieve the last updated files with the tags tagA, tagB, … tagI in a profile widget using this query like this:

$options = Array (
    "types" => "object",
    "subtypes" => "file",
    "limit" => 10,
    "full_view" => FALSE,
    "order_by" => "time_updated DESC",
    "pagination" => FALSE,
    "metadata_name_value_pairs" => Array (
                Array("name" => tags, "value" => "tagA", "operands"=> "contains"),
                Array("name" => tags, "value" => "tagB", "operands"=> "contains"),
                Array("name" => tags, "value" => "tagC", "operands"=> "contains"),
                Array("name" => tags, "value" => "tagD", "operands"=> "contains"),
                Array("name" => tags, "value" => "tagE", "operands"=> "contains"),
                Array("name" => tags, "value" => "tagF", "operands"=> "contains"),
                Array("name" => tags, "value" => "tagG", "operands"=> "contains"),
                Array("name" => tags, "value" => "tagH", "operands"=> "contains"),
                Array("name" => tags, "value" => "tagI", "operands"=> "contains")
    )
);
echo elgg_list_entities_from_metadata($options);

This is enough to put MySQL at 100% of processor use and make Elgg to stop.

I’m using phpMyAdmin to monitor the database and Wireshark to monitor the communication between Elgg and MySQL. I came to this SQL query:

SELECT COUNT(DISTINCT e.guid) AS total
FROM elgg_entities e
JOIN elgg_metadata n_table ON e.guid = n_table.entity_guid
JOIN elgg_metadata n_table1 ON e.guid = n_table1.entity_guid
JOIN elgg_metastrings msn1 ON n_table1.name_id = msn1.id
JOIN elgg_metastrings msv1 ON n_table1.value_id = msv1.id
JOIN elgg_metadata n_table2 ON e.guid = n_table2.entity_guid
JOIN elgg_metastrings msn2 ON n_table2.name_id = msn2.id
JOIN elgg_metastrings msv2 ON n_table2.value_id = msv2.id
JOIN elgg_metadata n_table3 ON e.guid = n_table3.entity_guid
JOIN elgg_metastrings msn3 ON n_table3.name_id = msn3.id
JOIN elgg_metastrings msv3 ON n_table3.value_id = msv3.id
JOIN elgg_metadata n_table4 ON e.guid = n_table4.entity_guid
JOIN elgg_metastrings msn4 ON n_table4.name_id = msn4.id
JOIN elgg_metastrings msv4 ON n_table4.value_id = msv4.id
JOIN elgg_metadata n_table5 ON e.guid = n_table5.entity_guid
JOIN elgg_metastrings msn5 ON n_table5.name_id = msn5.id
JOIN elgg_metastrings msv5 ON n_table5.value_id = msv5.id
JOIN elgg_metadata n_table6 ON e.guid = n_table6.entity_guid
JOIN elgg_metastrings msn6 ON n_table6.name_id = msn6.id
JOIN elgg_metastrings msv6 ON n_table6.value_id = msv6.id
JOIN elgg_metadata n_table7 ON e.guid = n_table7.entity_guid
JOIN elgg_metastrings msn7 ON n_table7.name_id = msn7.id
JOIN elgg_metastrings msv7 ON n_table7.value_id = msv7.id
JOIN elgg_metadata n_table8 ON e.guid = n_table8.entity_guid
JOIN elgg_metastrings msn8 ON n_table8.name_id = msn8.id
JOIN elgg_metastrings msv8 ON n_table8.value_id = msv8.id
JOIN elgg_metadata n_table9 ON e.guid = n_table9.entity_guid
JOIN elgg_metastrings msn9 ON n_table9.name_id = msn9.id
JOIN elgg_metastrings msv9 ON n_table9.value_id = msv9.id
WHERE (((msn1.string = 'tags'
         AND BINARY msv1.string = 'tagA'
         AND ((1 = 1)
              AND n_table1.enabled='yes'))
        AND (msn2.string = 'tags'
             AND BINARY msv2.string = 'tagB'
             AND ((1 = 1)
                  AND n_table2.enabled='yes'))
        AND (msn3.string = 'tags'
             AND BINARY msv3.string = 'tagC'
             AND ((1 = 1)
                  AND n_table3.enabled='yes'))
        AND (msn4.string = 'tags'
             AND BINARY msv4.string = 'tagD'
             AND ((1 = 1)
                  AND n_table4.enabled='yes'))
        AND (msn5.string = 'tags'
             AND BINARY msv5.string = 'tagE'
             AND ((1 = 1)
                  AND n_table5.enabled='yes'))
        AND (msn6.string = 'tags'
             AND BINARY msv6.string = 'tagF'
             AND ((1 = 1)
                  AND n_table6.enabled='yes'))
        AND (msn7.string = 'tags'
             AND BINARY msv7.string = 'tagG'
             AND ((1 = 1)
                  AND n_table7.enabled='yes'))
        AND (msn8.string = 'tags'
             AND BINARY msv8.string = 'tagH'
             AND ((1 = 1)
                  AND n_table8.enabled='yes'))
        AND (msn9.string = 'tags'
             AND BINARY msv9.string = 'tagI'
             AND ((1 = 1)
                  AND n_table9.enabled='yes'))))
  AND ((e.TYPE = 'object'
        AND e.subtype IN (1)))
  AND (e.site_guid IN (1))
  AND ((1 = 1)
       AND e.enabled='yes')

A special thanks to the SQL formatting service SQLformat.appspot.com.

When I try to retrieve up to 4 tags I can get a response fairly quickly but more than that the response time seems to exponentially grow.

Any clues?

The workaround I’ll try now is to retrieve entities by each tag and them sort by myself.

© 2016 silveira neto. All rights reserved.

Theme by Anders Norén.