I've got a query that chooses a table of nodes, then joins a table of game titles into it. This is accomplished beginning with joining an in-between table of node IDs and title IDs that enables a many-to-many relationship between your first couple of tables. Both joins are inner to ensure that only nodes having a correctly set up and existing title are selected. In my opinion this to any or all be neat and efficient - the issue is below:

There's additionally a 4th table that delivers an easy hierarchy for nodes node_parents. Each row has two fields a node ID along with a node ID that functions as that node's parent (node_id and parent_id). Some nodes don't have children set up within this database (ie. the node itself is not marked being a parent in almost any row from the node_parents table) - fundamental essentials nodes I am attempting to choose.

The extra criteria of these childless nodes is they possess a specific title set up - hence the subquery initially choosing from node_game titles after which inner joining node_parents. The subquery also offers an organization BY because some nodes are parents of multiple nodes, so their node_id will unnecessarily appear multiple occasions within the results. I ought to also explain that due to this the main key for node_parents is a mix of the node_id and parent_id.

The query:

SELECT  `nodes`.`node_id`,
        `titles`.`title`
FROM `nodes`
INNER JOIN `node_titles`
ON `nodes`.`node_id` = `node_titles`.`node_id`
INNER JOIN `titles`
ON `node_titles`.`title_id` = `titles`.`title_id`
WHERE `nodes`.`node_id` NOT IN
    (
    SELECT `node_titles`.`node_id`
    FROM `node_titles`
    INNER JOIN `node_parents`
    ON `node_titles`.`node_id` = `node_parents`.`parent_id`
    WHERE `node_titles`.`title_id` = 1
    GROUP BY `node_titles`.`node_id`
    )
AND `titles`.`title_id` = 1

Tables dimensions: nodes = ~32,000 node_game titles = ~49,000 game titles = 3 node_parents = ~55,000

The query takes around 16 minutes to accomplish. Can anybody provide any pointers? I've attempted profiling the query - which does not have lengthy dangles, however it does continue doing this cycle for which appears like all selected row:

| executing                      | 0.000005 |
| Copying to tmp table           | 0.515815 |
| Sorting result                 | 0.000053 |
| Sending data                   | 0.000028 |

I've also attempted ditching the subquery and taking advantage of a LEFT JOIN having a WHERE foo Isn't NULL, but this still requires a very long time to process - the profiler claims ~180 seconds for 'Copying to tmp table'.

Ultimately I suspect this can be an indexing problem - but in either case I'd appreciate solutions that are not questioning the implementation from the query unless of course they're going after a potential reason for the downturn (eg. yes, the game titles and nodes should be inside a many-to-many relationship). Thanks all, and additional info on request!

Take away the GROUP BY in the subquery:

SELECT  nodes.node_id,
        titles.title
FROM    nodes n
INNER JOIN
        node_titles nt
ON      nt.node_id = n.node_id
INNER JOIN
        titles t
ON      t.title_id = nt.title_id
WHERE   n.node_id NOT IN
        (
        SELECT  nti.node_id
        FROM    node_titles nti
        INNER JOIN 
                node_parents npi
        ON      npi.parent_id = nt.node_id
        WHERE   nti.title_id = 1
        )

Produce the following indexes:

node_titles (node_id, title_id)
titles (title_id)
node_parents (parent_id)

Update:

Do this:

SELECT  nodes.node_id,
        titles.title
FROM    nodes n
INNER JOIN
        node_titles nt
ON      nt.node_id = n.node_id
        AND nt.title_id = 1
INNER JOIN
        titles t
ON      t.title_id = nt.title_id
WHERE   n.node_id NOT IN
        (
        SELECT  parent_id
        FROM    node_parents
        )

MySql has a tendency to have difficulties with subqueries in my opinion. Do this

SELECT  nodes.node_id,
        titles.title
FROM    nodes b
INNER JOIN
        node_titles nt
ON      nt.node_id = n.node_id
INNER JOIN
        titles t
ON      t.title_id = nt.title_id
LEFT OUTER JOIN   
        (
        SELECT  nti.node_id
        FROM    node_titles nti
        INNER JOIN 
                node_parents npi
        ON      npi.parent_id = nt.node_id
        WHERE   nti.title_id = 1
        ) ThisTable on n.node_id = ThisTable.node_id
 WHERE ThisTable.node_id is null