Cleaning up orphaned records in MySQL
I needed to clean out some orphaned records from a MySQL database today. I thought this would be as simple as
DELETE FROM subTable WHERE ID IN (
SELECT subTable.ID from subTable
LEFT OUTER JOIN parentTable ON subTable.parentID = parentTable.ID
WHERE parentTable.ID IS NULL
)
But this resulted in an error:
ERROR 1093 (HY000): You can't specify target
table 'subTable' for update in FROM clause
In MySQL, you can’t delete (or update) a table when you are using that table in a subselect. Here is one work around (I’m sure there are many):
CREATE TEMPORARY TABLE deleteids AS (
SELECT subTable.ID from subTable
LEFT OUTER JOIN parentTable ON subTable.parentID = parentTable.ID
WHERE parentTable.ID IS NULL
);
DELETE FROM subTable WHERE ID IN (SELECT ID FROM deleteids);
DROP TABLE deleteids;
If you don’t explicitly drop your temporary table, it gets dropped automatically when you close your MySQL session.
This worked great for all the tables I needed clean up, except for one. This table has 150,000 records, with 10,000 of them being abandoned. The above solution created a query with an IN () statement containing 10,000 values!
Even when I broke it down to running 100 at a time, it was still running about 1 delete per second, and I’m sure it gets worse if you try to do more than 100 at a time. MySQL (And SQL Server as well) is not very efficient with IN() clauses. Thankfully, most (all?) subselects can be rewritten to use a join. In fact doing this way, I was able to eliminate the temp table.
DELETE subTable FROM subTable LEFT OUTER JOIN parentTable ON
subTable.parentID = parentTable.ID WHERE parentTable.ID IS NULL
This deleted all 10,000 records in two seconds.