Table of Contents

The Problem

Why it is a Problem

InnoDB and undo



Deleting in Chunks

InnoDB Chunking Recommendation

Iterating through a compound key

Reclaiming the disk space

Deleting more than half a table

Non-deterministic Replication

Replication and KILL

SBR vs RBR; Galera

Optimal reLOAD of a table


Brought to you by Rick James

The Problem

How to batch DELETE lots of rows from a large table? Here is an example of purging items older than 30 days:
If a table has a large number of rows, say millions, this statement may take minutes, maybe hours.

Any suggestions on how to speed this up?

Why it is a Problem

    ⚈  MyISAM will lock the table during the entire operation, thereby nothing else can be done with the table.
    ⚈  InnoDB won't lock the table, but it will chew up a lot of resources, leading to sluggishness.
    ⚈  InnoDB has to write the undo information to its transaction logs; this significantly increases the I/O required.
    ⚈  Replication, being asynchronous, will effectively be delayed (on Replicas) while the DELETE is running.

InnoDB and undo

To be ready for a crash, a transactional engine such as InnoDB will record what it is doing to a log file. To make that somewhat less costly, the log file is sequentially written. If the log files you have (there are usually 2) fill up because the delete is really big, then the undo information spills into the actual data blocks, leading to even more I/O.

Deleting in chunks avoids some of this excess overhead.

Limited benchmarking of total delete elapsed time shows two observations:

    ⚈  Total delete time approximately doubles above some 'chunk' size (versus below that threshold). I do not have a formula relating the log file size with the threshold cutoff.
    ⚈  Chunk size below several hundred rows is slower. This is probably because the overhead of starting/ending each chunk dominates the timing.


    ⚈  PARTITION -- Requires 5.1 and some careful setup, but is excellent for purging a time-base series.
    ⚈  DELETE in chunks -- Carefully walk through the table N rows at a time.


The idea here is to have a sliding window of partitions. Let's say you need to purge news articles after 30 days. The "partition key" would be the datetime (or timestamp) that is to be used for purging, and the PARTITIONs would be BY RANGE. Every night, a cron job would come along and decide whether to build a new partition for the next day, and drop the oldest partition.

Dropping a partition is essentially instantaneous, much faster than deleting that many rows. However, you must design the table so that the entire partition can be dropped. That is, you cannot have some items in a partition living longer than others.

PARTITION tables have a lot of restrictions, some are rather weird. You can either have no UNIQUE (or PRIMARY) key on the table, or every UNIQUE key must include the partition key. In this use case, the partition key is the datetime. It should not be the first part of the PRIMARY KEY (if you have a PRIMARY KEY).

You can PARTITION InnoDB tables. (Before Version 8.0, you could also partition MyISAM tables.)

Since two news articles could have the same timestamp, you cannot assume the partition key is sufficient for uniqueness of the PRIMARY KEY, so you need to find something else to help with that.

Reference implementation for Partition maintenance

PARTITIONing requires MySQL 5.1. MySQL docs on PARTITION

Deleting in Chunks

Although the discussion in this section talks about DELETE, it can be used for any other "chunking", such as, say, UPDATE, or SELECT plus some complex processing.

(This discussion applies to both MyISAM and InnoDB.)

When deleting in chunks, be sure to avoid doing a table scan. Also be sure to avoid OFFSET and LIMIT. The code below is good at that; it scans no more than 1001 rows in any one query. (The 1000 is tunable. If you have a very small innodb_buffer_pool_size, it should be tuned downward.)

Assuming you have news articles that need to be purged, and you have a schema something like
      ts TIMESTAMP,
      PRIMARY KEY(id)
Then, this pseudo-code is a good way to delete the rows older than 30 days:
   @a = 0
      DELETE FROM tbl
         WHERE id BETWEEN @a AND @a+999
      SET @a = @a + 1000
      sleep 1  -- be a nice guy
   UNTIL end of table
Notes (Most of these caveats will be covered later):
    ⚈  It uses the PK instead of the secondary key. This gives much better locality of disk hits, especially for InnoDB.
    ⚈  You could (should?) do something to avoid walking through recent days but doing nothing. Caution -- the code for this could be costly.
    ⚈  The 1000 should be tweaked so that the DELETE usually takes under, say, one second.
    ⚈  No INDEX on ts is needed. (This helps INSERTs a little.)
    ⚈  If your PRIMARY KEY is compound, the code gets messier. (a fix is below)
    ⚈  This code will not work without a numeric PRIMARY or UNIQUE key. (a fix is below)
    ⚈  Read on, we'll develop messier code to deal with most of these caveats.

If there are big gaps in id values (and there will after the first purge), then
   @a = SELECT MIN(id) FROM tbl
      SELECT @z := id FROM tbl WHERE id >= @a ORDER BY id LIMIT 1000,1
      If @z is null
         exit LOOP  -- last chunk
      DELETE FROM tbl
         WHERE id >= @a
           AND id <  @z
      SET @a = @z
      sleep 1  -- be a nice guy, especially in replication
   # Last chunk:
      WHERE id >= @a
That code works whether id is numeric or character, and it mostly works even if id is not UNIQUE. With a non-unique key, the risk is that you could be caught in a loop whenever @z==@a. That can be detected and fixed thus:
      SELECT @z := id FROM tbl WHERE id >= @a ORDER BY id LIMIT 1000,1
      If @z == @a
         SELECT @z := id FROM tbl WHERE id > @a ORDER BY id LIMIT 1
The drawback is that there could be more than 1000 items with a single id. In most practical cases, that is unlikely.

If you do not have a primary (or unique) key defined on the table, and you have an INDEX on ts, then consider
      DELETE FROM tbl
         ORDER BY ts   -- to use the index, and to make it deterministic
         LIMIT 1000
   UNTIL no rows deleted
This technique is NOT recommended because the LIMIT leads to a warning on replication about it being non-deterministic (discussed below).

A use case

InnoDB Chunking Recommendation

    ⚈  Have a 'reasonable' size for innodb_log_file_size.
    ⚈  Use AUTOCOMMIT=1 for the session doing the deletions.
    ⚈  Pick about 1000 rows for the chunk size.
    ⚈  Adjust the row count down if asynchronous replication (Statement Based) causes too much delay on the Replicas or hogs the table too much.

Iterating through a compound key

To perform the chunked deletes recommended above, you need a way to walk through the PRIMARY KEY. This can be difficult if the PK has more than one column in it.

To efficiently to do compound 'greater than':

Assume that you left off at ($g, $s) (and have handled that row):
   INDEX(Genus, species)
      WHERE Genus >= '$g' AND ( species  > '$s' OR Genus > '$g' )
      ORDER BY Genus, species
      LIMIT ...
Addenda: The above AND/OR works well in older versions of MySQL; this works better in newer versions:
      WHERE ( Genus = '$g' AND species  > '$s' ) OR Genus > '$g' )

A caution about using @variables for strings. If, instead of '$g', you use @g, you need to be careful to make sure that @g has the same CHARACTER SET and COLLATION as Genus, else there could be a charset/collation conversion on the fly that prevents the use of the INDEX. Using the INDEX is vital for performance. It may require a COLLATE clause on SET NAMES and/or the @g in the SELECT.

Do not use "Row constructors" until you are sure that the Optimizer optimizes them: WHERE (Genus, species) > ($g, $s)

Reclaiming the disk space

Note: Reclaiming disk space may not be necessary. After all, tomorrow's INSERTs will simply reuse the free space in the table.

MyISAM leaves gaps in the table (.MYD file); OPTIMIZE TABLE will reclaim the freed space after a big delete. But it may take a long time and lock the table.

InnoDB is block-structured, organized in a BTree on the PRIMARY KEY. An isolated deleted row leaves a block less full. A lot of deleted rows can lead to coalescing of adjacent blocks. (Blocks are normally 16KB.)

In InnoDB, there is no practical way to reclaim the freed space from ibdata1, other than to reuse the freed blocks eventually.

If you have innodb_file_per_table = 0, the only option is to dump ALL tables, remove ibdata*, restart, and reload. That is rarely worth the effort and time.

InnoDB, even with innodb_file_per_table = 1, OPTIMIZE TABLE will give space back to the OS, but you do need enough disk space for two copies of the table during the action.

Deleting more than half a table

The following technique can be used for any combination of
    ⚈  Deleting a large portion of the table more efficiently
    ⚈  Add PARTITIONing
    ⚈  Converting to innodb_file_per_table = ON
    ⚈  Defragmenting

This can be done by chunking, or (if practical) all at once:
   -- Optional:  SET GLOBAL innodb_file_per_table = ON;
   -- Do this INSERT..SELECT all at once, or with chunking:
      SELECT * FROM Main
         WHERE ...;  -- just the rows you want to keep
   RENAME TABLE main TO Old, New TO Main;
   DROP TABLE Old;   -- Space freed up here
    ⚈  You do need enough disk space for both copies.
    ⚈  You must not write to the table during the process. (Changes to Main may not be reflected in New.)
    ⚈  FOREIGN KEYs are likely to cause trouble.
    ⚈  TRIGGERs are likely to cause trouble.

See also pt-online-schema-change.

Non-deterministic Replication

Any UPDATE, DELETE, etc with LIMIT that is replicated to Replicas (via Statement Based Replication) may cause inconsistencies between the Master and Replicas. This is because the actual order of the records discovered for updating/deleting may be different on the Replica, thereby leading to a different subset being modified. To be safe, add ORDER BY to such statements. Moreover, be sure the ORDER BY is deterministic -- that is, the fields/expressions in the ORDER BY are unique.

An example of an ORDER BY that does not quite work: Assume there are multiple rows for each 'date':
   DELETE * FROM tbl ORDER BY date LIMIT 111
Given that id is the PRIMARY KEY (or UNIQUE), this will be safe:
   DELETE * FROM tbl ORDER BY date, id LIMIT 111
Unfortunately, even with the ORDER BY, MySQL has a deficiency that leads to a bogus warning in mysqld.err. See
Spurious "Statement is not safe to log in statement format." warnings

Some of the above code avoids this spurious warning by doing
   SELECT @z := ... LIMIT 1000,1;  -- not replicated
   DELETE ... BETWEEN @a AND @z;   -- deterministic
That pair of statements guarantees no more than 1000 rows are touched, not the whole table.

Replication and KILL

If you KILL a DELETE (or any? query) on the Master in the middle of its execution, what will be Replicated?

If it is InnoDB, the query should be rolled back. (Exceptions??)

In MyISAM, rows are DELETEd as the statement is executed, and there is no provision for ROLLBACK. Some of the rows will be deleted, some won't. You probably have no clue of how much was deleted. In a single server, simply run the delete again. The delete is put into the binlog, but with error 1317. Since Replication is supposed to keep the Master and Replica in sync, and since it has no clue of how to do that, Replication stops and waits for manual intervention. In a HA (High Available) system using Replication, this is a minor disaster. Meanwhile, you need to go to each Replica(s) and verify that it is stuck for this reason, then do
Then (presumably) reexecuting the DELETE will finish the aborted task.

(That is yet another reason to move all your tables from MyISAM to InnoDB.)

SBR vs RBR; Galera

"Row Based Replication" implies that the rows to be deleted are written to the binlog. The bigger the rows, and the more rows that you delete in a single "chunk", the more replication will be impacted. The suggestion of "1000" rows per chunks may need to be adjusted. The tradeoff is between how soon all the chunks are finished versus how much impact each chunk has on other things going on in replication.

If the task is to "purge old data", then speed of completion is probably not important.

Optimal reLOAD of a table

Suppose you need to repeatedly reload a table with fresh data, such as data provided from the outside.

You have a table called `real; the following will replace it with a new table containing the fresh data.
   CREATE TABLE t_new LIKE real;
   LOAD DATA INFILE new ...;
   RENAME TABLE real TO t_old,
                t_new TO real;
   DROP TABLE t_old;


    ⚈  The LOAD DATA step can be replaced by whatever process you have for importing the data.
    ⚈  The Loading is the only slow step.
    ⚈  The RENAME is atomic, so real always exists.
    ⚈  You may choose to delay the DROP in case the new data might be bad and you want to revert.
    ⚈  FOREIGN KEYs can be a hassle; it might be good not to have such.

More discussion, including some suggested privileges:
Create and swap (stackoverflow)


The tips in this document apply to MySQL, MariaDB, and Percona.

Similar - from OAK

Percona's package to do big deletes. Caution: using "--where" may be more invasive than walking through the PK as discussed above. pt-archiver

Anecdote: 2 hours vs 5 days

Stackoverflow example

Posted: 2010;   Refreshed: June, 2015;   Minor Refresh: Sep, 2017
-- Rick James

MySQL Documents by Rick James

HowTo Techniques for Optimizing Tough Tasks:

Partition Maintenance (DROP+REORG) for time series (includes list of PARTITION uses)

Big DELETEs - how to optimize -- and other chunking advice, plus a use for PARTITIONing
    Chunking lengthy DELETE/UPDATE/etc.

Data Warehouse techniques:
    Data Warehouse Overview   Summary Tables   High speed ingestion   Bulk Normalization  

Schema and code design for large Sensor database

Entity-Attribute-Value (EAV) -- a common, poorly performing, design pattern; plus an alternative

5 methods for 'Find Nearest'

Lat/Lng search to Find the nearest 10 pizza parlors
    Lat/Long representation choices

Z-Order 'find nearest'

Pagination, not with OFFSET, LIMIT

Techniques on efficiently finding a random row (On beyond ORDER BY RAND())

GUID/UUID Performance (type 1 only)

IP Range Table Performance -- or other disjoint ranges

Rollup Unique User Counts

Alter of a Huge table -- Mostly obviated by 5.6

Efficient List of Latest 10 news articles

Build and execute a Pivot SELECT (showing rows as columns)

(Groupwise Max): Efficiently find largest row(s) for each group

Other Tips, Tuning, Debugging, Optimizations, etc...

Rick's RoTs (Rules of Thumb -- lots of tips)

Datatypes and building a good schema

Memory Allocation (caching, etc)

Character Set and Collation problem solver
    Trouble with UTF-8   If you want case folding, but accent sensitivity, please file a request at .
    Python tips,   PHP tips,   other language tips
    utf8 Collations   utf8mb4 Collations on 8.0

Converting from MyISAM to InnoDB -- includes differences between them

Compound INDEXes plus other insights into the mysteries of INDEXing

Cookbook for Creating Indexes
    Many-to-many mapping table   Handler counts   wp_postmeta   UNION+OFFSET

MySQL Limits -- built-in hard limits
    767-byte INDEX limit

Galera, tips on converting to (Percona XtraDB Cluster, MariaDB 10, or manually installed)

5.7's Query Rewrite -- perhaps 5.7's best perf gain, at least for this forum's users

Analyze MySQL Performance
    Analyze VARIABLEs and GLOBAL STATUS     Analyze SlowLog

My slides from conferences
MiniFest 2021 - Rick James & Daniel Black - Answering on Stack Overflow(+comments) - MariaDB Frontlines
Percona Live 4/2017 - Rick's RoTs (Rules of Thumb) - MySQL/MariaDB
Percona Live 4/2017 - Index Cookbook - MySQL/MariaDB
Percona Live 9/2015 - PARTITIONing - MySQL/MariaDB

Contact me via LinkedIn; be sure to include a brief teaser in the Invite request:   View Rick James's profile on LinkedIn

Did my articles help you out? Like what you see? Consider donating:

☕️ Buy me a Banana Latte and bagel ($10) There is no obligation but it would put a utf8mb4 smiley 🙂 on my face, instead of the Mojibake "🙂"