Blog
PostgreSQLAWSDatabase MigrationSRE

Migrations Will Humble You: The Postgres Sequence Trap at Cutover

·Ankit Bhardwaj

Database migrations are humbling. No matter how many you've done, they find a new way to get you.

We recently migrated from Aurora → RDS. We had a checklist. We had experience. We thought we nailed it.

We did not nail it.

Everything worked — that was the trap

We used pgcopydb to clone the database. Worked perfectly. Then we set up logical replication to keep Aurora and RDS in sync during the cutover window. Also worked perfectly.

Too perfectly. That smooth replication was hiding a landmine.

The sneaky part about logical replication

When logical replication copies rows, it inserts them with their IDs already baked in:

INSERT INTO table (id, ...) VALUES (1001, ...);
INSERT INTO table (id, ...) VALUES (1002, ...);

It stuffs the rows in directly — it never calls nextval(). So PostgreSQL's sequence counter has no idea those IDs exist. The data lands, but the sequence stays exactly where it was.

Think of it like someone manually writing page numbers into a book by hand, while the page-numbering machine is still parked at page 1000. The pages are there; the machine just never advanced.

The failure, right on cue at cutover

At cutover, the app starts writing to RDS and asks for a new ID:

App: "give me the next ID" Sequence: "sure, here's 1001 👍" Database: "1001 ALREADY EXISTS 💥"

Every insert that relied on the sequence collided with a row logical replication had already copied. Celery tasks started crashing. SQS retries piled up. Pure chaos — and all of it surfacing in production, because that's the only place the app actually started writing through the sequence.

The fix (simple, in hindsight — isn't it always)

Reset the sequence to the real maximum ID already in the table:

-- Reset sequence to actual max ID in table
SELECT setval('table_id_seq', (SELECT MAX(id) FROM table));

Two things matter here:

  • Order. Run it after replication stops and before the first app write. In between is the only safe window.
  • Scope. Do it for every sequence, not just the one that happened to blow up first. The one you miss is the one that pages you at 2am.

The real lesson

Migrations aren't hard because the tools are bad — pgcopydb and logical replication did exactly what they promised. They're hard because every system has one quirk you won't discover until cutover, when real writes start flowing through paths your dry runs never exercised.

So: add "reset ALL sequences" to your migration checklist. You're welcome.

AB

About the Author

Ankit Bhardwaj

Site Reliability Engineer with 12+ years in software engineering and 4+ years operating production cloud infrastructure on AWS and Kubernetes. Currently running six Kubernetes clusters at 99.99% uptime.

Get in touch