Skip to main content

Handling Migration Errors: How Atlas Improves on golang-migrate

· 11 min read
Noa Rogoszinski
Noa Rogoszinski
DevRel Engineer

Database migrations are fundamental to modern software development, allowing teams to evolve their database schema in a controlled and versioned manner. As applications grow and requirements change, the ability to reliably alter your database is crucial for maintaining data integrity and application stability.

Atlas was originally created to support Ent, a popular Go ORM. From the start, Ent shipped with a simple "auto-migration" feature that could set up the database schema based on the Ent schema. However, as the project grew popular, it became clear that a more robust versioned migration system was needed.

Ent's authors had hoped to add functionality based on the existing "auto-migration" engine to generate migration files, and use an off-the-shelf migration tool to apply them. The most promising candidate was golang-migrate, a widely adopted migration tool in the Go community renowned for its simplicity and wide database support. But like many tools that start simple and grow popular, we realized that golang-migrate, too, has its limitations, and they led us to expand on its abilities.

In this article, we’ll explore some common challenges teams face with traditional migration tools like golang-migrate, and how Atlas takes a different approach to improve the developer experience.

golang-migrate in Practice

golang-migrate operates by managing pairs of migration files defining sets of schema modifications – one file applies the changes ("up") and the other reverts them ("down"). To implement a database change, developers create new migration files outlining the steps to be taken (e.g. creating a table, altering a column).

To apply migration, the migrate CLI tool reads these files and applies the changes to the target database in a sequential order, starting from the most recently applied migration.

When migrations are done properly, this is pretty straightforward, but it's unrealistic to expect that no mistakes will be made. As much as we try to avoid mistakes, even the most experience SQL programmer can make a typo or overlook a potential database lock when writing a migration. When this happens and the migration goes wrong, you may see an error like this:

error: migration failed in line 0: create table t1 (
id int primary key,
); (details: Error 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ')' at line 3)

After fixing the error, you may intuitively rerun the migration, but now you're met with a new error:

error: Dirty database version 20250403065350. Fix and force version.

This "dirty" state means the last attempted migration didn't finish running and golang-migrate is now stuck. It won't apply future migrations until you manually resolve the issue and reset the version.

The "Dirty" State

Why does golang-migrate add the "dirty" tag to a database when a migration fails? Why not stop at printing the error message?

Before running a migration, a dirty flag is set for each database. If the migration passes, the flag is removed. If the migration fails, the dirty state persists. The failed migration could leave the database in an inconsistent state if at least one statement was successfully applied, so the dirty state prevents any additional migrations from being run on the database while it's in limbo. The intention is to protect the database from further unintended changes until the failed migration is fixed.

Possible Solutions

  • Transactions — Many commonly used databases support performing migrations using transactions. Encapsulating each migration within a transaction allows for rolling back all changes made by that migration in the event of an error. However, golang-migrate does not automatically wrap individual migration files in transactions, even for database systems that support them. Consequently, each statement within a migration file is executed independently. If an error occurs during the execution of a statement within a file, the statements that have already been successfully applied remain committed, potentially leaving the database in an inconsistent state.
  • Down migrations — As previously stated, when creating migrations with golang-migrate, the user is prompted to write both "up" and "down" instructions in two separate files. One might assume the "down" file can be applied in this scenario to revert any changes that were made by the "up" file before the error occurred; however, since the database has been tagged "dirty," golang-migrate won't apply any file to it, not even the "down" file.

The only option is to manually fix the migration and reset the "dirty" tag.

Handling the "Dirty database version" Error

Let's go over the process using an example.

Let's say you create a migration that adds two new tables and sets up foreign key constraints:

CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
description TEXT,
price DECIMAL(10, 2) NOT NULL,
stock INT NOT NULL DEFAULT 0
);

CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_fk VARCHAR(100) NOT NULL,
product_fk VARCHAR(255) NOT NULL
);

ALTER TABLE orders
ADD CONSTRAINT fk_orders_products
FOREIGN KEY (product_fk) REFERENCES products (name) ON DELETE CASCADE;

Trying to apply the migration:

migrate -source file://db/migrations -database "mysql://root:pass@tcp(localhost:3306)/example" up 1

An error occurs:

error: migration failed in line 0: CREATE TABLE products (
# ... redacted for brevity
(details: Error 6125: Failed to add the foreign key constraint. Missing unique key for constraint 'fk_orders_products' in the referenced table 'products')

Let's see what we need to do to fix this.

Step 1: Plan a fix

First, we must locate the error, note any statements that were applied before the failure, and plan a "fix" script to undo these changes.

The first place to look is the error message, which says the error occurred when trying to add a foreign key constraint to the orders table. A foreign key is meant to reference a column with a unique constraint to maintain its integrity, and we created a foreign key that references a column without a unique constraint.

Adding the foreign key is the last statement in the migration file, so we can safely assume that the first two statements were applied successfully. This means that if we want to roll back the database to the previous step, we need to create a fix script that drops the tables that were made:

DROP TABLE orders;

DROP TABLE products;

Step 2: Fix the database

Since golang-migrate has the database marked as "dirty," the fix script must be run on the database manually.

There are many ways to run a SQL script against a database. In any case, make sure that you have direct, privileged access to the production database, which you can use ad-hoc.

Step 3: Force version

After running the fix script, run migrate force to tell golang-migrate that it can now mark the database as clean again:

migrate -database "mysql://root:pass@tcp(localhost:3306)/example" -path=. force <previous version>

Finally, we are back where we started before ever running the failed migration.

Step 4: Fix the migration

Now let's fix our original migration to add the missing unique constraint to the products table:

CREATE TABLE products
(
id SERIAL PRIMARY KEY,
+ name VARCHAR(255) UNIQUE NOT NULL,
- name VARCHAR(255) NOT NULL,
description TEXT,
price DECIMAL(10, 2) NOT NULL,
stock INT NOT NULL DEFAULT 0
);

Step 5: Rerun the migration

We can now run the migration again:

migrate -source file://db/migrations -database "mysql://root:pass@tcp(localhost:3306)/example" up 1

And this time it should work:

20250403134507/u init (56.108792ms)

Summary

The golang-migrate authors set out to build a simple, general-purpose migration tool that would be simple to develop and extend. As mentioned on the repository's README:

Drivers are "dumb", migrate glues everything together and makes sure the logic is bulletproof. (Keeps the drivers lightweight, too.)

In the words of long-term maintainer dhui on Issue #438:

migrate treats migrations as opaque blobs, so parsing or interpreting the migration is out of scope.

Software engineering is the art of making trade-offs. While golang-migrate's decision to keep the drivers "dumb" and uninvolved, has led to its widespread adoption and to the quick development of many drivers, it also means that the tool had to adopt a "fail fast" approach to error handling, as explained in the README:

Database drivers don't assume things or try to correct user input. When in doubt, fail.

The downside of these design decisions is evident when things go wrong (which can be quite common with migrations), golang-migrate can actually slow down the migration process. Here are some of the main issues we see:

  • Fragile - The tool enters a "dirty" state when encountering any form of failure and cannot automatically recover from partial failures using rollbacks.
  • Expensive Manual Fixes - The tool requires manual intervention to fix the database state. In our example, we had a migration that was relatively simple and easy to revert, so one can only imagine the implications of a migration failure on a much larger project.
  • Requires Privileged Access - A consequence of the above is that in order to solve day-to-day issues with migrations, you need direct access to the production database and are expected to run one-off, uncommitted code against it to revert changes and force the database into a "clean" state. This is a big no-no in most organizations, and for good reason.

A Different Approach: Atlas

To fill this gap, Atlas was built to be more resilient to migration failures — we wanted to create something that could handle the complexities of large teams, distributed development, and automated deployments.

To bake this into the tool, we took a different, more involved approach to migrations. It was harder to build, and it sure makes it more difficult to add support for new databases, but we think it was worth it. Here are some of the key features of Atlas that help users in more ways than just applying the migrations:

  • Automatic Rollbacks - For databases that support transactional DDL (like PostgreSQL), Atlas will automatically roll back failed migrations.
  • Statement-Level Tracking - Atlas executes migrations one statement at a time and tracks their progress. If a migration partially fails, Atlas knows where it left off and can resume from the last successful statement. No need to roll back or rerun everything from scratch.
  • Dynamic Down Migrations - Atlas' migrate down command does not rely on pre-computed down migrations. Instead, it computes the down migration dynamically based on the current state of the database. This enables Atlas users to easily recover from partial failures without needing to "break glass" and run a manual fix script.

To learn more about Atlas's error handling and recovery features, check out our blog post about troubleshooting migrations and the Atlas Docs.

The Atlas Experience

Atlas was built to be a developer-friendly migration tool based on modern DevOps practices. Beyond being a migration tool, Atlas offers a more involved experience to our users to ease the schema migration process that can get more complicated as projects and teams grow:

  • Schema as Code - Atlas allows you to define your database schema in a declarative way using HCL, SQL, or your favorite ORM.
  • Automatic Planning - Atlas automatically plans migrations based on the current state of the database and the desired state. By calculating the diff between the two, Atlas takes away the need to manually write migration scripts.
  • Automatic Code Review - Atlas automatically lints and tests migrations before applying them. This helps catch errors early and ensures that migrations are safe to apply.

Wrapping Up

golang-migrate is great when migrations run smoothly, but when things break, it often leaves you with more work and lost time.

Atlas takes a different approach: transactional safety, statement-level tracking, and dynamic rollbacks help you recover from failures gracefully and keep moving forward. It's built for modern teams, CI/CD pipelines, and driven developers who want to spend as little time debugging migrations as possible.

Check it out at atlasgo.io.


As always, we would love to hear your feedback and suggestions on our Discord server.