Automatic Google Cloud Spanner Schema Migrations with Atlas

Spanner is a fully managed, horizontally scalable, globally distributed, and strongly consistent database service offered by Google Cloud. It is designed to handle large-scale applications with high availability and low latency.

However, managing a large database schema with Spanner can be challenging due to the complexity of related data structures and the need for coordinated schema changes across multiple teams and applications.

Enter: Atlas

Atlas helps developers manage their database schema as code, abstracting away the intricacies of database schema management. With Atlas, users provide the desired state of the database schema and Atlas automatically plans the required migrations.

In this guide, we will dive into setting up Atlas for Spanner schema migrations, and introduce the different workflows available.

Prerequisites

Docker
Google Cloud Environment set up on your machine:
Atlas installed on your machine:

macOS + Linux
Docker
Windows

To download and install the custom release of the Atlas CLI, simply run the following in your terminal:

curl -sSf https://atlasgo.sh | ATLAS_FLAVOR="spanner" sh

To pull the Atlas image and run it as a Docker container:

docker pull arigaio/atlas:latest-extended
docker run --rm arigaio/atlas:latest-extended --help

If the container needs access to the host network or a local directory, use the --net=host flag and mount the desired directory:

docker run --rm --net=host \
  -v $(pwd)/migrations:/migrations \
  arigaio/atlas:latest-extended migrate apply \
  --url "oracle://PDBADMIN:Pssw0rd0995@localhost:1521/FREEPDB1"

An Atlas Pro account. To use the Spanner driver, run:

$ atlas login

Creating a Spanner Instance and Database

info

If you want to use an existing Spanner instance and database, you can skip this step.

Let's start off by spinning up a Spanner instance using the Google Cloud Console or the gcloud command line tool.

gcloud spanner instances create my-instance \
  --config=regional-us-central1 \
  --description="My Spanner Instance" \
  --nodes=1

Next, create a Spanner database within the instance:

gcloud spanner databases create my-database \
  --instance=my-instance \
  --ddl="CREATE TABLE users (id INT64 NOT NULL, email STRING(255), display_name STRING(255)) PRIMARY KEY(id);"

This command creates a new Spanner database named my-database with a users table.

Inspecting the Schema

The atlas schema inspect command supports reading the database description provided by a URL and outputting it in different formats, including Atlas DDL (default), SQL, and JSON. In this guide, we will demonstrate the flow using both the Atlas DDL and SQL formats, as the JSON format is often used for processing the output using jq.

Atlas DDL (HCL)
SQL

To inspect our Spanner database, use the -u flag and write the output to a file named schema.hcl:

atlas schema inspect -u "spanner://projects/YOUR_PROJECT_ID/instances/my-instance/databases/my-database" \
 > schema.hcl

Open the schema.hcl file to view the Atlas schema that describes our database.

schema.hcl
table "users" {
  schema = schema.default
  column "id" {
    null = false
    type = INT64
  }
  column "email" {
    null = true
    type = STRING(255)
  }
  column "display_name" {
    null = true
    type = STRING(255)
  }
  primary_key {
    columns = [column.id]
  }
}
schema "default" {
}

To inspect our Spanner database, use the -u flag and write the output to a file named schema.sql:

atlas schema inspect -u "spanner://projects/YOUR_PROJECT_ID/instances/my-instance/databases/my-database" \
 --format '{{ sql . }}' > schema.sql

Open the schema.sql file to view the inspected SQL schema that describes our database.

schema.sql
-- Create "users" table
CREATE TABLE `users` (
  `id` INT64 NOT NULL,
  `email` STRING(255),
  `display_name` STRING(255)
) PRIMARY KEY (`id`);

info

For in-depth details on the atlas schema inspect command, covering aspects like inspecting specific schemas, handling multiple schemas concurrently, excluding tables, and more, refer to our documentation here.

Declarative Migrations

Atlas supports a workflow called declarative schema migrations. In this workflow, you first define the desired state of your database schema (in one of many supported formats and languages). Then, you let Atlas calculate the diff between the desired state and the actual state of your database. Atlas then generates the SQL commands that will bring your database to the desired state.

Let's see this in action.

First, create a new file name schema.sql. This file will contain the desired state of our database in plain SQL.

schema.sql
-- Create "users" table
CREATE TABLE `users` (
  `id` INT64 NOT NULL,
  `email` STRING(255),
  `display_name` STRING(255)
) PRIMARY KEY (`id`);
-- Create the posts table with a custom name for the FK constraint
CREATE TABLE `posts` (
  `id` INT64 NOT NULL,
  `title` STRING(255),
  `body` STRING(MAX),
  `author_id` INT64,
  CONSTRAINT `author_fk` FOREIGN KEY (`author_id`) REFERENCES `users` (`id`)
) PRIMARY KEY (`id`);

Applying our schema

Next, let's apply this schema to our database using the atlas schema apply command.

atlas schema apply -u "spanner://projects/YOUR_PROJECT_ID/instances/my-instance/databases/my-database" \
  --to file://schema.sql \
  --dev-url "docker://spanner/latest"

PostgreSQL Dialect

If your Spanner database uses the PostgreSQL dialect, use docker://spannerpg/latest as the dev-url:

atlas schema apply -u "spanner://projects/YOUR_PROJECT_ID/instances/my-instance/databases/my-database" \
  --to file://schema.sql \
  --dev-url "docker://spannerpg/latest"

Atlas automatically detects the dialect of your target database and uses the appropriate syntax.

Atlas will connect to our target database to inspect its current state. Next, it will use the dev-database to normalize our schema and generate the SQL commands that will bring our database to the desired state:

Planning migration statements (1 in total):

  -- create "posts" table:
    -> CREATE TABLE `posts` (
         `id` INT64 NOT NULL,
         `title` STRING(255),
         `body` STRING(2621440),
         `author_id` INT64,
         CONSTRAINT `author_fk` FOREIGN KEY (`author_id`) REFERENCES `users` (`id`) ON DELETE NO ACTION
       ) PRIMARY KEY (`id`)

-------------------------------------------

Analyzing planned statements (1 in total):

  -- no diagnostics found

  -------------------------
  -- 119.9685ms
  -- 1 schema change

-------------------------------------------

? Approve or abort the plan:
  ▸ Approve and apply
    Abort

After applying the schema, Atlas confirms that the changes were applied:

-------------------------------------------

Applying approved migration (1 statement in total):

  -- create "posts" table
    -> CREATE TABLE `posts` (
         `id` INT64 NOT NULL,
         `title` STRING(255),
         `body` STRING(2621440),
         `author_id` INT64,
         CONSTRAINT `author_fk` FOREIGN KEY (`author_id`) REFERENCES `users` (`id`) ON DELETE NO ACTION
       ) PRIMARY KEY (`id`)
  -- ok (13.408836s)

  -------------------------
  -- 13.408947041s
  -- 1 migration
  -- 1 sql statement

Next, let's re-run the atlas schema apply command. This time, Atlas will detect that the database is already in the desired state and will not generate any changes:

Schema is synced, no changes to be made

Altering our schema

Now, let's make some changes to our schema. Open the schema.sql file and add a new column to the users table:

schema.sql
CREATE TABLE users (
  `id` INT64 NOT NULL,
  `email` STRING(255),
  `display_name` STRING(255),
+ `bio` STRING(1024) -- New column for user biography
);

Next, let's re-run the atlas schema apply command. This time, Atlas will detect that the schema has changed and will generate the needed SQL commands to bring the database to the desired state:

Planning migration statements (1 in total):

  -- add column "bio" to table "users":
    -> ALTER TABLE `users` ADD COLUMN `bio` STRING(1024)

-------------------------------------------

Analyzing planned statements (1 in total):

  -- no diagnostics found

  -------------------------
  -- 120.611709ms
  -- 1 schema change

-------------------------------------------

? Approve or abort the plan:
  ▸ Approve and apply
    Abort

After applying the changes, Atlas confirms once again that the changes were applied:

-------------------------------------------

Applying approved migration (1 statement in total):

  -- add column "bio" to table "users"
    -> ALTER TABLE `users` ADD COLUMN `bio` STRING(1024)
  -- ok (17.219869042s)

  -------------------------
  -- 17.219887625s
  -- 1 migration
  -- 1 sql statement

Visualizing our schema

One of the most useful features of Atlas is the ability to visualize your database schema. To do so, run the atlas schema inspect command with the -w (web) flag:

atlas schema inspect -w -u "spanner://projects/YOUR_PROJECT_ID/instances/my-instance/databases/my-database"

Atlas will ask whether you would like to create your visualization publicly (in a publicly accessible URL) or privately (in your Atlas Cloud account):

? Where would you like to share your schema visualization?:
  ▸ Publicly (gh.atlasgo.cloud)
    Privately (giautm.atlasgo.cloud)

For this demo, let's choose the public option. Atlas will create the visualization and open it in your default browser:

See it for yourself at: https://gh.atlasgo.cloud/explore/0256b374

Pushing schemas to Atlas

Similar to how Docker images are pushed to Docker Hub, you can push your schema to Atlas Cloud for versioning, collaboration, and deployment:

atlas schema push app \
  --dev-url "docker://spanner/latest"

Once pushed, Atlas prints a URL to the schema. You can then apply it to any database using the schema URL:

atlas schema apply \
  -u "spanner://projects/YOUR_PROJECT_ID/instances/my-instance/databases/my-database" \
  --to "atlas://app" \
  --dev-url "docker://spanner/latest"

This workflow allows you to manage your schema centrally and deploy it to multiple environments without having the schema files locally.

For more advanced workflows, you can use atlas schema plan to pre-plan and review migrations before applying them. This enables teams to plan, lint, and review changes during the PR stage, edit generated SQL if needed, and ensure no human intervention is required during deployment.

Versioned Migrations

Alternatively, the versioned migration workflow, sometimes called "change-based migrations", allows each change to the database schema to be checked-in to source control and reviewed during code-review. Users can still benefit from Atlas intelligently planning migrations for them, however they are not automatically applied.

Creating the first migration

In the versioned migration workflow, our database state is managed by a migration directory. The migration directory holds all of the migration files created by Atlas, and the sum of all files in lexicographical order represents the current state of the database.

To create our first migration file, we will run the atlas migrate diff command, and we will provide the necessary parameters:

--dir the URL to the migration directory, by default it is file://migrations.
--to the URL of the desired state. A state can be specified using a database URL, HCL or SQL schema, or another migration directory.
--dev-url a URL to a Dev Database that will be used to compute the diff.

Atlas DDL (HCL)
SQL

atlas migrate diff initial \
  --to file://schema.hcl \
  --dev-url "docker://spanner/latest"

atlas migrate diff initial \
  --to file://schema.sql \
  --dev-url "docker://spanner/latest"

PostgreSQL Dialect

If your Spanner database uses the PostgreSQL dialect, use docker://spannerpg/latest as the dev-url:

atlas migrate diff initial \
  --to file://schema.sql \
  --dev-url "docker://spannerpg/latest"

Run ls migrations, and you'll notice that Atlas has automatically created a migration directory for us, as well as two files:

20240221153232_initial.sql
atlas.sum

-- Create "users" table
CREATE TABLE `users` (
    `id` INT64 NOT NULL,
    `email` STRING(255),
    `display_name` STRING(255)
) PRIMARY KEY (`id`);
-- Create "posts" table
CREATE TABLE `posts` (
    `id` INT64 NOT NULL,
    `title` STRING(255),
    `body` STRING(MAX),
    `author_id` INT64,
    CONSTRAINT `author_fk` FOREIGN KEY (`author_id`) REFERENCES `users` (`id`)
) PRIMARY KEY (`id`);

h1:example-hash=
20240221153232_initial.sql h1:another-hash=

The migration file represents the current state of our database, and the sum file is used by Atlas to maintain the integrity of the migration directory. To learn more about the sum file, read the documentation.

Pushing migration directories to Atlas

Now that we have our first migration, we can apply it to a database. There are multiple ways to accomplish this, with most methods covered in the guides section. In this example, we'll demonstrate how to push migrations to Atlas Cloud, much like how Docker images are pushed to Docker Hub.

Let's name our new migration project app and run atlas migrate push:

atlas migrate push app \
  --dev-url "docker://spanner/latest"

Once the migration directory is pushed, Atlas prints a URL to the created directory.

Applying migrations

Once our app migration directory has been pushed, we can apply it to a database from any CD platform without necessarily having our directory there.

We'll create a simple Atlas configuration file (atlas.hcl) to store the settings for our environment:

atlas.hcl
env "spanner" {
  url = "spanner://projects/YOUR_PROJECT_ID/instances/my-instance/databases/my-database"
  migration {
    dir = "atlas://app"
  }
}

The final step is to apply the migrations to the database. Let's run atlas migrate apply with the --env flag to instruct Atlas to select the environment configuration from the atlas.hcl file:

atlas migrate apply --env spanner

After applying the migration, you should receive a link to the deployment and the database where the migration was applied.

Next Step: Setup CI/CD

Once you have your migration directory set up, the next step is to integrate Atlas into your CI/CD pipeline. Atlas provides native integrations for popular platforms:

GitHub Actions

Set up CI/CD with GitHub Actions

GitLab CI

Set up CI/CD with GitLab CI

Bitbucket Pipelines

Set up CI/CD with Bitbucket Pipelines

Azure DevOps

Set up CI/CD with Azure DevOps

Terraform Provider

Deploy schema changes with Terraform

Kubernetes Operator

Deploy schema changes with Kubernetes

Wrapping up

In this guide, we demonstrated how to set up Atlas to manage your Google Cloud Spanner database schema. We covered both declarative and versioned migration workflows, and showed how to generate migrations, push them to an Atlas workspace, and apply them to your databases. Atlas has many more features to explore. To learn more, check out the Atlas documentation.

As always, we would love to hear your feedback and suggestions on our Discord server.

Enter: Atlas​

Prerequisites​

Creating a Spanner Instance and Database​

Inspecting the Schema​

Declarative Migrations​

Applying our schema​

Altering our schema​

Visualizing our schema​

Pushing schemas to Atlas​

Versioned Migrations​

Creating the first migration​

Pushing migration directories to Atlas​

Applying migrations​

Next Step: Setup CI/CD​