Automatic Google Cloud Spanner Schema Migrations with Atlas
Spanner is a fully managed, horizontally scalable, globally distributed, and strongly consistent database service offered by Google Cloud. It is designed to handle large-scale applications with high availability and low latency.
However, managing a large database schema with Spanner can be challenging due to the complexity of related data structures and the need for coordinated schema changes across multiple teams and applications.
Enter: Atlas
Atlas helps developers manage their database schema as code, abstracting away the intricacies of database schema management. With Atlas, users provide the desired state of the database schema and Atlas automatically plans the required migrations.
In this guide, we will dive into setting up Atlas for Spanner schema migrations, and introduce the different workflows available.
Prerequisites
- Docker
- Google Cloud Environment set up on your machine:
- Atlas installed on your machine:
- macOS + Linux
- Docker
- Windows
To download and install the custom release of the Atlas CLI, simply run the following in your terminal:
curl -sSf https://atlasgo.sh | ATLAS_FLAVOR="spanner" sh
To pull the Atlas image and run it as a Docker container:
docker pull arigaio/atlas-extended
docker run --rm arigaio/atlas-extended --help
If the container needs access to the host network or a local directory, use the --net=host flag and mount the desired
directory:
docker run --rm --net=host \
-v $(pwd)/migrations:/migrations \
arigaio/atlas-extended migrate apply \
--url "mysql://root:pass@:3306/test"
Download the custom release and move the atlas binary to a file location on your system PATH.
- An Atlas Pro account. To use the Spanner driver, run:
$ atlas login
Creating a Spanner Instance and Database
If you want to use an existing Spanner instance and database, you can skip this step.
Let's start off by spinning up a Spanner instance using the Google Cloud Console or the gcloud command line tool.
gcloud spanner instances create my-instance \
--config=regional-us-central1 \
--description="My Spanner Instance" \
--nodes=1
Next, create a Spanner database within the instance:
gcloud spanner databases create my-database \
--instance=my-instance \
--ddl="CREATE TABLE users (id INT64 NOT NULL, email STRING(255), display_name STRING(255)) PRIMARY KEY(id);"
This command creates a new Spanner database named my-database with a users table.
Inspecting the Schema
The atlas schema inspect command supports reading the database description provided by a URL and outputting it in
different formats, including Atlas DDL (default), SQL, and JSON. In this guide, we will
demonstrate the flow using both the Atlas DDL and SQL formats, as the JSON format is often used for processing the
output using jq.
- Atlas DDL (HCL)
- SQL
To inspect our Spanner database, use the -u flag and write the output to a file named schema.hcl:
atlas schema inspect -u "spanner://projects/YOUR_PROJECT_ID/instances/my-instance/databases/my-database" \
> schema.hcl
Open the schema.hcl file to view the Atlas schema that describes our database.
table "users" {
schema = schema.default
column "id" {
null = false
type = INT64
}
column "email" {
null = true
type = STRING(255)
}
column "display_name" {
null = true
type = STRING(255)
}
primary_key {
columns = [column.id]
}
}
schema "default" {
}
To inspect our Spanner database, use the -u flag and write the output to a file named schema.sql:
atlas schema inspect -u "spanner://projects/YOUR_PROJECT_ID/instances/my-instance/databases/my-database" \
--format '{{ sql . }}' > schema.sql
Open the schema.sql file to view the inspected SQL schema that describes our database.
-- Create "users" table
CREATE TABLE `users` (
`id` INT64 NOT NULL,
`email` STRING(255),
`display_name` STRING(255)
) PRIMARY KEY (`id`);
For in-depth details on the atlas schema inspect command, covering aspects like inspecting specific schemas,
handling multiple schemas concurrently, excluding tables, and more, refer to our documentation
here.
Declarative Migrations
Atlas supports a workflow called declarative schema migrations. In this workflow, you first define the desired state of your database schema (in one of many supported formats and languages). Then, you let Atlas calculate the diff between the desired state and the actual state of your database. Atlas then generates the SQL commands that will bring your database to the desired state.
Let's see this in action.
First, create a new file name schema.sql. This file will contain the desired state of our database in plain SQL.
-- Create "users" table
CREATE TABLE `users` (
`id` INT64 NOT NULL,
`email` STRING(255),
`display_name` STRING(255)
) PRIMARY KEY (`id`);
-- Create the posts table with a custom name for the FK constraint
CREATE TABLE `posts` (
`id` INT64 NOT NULL,
`title` STRING(255),
`body` STRING(MAX),
`author_id` INT64,
CONSTRAINT `author_fk` FOREIGN KEY (`author_id`) REFERENCES `users` (`id`)
) PRIMARY KEY (`id`);
Applying our schema
Next, let's apply this schema to our database using the atlas schema apply command.
atlas schema apply -u "spanner://projects/YOUR_PROJECT_ID/instances/my-instance/databases/my-database" \
--to file://schema.sql \
--dev-url "docker://spanner/latest"
If your Spanner database uses the PostgreSQL dialect, use docker://spannerpg/latest as the dev-url:
atlas schema apply -u "spanner://projects/YOUR_PROJECT_ID/instances/my-instance/databases/my-database" \
--to file://schema.sql \
--dev-url "docker://spannerpg/latest"
Atlas automatically detects the dialect of your target database and uses the appropriate syntax.
Atlas will connect to our target database to inspect its current state. Next, it will use the dev-database to normalize our schema and generate the SQL commands that will bring our database to the desired state:
Planning migration statements (1 in total):
-- create "posts" table:
-> CREATE TABLE `posts` (
`id` INT64 NOT NULL,
`title` STRING(255),
`body` STRING(2621440),
`author_id` INT64,
CONSTRAINT `author_fk` FOREIGN KEY (`author_id`) REFERENCES `users` (`id`) ON DELETE NO ACTION
) PRIMARY KEY (`id`)
-------------------------------------------
Analyzing planned statements (1 in total):
-- no diagnostics found
-------------------------
-- 119.9685ms
-- 1 schema change
-------------------------------------------
? Approve or abort the plan:
▸ Approve and apply
Abort
After applying the schema, Atlas confirms that the changes were applied:
-------------------------------------------
Applying approved migration (1 statement in total):
-- create "posts" table
-> CREATE TABLE `posts` (
`id` INT64 NOT NULL,
`title` STRING(255),
`body` STRING(2621440),
`author_id` INT64,
CONSTRAINT `author_fk` FOREIGN KEY (`author_id`) REFERENCES `users` (`id`) ON DELETE NO ACTION
) PRIMARY KEY (`id`)
-- ok (13.408836s)
-------------------------
-- 13.408947041s
-- 1 migration
-- 1 sql statement
Next, let's re-run the atlas schema apply command. This time, Atlas will detect that the database is
already in the desired state and will not generate any changes:
Schema is synced, no changes to be made
Altering our schema
Now, let's make some changes to our schema. Open the schema.sql file and add a new column to the users table:
CREATE TABLE users (
`id` INT64 NOT NULL,
`email` STRING(255),
`display_name` STRING(255),
+ `bio` STRING(1024) -- New column for user biography
);
Next, let's re-run the atlas schema apply command. This time, Atlas will detect that the schema has changed
and will generate the needed SQL commands to bring the database to the desired state:
Planning migration statements (1 in total):
-- add column "bio" to table "users":
-> ALTER TABLE `users` ADD COLUMN `bio` STRING(1024)
-------------------------------------------
Analyzing planned statements (1 in total):
-- no diagnostics found
-------------------------
-- 120.611709ms
-- 1 schema change
-------------------------------------------
? Approve or abort the plan:
▸ Approve and apply
Abort
After applying the changes, Atlas confirms once again that the changes were applied:
-------------------------------------------
Applying approved migration (1 statement in total):
-- add column "bio" to table "users"
-> ALTER TABLE `users` ADD COLUMN `bio` STRING(1024)
-- ok (17.219869042s)
-------------------------
-- 17.219887625s
-- 1 migration
-- 1 sql statement
Visualizing our schema
One of the most useful features of Atlas is the ability to visualize your database schema. To do so, run the
atlas schema inspect command with the -w (web) flag:
atlas schema inspect -w -u "spanner://projects/YOUR_PROJECT_ID/instances/my-instance/databases/my-database"
Atlas will ask whether you would like to create your visualization publicly (in a publicly accessible URL) or privately (in your Atlas Cloud account):
? Where would you like to share your schema visualization?:
▸ Publicly (gh.atlasgo.cloud)
Privately (giautm.atlasgo.cloud)
For this demo, let's choose the public option. Atlas will create the visualization and open it in your default browser:
See it for yourself at: https://gh.atlasgo.cloud/explore/0256b374
Pushing schemas to Atlas
Similar to how Docker images are pushed to Docker Hub, you can push your schema to Atlas Cloud for versioning, collaboration, and deployment:
atlas schema push app \
--dev-url "docker://spanner/latest"
Once pushed, Atlas prints a URL to the schema. You can then apply it to any database using the schema URL:
atlas schema apply \
-u "spanner://projects/YOUR_PROJECT_ID/instances/my-instance/databases/my-database" \
--to "atlas://app" \
--dev-url "docker://spanner/latest"
This workflow allows you to manage your schema centrally and deploy it to multiple environments without having the schema files locally.
For more advanced workflows, you can use atlas schema plan to pre-plan and review migrations
before applying them. This enables teams to plan, lint, and review changes during the PR stage, edit generated
SQL if needed, and ensure no human intervention is required during deployment.
Versioned Migrations
Alternatively, the versioned migration workflow, sometimes called "change-based migrations", allows each change to the database schema to be checked-in to source control and reviewed during code-review. Users can still benefit from Atlas intelligently planning migrations for them, however they are not automatically applied.
Creating the first migration
In the versioned migration workflow, our database state is managed by a migration directory. The migration directory holds all of the migration files created by Atlas, and the sum of all files in lexicographical order represents the current state of the database.
To create our first migration file, we will run the atlas migrate diff command, and we will provide the necessary parameters:
--dirthe URL to the migration directory, by default it is file://migrations.--tothe URL of the desired state. A state can be specified using a database URL, HCL or SQL schema, or another migration directory.--dev-urla URL to a Dev Database that will be used to compute the diff.
- Atlas DDL (HCL)
- SQL
atlas migrate diff initial \
--to file://schema.hcl \
--dev-url "docker://spanner/latest"
atlas migrate diff initial \
--to file://schema.sql \
--dev-url "docker://spanner/latest"
If your Spanner database uses the PostgreSQL dialect, use docker://spannerpg/latest as the dev-url:
atlas migrate diff initial \
--to file://schema.sql \
--dev-url "docker://spannerpg/latest"
Run ls migrations, and you'll notice that Atlas has automatically created a migration directory for us, as well as
two files:
- 20240221153232_initial.sql
- atlas.sum
-- Create "users" table
CREATE TABLE `users` (
`id` INT64 NOT NULL,
`email` STRING(255),
`display_name` STRING(255)
) PRIMARY KEY (`id`);
-- Create "posts" table
CREATE TABLE `posts` (
`id` INT64 NOT NULL,
`title` STRING(255),
`body` STRING(MAX),
`author_id` INT64,
CONSTRAINT `author_fk` FOREIGN KEY (`author_id`) REFERENCES `users` (`id`)
) PRIMARY KEY (`id`);
h1:example-hash=
20240221153232_initial.sql h1:another-hash=
The migration file represents the current state of our database, and the sum file is used by Atlas to maintain the integrity of the migration directory. To learn more about the sum file, read the documentation.
Pushing migration directories to Atlas
Now that we have our first migration, we can apply it to a database. There are multiple ways to accomplish this, with most methods covered in the guides section. In this example, we'll demonstrate how to push migrations to Atlas Cloud, much like how Docker images are pushed to Docker Hub.
Let's name our new migration project app and run atlas migrate push:
atlas migrate push app \
--dev-url "docker://spanner/latest"
Once the migration directory is pushed, Atlas prints a URL to the created directory.
Applying migrations
Once our app migration directory has been pushed, we can apply it to a database from any CD platform without
necessarily having our directory there.
We'll create a simple Atlas configuration file (atlas.hcl) to store the settings for our environment:
env "spanner" {
url = "spanner://projects/YOUR_PROJECT_ID/instances/my-instance/databases/my-database"
migration {
dir = "atlas://app"
}
}
The final step is to apply the migrations to the database. Let's run atlas migrate apply with the --env flag
to instruct Atlas to select the environment configuration from the atlas.hcl file:
atlas migrate apply --env spanner
After applying the migration, you should receive a link to the deployment and the database where the migration was applied.
Next Step: Setup CI/CD
Once you have your migration directory set up, the next step is to integrate Atlas into your CI/CD pipeline. Atlas provides native integrations for popular platforms:
Wrapping up
In this guide, we demonstrated how to set up Atlas to manage your Google Cloud Spanner database schema. We covered both declarative and versioned migration workflows, and showed how to generate migrations, push them to an Atlas workspace, and apply them to your databases. Atlas has many more features to explore. To learn more, check out the Atlas documentation.
As always, we would love to hear your feedback and suggestions on our Discord server.