Skip to main content

3 posts tagged with "github-actions"

View All Tags

GitOps for Databases, Part 2: Atlas Operator and ArgoCD

· 7 min read
Rotem Tamir
Building Atlas
info

This is the second post in a two-part tutorial, which demonstrates how to use the Atlas Operator in tandem with Atlas Cloud and ArgoCD to create a slick, modern GitOps workflow for managing your database migrations natively in Kubernetes.

In part one, we demonstrated how to initialize an Atlas project, and create a CI/CD pipeline that automatically plans, verifies and stores your database migrations in Atlas Cloud using GitHub Actions.

In this part, we will show how to deploy these migrations using the Atlas Operator and ArgoCD to demonstrate a complete GitOps workflow for database migrations.

How to GitOps your Database Migrations on Kubernetes

"We can wrap existing schema management solutions into containers, and run them in Kubernetes as Jobs. But that is SILLY. That is not how we work in Kubernetes."

-Viktor Farcic, DevOps ToolKit

As applications evolve, so do their database schemas. The practice of automating the deployment of database schema changes has evolved hand in hand with modern devops principles into what is known as database migrations. As part of this evolution, hundreds of "migration tools" have been created to help developers manage their database migrations. These tools range from ORM and language specific tools like Alembic for Python, to language agnostic tools like Flyway and Liquibase.

When Kubernetes came along and teams started to containerize their applications, the knee-jerk reaction was to wrap these legacy tools in a container and run them as part of the application deployment process. We discussed some of the shortcomings of this approach in a recent KubeCon talk and earlier Webinar.

Atlas was created from the ground up to be a modern database migration tool that embodies modern DevOps principles and is designed to run natively in Kubernetes. The Atlas Operator enables teams to extend the native Kubernetes API with new resource types that represent database schemas and migrations. By using these capabilities it is possible to natively integrate database migrations into your GitOps workflow.

Prerequisites

  • A running Kubernetes cluster - for learning purposes, you can use Minikube, which is a tool that runs a single-node Kubernetes cluster on your laptop.
  • kubectl - a command-line tool for interacting with Kubernetes clusters.
  • Helm - a package manager for Kubernetes.

Setting up the Atlas Operator and ArgoCD

1. Install ArgoCD

To install ArgoCD run the following commands:

kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

Wait until all the pods in the argocd namespace are running:

kubectl wait --for=condition=ready pod --all -n argocd

kubectl will print something like this:

pod/argocd-application-controller-0 condition met
pod/argocd-applicationset-controller-69dbc8585c-6qbwr condition met
pod/argocd-dex-server-59f89468dc-xl7rg condition met
pod/argocd-notifications-controller-55565589db-gnjdh condition met
pod/argocd-redis-74cb89f466-gzk4f condition met
pod/argocd-repo-server-68444f6479-mn5gl condition met
pod/argocd-server-579f659dd5-5djb5 condition met

For more information or if you run into some errors refer to the Argo CD Documentation.

2. Install the Atlas Operator

helm install atlas-operator oci://ghcr.io/ariga/charts/atlas-operator

Helm will print something like this:

Pulled: ghcr.io/ariga/charts/atlas-operator:0.3.6
Digest: sha256:7e29c15e846fa9c25164f4ad5a7cb7f25e9ead2882082f0352985e58c1976f99
NAME: atlas-operator
LAST DEPLOYED: Mon Dec 11 10:25:11 2023
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None

Wait until the atlas-operator pod is running:

kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=atlas-operator -n default

kubectl will print something like this:

pod/atlas-operator-866dfbc56d-qkkkn condition met

For more information on the installation process, refer to the Atlas Operator Documentation

Step 2: Set up the Target Database

Start by deploying a simple PostgreSQL database using the following command:

kubectl apply -f https://raw.githubusercontent.com/ariga/atlas-operator/master/config/integration/databases/postgres.yaml

This will create a Deployment which runs a single (non-persistent) PostgreSQL instance and a Service that exposes it on port 5432. In addition, it will create a Secret that contains the database credentials.

Wait until the database pod is running:

kubectl wait --for=condition=ready pod -l app=postgres -n default

Step 3: Create the AtlasMigration resource

In order for the Atlas Operator to know which migrations to apply, we need to create an AtlasMigration resource that points to the Atlas Cloud project we created in part one. Create a new directory called manifests in your GitHub repository. In it, create a file called atlas-migration.yaml with the following contents:

manifests/atlas-migration.yaml
apiVersion: db.atlasgo.io/v1alpha1
kind: AtlasMigration
metadata:
name: migration
spec:
urlFrom:
secretKeyRef:
key: url
name: postgres-credentials
cloud:
project: "atlasdemo" # Atlas Cloud project name
tokenFrom:
secretKeyRef:
name: atlas-credentials
key: token
dir:
remote:
name: "atlasdemo" # Migration directory name in your atlas cloud project
tag: "1d579be616db48803bb21713fd836a9165030f18" # See below on how to obtain this value for your project.

This resource tells the Atlas Operator to apply the migrations in the atlasdemo project in Atlas Cloud to the database specified in the postgres-credentials secret. Notice that the tokenFrom field references a secret called atlas-credentials. This secret will contain the Atlas Cloud API token that we created in part one.

To create it run:

kubectl create secret generic atlas-credentials --from-literal=token=aci_<replace with your token>
Obtaining the tag field

Notice the tag field in the dir section. This field tells the Atlas Operator which version of the migrations to apply. In this case, we are telling it to apply the migrations tagged with the commit hash 1d579be616db48803bb21713fd836a9165030f18 which is the commit hash of the merge commit that merged the pull request we created in part one.

To review which tags are available for your migrations, head over to you Atlas Cloud project and click on the Tags tab. You should see something like this:

Commit and push the changes to your GitHub repository.

Step 4: Create the ArgoCD Application

Now that we have created the AtlasMigration resource, we can create an ArgoCD application that will deploy it. Create a file called Application.yaml in the root of your GitHub repository with the following contents:

Application.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: atlas-argocd-demo
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
source:
path: manifests
repoURL: 'https://github.com/<your gh user>/<your repo name>'
targetRevision: master
destination:
namespace: default
server: 'https://kubernetes.default.svc'
project: default
syncPolicy:
automated:
prune: true
selfHeal: true
retry:
limit: 5
backoff:
duration: 5s
maxDuration: 3m0s
factor: 2
syncOptions:
- CreateNamespace=true

Be sure to replace the repoURL field with the URL of your GitHub repository.

info

If your repository is private, you will need to create a GitHub Personal Access Token and tell ArgoCD about it by running the following command:

export CURRENT_NS=$(kubectl config view --minify --output 'jsonpath={..namespace}')
kubectl config set-context --current --namespace=argocd
argocd repo add https://github.com/<user>/<repo> --username <user> --password ghp_<your token>
kubectl config set-context --current --namespace=$CURRENT_NS

5. Step 5: Deploy!

Next, apply the application manifest:

kubectl apply -f Application.yaml

Wait until the application is deployed:

kubectl wait --for=condition=ready atlasmigration/migration

Observe the status of the migration object:

 kubectl get atlasmigration/migration -o jsonpath='{.status}' | jq

The output will look similar to:

{
"conditions": [
{
"lastTransitionTime": "2023-12-11T08:38:35Z",
"message": "",
"reason": "Applied",
"status": "True",
"type": "Ready"
}
],
"lastApplied": 1702283914,
"lastAppliedVersion": "20231206075118",
"observed_hash": "6e4feac15a35d20c38e705428de507835c7c58d487eacc84ed012a17b002981d"
}

You can also observe the status of the migration using the Atlas Cloud UI:

Wrapping Up

Let's review the flow that we have created, from end to end:

  • Developers modify the desired state of their schema and use atlas migrate diff locally to generate a migration plan.
  • Developers commit the migration plan to their GitHub repository and create a pull request.
  • GitHub Actions runs the Atlas Continuous Integration workflow, which verifies the migration plan is correct and safe.
  • Once the pull request is merged, a GitHub Actions workflow pushes the new migration to Atlas Cloud. It is tagged with the commit hash of the merge commit.
  • When we are ready to deploy our changes to production, we change the value of the tag field in the AtlasMigration resource to the most recent tag. We push this change to our GitHub repository.
  • ArgoCD detects the change and updates our AtlasMigration resource.
  • The Atlas Operator detects the change and applies the migrations to the database.
  • The database is now up to date with the desired state of our schema!

To summarize, in this tutorial we demonstrated how to use the Atlas Operator and ArgoCD to create a slick, modern GitOps workflow for managing your database migrations natively in Kubernetes.

As always, we would love to hear your feedback and suggestions on our Discord server.

GitOps for Databases, Part 1: CI/CD

· 10 min read
Rotem Tamir
Building Atlas
info

This is the first post in a two-part tutorial, which demonstrates how to use the Atlas Operator in tandem with Atlas Cloud and ArgoCD to create a slick, modern GitOps workflow for managing your database migrations natively in Kubernetes.

GitOps is a software development and deployment methodology that uses Git as the central repository for both code and infrastructure configuration, enabling automated and auditable deployments.

ArgoCD is a Kubernetes-native continuous delivery tool that implements GitOps principles. It uses a declarative approach to deploy applications to Kubernetes, ensuring that the desired state of the application is always maintained.

Kubernetes Operators are software extensions to Kubernetes that enable the automation and management of complex, application-specific, operational tasks with domain-specific knowledge within a Kubernetes cluster.

In this tutorial, we will use the Atlas Operator in tandem with Atlas Cloud and ArgoCD to create a slick, modern GitOps workflow for managing your database migrations natively in Kubernetes.

For the sake of brevity, we are going to split this guide into two parts:

  1. In part one, we will show how to initialize an Atlas project, and create a CI/CD pipeline that will automatically plan, verify and store your database migrations in Atlas Cloud using GitHub Actions.
  2. In part two, we will show how to deploy these migrations using the Atlas Operator and ArgoCD to demonstrate a complete GitOps workflow for database migrations.

Modern CI/CD for Database Migrations

Atlas was built to support a modern CI/CD workflow for database migrations based on the following principles:

  1. Changes to the database are planned automatically. Given the desired state of the database, the system should automatically generate a plan for how to get from the current state to the desired state.
  2. Changes to the database schema are stored in a versioned migration directory. All planned changes to the database are checked in to a versioned migration directory. This directory contains SQL scripts, which are executed in lexicographic order to apply the changes to the database.
  3. Changes to the database are validated during CI. All changes to the database are tested and evaluated against a set of governing policies.
  4. Changes to the database are deployed via automation. No manual steps are required to deploy changes to the database. All changes are deployed via a CI/CD pipeline.

To learn more about these principles, check out our guide to modern CI/CD for database migrations.

In this tutorial, we will show to apply the fourth principle to your database migrations using the Atlas Operator and ArgoCD.

Local Setup

With the principles of modern CI/CD for database migrations in mind, let's see how we can apply them to a simple application that uses a PostgreSQL database.

Prerequisites (for part one)

  1. A GitHub Account - we are going to be setting up some GitHub Actions workflows, so you will need a GitHub account.

  2. The most recent version of Atlas. To get Atlas on Linux or macOS run:

    curl -sSf https://atlasgo.sh | sh

For more installation options, see the docs 3. Docker. To install Docker, follow the instructions here.

  1. The GitHub CLI, gh. To install gh:
    brew install gh

Follow instructions for other platforms here.

Step 1: Define our desired state

Atlas advocates for the declarative approach in which users start their work by defining the desired state of their database and let the system figure out the implementation details. Atlas supports many different ways to define the desired state of your database, called "schema loaders". In this tutorial, we will use a simple SQL file to define our desired state.

In a fresh Git repo, create a file named schema.sql with the following contents:

schema.sql
create table users (
id int primary key,
name varchar(255) not null unique
);

In this project, whenever we want to change the database schema, we will update this file to reflect the desired state of the database.

Step 2: Plan the initial migration

Now that we have defined our desired state, we can use the Atlas CLI to plan the initial migration. Create the following file named atlas.hcl:

env "local" {
src = "file://schema.sql"
dev = "docker://postgres/15/dev"
migration {
dir = "file://migrations"
}
format {
migrate {
diff = "{{ sql . \" \" }}"
}
}
}

Next, run the following command to plan the initial migration:

atlas migrate diff --env local

Observe two new files that were created in the migrations directory:

.
├── atlas.hcl
├── migrations
│ ├── 20231204121249.sql
│ └── atlas.sum
└── schema.sql

Step 3: Push our migration directory to Atlas Cloud

Atlas Cloud is a hosted service that can serve as a central repository for your database migrations. Similarly to how DockerHub is used to store and distribute Docker images, Atlas Cloud can be used to store and distribute database migration directories. Atlas Cloud has a free tier that is suitable for small teams and personal projects which you can use to follow along with this tutorial.

Log in to Atlas Cloud using the following command:

atlas login

If you do not have an existing Atlas Cloud account, you will be prompted to create one.

Next, push your migration directory to Atlas Cloud using the following command:

atlas migrate push --env local atlasdemo

This will create a new project named atlasdemo on Atlas Cloud and push your migration directory to it. Atlas will print a URL to the project page on Atlas Cloud similar to the following:

https://rotemtam85.atlasgo.cloud/dirs/4294967359

Setup GitHub Actions

In this section, we will set up a GitHub Actions workflow that will add Atlas to your CI/CD pipeline.

Create a Bot Token

To write data to your Atlas Cloud account, you will need to supply your CI/CD pipelines with an API key that has write access to your Atlas Cloud account. To learn how to create a bot token, check out our guide on the topic. Use the instructions in this guide to create a token, and make a note of it. We will use it in the next steps.

Install the Atlas Extension

To streamline this process, we have created a gh command that will create the workflow for you. To install the latest version, run:

gh extension install ariga/gh-atlas

Ensure your gh CLI has sufficient permissions

Make sure you have the necessary permissions to configure your action:

gh auth refresh -s write:packages,workflow

Create a GitHub Actions Workflow

Once installed, let's use this extension to generate our GitHub Actions workflow. Run the following command:

gh atlas init-action --token <your-bot-token>  --dir-name="atlasdemo" --driver=postgres

Atlas will scan your repository (locally) for directories containing Atlas migrations and ask you which one you would like to use for CI. Select the desired directory and press "Enter":

Use the arrow keys to navigate: ↓ ↑ → ←
? choose migration directory:
▸ migrations

Atlas will then ask you which database driver this directory contains migrations for. Select the desired driver and press "Enter".

Next, the GitHub extension will save your bot token to a GitHub secret and create a pull request with the necessary configuration for the GitHub Action.

Screenshot Example

The PR contains a GitHub Actions workflow similar to this:

Code Example
name: Atlas
on:
push:
branches:
- master
paths:
- .github/workflows/ci-atlas.yaml
- 'migrations/*'
pull_request:
paths:
- 'migrations/*'
# Permissions to write comments on the pull request.
permissions:
contents: read
pull-requests: write
jobs:
atlas:
services:
# Spin up a postgres:15 container to be used as the dev-database for analysis.
postgres:
image: postgres:15
env:
POSTGRES_DB: dev
POSTGRES_PASSWORD: pass
ports:
- 5432:5432
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-start-period 10s
--health-timeout 5s
--health-retries 5
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0
- uses: ariga/setup-atlas@v0
with:
cloud-token: ${{ secrets.ATLAS_CLOUD_TOKEN_K6MJMK }}
- uses: ariga/atlas-action/migrate/lint@v1
with:
dir: 'file://migrations'
dir-name: 'atlasdemo'
dev-url: 'postgres://postgres:pass@localhost:5432/dev?search_path=public&sslmode=disable'
env:
GITHUB_TOKEN: ${{ github.token }}
- uses: ariga/atlas-action/migrate/push@v1
if: github.ref == 'refs/heads/master'
with:
dir: 'file://migrations'
dir-name: 'atlasdemo'
dev-url: 'postgres://postgres:pass@localhost:5432/dev?search_path=public&sslmode=disable'

After reviewing the changes, merge the pull request to enable the GitHub Action.

Testing our Pipeline

Now that we have set everything up, let's test our pipeline end-to-end. To do so, we will first plan a new change to our database schema.

Edit the Desired Database Schema

Edit the schema.sql file to add a new column to the users table:

schema.sql
create table users (
id int primary key,
name varchar(255) not null unique,
email varchar(255) not null unique
);

Generate a New Migration

Next, run the following command to automatically generate a new migration:

atlas migrate diff --env local add_email_column

This will create a new file in the migrations directory:

.
├── atlas.hcl
├── migrations
│ ├── 20231204121249.sql
│ ├── 20231206075118_add_email_column.sql
│ └── atlas.sum
└── schema.sql

Create a new Pull Request

Next, create a branch and push the changes to GitHub:

git checkout -b add-email-column
git add .
git commit -m "Add email column"
git push --set-upstream origin add-email

Next, use the gh CLI to create a new pull request:

gh pr create --title "migrations: add email column" --body "adding email column to users table"

Atlas Reviews the Pull Request

Based on the configuration in the GitHub Actions workflow we created, Atlas will automatically review your migration directory whenever a new pull request that affects it is opened. When Atlas is done running, it will comment on your PR with the results of the review:

Interesting! Atlas found some issues with our migration. Let's click on the report to see what they are:

Atlas warns us about two issues. The first is that adding a non-nullable varchar column "email" will fail in case the "users" table is not empty. The second is that creating an index non-concurrently causes write locks on the "users" table. Since we are in the early stages of development, we can safely ignore these issues for now. Let's merge the pull request and see what happens.

gh pr merge --squash

Atlas Pushes the Migrations to Atlas Cloud

Once GitHub Actions detects that a new push to the master branch has been merged, per our configuration, it will run the atlas migrate push command to push the migrations to Atlas Cloud. Once the push is complete, our schema will be updated in the Atlas Cloud schema viewer screen:

Wrapping Up Part One

That's it for part one! In this tutorial, we have shown how to use Atlas Cloud and GitHub Actions to create a slick, modern CI/CD pipeline for your database migrations. In part two, we will show how to deploy these migrations using the Atlas Operator and ArgoCD to demonstrate a complete GitOps workflow for database migrations.

As always, we would love to hear your feedback and suggestions on our Discord server.

Prevent destructive changes to your database with the Atlas GitHub Action

· 5 min read
Rotem Tamir
Building Atlas

Losing data is painful for almost all organizations. This is one of the reasons teams are very cautious when it comes to making changes to their databases. In fact, many teams set explicit policies on what kinds of changes to the database are allowed, often completely prohibiting any change that is destructive.

Destructive changes are changes to a database schema that result in loss of data. For instance, consider a statement such as:

ALTER TABLE `users` DROP COLUMN `email_address`;

This statement is considered destructive because whatever data is stored in the email_address column will be deleted from disk, with no way to recover it.

Suppose you were in charge of a team that decided to prohibit destructive changes, how would you go about enforcing such a policy? From our experience, most teams enforce policies relating to schema migrations in code-review: a human engineer, preferably with some expertise in operating databases, manually reviews any proposed database migration scripts and rejects them if they contain destructive changes.

Relying on a human reviewer to enforce such a policy is both expensive (it takes time and mental energy) and error-prone. Just like manual QA is slowly being replaced with automated testing, and manual code style reviews are being replaced with linters, isn't it time that we automate the process of ensuring that changes to database schemas are safe?

Announcing the Atlas GitHub Action

Today, we're happy to announce the release of the official Atlas GitHub Action which can be used to apply migration directory linting for a bunch of popular database migration tools. golang-migrate, goose, dbmate and Atlas itself are already supported, and Flyway and Liquibase are coming soon.

If you're using GitHub to manage your source code, you're in luck. By adding a short configuration file to your repository, you can start linting your schema migration scripts today! Let's see a short example.

Setting up

Suppose we are running a website for an e-commerce business. To store the data for our website we use a MySQL database. Because the data in this database is everything to us, we use a careful versioned migrations approach where each change to the database schema is described in an SQL script and stored in our Git repository. To execute these scripts we use a popular tool called golang-migrate.

The source code for this example can be found in rotemtam/atlas-action-demo.

Initially, our schema contains two tables: users and orders, documented in the first few migration files:

Create the users table:

-- create "users" table
CREATE TABLE `users` (
`id` int NOT NULL,
`name` varchar(100) NULL,
PRIMARY KEY (`id`)
) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

Add a unique email column:

ALTER TABLE `users` ADD COLUMN `email` varchar(255) NOT NULL, ADD UNIQUE INDEX `email_unique` (`email`);

Create the orders table, with a foreign-key referencing the users table:

-- create "orders" table
CREATE TABLE `orders` (
`id` int NOT NULL,
`user_id` int NOT NULL,
`total` decimal(10) NOT NULL,
PRIMARY KEY (`id`),
INDEX `user_orders` (`user_id`),
CONSTRAINT `user_orders` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON UPDATE NO ACTION ON DELETE NO ACTION
) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

Installing the Atlas Action

To make sure we never accidentally delete data during schema changes, we enact a policy that prohibits destructive changes to the database. To enforce this policy, we invoke the atlas-action GitHub Action from within our continuous integration flow by adding a workflow file name .github/workflows/atlas-ci.yaml:

name: Atlas CI
on:
# Run whenever code is changed in the master branch,
# change this to your root branch.
push:
branches:
- master
# Run on PRs where something changed under the `path/to/migration/dir/` directory.
pull_request:
paths:
- 'migrations/*'
jobs:
lint:
services:
# Spin up a mysql:8.0.29 container to be used as the dev-database for analysis.
mysql:
image: mysql:8.0.29
env:
MYSQL_ROOT_PASSWORD: pass
MYSQL_DATABASE: test
ports:
- "3306:3306"
options: >-
--health-cmd "mysqladmin ping -ppass"
--health-interval 10s
--health-start-period 10s
--health-timeout 5s
--health-retries 10
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3.0.1
with:
fetch-depth: 0 # Mandatory unless "latest" is set below.
- uses: ariga/atlas-action@v0
with:
dir: migrations/
dir-format: golang-migrate # Or: atlas, goose, dbmate
dev-url: mysql://root:pass@localhost:3306/test

Detecting a destructive change

Next, let's see what happens when a developer accidentally proposes a destructive change, to drop a column in the orders table:

-- modify "orders" table
ALTER TABLE `orders` DROP COLUMN `total`;

This change is proposed in PR #1 in our example repo. Because we have previously set up the Atlas GitHub Action to lint our migration directory, whenever a file changes under the migrations/ directory, a workflow is triggered.

After letting our workflow complete, observe that GitHub informs us that the Atlas CI / lint check has failed:

Clicking on the "details" link we find a detailed explanation on the causes for the failure:

Examining the Action run summary we find the following annotation:

As you can see, Atlas has detected the destructive change we proposed to apply to our database and failed our build!

Wrapping up

In this post we discussed why many teams set policies to prevent destructive changes to database schemas. We further showed how such policies can be enforced in an automated way using the official Atlas GitHub Action.

Further reading

To learn more about CI for database schema changes:

Have questions? Feedback? Find our team on our Discord server.