The Atlas Kubernetes Operator
What are Kubernetes Operators?
Kubernetes is an excellent solution for managing stateless resources. But for many stateful resources, such as databases, reconciling the desired state of a database with its actual state can be a complex task that requires specific domain knowledge. Kubernetes Operators were introduced to the Kubernetes ecosystem to help users manage complex stateful resources by codifying this domain knowledge into a Kubernetes controller.
What is the Atlas Kubernetes Operator?
The Atlas Kubernetes Operator is a Kubernetes controller that uses Atlas to manage your database schema. The Atlas Kubernetes Operator allows you to define the desired schema and apply it to your database using the Kubernetes API.
Supported workflows
Declarative schema migrations
The Atlas Kubernetes Operator supports declarative migrations.
In declarative migrations, the desired state of the database is defined by the user and the operator is responsible
for reconciling the desired state with the actual state of the database (planning and executing CREATE
, ALTER
and DROP
statements).
In this workflow, after installing the Atlas Kubernetes Operator, the user defines the desired state of the database
as an AtlasSchema
resource which either includes the desired schema directly or references a ConfigMap
that contains
it.
The operator then reconciles the desired state with the actual state of the database by planning and executing
CREATE
, ALTER
and DROP
statements.
To ensure that the operator does not accidentally introduce destructive changes to the database, the user can specify linting or diffing policies that will be applied to the computed plan before it is applied to the database.
To learn more about getting started with the Atlas Operator and declarative migrations, see the announcement blog post.
Versioned schema migrations
The Atlas Kubernetes Operator also supports versioned migrations.
In versioned migrations, the database schema is defined by a series of SQL scripts ("migrations") that are applied
in lexicographical order. The user can specify the version and migration directory to run, which can be located
on the Atlas Cloud or stored as a ConfigMap
in your Kubernetes
cluster.
In this workflow, after installing the Atlas Kubernetes Operator, the user defines the desired state of the database
as an AtlasMigration
resource which connects between a target database and a migration directory. The migration directory
may be configured as a remote directory in Atlas Cloud or as a ConfigMap
in your Kubernetes
cluster.
The operator then reconciles the desired state with the actual state of the database by applying any pending migrations on the target database.
Installation
The Atlas Kubernetes Operator is available as a Helm Chart. To install the chart with the release name atlas-operator
:
helm install atlas-operator oci://ghcr.io/ariga/charts/atlas-operator --namespace atlas-operator --create-namespace
Providing credentials
URLs
In order to manage the schema of your database, you must provide the Atlas Kubernetes Operator with an Atlas URL to your database. The Atlas Kubernetes Operator will use this URL to connect to your database and apply changes to the schema. As this string usually contains sensitive information, we recommend storing it as a Kubernetes secret.
Both AtlasSchema
and AtlasMigration
resources support defining the connection URL as a string (using the url
field)
or as a reference to a Kubernetes secret (using the urlFrom
field):
If you are connecting to a database that is running in your Kubernetes cluster, note that you should
you use a namespace-qualified DNS name to connect to it. Connections are made from the Atlas Kubernetes Operator's
namespace, so you should use a DNS name that is resolvable from that namespace. For example, to connect to the mysql
service in the app
namespace use:
mysql://user:pass@mysql.app:3306/myapp
Schema and database bound URLs
Notice that depending on your use-case, you may want to use either a schema-bound or a database-bound connection.
- Schema-bound connections are recommended for most use-cases, as they allow you to manage the schema of a database
schema (named database). This is akin to selecting a specific database using a
USE
statement. - Database-bound connections are recommended for use-cases where you want to manage multiple schemas in a single database at once.
Create a file named db-credentials.yaml
with the following contents:
- MySQL (Schema-bound)
- MySQL (DB-bound)
- PostgreSQL (Schema-bound)
- PostgreSQL (DB-bound)
apiVersion: v1
kind: Secret
metadata:
name: mysql-credentials
type: Opaque
stringData:
url: "mysql://user:pass@your.db.dns.default:3306/myapp"
This defines a schema-bound connection to a named database called myapp
.
apiVersion: v1
kind: Secret
metadata:
name: mysql-credentials
type: Opaque
stringData:
url: "mysql://user:pass@your.db.com:3306/"
This defines a database-bound connection.
apiVersion: v1
kind: Secret
metadata:
name: pg-credentials
type: Opaque
stringData:
url: "postgres://user:pass@your.db.com:5432/?search_path=myapp&sslmode=disable"
This defines a schema-bound connection to a schema called myapp
to the default database (usually postgres
).
apiVersion: v1
kind: Secret
metadata:
name: pg-credentials
type: Opaque
stringData:
url: "postgres://user:pass@your.db.com:5432/?sslmode=disable"
urlFrom
The urlFrom
field is used to load the URL of the target database from another resource such as a
Kubernetes secret.
urlFrom:
secretKeyRef:
key: url
name: mysql-credentials
url
The url
field is used to define the URL of the target database directly. This is not recommended, but may
be useful for testing purposes.
url: "mysql://user:pass@localhost:3306/myapp"
credentials
object
Alternatively, you can provide the credentials for your database as a credentials
object.
apiVersion: db.atlasgo.io/v1alpha1
kind: AtlasSchema
metadata:
name: atlasschema-postgres
spec:
credentials:
scheme: postgres
host: postgres.default
user: root
passwordFrom:
secretKeyRef:
key: password
name: postgres-credentials
database: postgres
port: 5432
parameters:
sslmode: disable
# ... rest of the resource
Note: hostFrom
and userFrom
are also supported to load the host and user from another resource.
Declarative migrations
The Atlas Operator relies on a Kubernetes custom resource called AtlasSchema
to define the desired schema of your database.
You can find the full definition for this resource on the ariga/atlas-operator repo.
This resource describes the desired state of a target database that is used in the operator's reconciliation loop.
Let's review an example of such as a resource. Create a file named atlas-schema.yaml
with the following contents:
apiVersion: db.atlasgo.io/v1alpha1
kind: AtlasSchema
metadata:
name: myapp
spec:
# Load the URL of the target database from a Kubernetes secret.
urlFrom:
secretKeyRef:
key: url
name: mysql-credentials
# Define the desired schema of the target database. This can be defined in either
# plain SQL like this example or in Atlas HCL.
schema:
sql: |
create table users (
id int not null auto_increment,
name varchar(255) not null,
email varchar(255) unique not null,
short_bio varchar(255) not null,
primary key (id)
);
# Define policies that control how the operator will apply changes to the target database.
policy:
# Define a policy that will stop the reconciliation loop if the operator detects
# a destructive change such as dropping a column or table.
lint:
destructive:
error: true
# Define a policy that omits DROP COLUMN changes from any produced plan.
diff:
skip:
drop_column: true
# Exclude a table from being managed by the operator. This is useful for resources
# that belong to other applications and are managed in other ways.
exclude:
- external_table_managed_elsewhere
Alternatively, you can use a ConfigMap
to store the desired schema of your database. To use
this approach, create a ConfigMap with the desired schema:
kind: ConfigMap
apiVersion: v1
metadata:
name: mysql-schema
data:
schema.sql: |
create table users (
id int not null auto_increment,
name varchar(255) not null,
email varchar(255) unique not null,
short_bio varchar(255) not null,
primary key (id)
);
Then, update the AtlasSchema
resource to reference the config map key:
apiVersion: db.atlasgo.io/v1alpha1
kind: AtlasSchema
metadata:
name: myapp
spec:
schema:
configMapKeyRef:
key: schema.sql
name: mysql-schema
# Skipping all other fields for brevity.
A word of caution: if you have multiple AtlasSchema
resources that target the same database, you should
ensure that the schemas defined in each resource do not overlap. Otherwise, the operator may produce
unexpected results when attempting to reconcile the state of the database.
AtlasSchema
API reference
The AtlasSchema
resource is used to define the desired state of a target database.
Available fields:
Providing credentials
See urlFrom
, url
or credentials
.
schema
The schema
field is used to define the desired schema of the target database.
This can be defined in either plain SQL or in Atlas HCL.
schema:
sql: |
create table users (
id int not null auto_increment,
name varchar(255) not null,
email varchar(255) unique
);
schema:
hcl: |
schema "myapp" {
}
table "users" {
column "id" {
type = int
}
// ...
}
Alternatively, you can reference a key in a ConfigMap
to store the desired schema of your database:
schema:
configMapKeyRef:
key: schema.sql
name: mysql-schema
Note: the .sql
suffix on the key denotes that the schema is defined in plain SQL. If you are using Atlas HCL,
you should use the .hcl
suffix instead.
policy
The policy
field is used to define policies that control how the operator will apply changes to the target database.
lint
The lint
field is used to define policies that control how the operator will lint the desired schema.
policy:
lint:
destructive:
error: true
Currently, the only lint policy available is destructive
. This policy controls what the operator will do if it
detects a destructive change such as dropping a column or table.
diff
The diff
field is used to define the diff policies that are used to plan the schema changes.
policy:
diff:
skip:
drop_column: true
Currently, the only diff policy available is skip
. This policy controls which kind of schema changes will be
skipped from the produced plan. The following options are available:
add_column
,add_foreign_key
,add_index
,add_schema
,add_table
,drop_column
,drop_foreign_key
,drop_index
,drop_schema
,drop_table
,modify_column
,modify_foreign_key
,modify_index
,modify_schema
,modify_table
exclude
The exclude
field is used to define tables that should be excluded from being managed by the operator.
This is useful for resources that belong to other applications and are managed in other ways.
exclude:
- external_table_managed_elsewhere
schemas
The schemas
field is used to define schemas that should be managed by the operator. This is used
to manage multiple schemas in a single database and works only if the connection string is database-bound.
schemas:
- schema1
- schema2
devURL
Atlas relies on a Dev Database to normalize schemas and make various
calculations. By default, the operator will spin a new database pod to use as the Dev Database.
However, if you have special requirements (such as have it preloaded with certain extensions or configuration),
you can use the devURL
field to specify a URL to use as the Dev Database.
devURL: mysql://root:password@dev-db-svc.default:3306/dev
devURLFrom
Alternatively, you can use the devURLFrom
field to specify a key in a Secret
that contains the Dev Database URL.
devURLFrom:
secretKeyRef:
key: dev-url
name: myapp
Versioned migrations
ConfigMap
The Atlas Operator supports defining the migration directory as ConfigMap
with key-value pairs where the key is the file name and the value is the content of the migration file.
apiVersion: v1
kind: ConfigMap
metadata:
name: migration-dir
data:
20230316085611.sql: |
create table users (
id int not null auto_increment,
name varchar(255) not null,
email varchar(255) unique not null,
short_bio varchar(255) not null,
primary key (id)
);
atlas.sum: |
h1:sPURLhQRvLU79Dnlaw3aiU4KVkyUVEmW+ekenqu/V2o=
20230316085611.sql h1:FKUFrD9E4ceSeBZ5owv2c05Ag8rokXGKXp53ZctbocE=
Then, update the AtlasMigration
resource to reference the config map key:
apiVersion: db.atlasgo.io/v1alpha1
kind: AtlasMigration
metadata:
name: atlasmigration-sample
spec:
urlFrom:
secretKeyRef:
key: url
name: db-credentials
dir:
configMapRef:
name: "migration-dir" # ConfigMap name of atlas-migration-configmap.yaml
Remote Directory (Atlas Cloud)
If your migration directory is connected to Atlas Cloud you can configure the operator to read it from your cloud account.
To configure your AtlasMigration
to read the directory from your cloud account, set the cloud
and dir
attributes.
apiVersion: db.atlasgo.io/v1alpha1
kind: AtlasMigration
metadata:
name: atlasmigration-sample
spec:
urlFrom:
secretKeyRef:
key: url
name: db-credentials
cloud:
tokenFrom:
secretKeyRef:
key: token
name: atlas-credentials
dir:
remote:
name: "atlas" # Migration directory name in your atlas cloud project
tag: "commit-id" # When tag is not specified, the latest tag will be used
AtlasMigration
API reference
Providing credentials
See urlFrom
, url
or credentials
.
envName
The envName
sets the environment name used for reporting runs to Atlas Cloud.
By default, the environment name is set to Kubernetes
.
envName: "production"
cloud
Cloud is used to define the Atlas Cloud configuration.
tokenFrom
The tokenFrom
field is used to load the Atlas Cloud token from another resource such as a Kubernetes secret.
cloud:
tokenFrom:
secretKeyRef:
key: token
name: atlas-credentials
dir
The dir
field is used to define the migration directory.
For a directory defined in a ConfigMap
, use configMapRef
to reference the ConfigMap
that contains the migration files.
dir:
configMapRef:
name: "migration-dir" # ConfigMap name of atlas-migration-configmap.yaml
For remote directory, use remote
to define the migration directory in Atlas Cloud.
dir:
remote:
name: "atlas"
tag: "commit-id"
Alternatively, you can use the local
field to define the migration directory inline in the AtlasMigration
resource.
dir:
local:
20230316085611.sql: |
create table users (
id int not null auto_increment,
name varchar(255) not null,
email varchar(255) unique not null,
short_bio varchar(255) not null,
primary key (id)
);
atlas.sum: |
h1:sPURLhQRvLU79Dnlaw3aiU4KVkyUVEmW+ekenqu/V2o=
20230316085611.sql h1:FKUFrD9E4ceSeBZ5owv2c05Ag8rokXGKXp53ZctbocE=
revisionsSchema
The revisionsSchema
field is used to define the schema name for the revisions table.
Details about the revisions table can be found in the Revisions Schema.
revisionsSchema: "atlas"