Skip to main content

3 posts tagged with "django"

View All Tags

The Hidden Bias of Alembic and Django Migrations (and when to consider alternatives)

· 9 min read
Rotem Tamir
Building Atlas

Python has been a top programming language for the past decade, known for its simplicity and rich ecosystem. Many companies use it to build web apps and server software, thanks to frameworks like Django and SQLAlchemy.

One of the most common (and often loathed) tasks when building backend applications is managing the database schema. As the app's data model evolves, developers need to modify the database schema to reflect those changes. This is where database schema migration tools come into play.

Why devs ❤️ Django Migrations and Alembic

As far as migration tools go, SQLAlchemy and Django have both built out robust solutions for managing database schema through Alembic and Django Migrations, which stand out as some of the best in the field. They have both been around for a while, becoming popular due to their code-first approach:

  1. First, developers define their database schema as code through Python classes, which are also used at runtime to interact with the database.
  2. Next, the migration tool automatically generates the necessary SQL to apply those changes to the database.

For most projects, these tools work well and are a great fit, but there are some cases where you should consider looking at a specialized schema management tool. In this article, we'll explore some of the limitations of ORM-based migration tools and present Atlas, a database schema-as-code tool that integrates natively with both Django and SQLAlchemy.

The bias of ORM-based migrations

ORMs are commonly distributed with schema management tools. Without a way to set up the database schema, the ORM cannot function, so each ORM must include something that provides a viable developer experience.

The main purpose of ORMs is to abstract the database layer and deliver a roughly uniform experience across different systems (e.g., PostgreSQL, MySQL, SQL Server, etc.). As an abstraction layer, they tend to concentrate on the shared database features (such as tables, indexes, and columns) rather than on more advanced, database-specific capabilities.

Being ORM maintainers ourselves (the team behind Atlas maintains Ent), we can attest that in our capacity as ORM authors, migrations are seen as a "necessary evil", something we have to ship, but really is just an annoying requirement. ORMs exist to bridge code and DB - they are a runtime effort, not a CI/CD or resource management concern.

As such, ORM migration tools tend to be basic, suitable just for the common cases of reading and writing data from tables. In projects that require a more involved schema management process, you might want to consider using a specialized tool like Atlas.

Advanced database features

ORMs scratch the tip of the iceberg of database features

In many systems, the database is viewed simply as a persistence layer, but databases are capable of much more than just storing data. In recent years, there is a growing trend of utilizing databases for more than just CRUD operations. For example, you might want to use your database for:

  • Stored Procedures, Functions, and Triggers: Logic can be encapsulated in stored procedures or triggers that automatically execute on certain events (e.g., inserts, updates, or deletes). This ensures consistent data validation, auditing, or transformation at the database level.
  • Views and Materialized Views: Views are virtual tables generated from a SELECT query, while materialized views store the query results in a physical table that can be indexed. Materialized views can boost performance for frequently accessed or computationally expensive queries.
  • Custom Data Types: Some databases (e.g., PostgreSQL) allow defining custom data types for domain-specific constraints or storing complex structures that exceed standard built-in types.
  • Extensions: Many databases support extensions that add specialized capabilities. For example, PostGIS (an extension for PostgreSQL) provides advanced geospatial data handling and queries.
  • Row-Level Security (RLS): RLS lets you define policies to control which rows a user can see or modify. This is particularly useful for multi-tenant systems or sensitive data scenarios where granular, row-level permissions are required.
  • Sharding: Sharding distributes data across multiple database nodes (or clusters). This approach can enhance performance, fault tolerance, and scalability, especially in high-traffic, large-volume applications.
  • Enumerated Types (ENUM): ENUM types allow you to define a constrained list of valid values for a column (e.g., "small", "medium", "large"). This can help enforce data consistency and prevent invalid entries.

Where Atlas comes in

ORMs typically do not provide a way to manage these advanced database features.

Using Atlas, ORM users can keep using their favorite ORM (e.g SQLAlchemy) but also extend their data model with advanced database features. This is done by utilizing composite_schema, a feature that allows you to define your schema in multiple parts, each part using a different schema source. For example:

data "external_schema" "sqlalchemy" {
program = [
"atlas-provider-sqlalchemy",
"--path", "./models",
"--dialect", "postgresql"
]
}

data "composite_schema" "example" {
// First, load the schema with the SQLAlchemy provider
schema "public" {
url = data.external_schema.sqlalchemy.url
}
// Next, load the additional schema objects from a SQL file
schema "public" {
url = "file://extra_resources.sql"
}
}

env "local" {
src = data.composite_schema.example.url
// ... other configurations
}

In this example, we define a composite schema that combines the SQLAlchemy schema with additional schema objects loaded from a SQL file. This allows you to use the full power of your database while still benefiting from the convenience of ORMs.

Using composite schemas, we can use SQLAlchemy to define a base table and then use a SQL file to define a materialized view that aggregates data from it for faster querying. For instance, let's define a SQLAlchemy model for a user account:

class User(Base):
__tablename__ = "user_account"
id: Mapped[int] = mapped_column(primary_key=True)
name: Mapped[str] = mapped_column(String(30))
team_name: Mapped[Optional[str]] = mapped_column(String(30))
points: Mapped[int] = mapped_column(Integer)

Then use plain SQL to define a materialized view that aggregates the total points per team:

CREATE MATERIALIZED VIEW team_points AS
SELECT team_name, SUM(points) AS total_points
FROM user_account
GROUP BY team_name;

Atlas will read both the SQLAlchemy model and the SQL file and generate any necessary SQL migration scripts to apply the changes to the database.

CI/CD Pipelines

Although it happens more frequently than you might hope, database schema migrations should not be executed from a developer's workstation. Running migrations in such a way is error-prone and can lead to inconsistencies between your codebase and the database schema.

Instead, migrations should be applied as part of your CI/CD pipeline. This ensures that only code that was reviewed, approved and merged is deployed to production. Additionally, it reduces the need to grant developers direct access to the production database, which can be a security and compliance risk.

Django and SQLAlchemy are unopinionated about how you run migrations in your CI/CD pipeline. They provide the basic building blocks (e.g., manage.py migrate for Django) and leave it up to you to integrate them into your pipeline.

For simple use-cases, this is fine. But as your project grows, you might find yourself needing more control over the migration process. For example, you might want to:

  1. Automate code review. Automatically verify that migrations are safe to apply before running them. Integrating automatic checks into your CI/CD pipeline can help catch issues early and prevent bad migrations from being applied.

  2. Integrate with CD systems. As systems evolve, organizations often adapt more complex deployment strategies that require advanced tooling (e.g GitOps, Infrastucture as Code). Integrating migrations natively into these systems requires a substantial engineering effort (e.g, writing a Kubernetes Operator or Terraform provider).

  3. Monitor schema drift. As much as we'd like to believe that production environments are air-tight, and never touched by human hands, the reality is that things happen. Monitoring schema drift can help you catch unexpected changes to your database schema and take corrective action before they cause issues.

Atlas ships with native integrations for popular CI/CD systems like GitHub Actions, GitLab CI, BitBucket Pipelines, Kubernetes, Terraform, and more. This allows you to easily integrate schema management into your existing CI/CD pipelines without having to write brittle custom scripts or plugins.

One migration tool to rule them all

If your company's tech stack is uniform and everyone is using the same ORM and database system, you might not be worried about the need to standardize on a single migration tool, but as companies grow, the tech stack can become more diverse.

This is especially true when adopting a microservices architecture as different teams might be using different ORMs, languages or database systems. While this is great for flexibility, it can make it can make it very difficult for platform, security, and compliance functions to ensure that all teams are following the same best practices. This is where choosing a single migration tool can help.

Atlas is designed to be a universal schema management tool that can be used across different ORMs, languages, and database systems. It provides a consistent experience for managing database schema, regardless of the underlying technology stack.

By providing a plugin system that can provide bindings to different ORMs and database systems, Atlas can be to be the common denominator for schema management across your organization.

Conclusion

Django Migrations and Alembic are great tools that have served Python developers well. They make schema changes simple and work seamlessly with their respective ORMs. But ORMs focus on abstracting databases, not managing them.

For teams that need more advanced database features, better CI/CD integration, or consistency across multiple stacks — a dedicated schema management tool like Atlas can help. It works alongside ORMs, letting developers define schema as code while keeping full control over the database.

If you're running into the limits of ORM-based migrations, give Atlas a try!

Announcing v0.20: New Pricing Model, Django Support, Postgres Extensions, and more

· 7 min read

Hi everyone,

It's been a few weeks since our last version announcement and today I'm happy to share with you v0.20, which includes some big changes and exciting features:

  • New Pricing Model - As we announced earlier this month, beginning March 15th the new pricing model took effect. The new pricing is usage-based, offering you more flexibility and cost efficiency. Read about what prompted this change and view the new pricing plans here.
  • Django ORM Integration - Atlas now supports Django! Django is a popular ORM for Python. Developers using Django can now use Atlas to automatically plan schema migrations based on the desired state of their schema, instead of crafting them by hand.
  • Support for PostgreSQL Extensions - Atlas now supports installing and managing PostgreSQL extensions.
  • Dashboards in the Cloud - The dashboard (previously the 'Projects' page) got a whole new look in Atlas Cloud. Now you can view the state of your projects and environments at a glance.
  • _SQL Server is out of Beta](#sql-server-is-out-of-beta) - SQL Server is officially out of Beta! Along with this official support, we have included some new features:
    • User-Defined Types support for SQL Server - Atlas now supports two User-Defined Types: alias types and table types.
    • Azure Active Directory (AAD) Authentication for SQL Server - Connect to your SQL Server database using AAD Authentication.

Let’s dive in!

New Pricing Model

As of March 15th, there is a new pricing model for Atlas users. This change is a result of feedback we received from many teams that the previous $295/month minimum was prohibitive, and a gradual, usage-based pricing model would help them adopt Atlas in their organizations.

You can read the full reasoning for the change and a breakdown of the new pricing in this blog post.

Django ORM Integration

Django is the most popular web framework in the Python community. It includes a built-in ORM which allows users to describe their data model using Python classes. Migrations are then created using the makemigrations command, which can be applied to the database using migrate command.

Among the many ORMs available in our industry, Django's automatic migration tool is one of the most powerful and robust. It can handle a wide range of schema changes, however, having been created in 2014, a very different era in software engineering, it naturally has some limitations.

Some of the limitations of Django's migration system include:

  1. Database Features - Because it was created to provide interoperability across database engines, Django's migration system is centered around the "lowest common denominator" of database features.

  2. Ensuring Migration Safety - Migrations are a risky business. If you're not careful, you can easily cause data loss or a production outage. Django's migration system does not provide a native way to ensure that a migration is safe to apply.

  3. Modern Deployments - Django does not provide native integration with modern deployment practices such as GitOps or Infrastructure-as-Code.

Atlas, on the other hand, lets you manage your Django applications using the Database Schema-as-Code paradigm. This means that you can use Atlas to automatically plan schema migrations for your Django project, and then apply them to your database.

Read the full guide to set up Atlas for your Django project.

Support for PostgreSQL Extensions

Postgres extensions are add-on modules that enhance the functionality of the database by introducing new objects, such as functions, data types, operators, and more.

The support for extensions has been highly requested, so we are excited to announce that they are finally available!

To load an extension, add the extension block to your schema file. For example, adding PostGIS would look similar to:

extension "postgis" {
schema = schema.public
version = "3.4.1"
comment = "PostGIS geometry and geography spatial types and functions"
}

Read more about configuring extensions in your schema here.

Dashboards in the Cloud

Atlas Cloud has a new and improved dashboard view!

When working with multiple databases, environments, or even projects - it becomes increasingly difficult to track and manage the state of each of these components. With Atlas Cloud, we aim to provide a single source of truth, allowing you to get a clear overview of each schema, database, environment, deployment and their respective statuses.

project-dashboard

Once you push your migration directory to the schema registry, you will be able to see a detailed dashboard like the one shown above.

Let’s break down what we see:

  • The usage calendar shows when changes are made to your migration directory via the migrate push command in CI.

  • The databases show the state of your target databases. This list will be populated once you have set up deployments for your migration directory. The state of the database can be one of the following:

    • Synced - the database is at the same version as the latest version of your migration directory schema.
    • Failed - the last deployment has failed on this database.
    • Pending - the database is not up to date with the latest version of your migration directory schema.

An alternate view to this page is viewing it per environment. This way, you can see a comprehensive list of the status of each database in each environment.

project-envs

SQL Server out of Beta

We are proud to announce that SQL Server is officially supported by Atlas! Since our release of SQL Server in Beta last August, our team has been working hard to refine and stabilize its performance.

In addition, we have added two new capabilities to the SQL Server driver.

User-Defined Types Support

In SQL Server, user-defined types (UDTs) are a way to create custom data types that group together existing data types. Atlas now supports alias types and table types.

Alias Types

Alias types allow you to create a custom data type, which can then make your code more readable and maintainable.

For example, you might want to create an alias type email_address for the VARCHAR(100) data type. Instead of rewriting this throughout the code, and in order to maintain consistency, you can simply use email_address for clarity.

In the schema.hcl file, you would define this like so:

type_alias "email_address" {
schema = schema.dbo
type = varchar(100)
null = false
}
table "users" {
schema = schema.dbo
column "email_address" {
type = type_alias.email_address
}
}

Table Types

Table types allow you to define a structured data type that represents a table structure. These are particularly useful for passing sets of data between stored procedures and functions. They can also be used as parameters in stored procedures or functions, allowing you to pass multiple rows of data with a single parameter.

For example, we have a type_table to describe the structure of an address. We can declare this table and later use it in a function:

type_table "address" {
schema = schema.dbo
column "street" {
type = varchar(255)
}
column "city" {
type = varchar(255)
}
column "state" {
type = varchar(2)
}
column "zip" {
type = type_alias.zip
}
index {
unique = true
columns = [column.ssn]
}
check "zip_check" {
expr = "len(zip) = 5"
}
}
function "insert_address" {
schema = schema.dbo
lang = SQL
arg "@address_table" {
type = type_table.address
readonly = true // The table type is readonly argument.
}
arg "@zip" {
type = type_alias.zip
}
return = int
as = <<-SQL
BEGIN
DECLARE @RowCount INT;
INSERT INTO address_table (street, city, state, zip)
SELECT street, city, state, zip
FROM @address_table;

SELECT @RowCount = @ROWCOUNT;

RETURN @RowCount;
END
SQL
}
type_alias "zip" {
schema = schema.dbo
type = varchar(5)
null = false
}

Read the documentation to learn how to use these types in Atlas.

Azure Active Directory (AAD) Authentication

Now when using SQL Server with Atlas, instead of providing your regular database URL, you can connect to your Azure instance with Azure Active Directory Authentication.

Use the fedauth parameter to specify the AAD authentication method. For more information, see the document on the underlying driver.

To connect to your Azure instance using AAD, the URL will look similar to:

azuresql://<instance>.database.windows.net?fedauth=ActiveDirectoryDefault&database=master

Wrapping up

That's it! I hope you try out (and enjoy) all of these new features and find them useful. As always, we would love to hear your feedback and suggestions on our Discord server.

Quickly visualize your Django schemas with DjangoViz

· 3 min read
DjangoViz is Deprecated

DjangoViz has been deprecated. Please refer to the Atlas Django Provider documentation for up to date instructions.

Having a visual representation of your data model can be helpful as it allows for easier comprehension of complex data structures, and enables developers to better understand and collaborate on the data model of the application they are building.

Entity relationship diagrams (ERDs) are a common way to visualize data models, by showing how data is stored in the database. ERDs are graphical representations of the entities, their attributes, and the way these entities are related to each other.

Today we are happy to announce the release of DjangoViz, a new tool for automatically creating ERDs from Django data models.

Django is an open source Python framework for building web applications quickly and efficiently. In this blog post, I will introduce DjangoViz and demonstrate how to use it for generating Django schema visualizations using the Atlas playground.

img

Django ORM

Django ORM is a built-in module in the Django web framework. It offers a high-level abstraction layer that enables developers to define complex application data models with ease. Unlike traditional ORM frameworks that rely on tables and foreign keys, Django models are defined using Python objects and relationships:

from django.db import models

class User(models.Model):
username = models.CharField(max_length=255)
email = models.EmailField(unique=True)
password = models.CharField(max_length=255)

class Post(models.Model):
title = models.CharField(max_length=255)
content = models.TextField()
author = models.ForeignKey(User, on_delete=models.CASCADE)

When the application runs, Django translates these Python models into database schemas, mapping each model to a corresponding database table and each field to a corresponding column in the table.
When working with schemas and making changes to them, being able to understand the full picture just through code can get complicated very quickly. To help developers better understand their schema, we have created DjangoViz.

Introducing DjangoViz

For the purpose of this demo, we will follow the Django getting started tutorial, and showcase how you can use DjangoViz to visualize the default models included by Django's startproject command.

First, install Django and create a new project:

pip install Django
django-admin startproject atlas_demo
cd atlas_demo

Install the DjangoViz package:

pip install djangoviz

Add DjangoViz to your Django project's INSTALLED_APPS in atlas_demo/settings.py:

INSTALLED_APPS = [
...,
'djangoviz',
...
]

DjangoViz supports either PostgreSQL or MySQL, in this example we will use PostgreSQL:

Install the PostgreSQL driver:

pip install psycopg2-binary

Configure the database to work with PostgreSQL in the settings.py file:

DATABASES = {
"default": {
"ENGINE": "django.db.backends.postgresql_psycopg2",
"NAME": "postgres",
"USER": "postgres",
"PASSWORD": "pass",
"HOST": "127.0.0.1",
"PORT": "5432",
}
}

Start a PostgreSQL container:

docker run --rm -p 5432:5432  -e POSTGRES_PASSWORD=pass -d postgres:15

Now, you can visualize your schema by running the djangoviz management command from your new project directory:

python manage.py djangoviz

You will get a public link to your visualization, which will present an ERD and the schema itself in SQL or HCL:

Here is a public link to your schema visualization:
https://gh.atlasgo.cloud/explore/ac523fef

When clicking on the link you will see the ERD of your new project:

img

Wrapping up

In this post, we discussed DjangoViz, a new tool that helps to quickly visualize Django schemas. With this tool, you can easily get an overview of the data model and visual of your schema. We would love to hear your thoughts and feedback if you decide to give it a go!

Have questions? Feedback? Find our team on our Discord server ❤️.