Skip to main content

Column-Level Data Lineage

Atlas Cloud traces column-level data lineage across tables, views, and datasets. Select any column to see exactly where its data comes from, which transformation expressions shape it, and what downstream objects depend on it.

Lineage is supported for PostgreSQL, MySQL, ClickHouse, CockroachDB, and Snowflake (including dynamic tables).

Atlas Cloud lineage graph showing column lineage with a pinned expression

Open the Lineage Graph

In the project overview, click the Lineage tab and select a table or view. Atlas will render the lineage graph for the selected object.

Atlas Cloud overview page with the Lineage tab selected Selecting a table or view in the Atlas Cloud lineage graph

You can also open the lineage graph from the ERD and Docs views:

Use the lineage action to open the graph for the selected object:

Opening the lineage graph from the ERD view in Atlas Cloud

Atlas opens the lineage graph in a modal so you can inspect the object without leaving the current view:

Lineage graph opened in a modal from Atlas Cloud Docs or ERD

Select and Pin Columns

Use [+] and [-] to expand or collapse upstream and downstream nodes.

Expanded upstream and downstream lineage graph in Atlas Cloud

Click a column to inspect its lineage:

Selecting a column in the Atlas Cloud lineage graph

Selecting multiple columns shows their lineage together:

Selecting multiple columns in the Atlas Cloud lineage graph

Click a selected column again to pin it and highlight its lineage path. Click it once more to unpin it:

Pinning a selected column in the Atlas Cloud lineage graph

You can also pin columns that appear elsewhere in the selected lineage path:

Pinning a related column from the selected lineage path in Atlas Cloud

To remove a selected column from the view, click (-) on the column:

Deselecting a column in the Atlas Cloud lineage graph

Inspect Transformation Expressions

Atlas marks transformation steps on the lineage path with [f]. Hover over [f] to preview the expression that transforms upstream fields along the path. This is useful for understanding how derived columns like revenue calculations or status aggregations are built from source fields.

Previewing a transformation expression from the lineage path in Atlas Cloud

Clicking [f] keeps the expression card visible while you continue exploring the graph:

Pinned expression card on the Atlas Cloud lineage graph

External Datasets

When a view or query references a table that is not defined in the current schema, Atlas renders it as an external dataset node in the lineage graph. This happens when the source table lives in a different schema, database, or is managed outside of Atlas.

External dataset nodes let you trace lineage beyond the boundaries of your managed schema without losing visibility into where data comes from.

Programmatic Access

Beyond the visual graph, Atlas can export a repository's lineage from the command line for rendering, programmatic traversal, or ingestion into a lineage catalog. The atlas cloud repo lingraph command prints the graph to stdout in one of two formats:

  • Atlas format (default) - a compact node/edge graph, ideal for rendering or programmatic traversal.
  • OpenLineage (--open-lineage) - the same lineage as a list of OpenLineage RunEvents for ingestion into catalogs such as Marquez or DataHub.

The machine-readable graph is also a ready source of context for AI coding agents. An agent asked to rename or drop a column can run atlas cloud repo lingraph to load the column-level dependency graph, trace every downstream view and column that reads from it. It can then rewrite them or flag the change as breaking, all without a human mapping the dependencies by hand. The compact default format is small enough to drop straight into the model's context.

The default format is a graph of nodes and edges. Take a schema with a table, a view built from a CTE, and a view that reads from an external source:

schema.sql
CREATE TABLE main.users (
id int NOT NULL,
name varchar(255) NOT NULL
);

CREATE VIEW main.user_names AS
WITH u_cte AS (
SELECT u.name AS user_name FROM main.users AS u
)
SELECT u_cte.user_name AS user_name FROM u_cte;

CREATE VIEW main.ext_view AS
SELECT e.name AS name FROM external_source AS e;
atlas cloud repo lingraph --slug my-repo
{
"nodes": [
{ "id": "external_dataset/external_source", "type": "external_dataset", "name": "external_source", "schema": "" },
{ "id": "schema/main/table/users", "type": "table", "name": "users", "schema": "main" },
{ "id": "schema/main/view/ext_view", "type": "view", "name": "ext_view", "schema": "main" },
{ "id": "schema/main/view/user_names", "type": "view", "name": "user_names", "schema": "main" },
{ "id": "schema/main/dataset/user_names/u_cte@5", "type": "dataset", "name": "u_cte@5", "schema": "main" }
],
"edges": [
{ "from": "external_dataset/external_source", "to": "schema/main/view/ext_view", "fromColumn": "name", "toColumn": "name", "expr": "e.name AS name" },
{ "from": "schema/main/table/users", "to": "schema/main/view/user_names" },
{ "from": "schema/main/table/users", "to": "schema/main/dataset/user_names/u_cte@5", "fromColumn": "name", "toColumn": "user_name", "expr": "u.name AS user_name" },
{ "from": "schema/main/dataset/user_names/u_cte@5", "to": "schema/main/view/user_names", "fromColumn": "user_name", "toColumn": "user_name", "expr": "u_cte.user_name AS user_name" }
]
}

Each node has a stable id, a type, a name, and a schema. The node types are:

  • table / view - objects managed in the schema (schema/<schema>/table|view/<name>).
  • external_dataset - a source not managed in the schema, such as external_source.
  • dataset - an intermediate dataset such as a CTE, named <view>/<cte>@<line>. For example, user_names/u_cte@5 is the u_cte CTE defined on line 5 of user_names.

Each edge connects two nodes by their from/to ids. Column-level edges add fromColumn, toColumn, and expr (the SQL projection); an edge with only from/to is an object-level dependency, such as users to user_names.