Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 9 additions & 8 deletions knowledge_base/app_with_database/README.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,26 @@
# Databricks app with OLTP database

This example demonstrates how to define a Databricks app backed by
an OLTP Postgres in a bundle.
a Lakebase Autoscaling Postgres database in a bundle.

It includes and deploys an example application that uses Python and Dash and a database instance.
When application is started it provisions its own schema and demonstration data in the OLTP database.
It includes and deploys an example application that uses Python and Dash and a Lakebase database project.
When application is started it provisions its own schema and demonstration data in the Postgres database.

For more information about Databricks Apps see the [documentation](https://docs.databricks.com/aws/en/dev-tools/databricks-apps).
For more information about Databricks database instances see the [documentation](https://docs.databricks.com/aws/en/oltp/).
For more information about Lakebase see the [documentation](https://docs.databricks.com/aws/en/oltp/).
For more information about managing Lakebase with bundles see the [documentation](https://docs.databricks.com/aws/en/oltp/projects/manage-with-bundles).

## Prerequisites

* Databricks CLI v0.267.0 or above
* Databricks CLI v1.4.0 or above

## Usage

1. Deploy the bundle:
```
databricks bundle deploy -t dev
```
Please note that after this bundle gets deployed, the database instance starts running immediately, which incurs cost.
Please note that after this bundle is deployed, the Lakebase project is created and incurs cost while running. Lakebase Autoscaling scales its compute down to zero when idle.

2. Run the app:
```
Expand All @@ -38,7 +39,7 @@ Alternatively, run `databricks bundle summary` to display its URL.

Run the following command to display the data generated by the app:
```
databricks psql example-database-instance -- --dbname example_database -c "select * from holidays.holiday_requests"
databricks psql --project example-database -- --dbname example_database -c "select * from holidays.holiday_requests"
```

5. Explore the app data:
Expand All @@ -48,7 +49,7 @@ databricks bundle open my_catalog
```

## Clean up
To remove the provisioned resources run
To remove the deployed resources run
```
databricks bundle destroy
```
6 changes: 3 additions & 3 deletions knowledge_base/app_with_database/resources/myapp.app.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ resources:
resources:
- name: "app-db"
description: "A database for the app to be able to connect to and query"
database:
database_name: ${resources.database_catalogs.my_catalog.database_name}
instance_name: ${resources.database_catalogs.my_catalog.database_instance_name}
postgres:
branch: ${resources.postgres_projects.my_project.id}/branches/production
database: ${resources.postgres_databases.my_database.id}
permission: "CAN_CONNECT_AND_CREATE"
11 changes: 0 additions & 11 deletions knowledge_base/app_with_database/resources/mydb.database.yml

This file was deleted.

34 changes: 34 additions & 0 deletions knowledge_base/app_with_database/resources/mydb.postgres.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
resources:
postgres_projects:
my_project:
project_id: example-database
display_name: "Example app database"
pg_version: 16

# An owner role for the database. Lakebase requires a role when creating a
# database, and declaring it explicitly keeps the bundle portable across users
# (the auto-created project-owner role's id is derived from the creator's
# identity).
postgres_roles:
my_role:
parent: ${resources.postgres_projects.my_project.id}/branches/production
role_id: app-owner
postgres_role: app_owner

# Declare the database explicitly so it has a stable id that the app can
# reference. Relying on the catalog's create_database_if_missing would create a
# database with an auto-generated id, which the app cannot reference by path.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't create_database_if_missing produce a stable name?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, there is also an auto-created database called databricks_postgres (or with dash for the API name).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name is stable, but the app attaches to the database by its resource path:

projects//branches//databases/<database_id>

with create_database_if_missing the resource isn't addressable

postgres_databases:
my_database:
parent: ${resources.postgres_projects.my_project.id}/branches/production
database_id: example-database
postgres_database: example_database
role: ${resources.postgres_roles.my_role.id}

# A Unity Catalog catalog backed by the Postgres database, so its data can be
# browsed in Unity Catalog.
postgres_catalogs:
my_catalog:
catalog_id: example_database_catalog
branch: ${resources.postgres_projects.my_project.id}/branches/production
postgres_database: ${resources.postgres_databases.my_database.postgres_database}
21 changes: 11 additions & 10 deletions knowledge_base/database_with_catalog/README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,15 @@
# OLTP database instance with a catalog
# Lakebase database project with a catalog

This Declarative Automation Bundles example demonstrates how to define an OLTP database instance and a database catalog.
This Declarative Automation Bundles example demonstrates how to define a Lakebase Autoscaling database project and a Unity Catalog catalog backed by it.

It includes and deploys an example database instance and a catalog. When data changes in the database instance, they are reflected in Unity Catalog.
It includes and deploys an example project and a catalog. When data changes in the project's Postgres database, it is reflected in Unity Catalog.

For more information about Databricks database instances, see the [documentation](https://docs.databricks.com/aws/en/oltp/).
For more information about Lakebase, see the [documentation](https://docs.databricks.com/aws/en/oltp/).
For more information about managing Lakebase with bundles, see the [documentation](https://docs.databricks.com/aws/en/oltp/projects/manage-with-bundles).

## Prerequisites

* Databricks CLI v0.265.0 or above
* Databricks CLI v1.0.0 or above
* `psql` client version 14 or above (only needed to run the demo data generation)

## Usage
Expand All @@ -18,23 +19,23 @@ Modify `databricks.yml`:

Run `databricks bundle deploy` to deploy the bundle.

Please note that after this bundle gets deployed, the database instance starts running, which incurs cost.
Please note that after this bundle is deployed, the Lakebase project is created and incurs cost while running. Lakebase Autoscaling scales its compute down to zero when idle.

Run the following queries to populate your database with sample data:

```bash
# Create a demo table:
databricks psql my-instance -- -d my_database -c "CREATE TABLE IF NOT EXISTS hello_world (id SERIAL PRIMARY KEY, message TEXT, number INTEGER, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP);"
databricks psql --project my-project -- -d my_database -c "CREATE TABLE IF NOT EXISTS hello_world (id SERIAL PRIMARY KEY, message TEXT, number INTEGER, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP);"

# Insert 100 rows of demo data:
databricks psql my-instance -- -d my_database -c "INSERT INTO hello_world (message, number) SELECT 'Hello World #' || generate_series, generate_series FROM generate_series(1, 100);"
databricks psql --project my-project -- -d my_database -c "INSERT INTO hello_world (message, number) SELECT 'Hello World #' || generate_series, generate_series FROM generate_series(1, 100);"

# Show generated rows:
databricks psql my-instance -- -d my_database -c "SELECT * FROM hello_world;"
databricks psql --project my-project -- -d my_database -c "SELECT * FROM hello_world;"
```

Open your catalog in Databricks: `databricks bundle open my_catalog`
Navigate to the `public` schema, then to the `hello_world` table, then to "Sample data" and explore your generated data.

## Clean up
To remove the provisioned instance and catalog run `databricks bundle destroy`
To remove the project and catalog run `databricks bundle destroy`
19 changes: 10 additions & 9 deletions knowledge_base/database_with_catalog/databricks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,17 @@ bundle:
# host: https://myworkspace.cloud.databricks.com

resources:
database_instances:
my_instance:
name: my-instance
capacity: CU_1
database_catalogs:
postgres_projects:
my_project:
project_id: my-project
display_name: "Example project"
pg_version: 16
postgres_catalogs:
my_catalog:
database_instance_name: ${resources.database_instances.my_instance.name}
name: example_catalog
database_name: my_database
create_database_if_not_exists: true
catalog_id: example_catalog
branch: ${resources.postgres_projects.my_project.id}/branches/production
postgres_database: my_database
create_database_if_missing: true


# Defines the targets for this bundle.
Expand Down