Databricks’ Unity Catalog utilizes a three-level namespace for organizing and managing data assets:

  1. Catalog: This is the top-level container, acting like a database in a traditional system. It helps group related data assets together.
  2. Schema (Database): Within a catalog, schemas (or databases) provide another layer of organization, similar to folders or directories. They contain tables, views, and volumes.
  3. Objects: These are the actual data assets, such as tables, views, volumes, and models, stored within schemas.

Why use a three-level namespace?

  • Granular organization: It allows for a more structured and logical arrangement of data assets, making finding and managing them more accessible.
  • Reduced naming conflicts: With distinct namespaces, you can avoid naming collisions when dealing with similar objects from different sources.
  • Improved security and access control: Permissions can be applied at each level, providing finer-grained control over who can access what.


Consider a data warehouse for sales data. You could have a catalog called “sales_data,” schemas for different regions (“north,” “south,” “east,” “west”), and tables within each schema for different quarters (“Q1_2023,” “Q2_2023,” etc.).

Additional notes:

  • Unity Catalog is Databricks’ centralized metadata service. It provides a unified view of your data assets across workspaces and clouds.
  • By default, Unity Catalog objects are “managed,” meaning Databricks manages their lifecycle and file layout. However, you can also have “external” objects referencing data stored outside Databricks.

