How do you Connect Databricks with Storage Account

Share

How do you Connect Databricks with Storage Account

You can connect Databricks to your Azure Storage Account using a few different methods, each with varying levels of security and flexibility:

Recommended Methods:

  1. OAuth 2.0 with a Microsoft Entra ID (Azure Active Directory) Service Principal is the most secure and recommended method. It involves creating a service principal in your Azure Active Directory and permitting it to access your storage account. Databricks then uses this service principle to authenticate and access the data.
  2. Shared Access Signature (SAS) Token: SAS tokens provide granular, time-limited access to specific resources within your storage account. You can generate a SAS token with particular permissions and share it with Databricks for restricted access.

Less Secure Method (Not Recommended):

  1. Account Access Keys: While you can use your storage account access keys for connection, this is the least secure option. Account keys provide full access to your storage account, so using them directly in Databricks could expose your data to unnecessary risks.

Here’s a general outline of the steps involved in connecting Databricks to your storage account using a service principal:

  1. Create a Service Principal:
    • In Azure Active Directory, create a service principal and assign it a role with the necessary permissions to access your storage account (e.g., Storage Blob Data Contributor).
  2. Configure Databricks:
    • In Databricks, go to the “Compute” section and create a cluster.
    • Under “Advanced Options,” go to the “Spark Config” tab in the cluster configuration.
    • Add the following Spark configuration properties, replacing the placeholders with your actual values:
    • Fs.azure.account.auth.type.<storage-account-name>.dfs.core.windows.net oauth
    • Fs.azure.account.oauth.provider.type.<storage-account-name>.dfs.core.windows.net org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider
    • fs.azure.account.oauth2.client.id.<storage-account-name>.dfs.core.windows.net <application-id>
    • fs.azure.account.oauth2.client.secret.<storage-account-name>.dfs.core.windows.net <password>
    • fs.azure.account.oauth2.client.endpoint.<storage-account-name>.dfs.core.windows.net https://login.microsoftonline.com/<directory-id>/oauth2/token
  3. Access Your Data:
    • You can now use standard Spark APIs (e.g., spark.read.format(“CSV”).load) to read and write data from/to your storage account in Databricks notebooks.

Databricks Training Demo Day 1 Video:

 
You can find more information about Databricks Training in this Dtabricks Docs Link

 

Conclusion:

Unogeeks is the No.1 IT Training Institute for Databricks Training. Anyone Disagree? Please drop in a comment

You can check out our other latest blogs on Databricks Training here – Databricks Blogs

Please check out our Best In Class Databricks Training Details here – Databricks Training

 Follow & Connect with us:

———————————-

For Training inquiries:

Call/Whatsapp: +91 73960 33555

Mail us at: info@unogeeks.com

Our Website ➜ https://unogeeks.com

Follow us:

Instagram: https://www.instagram.com/unogeeks

Facebook:https://www.facebook.com/UnogeeksSoftwareTrainingInstitute

Twitter: https://twitter.com/unogeeks


Share

Leave a Reply

Your email address will not be published. Required fields are marked *