Module 1: Introduction to Databricks
- What is Azure Databricks?
- Why Azure Databricks?
- How does Azure Databricks work?
- Databricks utilities
- How to integrate Azure Databricks with Azure Blob Storage?
Module 2: Spark Architecture Basics
- Overview of Spark Architecture
- The architecture of Azure Databricks spark cluster
- The architecture of Spark Job
Module 3: Spark RDDs
- What is an RDD?
- How to create an RDD?
- Datasets in Spark
Module 4: Read and Write Data in the Azure Databricks
- Read Data in CSV and JSON Format
- Read Data in the Parquet Format
- Read Data stored in views and tables
- Write the data
Module 5: DataFrames in Azure Databricks
- DataFrames and SparkSQL
- How to create DataFrames?
- SparkSQL data types
- Data sources
- DataFrame Reader & Writer
- Schemas
- Performance
- DataFrame columns and expressions
- DataFrame actions - DataFrame rows
Module 6: DataFrames Columns in Azure Databricks
- Column Class
- Working with the Column Expressions
*****DataFrames Advanced Methods in Azure Databricks*****
Module 7: Aggregation
- GroupBy
- Grouped data methods
- Aggregate functions
- Math functions
Module 8: DateTime
- Dates and timestamps
- DateTime patterns
- DateTime functions
Module 9: Complex types
- String functions
- Collection functions
Module 10: Additional functions
- Non-aggregate functions
- Na Functions
Module 11: User-defined functions
- Add to Cache
- Load from Cache
Module 12: Platform Architecture, Data protection in the Azure Databricks
- Azure Databricks platform architecture
- Perform data protection
- Security Scope of Azure Key Vault and Databricks
- Secure Access with the Azure Authentication and IAM
- Explain Security
Module 13: Building and Querying a Data Lake
- Open-Source Delta Lake
- How Azure Databricks manages Delta Lake
Module 14: Process the Streaming Data with the Azure Databricks structured streaming
- Azure Databricks structured streaming
- Performing the Stream Processing through the structured streaming
- Working with the Time Windows
- Process the data from the Event Hubs with the structured streaming
Module 15: Delta Lake Architecture
- Bronze, Gold, and Silver Architecture
- Performing the Batch and stream processing
Module 16: Creating the production workloads on Azure Databricks with the Azure Data factory
- Scheduling the Databricks jobs in the data factory pipeline
- Passing the Parameters in and out of the Databricks jobs in the data factory
Module 17: Implementing the CI/CD with the Azure DevOps
- What is CI/CD
- Creating the CI/CD process with the Azure DevOps
Module 18: Integrate Azure Databricks with the Azure Synapse
- Execute process (Different ways of Executing a process)
- Rerunning the documents in Process reporting
- Viewing process execution documents
- Viewing process and document logs
- Setting predefined tracking fields
- Creating custom/use defined tracing fields
********* PROJECT – Implement Azure Data Bricks for a Live Project
Introduction to Project Use Case
- Implement Azure Data Bricks for a Live Project.
Module 19: Project Work – Build Azure Data Bricks Components
- Understand the project requirement & come up with Design
- Configure Azure Data Bricks Components as per requirements.
- Test the setups
Module 20: Azure Data Bricks Certification Guidance
- Explain various Azure Data Bricks Certification Options
- Discuss Important Azure Data Bricks Certification Exam Questions
- Prepare for Azure Data Bricks Certification
Module 21: Resume Preparation, Interview and Job Assistance
- Prepare Crisp Resume as Azure Data Bricks Specialist
- Discuss common interview questions in Azure Data Bricks
- Provide Job Assistance