Explode in Databricks
Explode in Databricks
Here’s a breakdown of the explode function in Databricks, including its purpose, syntax, examples, and important considerations:
Purpose
The explode function in Databricks transforms array or map-type columns into multiple rows.
- Arrays: If you have an array column, explode takes each array element and creates a new row with that element.
- Maps: If you have a map column, explode creates a new row for each key-value pair in the map, with separate columns for the key and the value.
Syntax
The basic syntax is:
SQL
SELECT explode(column_name) AS alias_name
FROM table_name
- column_name: The name of the array or map column you want to explore.
- alias_name: An optional alias for the new column created from the exploded array elements or map keys.
Examples
1. Exploding an Array
Suppose you have a table of customers with a column of interests containing arrays of strings:
customers
+—-+——————–+
| id | interests |
+—-+——————–+
| 1 | [“music”, “sports”]|
| 2 | [“reading,” “travel”]|
+—-+——————–+
You can use Explode to create a new row for each interest:
SQL
SELECT id, explode(interests) AS interest
FROM customers
Result:
+—-+———+
| id | interest |
+—-+———+
| 1 | music |
| 1 | sports |
| 2 | reading |
| 2 | travel |
+—-+———+
2. Exploding a Map
Let’s say you have a column preferences containing maps:
preferences
+—-+———————————-+
| id | preferences |
+—-+———————————-+
| 1 | {“color”: “blue,” “city”: “NY”} |
| 2 | {“food”: “pizza“, “season”: “fall”}|
+—-+———————————-+
You can explode the map into key-value pairs:
SQL
SELECT id, explode(preferences) AS (preference_key, preference_value)
FROM preferences
Result:
+—-+—————+——————+
| id | preference_key| preference_value |
+—-+—————+——————+
| 1 | color | blue |
| 1 | city | NY |
| 2 | food | pizza |
| 2 | season | fall |
+—-+—————+——————+
Important Considerations
- Databricks SQL and Older Databricks Runtime: In these environments, you can only use explode in the SELECT list (either as the root of an expression or following a LATERAL VIEW).
- Databricks SQL and Databricks Runtime 12.2 LTS and above: Using explode directly in the SELECT list or LATERAL VIEW clause is deprecated. You should instead invoke explode as a table reference:
SQL
SELECT …
FROM table_name
LATERAL VIEW explode(array_column) table_alias AS column_alias
Databricks Training Demo Day 1 Video:
Conclusion:
Unogeeks is the No.1 IT Training Institute for Databricks Training. Anyone Disagree? Please drop in a comment
You can check out our other latest blogs on Databricks Training here – Databricks Blogs
Please check out our Best In Class Databricks Training Details here – Databricks Training
Follow & Connect with us:
———————————-
For Training inquiries:
Call/Whatsapp: +91 73960 33555
Mail us at: info@unogeeks.com
Our Website ➜ https://unogeeks.com
Follow us:
Instagram: https://www.instagram.com/unogeeks
Facebook:https://www.facebook.com/UnogeeksSoftwareTrainingInstitute
Twitter: https://twitter.com/unogeeks