Replace in Databricks


             Replace in Databricks

In Databricks, you can replace values in strings and DataFrames using a couple of methods:

1. SQL Functions:

  • replace(str, search, replace): This function replaces all occurrences of a specific substring (search) within a string (str) with another substring (replace).
  • SQL
  • SELECT replace(‘Hello world’, ‘world’, ‘Databricks’); — Output: ‘Hello Databricks’
  • regexp_replace(str, regexp, rep): This function replaces parts of a string (str) that match a regular expression (regexp) with another string (rep).
  • SQL
  • SELECT regexp_replace(‘Hello 123 world’, ‘[0-9]+’, ‘456’); — Output: ‘Hello 456 world’

2. DataFrame API:

  • withColumn() and replace(): You can use the DataFrame API to create or replace columns. In combination with the replace function, you can also replace values within a specific column.
  • Python
  • from pyspark.sql.functions import col, replace
  • df = df.withColumn(“new_column”, replace(col(“old_column”), “old_value”, “new_value”))

Example: Replace Values in a DataFrame Column


from pyspark.sql.functions import col, regexp_replace


df = spark.createDataFrame([(“This is a test string.”,), (“Another string 123.”,)], [“text”])


# Replace specific substrings

df = df.withColumn(“replaced_text”, replace(col(“text”), “string”, “replaced_string”))


# Replace values using regular expressions

df = df.withColumn(“replaced_text_regex”, regexp_replace(col(“text”), “[0-9]+”, “456”))




|text |replaced_text |replaced_text_regex |


|This is a test string. |This is a test replaced_string.|This is a test string. |

|Another string 123. |Another replaced_string 123. |Another replaced_string 456. |


Replacing Data in Delta Tables

For a full replacement of data in a Delta table that might be used in concurrent operations, use the following pattern:


CREATE OR REPLACE TABLE table_name AS SELECT * FROM parquet. `/path/to/files`;

Databricks Training Demo Day 1 Video:

You can find more information about Databricks Training in this Dtabricks Docs Link



Unogeeks is the No.1 IT Training Institute for Databricks Training. Anyone Disagree? Please drop in a comment

You can check out our other latest blogs on Databricks Training here – Databricks Blogs

Please check out our Best In Class Databricks Training Details here – Databricks Training

 Follow & Connect with us:


For Training inquiries:

Call/Whatsapp: +91 73960 33555

Mail us at:

Our Website ➜

Follow us:





Leave a Reply

Your email address will not be published. Required fields are marked *