PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Sunday, August 21, 2022

[FIXED] How to get username inside spark submit task in databricks?

 August 21, 2022     apache-spark, databricks, environment-variables, scala     No comments   

Issue

I'm trying to retrieve the user name inside spark-submit task in Databricks to write additional information to the table about a user who was changing the data. Unfortunately, I'm not able to find the correct way. For now, I was trying two things:

spark.sparkContext.sparkUser

and

System.getProperty("user.name")

but they both return root. Do you have any idea how to accomplish that?


Solution

If you're using Delta Lake tables, then information about performed operations is captured in the history of the Delta Lake table - see an example in the documentation.

Databricks exposes a lot of information via spark.conf - the configuration properties are starting with spark.databricks.clusterUsageTags., so you can filter all configurations and search for necessary information.

But you need to take into account that all operations in the job are performed under identity of the job owner, even if it's triggered by someone else.

There is a spark.databricks.clusterUsageTags.clusterAllTags configuration property that contains a JSON string containing a list of cluster tags, that also include Owner field with email of user who owns that Databricks job.



Answered By - Alex Ott
Answer Checked By - Marilyn (PHPFixing Volunteer)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing