PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0
Showing posts with label azure-data-factory-2. Show all posts
Showing posts with label azure-data-factory-2. Show all posts

Monday, September 26, 2022

[FIXED] How to update ADF Pipeline level parameters during CICD

 September 26, 2022     azure, azure-data-factory-2, continuous-deployment, continuous-integration     No comments   

Issue

Being novice to ADF CICD i am currently exploring how we can update the pipeline scoped parameters when we deploy the pipeline from one enviornment to another. Here is the detailed scenario -
I have a simple ADF pipeline with a copy activity moving files from one blob container to another
Example - Below there is copy activity and pipeline has two parameters named :
1- SourceBlobContainer
2- SinkBlobContainer
with their default values.

enter image description here

Here is how the dataset is configured to consume these Pipeline scoped parameters.

enter image description here

Since this is development environment its OK with the default values. But the Test environment will have the containers present with altogether different name (like "TestSourceBlob" & "TestSinkBlob").
Having said that, when CICD will happen it should handle this via CICD process by updating the default values of these parameters.

When read the documents, no where i found to handle such use-case.
Here are some links which i referred -

  • http://datanrg.blogspot.com/2019/02/continuous-integration-and-delivery.html
  • https://docs.microsoft.com/en-us/azure/data-factory/continuous-integration-deployment

    Thoughts on how to handle this will be much appreciated. :-)

Solution

There is another approach in opposite to ARM templates located in 'ADF_Publish' branch. Many companies leverage that workaround and it works great.
I have spent several days and built a brand new PowerShell module to publish the whole Azure Data Factory code from your master branch or directly from your local machine. The module resolves all pains existed so far in any other solution, including:

  • replacing any property in JSON file (ADF object),
  • deploying objects in an appropriate order,
  • deployment part of objects,
  • deleting objects not existing in the source any longer,
  • stop/start triggers, etc.

The module is publicly available in PS Gallery: azure.datafactory.tools
Source code and full documentation are in GitHub here.
Let me know if you have any question or concerns.



Answered By - Kamil Nowinski
Answer Checked By - Marie Seifert (PHPFixing Admin)
Read More
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg

Monday, August 1, 2022

[FIXED] How to get static IP for Azure Data Factory Pipeline?

 August 01, 2022     azure, azure-data-factory-2, ftp     No comments   

Issue

I have a workflow/pipeline in Azure which connects to third party FTP (via linked service) and get files on regular basis.

It was all working fine, till third party introduced white listing of IP's, and now they are asking me to provide static IP's or range. Unless white listed, I will not be able to get my pipeline working.

Now my question is. How to provide my IP address?

I know which region my ADF works in (North-Europe) and I know my linked service uses AutoResolve-IR.

Will solution be to go with, self hosted IR? If yes, then how will I know the IP of my IR?


Solution

They seem to support static IP addresses for Data Factory recently. Announcement: https://techcommunity.microsoft.com/t5/azure-data-factory/azure-data-factory-now-supports-static-ip-address-ranges/ba-p/1117508

Here is the list of IPs for North Europe as per https://docs.microsoft.com/en-us/azure/data-factory/azure-integration-runtime-ip-addresses

enter image description here



Answered By - thebernardlim
Answer Checked By - David Marino (PHPFixing Volunteer)
Read More
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg

Sunday, July 24, 2022

[FIXED] How to convert Excel to JSON in Azure Data Factory?

 July 24, 2022     azure-data-factory-2, json     No comments   

Issue

I want to convert this Excel file which contains two tables in a single worksheet

enter image description here Into this JSON format

{

parent:
{
    "P1":"x1",
    "P2":"y1",
    "P3":"z1"
}
children: [
{"C1":"a1", "C2":"b1", "C3":"c1", "C4":"d1"},
{"C1":"a2", "C2":"b2", "C3":"c2", "C4":"d2"},
...
]
}

And then post the JSON to a REST endpoint.

How to perform the mapping and posting to REST service?

Also, it appears that I need to sink the JSON to a physical JSON file before I can post as a payload to REST service - is this physical sink step necessary or can it be held in memory?

I cannot use Lookup activity to read in the Excel file because it is limited to 5,000 rows and 4MB.


Solution

I managed to do it in ADF, the solution is a bit long, but you can use azure functions to do it programmatically.

Here is a quick demo that i built:

the main idea is to split data, add headers as requested and then re-join data and add relevant keys like parents and children.

ADF:

  1. added Conditional join to split data (see attached pictures).
  2. add surrogate key for each table.
  3. filtered first row to get red off the headers in the csv.
  4. map children/parents' columns: renaming columns using derived column activity
  5. added constant value in children data flow so i can aggregate by it and convert the CSV into a complex data type.
  6. childrenArray: in a derived column,added subcolumn to a new column named Children and in values i added relevant columns.
  7. aggregated children Jsons by using the constant value.
  8. in parents dataFlow: after mapping columns , i created jsons using derived column.(please see attached pictures).
  9. joined the children array and parents jsons into one table so it will be converted to the requested Json.
  10. wrote to cached sink(here you can do the post request instead of writing to sink).

DataFlow: enter image description here

![enter image description here

Activities:

Conditional Split: enter image description here

AddSurrogateKey:

(it's the same for parents data flow just change the name of incoming stream as shown in dataflow above) enter image description here

FilterFirstRow:

enter image description here

MapChildrenColumns: enter image description here

MapParentColumns: enter image description here

AddConstantValue: enter image description here

PartentsJson: Here i added subcolumn in Expression Builder and sent column name as value,this will build the parents json. enter image description here

enter image description here

ChildrenArray: Again in a derived column, added column with a name "children" and in Expression Builder i added relevant columns.

enter image description here

Aggregate:

the purpose of this activity is to aggregate children Json's and build the array, without it you will not get an array. the aggregation function is collect().

enter image description here

enter image description here

Join Activity: Here i added an outer join to join the parents json and the children array. enter image description here

Select Relevant columns: enter image description here

Output: enter image description here



Answered By - Sally Dabbah
Answer Checked By - Dawn Plyler (PHPFixing Volunteer)
Read More
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Older Posts Home
View mobile version

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
All Comments
Atom
All Comments

Copyright © PHPFixing