Friday, May 13, 2022

[FIXED] How can I append XML parent tags to child tags?

Issue

I have a sample XML file:

<Root>
 <Parent_1>
  <Child_1> Text 1 </Child_1>
  <Child_2> Text 2 </Child_2>
 </Parent_1>
 <Parent_2>
  <Child_3> Text 3 </Child_3>
  <Child_4> Text 4 </Child_4>
 </Parent_2>

I want to append the parent tags down to hierarchy.

<Root>
 <Root_Parent_1>
  <Root_Parent_1_Child_1> Text 1 </Root_Parent_1_Child_1>
  <Root_Parent_1_Child_2> Text 2 </Root_Parent_1_Child_2>
 </Root_Parent_1>
 <Root_Parent_2>
  <Root_Parent_2_Child_3> Text 3 </Root_Parent_2_Child_3>
  <Root_Parent_2_Child_4> Text 4 </Root_Parent_2_Child_4>
 </Root_Parent_2>
</Root>

Can someone help me try to read the file and append the tags?


Solution

Basically you just want to write a traversal of the XML tree, creating new elements in the target tree as you go along, and accumulating the prefix that you want to add to the tags.

The handle_node function below processes one node in the traversal, and returns the target element that it created (assuming your file is toto.xml):

import lxml.etree as et

def handle_node(nd, prefix):
    new_prefix = f'{prefix}_{nd.tag}' if prefix != '' else nd.tag
    elem = et.Element(new_prefix)
    if len(nd) == 0:
        elem.text = nd.text
        return elem
    for k in nd:
        elem.append(handle_node(k, new_prefix))
    return elem
        
root = et.parse('toto.xml').getroot()
target = handle_node(root, '')

You can print out the target XML tree to check that the result is what you're expecting:

print(et.tostring(target, pretty_print=True).decode())


Answered By - joao
Answer Checked By - Cary Denson (PHPFixing Admin)

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.