python - How can I strip namespaces out of an lxml tree? -
following on removing child elements in xml using python ...
thanks @tichodroma, have code:
if can use lxml, try this:
import lxml.etree tree = lxml.etree.parse("leg.xml") dog in tree.xpath("//leg1:dog", namespaces={"leg1": "http://what.not"}): parent = dog.xpath("..")[0] parent.remove(dog) parent.text = none tree.write("leg.out.xml")
now leg.out.xml
looks this:
<?xml version="1.0"?> <leg1:mor xmlns:leg1="http://what.not" ocount="7"> <leg1:order> <leg1:ctemp id="fo"> <leg1:group bnum="001" ccount="4"/> <leg1:group bnum="002" ccount="4"/> </leg1:ctemp> <leg1:ctemp id="go"> <leg1:group bnum="001" ccount="4"/> <leg1:group bnum="002" ccount="4"/> </leg1:ctemp> </leg1:order> </leg1:mor>
how modify code remove leg1:
namespace prefix of elements' tag names?
one possible way remove namespace prefix each element :
def strip_ns_prefix(tree): #iterate through element nodes (skip comment node, text node, etc) : element in tree.xpath('descendant-or-self::*'): #if element has prefix... if element.prefix: #replace element name it's local name element.tag = etree.qname(element).localname return tree
another version has namespace checking in xpath instead of using if
statement :
def strip_ns_prefix(tree): #xpath query selecting element nodes in namespace query = "descendant-or-self::*[namespace-uri()!='']" #for each element returned above xpath query... element in tree.xpath(query): #replace element name it's local name element.tag = etree.qname(element).localname return tree
Comments
Post a Comment