Came across such scenario recently where had to remove all the comments from XML file before processing it through grep, sed, awk and other bash shell utilities.
Sed proved to be a handy tool to remove all the single and multiline comments from the XML files.
Sample XML file. [Assuming filename as sample.xml]
<?xml version="1.0" encoding="ISO-8859-1"?>Command below would be able to remove all the comments in the sample.xml file
<!--
If the message tag does not contain a definition of a property,
the default value will be used.
-->
<message>
<value>reference</value>
</message>
<!-- some comment -->
<!-- another comment -->
<!--
This is another multiline comment.
line
-->
$ cat sample| sed '/<!--.*-->/d'| sed '/<!--/,/-->/d'
Result:
<?xml version="1.0" encoding="ISO-8859-1"?>Cheers,
<message>
<value>reference</value>
</message>
make world open.
4 comments:
when I try this I get the error:
sed: -e expression #1, char 1: unknown command: `<'
cat sample| sed '//d'| sed '//d'
this one works , i think he forgot to add another / after second sed
This doesn't work on this sample
NOT COMMENT
It removes the "NOT COMMENT" even though it shouldn't. It is because you assume there is only one comment per line.
This REGEX is not complete. For example if a include a string which content a comment line.
""
Post a Comment