bash - Parsing and manipulating log file with embedded xml -
i have log file has embedded xml amongst normal stdout in follows:
2015-05-06 04:07:37.386 [info]process:102 - application submitted ==== 1 <application><firstname>test</firstname><studentssn>123456789</studentssn><address>123 test street</address><parentssn>123456780</parentssn><applicationid>2</applicationid></application> 2015-05-06 04:07:39.386 [info] process:103 - application completed ==== 1 2015-05-06 04:07:37.386 [info]process:104 - application submitted ==== 1 <application><firstname>test2</firstname><studentssn>323456789</studentssn><address>234 test street</address><parentssn>123456780</parentssn><applicationid>2</applicationid></application> 2015-05-06 04:07:39.386 [info] process:105 - application completed ==== 1
my objective parse file , replace occurences of personal data ***. therefore, desired output after script above should be:
2015-05-06 04:07:37.386 [info]process:102 - application submitted ==== 1 <application><firstname>***</firstname><studentssn>***</studentssn><address>*******</address><parentssn>*********</parentssn> <applicationid>2</applicationid></application> 2015-05-06 04:07:39.386 [info] process:103 - application completed ==== 1 2015-05-06 04:07:37.386 [info]process:104 - application submitted ==== 1 <application><firstname>***</firstname><studentssn>*********</studentssn><address>*****</address><parentssn>*********</parentssn> <applicationid>2</applicationid></application> 2015-05-06 04:07:39.386 [info] process:105 - application completed ==== 1
thank in advance.
create file foo.sed content:
s|<firstname>[^<]*</firstname>|<firstname>***</firstname>| s|<studentssn>[^<]*</studentssn>|<studentssn>***</studentssn>| s|<address>[^<]*</address>|<address>***</address>| s|<parentssn>[^<]*</parentssn>|<parentssn>***</parentssn>|
and try gnu sed:
sed -f foo.sed log_file > new_file
or edit file "in place":
sed -i -f foo.sed log_file
Comments
Post a Comment