regex - How do I get perl to print n lines following a specific string? -
i have large file , want pull out atom symbols , coordinates equilibrium geometry. desired information displayed below:
***** equilibrium geometry located ***** coordinates of atoms (angs) atom charge x y z ----------------------------------------------------------- c 6.0 0.8438492825 -2.0554543742 0.8601734285 c 6.0 1.7887997955 -1.2651150894 0.4121141006 n 7.0 1.3006136046 0.0934593194 0.2602148346
note: after coordinates finish there blank line.
i have so-far patched code makes sense me produces errors , not sure why. expects 1 file after calling on script, saves each line , changes $start==1 when sees string containing equilibrium geometry triggers recording of symbols , coordinates. continues save lines contain coordinate format until sees blank line finishes recording $geom.
#!/usr/bin/perl $num_args = $#argv + 1; if ($num_args != 1) { print "\nmust supply gamess .log file.\n"; exit; } $file = $argv[0]; open file, "<", $file; $start = 0; $geom="";
while (<file>) { $line = $_; if ( $line eq "\n" && ($start == 1) ) { $start = 0; } if ( $start == 1 && $line =~ m/\s+[a-z]+\s+[0-9\.]+\s+[0-9\.\-]+\s+[0-9\.\-]+\s+[0-9\.\-]+/ ) { $line =~ s/^\s+//; @coordinates = split(/\s+/,$line); $geom=$coordinates[0],$coordinates[3],$coordinates[4],$coordinates[5]; } if ( $line =~ m/\s+\*+ equilibrium geometry located\s\*+\s+) { $geom = ""; $start = 1; } } print $geom;
error message: unrecognized character \xc2; marked <-- here after <-- here near column 1 @ ./perl-grep line 5.
there invisible character on line 13
i have created file line (by cut/paste) , add 1 line above retyping
$geom="";
$geom="";
that looks same not (the second line buggy one)
[tmp]=> cat x | perl -ne '$line = $_; $hex = unpack "h*"; print "$hex $line" ' 2467656f6d3d22223b0a $geom=""; 2467656f6d3d22223be280a80a $geom="";
you can see there more character when hexamine file. => remove single line , retype
by way, there issue in file, miss close regexp '/'
if ( $line =~ m/\s+\*+ equilibrium geometry located\s\*+\s+) {
but guess, there still work finish script cause don't see purpose ;)
Comments
Post a Comment