c# - Regex lookahead discard a match -
i trying make regex match discarding lookahead completely.
\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*
this match , regex101 test.
but when email starts -
or _
or .
should not match completely, not remove initial symbols. ideas welcome, i've been searching past half hour, can't figure out how drop entire email when starts symbols.
you can use word boundary near @
negative lookbehind check if @ beginning of string or right after whitespace, check if 1st symbol not inside unwanted class [^\s\-_.]
:
(?<=^|\s)[^\s\-_.]\w*(?:[-+.]\w+)*\b@\w+(?:[-.]\w+)*\.\w+(?:[-.]\w+)*
see demo
list of matches:
support@github.com s.miller@mit.edu j.hopking@york.ac.uk steve.parker@soft.de info@company-hotels.org kiki@hotmail.co.uk no-reply@github.com s.peterson@mail.uu.net info-bg@software-software.software.academy
additional notes on usage , alternative notation
note best practice use few escaped chars possible in regex, so, [^\s\-_.]
can written [^\s_.-]
, hyphen @ end of character class still denoting literal hyphen, not range. also, if plan use pattern in other regex engines, might find difficulties alternation in lookbehind, , can replace (?<=\s|^)
equivalent (?<!\s)
. see this regex:
(?<!\s)[^\s_.-]\w*(?:[-+.]\w+)*\b@\w+(?:[-.]\w+)*\.\w+(?:[-.]\w+)*
and last not least, if need use in javascript or other languages not supporting lookarounds, replace (?<!\s)
/(?<=\s|^)
(non)capturing group (\s|^)
, wrap whole email pattern part set of capturing parentheses , use language means grab group 1 contents:
(\s|^)([^\s_.-]\w*(?:[-+.]\w+)*\b@\w+(?:[-.]\w+)*\.\w+(?:[-.]\w+)*)
see regex demo.
Comments
Post a Comment