Tech Notes

RegEx

https://www.youtube.com/watch?v=sa-TUpSx1JA

Meta Characters

Meta CharacterMeaning of the meta character when searching
. (dot)any character except new line
\dany digit: 0-9
\Dnot a digit(0-9)
\wany word character: a-z, A-Z,0-9, _
\Wnot word character
\swhitespace: space, tab, newline
\Snot whitespace

Anchors: don’t match character but match invisible position before/after character. anchors will be used in conjuction with meta characters

AnchorMeaning of the anchor when searching
\bWord boundary(eg start of a line, space)
\Bnot a word boundary
^Beginning of the string
$End of the string

Matchers:

matcherMeaning of the anchor when searching
[ ]Matches characters in bracket
[^ ]Match characters NOT in the bracket
|Either Or
()group

Quantifiers:

QuantifierMeaning of the quantifier when searching
*0 or more
+1 or more
?0 or one
{3}exact match
{3,4}range of numbers (minimum, maximum)

RegEx in Python

Why use raws(r”string”) string while using regex. by default, python considers \t,\n as tab and newline. We want to use the raw string so that python doesn’t give special treatment to \n,\t etc.

print("\ttab")
	tab
print(r"\ttab")
\ttab

Posted

in

by

Tags: