正则表达式学习笔记

正则表达式是对字符串操作的一种逻辑公式，就是用事先定义好的一些特定字符、及这些特定字符的组合，组成一个“规则字符串”，这个“规则字符串”用来表达对字符串的一种过滤逻辑。摘自百度百科。
常用的正则表达式：
. 匹配任意字符（.转义）
[a-z] 表示字符集,a-z中的任意一个字母,如[0-9] [A-Z] [a-zA-Z0-9]等
[^a-z]除了小写字母外的任意一个字符
|管道符表示或，如p(ython|hp)
w 表示字母、数字或下划线
d 表示任意数字
( )? 表示括号中的内容可能但不一定出现的，如(http://)?heartofocean..cn能匹配http://heartofocean.cn 和heartofocean.cn
( )* 允许括号中的内容出现N次，N≥0
( )+允许括号中的内容出现N次，N≥1
( ){x,y} 允许括号中的内容出现N次，x≤N≤y
(?=exp) 匹配exp前面的位置
(?<=exp) 匹配exp后面的位置
(?!exp) 匹配后面跟的不是exp的位置
(?<!exp) 匹配前面不是exp的位置
(?#comment) 注释
(exp) 匹配exp,并捕获文本到自动命名的组里
(?<name>exp) 匹配exp,并捕获文本到名称为name的组里，也可以写成(?’name’exp)

Regular Expression HOWTO

d: Matches any decimal digit; this is equivalent to the class [0-9].
D: Matches any non-digit character; this is equivalent to the class [^0-9].
s: Matches any whitespace character; this is equivalent to the class [ tnrfv].
S: Matches any non-whitespace character; this is equivalent to the class [^ tnrfv].
w: Matches any alphanumeric character; this is equivalent to the class [a-zA-Z0-9_].
W: Matches any non-alphanumeric character; this is equivalent to the class [^a-zA-Z0-9_].

发表回复 取消回复

发表回复取消回复