关于php正则表达式的两点备注
several tips about Regular Expressions
1. process for "greedy"
By default, the quantifiers are "greedy", that is, they
match as much as possible (up to the maximum number of per-
mitted times), without causing the rest of the pattern to
fail. The classic example of where this gives problems is in
trying to match comments in C programs. These appear between
the sequences /* and */ and within the sequence, individual
* and / characters may appear. An attempt to match C com-
ments by applying the pattern
/\*.*\*/
to the string
/* first command */ not comment /* second comment */
fails, because it matches the entire string due to the
greediness of the .* item.
However, if a quantifier is followed by a question mark,
then it ceases to be greedy, and instead matches the minimum
number of times possible, so the pattern
/\*.*?\*/
小结:
?与/U有类似功能,但同时出现彼此抵消
如下:
<?
$a = "asdf/*asdfaldsfasdf*/asfdasldf;kfldsj*/asfddsaf";
$pattern = "/\/\*.*?\*\//";
//$pattern = "/\/\*.*\*\//U";
//$pattern = "/\/\*.*?\*\//U";
preg_match($pattern,$a,$match);
print_r($match);
?>
2.Assertions
\w+(?=;)
matches a word followed by a semicolon, but does not include
the semicolon in the match, and
foo(?!bar)
matches any occurrence of "foo" that is not followed by
"bar". Note that the apparently similar pattern
小结:
(?!)只前向判断匹配,如bar(?!foo),而(?!foo)bar没有意义
(?<!)只后向判断匹配,如(?<!foo)bar,而foo(?<!bar)没有意义