对于C11中的正则表达式的使用

Regular Expression Special Characters

"."---Any single character(a "wildcard")

"["---Begin character class

"]"---End character class

"{"---Begin count

"}"---End count

"("---Begin grouping

")"---End grouping

"\"---Next character has a special meaning

"*"---Zero or more

"+"---One or more

"?"---Optional(zero or one)

"!"---Alternative(or)

"^"---Start of line; negation

"$"---End of line

Example:

case 1:

^A*B+C?$

explain 1:

以A开头，有多个或者没有B，有至少一个C，之后有没有都可以，结束。

A pattern can be optional or repeated(the default is exactly once) by adding a suffix:

Repetition

{n}---Exactly n times;

{n,}---no less than n times;

{n,m}---at least n times and at most m times;

*---Zero or more , that is , {0,}

+---One or more, that is ,{1,}

?---Optional(zero or one), that is {0,1}

Example:

case 1:

A{3}B{2,4}C*

explain 1:

AAABBC or AAABBB

A suffix ? after any of the repetition notations makes the pattern matcher "lazy" or "non-greedy".

That is , when looking for a pattern, it will look for the shortest match rather than the lonest.

By default, the pattern matcher always looks for the longest match (similar to C++‘s Max rule).

Consider:

?ababab

The pattern (ab)*matches all of "ababab". However, (ab)*? matches only the first "ab".

The most common character classifications have names:

Character Classes

alnum --- Any alphanumeric character

alpha --- Any alphanumeric character

blank --- Any whitespace character that is not a line separator

cntrl --- Any control character

d --- Any decimal digit

digit --- Any decimal digit

graph --- Any graphical character

lower --- Any lowercase character

print --- Any printable character

punct --- Any punctuation character

s --- Any whitespace character

space --- Any whitespace character

upper --- Any uppercase charater

w --- Any word character(alphnumeric characters plus the underscore)

xdigit --- Any hexadecimal digit character

Several character classes are supported by shorthand notation:

Character Class Abbreviations

\d --- A decimal digit --- [[:digit:]]

\s --- A space (space tab,...) --- [[:space:]]

\w --- A letter(a-z) or digit(0-9) or underscore(_) --- [_[:alnum:]]

\D --- Not \d --- [^[:digit:]]

\S --- Not \s --- [^[:space:]]

\W --- Not \w --- [^_[:alnum:]]

In addition, languages supporting regular expressions often provide:

Nonstandard (but Common) Character Class Abbreviations

\l --- A lowercase character --- [[:lower:]]

\u --- An uppercase character --- [[:upper;]]

\L --- Not \l --- [^[:lower:]]

\U --- Not \u --- [^[:upper:]]

Note the doubling of the backslash to include a backslash in an ordinary string literal.

As usual, backslashes can denote special charaters:

Special Characters

\n --- Newline

\t --- Tab

\\ --- One backslash

\xhh -- Unicode characters expressed using twp hexadecimal digits

\uhhh --- Unicode characters expressed using four hexadecimal digits

To add to the opportunites for confusion, two further logically differents uses of the backslash are provided:

Special Characters

\b --- The first or last character of a word (a "boundary character")

\B --- Not a \b

\i --- The ith sub_match in this pattern

Here are some examples of patterns:

Ax* ? ?//A,Ax,Axxxx

Ax+ ? ?//Ax,Axxx not A

\d-?\d ?//1-2,12 not 1--2

\w{2}-d{4,5} ? ?//Ab-1234,XX54321,22-5432

(\d*:)?(\d+) ? ? //12:3, 1:23, 123, :123 Not 123:

(bs|BS) ? ? ? ? //bs ,BS Not bS

[aeiouy] ? ? ? ?//a,o,u An English vowel, not x

[^aeiouy] ? ? ? //x,k Not an English vowel, not e

[a^eiouy] ? ? ? //a,^,o,u An Engish vowel or ^

下面是测试代码：

#include <iostream>
#include <regex>

using namespace std;

int main()
{
    const char* reg_esp = "^A*B+C?$";
    regex rgx(reg_esp);
    cmatch match;
    const char* target = "AAAAAAAAABBBBBBBBC";
    if(regex_search(target,match,rgx))
    {
        for(size_t a = 0;a < match.size();a++)
            cout << string(match[a].first,match[a].second) << endl;
    }
    else
        cout << "No Match Case !" << endl;
    return 0;
}

对于C11中的正则表达式的使用

时间： 2024-10-25 14:04:13

对于C11中的正则表达式的使用的相关文章

JavaScript中的正则表达式（终结篇）

JavaScript中的正则表达式(终结篇) 在之前的几篇文章中,我们了解了正则表达式的基本语法,但那些语法不是针对于某一个特定语言的.这篇博文我们将通过下面几个部分来了解正则表达式在JavaScript中的使用: JavaScript对正则表达式的支持程度支持正则表达式的RegExp类型 RegExp的实例属性 RegExp的实例方法 RegExp的构造函数属性简单的应用第一部分:JavaScript对正则表达式的支持程度之前我介绍了正则表达式的基本语法,如果大家不是很了解可以先看下面

Python中re(正则表达式)模块函数学习

今天学习了Python中有关正则表达式的知识.关于正则表达式的语法,不作过多解释,网上有许多学习的资料.这里主要介绍Python中常用的正则表达式处理函数. 方法/属性作用 match() 决定 RE 是否在字符串刚开始的位置匹配 search() 扫描字符串,找到这个 RE 匹配的位置 findall() 找到 RE 匹配的所有子串,并把它们作为一个列表返回 finditer() 找到 RE 匹配的所有子串,并把它们作为一个迭代器返回 match() 函数只检查 RE 是否在字符串开始处匹配

Coursera-Getting and Cleaning Data-week4-R语言中的正则表达式以及文本处理

Coursera-Getting and Cleaning Data-Week4 Thursday, January 29, 2015 补上第四周笔记,以及本次课程总结. 第四周课程主要针对text进行处理.里面包括 1.变量名的处理 2.正则表达式 3.日期处理(参见swirl lubridate包练习) 首先,变量名的处理,奉行两个原则,1)统一大小写tolower/toupper:2)去掉在导入数据时,因为特殊字符导致的合并变量 3)不要重复:4)少用代码缩写使用的函数包括替换查找:

SQL Server中使用正则表达式

SQL Server 2005及以上版本支持用CLR语言(C# .NET.VB.NET)编写过程.触发器和函数,因此使得正则匹配,数据提取能够在SQL中灵活运用,大大提高了SQL处理字符串,文本等内容的灵活性及高效性. 操作步骤: 1.新建一个SQL Server项目(输入用户名,密码,选择DB),新建好后,可以在属性中更改的 2.新建一个类“RegexMatch.cs”,选择用户定义的函数可以看到,该类为一个部分类:public partial class UserDefinedFuncti

String的replaceAll方法中的正则表达式用法

项目里面需要对已手机号码进行如下的显示比如15088688388 要显示为150****8388的效果实现这个简单的效果方法有很多我想试试用正则表达式去实现查了点资料最终试出来以下方法可行 System.out.println("15088688388".replaceAll("(\\d{3})(\\d{4})","$1****")); 输出结果:150****8388 首先对replaceAll方法的第一个参数进行解释第一个参数

MySQL中的正则表达式

正则表达式是用正则表达式语言来建立基本字符的匹配 .是正则表达式语言中的一个特殊的字符,它表示匹配任意一个字符在LIKE和REGEXP之间有一个重要的差别,LIKE匹配整个列,如果被匹配的文本仅在列值中出现,LIKE将不会找到它,相应的行也不会被返回(除非使用通配符) 而REGEXP在列值内匹配,如果被匹配的文本在列值中出现,REGEXP将会找到它,相应的行也会被返回. MySQL中的正则表达式匹配默认不区分大小写,为区分大小写,可使用BINARY关键字,在REGEXP后面加上BINARY即

java 中的正则表达式

一. 什么是正则表达式? 正则表达式是一种文本模式,包括普通字符(例如,a 到 z 之间的字母)和特殊字符(称为“元字符”).模式描述在搜索文本时要匹配的一个或多个字符串. 二.正则表达式语法参考: https://msdn.microsoft.com/zh-cn/library/ae5bf541(VS.80).aspx. 三. 正则表达式规则的例子: /^(/d{3}-|/d{4}-)?(/d{8}|/d{7})?$/ //国内电话 /^[1-9]*[1-9][0-9]*$/

C++、Java、JavaScript中的正则表达式

编程思想之正则表达式什么是正则表达式? 正则表达式(Regular Expression)就是用某种模式去匹配一类字符串的公式.如你要在一篇文章中查找第一个字是"罗"最后一个字是"浩"的三个字的姓名,即"罗*浩":那么"罗*浩"就是公式,也称作模式(Pattern),这篇文章就是要匹配的串(或叫文本text).再如,你要检查输入的一个字符串是否是126邮箱的格式,你得制定一个规则去查检,这种规则就是正则表达式. 从入门开始

Python学习-38.Python中的正则表达式（二）

在Python中,正则表达式还有较其他编程语言有特色的地方.那就是支持松散正则表达式了. 在某些情况,正则表达式会写得十分的长,这时候,维护就成问题了.而松散正则表达式就是解决这一问题的办法. 用上一次分组的代码作为例子: 1 import re 2 userinput = input("please input test string:") 3 m = re.match(r'(\d{3,4})-(\d{8})',userinput) 4 if m: 5 print('区号:' + m