数据库基础知识——用正则表达式进行搜索（使用技巧）

tech2023-11-08 201

使用MySQL正则表达式

为了下面的展示，我们创建了 crashcourse 数据库：

CREATE DATABASE crashcourse DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;

crashcourse 是我自己设置的数据库的名称，下面的使用示例，都是使用这个数据库。

为了下面的展示，我们还需要为 crashcourse 数据库创建一些表和往表中插入一些记录，相关的 SQL 文件有：

创建表：create.sql插入数据：populate.sql

SQL 语句默认不区分大小写。

select vend_id, prod_name from products where prod_name regexp "1000" order by prod_name;

匹配 prod_name 中包含字符串'1000'的行。

select vend_id, prod_name from products where prod_name regexp ".000" order by prod_name;

这里使用了正则表达式.000。.是正则表达式语言中一个特殊的字符。它表示匹配任意一个字符，因此，1000和2000都匹配且返回。

select vend_id, prod_name from products where prod_name regexp "1000|2000" order by prod_name;

语句中使用了正则表达式1000|2000。|为正则表达式的OR操作符。它表示匹配其中之一，因此1000和2000都匹配并返回。

使用|从功能上类似于在SELECT语句中使用OR语句，多个OR条件可并入单个正则表达式。例如， '1000 | 2000 | 3000’将匹配1000或2000或3000。

select vend_id, prod_name from products where prod_name regexp "[123] Ton" order by prod_name;

这里，使用了正则表达式[123] Ton。[123]定义一组字符，它的意思是匹配1或2或3，因此，1 ton和2 ton都匹配且返回（没有3 ton）。

select vend_id, prod_name from products where prod_name regexp "[^123] Ton" order by prod_name;

[^123] 匹配除这些字符外的任何东西。

select vend_id, prod_name from products where prod_name regexp "[1-5] Ton" order by prod_name;

这里使用正则表达式[1-5] Ton。[1-5]定义了一个范围，这个表达式意思是匹配1到5，因此返回3个匹配行。由于5 ton匹配，所以返回.5 ton。

select vend_id, prod_name from products where prod_name regexp "\\." order by prod_name;

\\.匹配.，所以只检索出一行。这种处理就是所谓的转义（escaping），正则表达式内具有特殊意义的所有字符都必须以这种方式转义。这包括.、|、[]以及迄今为止使用过的其他特殊字符。

存在找出你自己经常使用的数字、所有字母字符或所有数字字母字符等的匹配。为更方便工作，可以使用预定义的字符集，称为字符类（character class）。表9-2列出字符类以及它们的含义。

你可能需要寻找所有的数，不管数中包含多少数字，或者你可能想寻找一个单词并且还能够适应一个尾随的s（如果存在），等等。这可以用表9-3列出的正则表达式重复元字符来完成。

select vend_id, prod_name from products where prod_name regexp "\\([0-9] sticks?\\)" order by prod_name;

正则表达式\\([0-9] sticks?\\)： \\(匹配)，[0-9]匹配任意数字（这个例子中为1和5），sticks?匹配stick和sticks，（s后的?使s可选，因为?匹配它前面的任何字符的0次或1次出现），\\)匹配)。没有?，匹配stick和sticks会非常困难。

select vend_id, prod_name from products where prod_name regexp "[[:digit:]]{4}" order by prod_name;

为了匹配特定位置的文本，需要使用表9-4列出的定位符。例如，如果你想找出以一个数（包括以小数点开始的数）开始的所有产品，怎么办？简单搜索[0-9\\.]（或[[:digit:]\\.]）不行，因为它将在文本内任意位置查找匹配。解决办法是使用^定位符，如下所示：

select vend_id, prod_name from products where prod_name regexp "^[0-9\\.]" order by prod_name;

^匹配串的开始。因此，^[0-9\\.]只在.或任意数字为串中第一个字符时才匹配它们。没有^，则还要多检索出4个别的行（那些中间有数字的行）。

最新回复(0)