REDIS核心原理与实战-REDISEARCH高性能的全文搜索引擎

RediSearch 是一个高性能的全文搜索引擎,它可以作为一个 Redis Module(扩展模块)运行在 Redis 服务器上。

RediSearch 主要特性如下:

  1. 基于文档的多个字段全文索引 高性能增量索引 文档排序(由用户在索引时手动提供) 在子查询之间使用 AND 或 NOT 操作符的复杂布尔查询 可选的查询子句 基于前缀的搜索
  2. 支持字段权重设置 自动完成建议(带有模糊前缀建议) 精确的短语搜索 在许多语言中基于词干分析的查询扩展
  3. 支持用于查询扩展和评分的自定义函数 将搜索限制到特定的文档字段 数字过滤器和范围 使用 Redis 自己的地理命令进行地理过滤 Unicode 支持(需要 UTF-8 字符集) 检索完整的文档内容或只是 ID 的检索
  4. 支持文档删除和更新与索引垃圾收集
  5. 支持部分更新和条件文档更新 安装

和前面讲到布隆过滤器的引入方式一样,我们可以使用 RediSearch 官方推荐的 Docker 方式来安装并启动 RediSearch 功能,操作命令如下:

  1. docker run -p 6379:6379 redislabs/redisearch:latest

安装并启动成功,如下图所示:

安装完成之后使用 redis-cli 来检查 RediSearch 模块是否加载成功,使用 Docker 启动 redis-cli,命令如下:

  1. docker exec -it myredis redis-cli

其中”myredis”为 Redis 服务器的名称,执行结果如下:

  1. 127.0.0.1:6379> module list
  2. 1) 1) "name"
  3. 2) "ft"
  4. 3) "ver"
  5. 4) (integer) 10610

返回数组存在”ft”,表明 RediSearch 模块已经成功加载。 源码方式安装*

如果不想使用 Docker,我们也可以使用源码的方式进行安装,安装命令如下:

  1. git clone https://github.com/RedisLabsModules/RediSearch.git
  2. cd RediSearch 进入模块目录
  3. make all

安装完成之后,可以使用如下命令启动 Redis 并加载 RediSearch 模块,命令如下:

  1. src/redis-server redis.conf --loadmodule ../RediSearch/src/redisearch.so

使用

我们先使用 redis-cli 来对 RediSearch 进行相关的操作。 创建索引和字段*

  1. 127.0.0.1:6379> ft.create myidx schema title text weight 5.0 desc text
  2. OK

其中”myidx”为索引的ID,此索引包含了两个字段”title”和”desc”,“weight”为权重,默认值为 1.0。 将内容添加到索引*

  1. 127.0.0.1:6379> ft.add myidx doc1 1.0 fields title "He urged her to study English" desc "good idea"
  2. OK

其中”doc1”为文档 ID(docid),“1.0”为评分(score)。 根据关键查询*

  1. 127.0.0.1:6379> ft.search myidx "english" limit 0 10
  2. 1) (integer) 1
  3. 2) "doc1"
  4. 3) 1) "title"
  5. 2) "He urged her to study English"
  6. 3) "desc"
  7. 4) "good idea"

可以看出我们使用 title 字段中的关键字”english”查询出了一条满足查询条件的数据。 中文搜索*

首先我们需要先给索引中,添加一条中文数据,执行命令如下:

  1. 127.0.0.1:6379> ft.add myidx doc2 1.0 language "chinese" fields title "Java 14 发布了!新功能速览" desc "Java 14 在 2020.3.17 日发布正式版了,但现在很多公司还在使用 Java 7 或 Java 8"
  2. OK

注意:这里必须要设置语言编码为中文,也就是”language “chinese”“,默认是英文编码,如果不设置则无法支持中文查询(无法查出结果)。

我们使用之前的查询方式,命令如下:

  1. 127.0.0.1:6379> ft.search myidx "正式版"
  2. 1) (integer) 0

我们发现并没有查到任何信息,这是因为我们没有指定搜索的语言,不但保存时候要指定编码,查询时也需要指定,查询命令如下:

  1. 127.0.0.1:6379> ft.search myidx "发布了" language "chinese"
  2. 1) (integer) 1
  3. 2) "doc2"
  4. 3) 1) "desc"
  5. 2) "Java 14 \xe5\x9c\xa8 2020.3.17 \xe6\x97\xa5\xe5\x8f\x91\xe5\xb8\x83\xe6\xad\xa3\xe5\xbc\x8f\xe7\x89\x88\xe4\xba\x86\xef\xbc\x8c\xe4\xbd\x86\xe7\x8e\xb0\xe5\x9c\xa8\xe5\xbe\x88\xe5\xa4\x9a\xe5\x85\xac\xe5\x8f\xb8\xe8\xbf\x98\xe5\x9c\xa8\xe4\xbd\xbf\xe7\x94\xa8 Java 7 \xe6\x88\x96 Java 8"
  6. 3) "title"
  7. 4) "Java 14 \xe5\x8f\x91\xe5\xb8\x83\xe4\xba\x86\xef\xbc\x81\xe6\x96\xb0\xe5\x8a\x9f\xe8\x83\xbd\xe9\x80\x9f\xe8\xa7\x88"

从结果可以看出中文信息已经被顺利的查询出来了。 删除索引的数据*

  1. 127.0.0.1:6379> ft.del myidx doc1
  2. (integer) 1

我们使用索引加文档 ID 就可以实现删除数据的功能。 删除索引*

我们可以使用”ft.drop”关键字删除整个索引,执行命令如下:

  1. 127.0.0.1:6379> ft.drop myidx
  2. OK

查询索引详细信息*

我们可以使用”ft.info”关键查询索引相关信息,执行命令如下:

  1. 127.0.0.1:6379> ft.info myidx
  2. 1) index_name
  3. 2) myidx
  4. 3) index_options
  5. 4) (empty list or set)
  6. 5) fields
  7. 6) 1) 1) title
  8. 2) type
  9. 3) TEXT
  10. 4) WEIGHT
  11. 5) "5"
  12. 2) 1) desc
  13. 2) type
  14. 3) TEXT
  15. 4) WEIGHT
  16. 5) "1"
  17. 7) num_docs
  18. 8) "2"
  19. 9) max_doc_id
  20. 10) "2"
  21. 11) num_terms
  22. 12) "9"
  23. 13) num_records
  24. 14) "18"
  25. 15) inverted_sz_mb
  26. 16) "0.000102996826171875"
  27. 17) total_inverted_index_blocks
  28. 18) "29"
  29. 19) offset_vectors_sz_mb
  30. 20) "1.71661376953125e-05"
  31. 21) doc_table_size_mb
  32. 22) "0.000164031982421875"
  33. 23) sortable_values_size_mb
  34. 24) "0"
  35. 25) key_table_size_mb
  36. 26) "8.0108642578125e-05"
  37. 27) records_per_doc_avg
  38. 28) "9"
  39. 29) bytes_per_record_avg
  40. 30) "6"
  41. 31) offsets_per_term_avg
  42. 32) "1"
  43. 33) offset_bits_per_record_avg
  44. 34) "8"
  45. 35) gc_stats
  46. 36) 1) bytes_collected
  47. 2) "0"
  48. 3) total_ms_run
  49. 4) "16"
  50. 5) total_cycles
  51. 6) "14"
  52. 7) avarage_cycle_time_ms
  53. 8) "1.1428571428571428"
  54. 9) last_run_time_ms
  55. 10) "2"
  56. 11) gc_numeric_trees_missed
  57. 12) "0"
  58. 13) gc_blocks_denied
  59. 14) "0"
  60. 37) cursor_stats
  61. 38) 1) global_idle
  62. 2) (integer) 0
  63. 3) global_total
  64. 4) (integer) 0
  65. 5) index_capacity
  66. 6) (integer) 128
  67. 7) index_total
  68. 8) (integer) 0

其中”num_docs”表示存储的数据数量。 代码实战 RediSearch 支持的客户端有以下这些。

本文我们使用 JRediSearch 来实现全文搜索的功能,首先在 pom.xml 添加 JRediSearch 引用:

  1. <!-- https://mvnrepository.com/artifact/com.redislabs/jredisearch -->
  2. <dependency>
  3. <groupId>com.redislabs</groupId>
  4. <artifactId>jredisearch</artifactId>
  5. <version>1.3.0</version>
  6. </dependency>

完整的操作代码如下:

  1. import io.redisearch.client.AddOptions;
  2. import io.redisearch.client.Client;
  3. import io.redisearch.Document;
  4. import io.redisearch.SearchResult;
  5. import io.redisearch.Query;
  6. import io.redisearch.Schema;
  7. public class RediSearchExample {
  8. public static void main(String[] args) {
  9. // 连接 Redis 服务器和指定索引
  10. Client client = new Client("myidx", "127.0.0.1", 6379);
  11. // 定义索引
  12. Schema schema = new Schema().addTextField("title",
  13. 5.0).addTextField("desc", 1.0);
  14. // 删除索引
  15. client.dropIndex();
  16. // 创建索引
  17. client.createIndex(schema, Client.IndexOptions.Default());
  18. // 设置中文编码
  19. AddOptions addOptions = new AddOptions();
  20. addOptions.setLanguage("chinese");
  21. // 添加数据
  22. Document document = new Document("doc1");
  23. document.set("title", "天气预报");
  24. document.set("desc", "今天的天气很好,是个阳光明媚的大晴天,有蓝蓝的天空和白白的云朵。");
  25. // 向索引中添加文档
  26. client.addDocument(document,addOptions);
  27. // 查询
  28. Query q = new Query("天气") // 设置查询条件
  29. .setLanguage("chinese") // 设置为中文编码
  30. .limit(0,5);
  31. // 返回查询结果
  32. SearchResult res = client.search(q);
  33. // 输出查询结果
  34. System.out.println(res.docs);
  35. }
  36. }

以上程序执行结果如下:

  1. [{"id":"doc1","score":1.0,"properties":{"title":"天气预报","desc":"今天的天气很好,是个阳光明媚的大晴天,有蓝蓝的天空和白白的云朵。"}}]

可以看出添加的中文数据,被正确的查询出来了。

小结

本文我们使用 Docker 和 源码编译的方式成功的启动了 RediSearch 功能,要使用 RediSearch 的全文搜索功能,必须先要创建一个索引,然后再索引中添加数据,再使用 ft.search 命令进行全文搜索,如果要查询中文内容的话,需要在添加数据时设置中文编码,并且在查询时也要设置中文编码,指定”language “chinese”“。

参考 & 鸣谢

官网地址: http://redisearch.io

项目地址:https://github.com/RediSearch/RediSearch

点赞 ( 0 )

41 条评论

  1. clickrank review

    Thanks for another informative website. The place else may just I am getting that type of information written in such an ideal manner? I've a mission that I am just now working on, and I have been on the glance out for such information.

  2. tron vanity address

    Nice post. I was checking continuously this blog and I am impressed! Very useful information particularly the last part : ) I care for such info a lot. I was looking for this certain information for a long time. Thank you and best of luck.

  3. custom tron vanity token

    Asking questions are really pleasant thing if you are not understanding anything completely, except this paragraph gives fastidious understanding even.

  4. random trx address

    always i used to read smaller content that also clear their motive, and that is also happening with this piece of writing which I am reading at this time.

  5. ai for work

    Thanks for a marvelous posting! I actually enjoyed reading it, you can be a great author. I will always bookmark your blog and will come back at some point. I want to encourage one to continue your great job, have a nice morning!

  6. best ai browser

    This blog was... how do you say it? Relevant!! Finally I've found something which helped me. Thank you!

  7. Totobet

    Quality articles is the main to invite the users to go to see the web site, that's what this web site is providing.

  8. ZPlatform.ai

    However, you must master the completely different methods to attain high rankings in Google.

  9. casca de copiat

    Helpful information. Fortunate me I discovered your website accidentally, and I'm shocked why this twist of fate did not happened in advance! I bookmarked it.

  10. Binance tilmelding

    Fine way of explaining, and good article to take facts concerning my presentation focus, which i am going to convey in academy.

  11. gelatin trick

    I'd like to thank you for the efforts you've put in penning this blog. I really hope to see the same high-grade blog posts by you later on as well. In fact, your creative writing abilities has motivated me to get my own site now ;)

  12. gelatin trick

    My brother suggested I would possibly like this web site. He used to be entirely right. This publish actually made my day. You cann't believe simply how much time I had spent for this info! Thanks!

  13. binance post

    Hi! This post could not be written any better! Reading this post reminds me of my good old room mate! He always kept chatting about this. I will forward this write-up to him. Pretty sure he will have a good read. Thank you for sharing!

  14. Silas

    Hello, I wish for to subscribe for this weblog to take most recent updates, therefore where can i do it please help out.

  15. Marlon

    Hi there to every one, the contents existing at this site are genuinely amazing for people knowledge, well, keep up the nice work fellows.

  16. Delilah

    What's up, I log on to your blogs like every week. Your writing style is awesome, keep up the good work!

  17. dalaran wow alternative

    I every time used to read article in news papers but now as I am a user of net therefore from now I am using net for posts, thanks to web.

  18. check it out

    I constantly spent my half an hour to read this webpage's articles everyday along with a mug of coffee.

  19. bicoinxxo 공식 사이트

    What's up, after reading this remarkable article i am too delighted to share my know-how here with colleagues.

  20. 바이낸스 가입 정보 확인

    This post is worth everyone's attention. When can I find out more?

  21. hướng dẫn crypto

    I will right away grab your rss feed as I can't to find your email subscription link or e-newsletter service. Do you have any? Please allow me recognise in order that I could subscribe. Thanks.

  22. 바이낸스 가입

    Hi, i think that i noticed you visited my site so i came to go back the favor?.I'm attempting to find things to improve my site!I guess its ok to make use of a few of your ideas!!

  23. tron generator

    It's fantastic that you are getting thoughts from this piece of writing as well as from our discussion made at this time.

  24. nude chat

    Great information. Lucky me I came across your blog by chance (stumbleupon). I've saved as a favorite for later!

  25. bybit register

    I'm curious to find out what blog platform you happen to be using? I'm having some minor security problems with my latest website and I'd like to find something more safeguarded. Do you have any suggestions?

  26. Wow, awesome blog layout! How long have you been blogging for? you made blogging look easy. The overall look of your website is excellent, as well as the content!

  27. worked today

    My spouse and I stumbled over here different website and thought I might as well check things out. I like what I see so i am just following you. Look forward to checking out your web page yet again.

  28. ufamasterbet888

    This piece of writing will help the internet users for creating new blog or even a weblog from start to end.

  29. 4d cambodia

    I used to be able to find good advice from your blog articles.

  30. studyleo.com

    Very nice post. I simply stumbled upon your weblog and wanted to mention that I've really loved surfing around your blog posts. In any case I will be subscribing for your rss feed and I hope you write again soon!

  31. universities

    I constantly emailed this weblog post page to all my associates, for the reason that if like to read it next my friends will too.

  32. studyleo.com

    Everything is very open with a really clear explanation of the issues. It was definitely informative. Your site is useful. Thank you for sharing!

  33. study in turkey

    Thanks for some other informative website. Where else may I am getting that kind of info written in such a perfect way? I have a venture that I am simply now running on, and I've been at the glance out for such information.

  34. buy Instagram followers

    Wow, that's what I was looking for, what a information! existing here at this website, thanks admin of this site.

  35. gelatin weight loss trick

    I feel this is one of the such a lot important info for me. And i'm glad studying your article. However wanna statement on few common issues, The website style is perfect, the articles is truly great : D. Just right activity, cheers

  36. gelatin trick recipe

    Howdy! Would you mind if I share your blog with my zynga group? There's a lot of people that I think would really enjoy your content. Please let me know. Thanks

  37. gelatin trick

    Pretty nice post. I just stumbled upon your blog and wished to say that I have really enjoyed surfing around your blog posts. In any case I'll be subscribing to your rss feed and I hope you write again very soon!

  38. gelatin recipe for weight loss

    What's Taking place i am new to this, I stumbled upon this I've discovered It positively useful and it has helped me out loads. I'm hoping to contribute & help other users like its aided me. Good job.

  39. gelatin weight loss

    I am curious to find out what blog system you are utilizing? I'm experiencing some small security problems with my latest blog and I'd like to find something more risk-free. Do you have any solutions?

  40. trc20 token generator

    Hey there! Do you use Twitter? I'd like to follow you if that would be ok. I'm absolutely enjoying your blog and look forward to new updates.

  41. online weed sites

    When I initially commented I clicked the "Notify me when new comments are added" checkbox and now each time a comment is added I get several e-mails with the same comment. Is there any way you can remove people from that service? Bless you!

发表评论

人生在世,错别字在所难免,无需纠正。

插入图片
s
返回顶部