当前位置：首页 > 日记本 > 正文内容

elasticsearch各个分词效果测试

zhangchap1年前 (2023-04-01)日记本135

from elasticsearch import Elasticsearch

es = Elasticsearch()

text = "10万左右口碑最好的车 "

# 使用 Elasticsearch 的 standard 分词器分析文本
tokens = es.indices.analyze(index="new_cars",body={'text': text, 'analyzer': 'standard'})

print("使用 standard 分词器分析文本：")
for token in tokens['tokens']:
    print(token['token'])

# 使用 Elasticsearch 的 ik_max_word 分词器分析文本
tokens = es.indices.analyze(index="new_cars",body={'text': text, 'analyzer': 'ik_max_word'})

print("\n使用 ik_max_word 分词器分析文本：")
for token in tokens['tokens']:
    print(token['token'])

# 使用 Elasticsearch 的 ik_smart 分词器分析文本
tokens = es.indices.analyze(index="new_cars",body={'text': text, 'analyzer': 'ik_smart'})

print("\n使用 ik_smart 分词器分析文本：")
for token in tokens['tokens']:
    print(token['token'])

分享给朋友：

返回列表

上一篇：elasticsearch老数据库新建索引python代码

下一篇：新建个mysql数据库并加索引sql语句

相关文章

火狐添加自定义搜索引擎

直接网址搜索自定义添加：https://mycroftproject.com/...

宝塔重启服务器后，Redis就启动不了解决方案

宝塔重启服务器后，Redis就启动不了解决方案

1.更改权限 chown -R redis.redis /www/server/redis/ 2.设置持久化...

Nginx+PHP，PHP如何优化配置？

具体修改FPM配置文件参数：若你的php日志出现： WARNING: [pool www] seems busy (you may need to increase pm.sta...

python 函数开启多线程示例

from threading import Thread def readfile(queue:Queue): &nbs...

python jieba分词

import jieba from jieba.analyse import tfidf words = jieba.lcut('...

python 发布文章随机分类（choice）

from random import choice catid = choice([5,6]) #choice 函数从列表中随机提取...

发表评论

最顶级的能力是屏蔽力，任何消耗你的人和事，多看一眼都是你的不对。

人生最大的代价不是金钱，而是你走过的弯路，
人生最大的成本不是金钱，而是你的时间和精力，
机遇一旦错过就可能是一生。
Copyright zhenglia.com Rights Reserved.
挣俩网张涛与你共勉：当你的才华还撑不起你的野心的时候，你就应该静下心来学习。当你的能力还驾驭不了你的目标的时候，你就应该沉下心来历练。问问自己，想要怎样的人生。
分享学习(python、优化)的点点滴滴

Powered By Z-BlogPHP. Theme by TOYEAN.