ElasticSearch + kibana 部署略
创建索引
PUT /nba_20220101
{"mappings": {"properties": {"age": {"type": "integer"},"birthDay": {"type": "date"},"birthDayStr": {"type": "keyword"},"code": {"type": "text"},"country": {"type": "keyword"},"countryEn": {"type": "keyword"},"displayAffiliation": {"type": "text"},"displayName": {"type": "text","analyzer": "ik_max_word_pinyin", "fields" : {"suggest" : {"type" : "completion","analyzer" : "ik_smart_pinyin","preserve_separators" : true,"preserve_position_increments" : true,"max_input_length" : 50}}},"displayNameEn": {"type": "text"},"draft": {"type": "long"},"heightValue": {"type": "float"},"jerseyNo": {"type": "keyword"},"playYear": {"type": "long"},"playerId": {"type": "keyword"},"position": {"type": "text"},"schoolType": {"type": "text"},"teamCity": {"type": "text"},"teamCityEn": {"type": "text"},"teamConference": {"type": "keyword"},"teamConferenceEn": {"type": "keyword"},"teamName": {"type": "keyword"},"teamNameEn": {"type": "keyword"},"weight": {"type": "text"}}},"settings" : {"index" : {"analysis" : {"analyzer" : {"default" : {"type" : "ik_max_word"}}}},"analysis": {"analyzer": {"ik_smart_pinyin": {"type": "custom","tokenizer": "ik_smart","filter": ["my_pinyin"]},"ik_max_word_pinyin": {"type": "custom","tokenizer": "ik_max_word","filter": ["my_pinyin"]}},"filter": {"my_pinyin": {"type" : "pinyin","keep_separate_first_letter" : false,"keep_full_pinyin" : true,"keep_original" : true,"limit_first_letter_length" : 16,"lowercase" : true,"remove_duplicated_term" : true}}}}
}
自定义 ik_smart_pinyin ik_max_word_pinyin 分词器
- 分别基于
my_pinyin + ik_max_word 和 my_pinyin + ik_smart 创建自定义分词器 my_pinyin 定义拼音分词过滤displayName 字段使用 ik_max_word_pinyin 使用 拼音 汉字 双检索, 并 添加 suggest 检索补全
pinyin 插件可选参数说明
keep_first_letter 保存第一个字母 例如: 刘德华 > ldh,默认值:truekeep_separate_first_letter 启用此选项后,将单独保留第一个字母,例如:刘德华 > l, d, h, 默认值: false,注意:由于词频太高,查询结果可能太模糊limit_first_letter_length 设置 first_letter 结果的最大长度,默认值:16keep_full_pinyin 当启用此选项时,例如:刘德华> [ liu, de, hua],默认值:truekeep_joined_full_pinyin 当启用此选项时,例如:刘德华> [ liudehua],默认值:falsekeep_none_chinese 结果保留非中文字母或数字,默认:truekeep_none_chinese_together 保持非中文字母在一起,默认:true,例如:DJ音乐家-> DJ, yin, yue, jia, 当设置为 时false,例如:DJ音乐家-> D, J, yin, yue, jia, 注意:keep_none_chinese应先启用keep_none_chinese_in_first_letter首字母保留非中文字母,例如:刘德华AT2016-> ldhat2016,默认:truekeep_none_chinese_in_joined_full_pinyin 保留非中文字母加入全拼音,例如:刘德华2016-> liudehua2016,默认:falsenone_chinese_pinyin_tokenize 如果非中文字母是拼音,则将其拆分为单独的拼音词,默认:true,例如:liudehuaalibaba13zhuanghan-> liu, de, hua, a, li, ba, , ba, 13, zhuang, han, 注意: keep_none_chinese并且keep_none_chinese_together应首先启用keep_original 启用此选项时,也将保留原始输入,默认值:falselowercase 小写非汉字,默认:truetrim_whitespace 默认值:trueremove_duplicated_term 启用此选项时,将删除重复的术语以保存索引,例如:de的> de,默认值:false,注意:位置相关查询可能会受到影响ignore_pinyin_offset 6.0以后严格限制offset,不允许重叠token,有了这个参数,overlapping token会忽略offset,注意,所有position相关的query或者highlight都会出错,应该使用multi fields,不同的设置不同查询目的。如果需要偏移量,请将其设置为 false。 默认值:true。
导入数据
POST /nba_20220101/_doc/566
{"countryEn": "Croatia","teamName": "快船","birthDay": 858661200000,"country": "克罗地亚","teamCityEn": "LA","code": "ivica_zubac","displayAffiliation": "Croatia","displayName": "伊维察 祖巴茨哥哥","schoolType": "","teamConference": "西部","teamConferenceEn": "Western","weight": "108.9 公斤","teamCity": "洛杉矶","playYear": 3,"jerseyNo": "40","teamNameEn": "Clippers","draft": 2016,"displayNameEn": "Ivica Zubac","heightValue": 2.16,"birthDayStr": "1997-03-18","position": "中锋","age": 22,"playerId": "1627826"
}
校验汉字分词是否有效
GET /nba_20220101/_search
{"query": {"match": {"displayName": "伊维察"}}
}

校验拼音分词是否有效
GET /nba_20220101/_search
{"query": {"match": {"displayName": "yi wei"}}
}

校验汉字补全是否生效
POST /nba_20220101/_search
{"suggest": {"my-suggestion": {"text": "伊维","completion": {"field": "displayName.suggest"}}}
}
![在这里插入图片描述]()
校验拼音补全是否生效
POST /nba_20220101/_search
{"suggest": {"my-suggestion": {"text": "yi wei","completion": {"field": "displayName.suggest"}}}
}
