Performs the analysis process on a text and return the tokens breakdown of the text

Analyzeedit

Performs the analysis process on a text and return the tokens breakdown of the text.

Can be used without specifying an index against one of the many built in analyzers:

GET _analyze
{
  "analyzer" : "standard",
  "text" : "this is a test"
}

COPY AS CURL VIEW IN CONSOLE

If text parameter is provided as array of strings, it is analyzed as a multi-valued field.

GET _analyze
{
  "analyzer" : "standard",
  "text" : ["this is a test", "the second text"]
}

COPY AS CURL VIEW IN CONSOLE

Or by building a custom transient analyzer out of tokenizers, token filters and char filters. Token filters can use the shorter filter parameter name:

GET _analyze
{
  "tokenizer" : "keyword",
  "filter" : ["lowercase"],
  "text" : "this is a test"
}

COPY AS CURL VIEW IN CONSOLE

GET _analyze
{
  "tokenizer" : "keyword",
  "filter" : ["lowercase"],
  "char_filter" : ["html_strip"],
  "text" : "this is a <b>test</b>"
}

COPY AS CURL VIEW IN CONSOLE

Deprecated in 5.0.0.

Use filter/char_filter instead of filters/char_filters and token_filters has been removed

Custom tokenizers, token filters, and character filters can be specified in the request body as follows:

GET _analyze
{
  "tokenizer" : "whitespace",
  "filter" : ["lowercase", {"type": "stop", "stopwords": ["a", "is", "this"]}],
  "text" : "this is a test"
}

COPY AS CURL VIEW IN CONSOLE

It can also run against a specific index:

GET twitter/_analyze
{
  "text" : "this is a test"
}

COPY AS CURL VIEW IN CONSOLE

The above will run an analysis on the "this is a test" text, using the default index analyzer associated with the test index. An analyzer can also be provided to use a different analyzer:

GET twitter/_analyze
{
  "analyzer" : "whitespace",
  "text" : "this is a test"
}

COPY AS CURL VIEW IN CONSOLE

Also, the analyzer can be derived based on a field mapping, for example:

GET twitter/_analyze
{
  "field" : "obj1.field1",
  "text" : "this is a test"
}

COPY AS CURL VIEW IN CONSOLE

Will cause the analysis to happen based on the analyzer configured in the mapping for obj1.field1(and if not, the default index analyzer).

Deprecated in 5.1.0 request parameters are deprecated and will be removed in the next major release. please use JSON params instead of request params.

All parameters can also supplied as request parameters. For example:

GET /_analyze?tokenizer=keyword&filter=lowercase&text=this+is+a+test

COPY AS CURL VIEW IN CONSOLE

For backwards compatibility, we also accept the text parameter as the body of the request, provided it doesn’t start with { :

curl -XGET ‘localhost:9200/_analyze?tokenizer=keyword&filter=lowercase&char_filter=reverse‘ -d ‘this is a test‘ -H ‘Content-Type: text/plain‘

Deprecated in 5.1.0 the text parameter as the body of the request are deprecated and this feature will be removed in the next major release. please use JSON text param

时间： 2024-10-10 06:27:34

Performs the analysis process on a text and return the tokens breakdown of the text

Analyzeedit

Deprecated in 5.0.0.

Performs the analysis process on a text and return the tokens breakdown of the text的相关文章

论文阅读（Weilin Huang——【arXiv2016】Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network）

Oracle Error - "OCIEnvCreate failed with return code -1 but error message text was not available".

AFN不支持 "text/html" 的数据的问题:unacceptable content-type: text/html

http://elasticsearch-py.readthedocs.io/en/master/api.html

jquery的http请求对响应内容的处理

11大Java开源中文分词器的使用方法和分词效果对比

Webpact打包React后端Node+Express

9大Java开源中文分词器的使用方法和分词效果对比

对话框（api）