分布式搜索引擎Elasticsearch的简单使用

官方网址：https://www.elastic.co/products/elasticsearch/

一、特性

1、支持中文分词

2、支持多种数据源的全文检索引擎

3、分布式

4、基于lucene的开源搜索引擎

5、Restful api

二、资源

smartcn, 默认的中文分词：https://github.com/elasticsearch/elasticsearch-analysis-smartcn
mmseg ：https://github.com/medcl/elasticsearch-analysis-mmseg
ik：https://github.com/medcl/elasticsearch-analysis-ik
pinyin, 拼音分词可用于输入拼音提示中文：https://github.com/medcl/elasticsearch-analysis-pinyin
stconvert, 中文简繁体互换：https://github.com/medcl/elasticsearch-analysis-stconvert
elasticsearch-servicewrapper：https://github.com/elasticsearch/elasticsearch-servicewrapper
Elastic HQ，elasticsearch的监控工具：http://www.elastichq.org
elasticsearch-rtf ：https://github.com/medcl/elasticsearch-rtf

三、安装

服务器：Linux（centos 6.x）
java环境：JDK 1.8.0
elasticsearch：2.3.1
elasticsearch-jdbc（数据源插件）：2.3.1
IK Analysis（中文分词插件）：1.9.1

1、安装Java

yum install java-1.8.0

2、安装Elasticsearch

#创建.repo文件（elasticsearch.repo）
cat >> /etc/yum.repos.d/elasticsearch.repo << EOF
[elasticsearch-2.x]
name=Elasticsearch repository for 2.x packages
baseurl=https://packages.elastic.co/elasticsearch/2.x/centos
gpgcheck=1
gpgkey=https://packages.elastic.co/GPG-KEY-elasticsearch
enabled=1
EOF

#导入key：
rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch
yum install elasticsearch

3、创建目录

mkdir -p  /data/elasticsearch/data
mkdir -p  /data/elasticsearch/logs
chown -R elasticsearch /data/elasticsearch/data
chown -R elasticsearch /data/elasticsearch/logs

4、生成配置文件（/etc/elasticsearch/elasticsearch.yml）

#集群名（同一个集群，名称必须相同）
cluster.name: my-application
#服务节点名（每个服务节点不一样）
node.name: node-1
#数据存储路径
path.data: /data/elasticsearch/data
#服务日志路径
path.logs: /data/elasticsearch/logs
#服务ip地址
network.host: 0.0.0.0
#服务端口
http.port: 9200

四、IK的安装

1.安装maven工具

wget http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo -O /etc/yum.repos.d/epel-apache-maven.repo
yum install apache-maven

2.下载ik源码包

git clone https://github.com/medcl/elasticsearch-analysis-ik

3.生成jar插件包

mvn clean
mvn compile
mvn package

unzip target/releases/elasticsearch-analysis-ik-*.zip
cp -r target/releases/ /usr/share/elasticsearch/plugins/ik

4.配置词库（ik自带搜狗词库）

配置：/usr/share/elasticsearch/plugins/ik/config/ik/IKAnalyzer.cfg.xml

<entry key="ext_dict">custom/mydict.dic;custom/single_word_low_freq.dic;custom/sougou.dic</entry>

将jar包复制到Elasticsearch的plugins/analysis-ik 目录下，再把解压出的ik目录（配置和词典等），复制到Elasticsearch的config 目录下。然后编辑配置文件elasticsearch.yml ，在后面加一行：

index.analysis.analyzer.ik.type : "ik"

重启service elasticsearch restart

然后录入数据，创建索引

五、elasticsearch-jdbc

1、使用feeder方式

wget http://xbib.org/repository/org/xbib/elasticsearch/importer/elasticsearch-jdbc/2.3.1.0/elasticsearch-jdbc-2.3.1.0-dist.zip
unzip elasticsearch-jdbc-2.3.1.0-dist.zip

编辑数据导入脚本import.sh

export JDBC_IMPORTER_HOME=/elasticsearch-jdbc-2.3.2.0

bin=$JDBC_IMPORTER_HOME/bin
lib=$JDBC_IMPORTER_HOME/lib
echo ‘{
"type" : "jdbc",
"jdbc": {
"url":"jdbc:mysql://127.0.0.1:3306/dbtest",
"user":"root",
"password":"123456",
"sql":"select * from test_tb",
"index" : "customer",
"type" : "external"
}}‘ | java     -cp "${lib}/*"     -Dlog4j.configurationFile=${bin}/log4j2.xml     org.xbib.tools.Runner     org.xbib.tools.JDBCImporter

测试

curl ‘localhost:9200/customer/external/_search?pretty&q=*‘

2、使用river方式

#安装elasticsearch
curl -OL https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.4.2.zip

cd $ES_HOME
unzip path/to/elasticsearch-1.4.2.zip

#安装JDBC插件
./bin/plugin --install jdbc --url http://xbib.org/repository/org/xbib/elasticsearch/plugin/elasticsearch-river-jdbc/1.4.0.6/elasticsearch-river-jdbc-1.4.0.6-plugin.zip

#下载mysql driver
curl -o mysql-connector-java-5.1.33.zip -L ‘http://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.33.zip/from/http://cdn.mysql.com/‘
cp mysql-connector-java-5.1.33-bin.jar $ES_HOME/plugins/jdbc/ chmod 644 $ES_HOME/plugins/jdbc/*

#启动elasticsearch
./bin/elasticsearch

#停止river
curl -XDELETE ‘localhost:9200/_river/my_jdbc_river/‘

JDBC插件参数

curl -XPUT ‘localhost:9200/_river/my_jdbc_river/_meta‘ -d ‘{
    "type" : "jdbc",
    "jdbc" : {
        "url" : "jdbc:mysql://localhost:3306/test",
        "user" : "",
        "password" : "",
        "sql" : "select * from orders",
        "index" : "myindex",
        "type" : "mytype",
        ...
    }
}‘

如果一个数组传递给jdbc字段，多个river源也是可以的

curl -XPUT ‘localhost:9200/_river/my_jdbc_river/_meta‘ -d ‘{
     <river parameters>
    "type" : "jdbc",
    "jdbc" : [ {
         <river definition 1>
    }, {
         <river definition 2>
    } ]
}‘

curl -XPUT ‘localhost:9200/_river/my_jdbc_river/_meta‘ -d ‘{
     "type" : "jdbc",
     "jdbc" : {
         "driver" : "com.mysql.jdbc.Driver",
         "url" : "jdbc:mysql://localhost:3306/test",
         "user" : "root",
         "password" : "123456",
         "sql" : "select * from test.student;",
         "interval" : "30",
         "index" : "test",
         "type" : "student"
     }
 }’

查看ES是否已经同步了这些数据　　

curl -XGET ‘localhost:9200/test/student/_search?pretty&q=*‘

官网地址：https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html

参考

https://www.elastic.co/guide/en/elasticsearch/guide/current/empty-search.html

https://github.com/medcl/elasticsearch-analysis-ik

http://blog.csdn.net/clementad/article/details/46898013

https://endymecy.gitbooks.io/elasticsearch-guide-chinese/content/elasticsearch-river-jdbc.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html

https://github.com/jprante/elasticsearch-jdbc

http://www.voidcn.com/blog/wojiushiwo987/article/p-6058574.html

http://leotse90.com/2015/11/11/ElasticSearch%E4%B8%8EMySQL%E6%95%B0%E6%8D%AE%E5%90%8C%E6%AD%A5%E4%BB%A5%E5%8F%8A%E4%BF%AE%E6%94%B9%E8%A1%A8%E7%BB%93%E6%9E%84/

http://www.jianshu.com/p/638ff7b848cc

http://www.cnblogs.com/buzzlight/p/logstash_elasticsearch_kibana_log.html

时间： 2024-10-21 00:18:08

分布式搜索引擎Elasticsearch的简单使用的相关文章

分布式搜索引擎Elasticsearch安装配置

分布式搜索引擎Elasticsearch 介绍 Elasticsearch是一个基于Lucene的开源分布式搜索引擎,具有分布式多用户能力.Elasticsearch是用java开发,提供Restful接口,能够达到实时搜索.高性能计算:同时Elasticsearch的横向扩展能力非常强,不需要重启服务,基本上达到了零配置.但是目前来说相关资料很少,同时版本更新很快,bug存在,API繁多并且变化. 概念和设计索引索引(index)是Elasticsearch存放数据的地方.如果你熟悉关系型

一个开源的分布式搜索引擎---Elasticsearch（未完待续）

今天给大家介绍一个开源的分布式搜索引擎Elasticsearch. 一.ElasticSearch是一个基于Lucene的搜索服务器.它提供了一个分布式多用户能力的全文搜索引擎, 基于RESTful web接口.Elasticsearch是用Java开发的,并作为Apache 许可条款下的开放源码发布,是第二最流行的企业搜索引擎.设计用于云计算中,能够达到实时搜索, 稳定,可靠,快速,安装使用方便. 我们建立一个网站或应用程序,并要添加搜索功能,令我们受打击的是:搜索工作是很难的.我们希望我们的

分布式搜索引擎ElasticSearch+Kibana (Marvel插件安装详解)

在安装插件的过程中,尤其是安装Marvel插件遇到了很多问题,要下载license.Marvel-agent,又要下载安装Kibana 版本需求 Java 7 or later Elasticsearch 2.4.2 Kibana 4.5 Elasticsearch License 2.4.2 plugin 系统版本是:CentOS release 6.6 一.简介 Marvel插件介绍 Marvel插件:在簇中从每个节点汇集数据.这个插件必须每个节点都得安装. Marvel是Elasticse

分布式搜索引擎ElasticSearch学习(安装)

由于项目算法研究的需要,所以自己部署了ElasticSearch,这是一个基于lucene分布式的全文搜索引擎,具体介绍和简单wiki可以参考以下链接:http://www.learnes.net/getting_started/what_is_it.html 首先是安装:下载地址在 elasticsearch.org/download,同时需要安装curl来进行简单的交互,都安装成功后.可以运行 curl 'http://localhost:9200/?pretty' 可以看到结果: { "s

十次方项目第四天（分布式搜索引擎ElasticSearch）

1 ElasticSearch简介1.1 什么是ElasticSearch? Elasticsearch是一个实时的分布式搜索和分析引擎.它可以帮助你用前所未有的速度去处理大规模数据.ElasticSearch是一个基于Lucene的搜索服务器.它提供了一个分布式多用户能力的全文搜索引擎,基于RESTful web接口.Elasticsearch是用Java开发的,并作为Apache许可条款下的开放源码发布,是当前流行的企业级搜索引擎.设计用于云计算中,能够达到实时搜索,稳定,可靠,快速,安装使

ElasticSearch logo 分布式搜索引擎 ElasticSearch

原文来自:http://www.oschina.net/p/elasticsearch Elastic Search 是一个基于Lucene构建的开源,分布式,RESTful搜索引擎.设计用于云计算中,能够达到实时搜索,稳定,可靠,快速,安装使用方便.支持通过HTTP使用JSON进行数据索引. ElasticSearch 提供多种语言的客户端 API: Java API — 1.x — other versions JavaScript API — 2.4 — other versions

分布式搜索引擎Elasticsearch的查询与过滤

一.写入先来一个简单的官方例子,插入的参数为 -XPUT ,插入一条记录. curl -XPUT'http://localhost:9200/test/users/1' -d'{ "user": "test", "post_date": "2009-11-15T14:12:12", "message": "Elastic Search" }' { "_index":

分布式搜索引擎Elasticsearch性能优化与配置

1.内存优化在bin/elasticsearch.in.sh中进行配置修改配置项为尽量大的内存: ES_MIN_MEM=8g ES_MAX_MEM=8g 两者最好改成一样的,否则容易引发长时间GC(stop-the-world) elasticsearch默认使用的GC是CMS GC,如果你的内存大小超过6G,CMS是不给力的,容易出现stop-the-world,建议使用G1 GC JAVA_OPTS=”$JAVA_OPTS -XX:+UseParNewGC” JAVA_OPTS=”$JA

分布式搜索引擎Elasticsearch PHP类封装使用原生api

<?php class ElasticSearch { public $index; function __construct($server = 'http://localhost:9200'){ $this->server = $server; } function call($path, $http = array()){ if (!$this->index) throw new Exception('$this->index needs a value'); return