solr4.5 schema.xml配置文件

schema.xml配置文件是用于定义index索引库的结构，有点类似于数据表表的定义。

当我们打开schema.xml配置文件时，也许会被里面密密麻麻的代码所吓倒，其实不必惊慌，里面其实就两个东西filed和fieldType。

1、field–类似于数据表的字段

<fields>
      <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" omitNorms="true"  default="df"/>
   .....//省略
  <field name="_version_" type="long" indexed="true" stored="true"/><!--此字段为最好不要删除哦！非要删除，请把solrconfig.xml中的updateLog注释，但不建议这样-->
</fields>

属性介绍：
（1）、name：字段名称
（2）、type：字段类型（此处type不是java类型，而是下面定义的fieldType）
（3）、indexed：是否索引？true--solr会对这个字段进行索引，只有经过索引的字段才能被搜索、排序等；false--不索引
（4）、stored：是否存储？true--存储，当我们需要在页面显示此字段时，应设为true，否则false。
（5）、required：是否必须？true--此字段为必需，如果此字段的内容为空，会报异常；false--不是必需
（6）、multiValued：此字段是否可以保存多个值？
（7）、omitNorms：是否对此字段进行解析？有时候我们想通过某个字段的完全匹配来查询信息，那么设置 indexed="true"、omitNorms="true"。
（8）、default：设置默认值

2、fieldType–字段类型

<types>
<fieldType name="string" class="solr.StrField" sortMissingLast="true" />
.....//省略
<fieldType name="text_general" positionIncrementGap="100">
        <analyzer type="index">
                    <tokenizer/>
                    <filter ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
                     <filter/>
        </analyzer>
        <analyzer type="query">
              <tokenizer/>
              <filter ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
               <filter synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
                <filter/>
         </analyzer>
 </fieldType>
</types>

属性说明：
（1）、name：类型名称，<field>中的type引用的就是这个name
（2）、class：solr自定义的类型
（3）、<analyzer type="index">定义建立索引时使用的分词器及过滤器
（4）、<analyzer type="query">定义搜索时所使用的分词器及过滤器
（5）、 <tokenizer/>定义分词器
（6）、<filter/>定义过滤器

3、uniqueKey

<uniqueKey>id</uniqueKey>
类似于数据表数据的id，solr索引库中最好定义一个用于标示document唯一性的字段，此字段主要用于删除document。

4、<copyField/>

<copyField source=”cat” dest=”text”/>
实际项目中为了方便查询，我们会把多个需要查询的字段合并到一个字段里，方便查询。

举例：

产品搜索，关键词不应该只匹配产品标题，还应该匹配产品关键词及产品简介等，那么在建立索引库时，可以把标题、产品关键词、简介放到一个叫text的字段中，搜索时直接搜text字段。

<fields>
     <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
     <field name="title" type="text_general" indexed="true" stored="true"/>
     <field name="keywords" type="text_general" indexed="true" stored="true" omitNorms="true"/>
     <field name="description" type="string" indexed="true" stored="true" multiValued="true"/>
</fields>

<copyField source="title" dest="text"/>
<copyField source="keywords" dest="text"/>
<copyField source="description" dest="text"/>

更多详细的内容请亲自研究schema.xml配置文件

本文永久链接: http://www.luoshengsha.com/213.html

时间： 2024-12-28 23:08:02

solr4.5 schema.xml配置文件的相关文章

solr4.2 solrconfig.xml配置文件简单介绍

对于solr4.x的每个core有两个很重要的配置文件:solrconfig.xml和schema.xml,下面我们来了解solrconfig.xml配置文件. 具体很详细的内容请细读solrcofig.xml配置文件中的英文说明. 1. “solr.”--代表solr home,即core所在的目录,如:/example/solr/collection1 2. <luceneMatchVersion>LUCENE_42</luceneMatchVersion> 告诉solr底层使

solr4.3 solrconfig.xml配置文件

<?xml version="1.0" encoding="UTF-8" ?> <config>  <luceneMatchVersion>LUCENE_43</luceneMatchVersion>  <lib dir=&quo

认识配置文件schema.xml（managed-schema）

1.schema文件是在SolrConfig中的架构工厂定义,有两种定义模式: 1.1.默认的托管模式: solr默认使用的就是托管模式.也就是当在solrconfig.xml文件中没有显式声明<schemaFactory/>时,Solr隐式地使用ManagedIndexSchemaFactory,它是默认的"mutable"并将模式信息保存在一个managed-schema文件中. [html] view plain copy <span style="f

Solr 配置文件之schema.xml

schema.xml这个配置文件的根本目的是为了通过配置告诉Solr如何建立索引. solr的数据结构如下: document:一个文档.一条记录 field:域.属性 solr通过搜索某个或某些field,返回若干个符合条件的document,或者按搜索的score排序返回. 如果跟数据库对比,document相当于数据库的表,field相当于表中的字段.而schema.xml就是为了定义一个表的结构(定义各个field的名字.类型.约束.等等). schema.xml的基本结构如下: <sc

3 Solr配置文件 schema.xml

1 添加自己的分词器(mmseg4j) 意思是textCommplex 这个类型,用的是 com.chenlb.mmseg4j.solr.MMSegTokenizerFactory 这个分词器,词库是用到的solr.home目录下面的dic目录, 但是mmseg4j.jar 1.9 把词库包进去了,想要用外面的,需要把里面的删除掉, <filter class="solr.LowerCaseFilterFactory"/> 下面可选择性的添加一些自己的过滤器 <fi

schema.xml文件配置

schema.xml是Solr一个配置文件,它包含了你的文档所有的字段,以及当文档被加入索引或查询字段时,这些字段是如何被处理的.这个文件被存储在Solr主文件夹下的conf目录下,默认的路径./solr/conf/schema.xml,也可以是Solr webapp的类加载器所能确定的路径.在下载的Solr包里,有一个schema的样例文件,用户可以从那个文件出发,来观察如何编写自己的Schema.xml. type节点先来看下type节点,这里面定义FieldType子节点,包括name.

Spring框架［一］——spring概念和ioc入门（ioc操作xml配置文件）

Spring概念 spring是开源的轻量级框架(即不需要依赖其他东西,可用直接使用) spring核心主要两部分 aop:面向切面编程,扩展功能不是修改源代码来实现: ioc:控制反转,比如:有一个类,在类中有个方法(非静态的方法),要调用类中的这个方法,则需要创建类的对象,使用对象调用方法.创建类对象的过程,需要new出来对象:而ioc则是将对象的创建不是通过new方式实现,而是交给spring配置来创建对象(即,将对象的创建交给spring来管理): spring是一站式框架 spring

Spring进阶之路(11)-使用Aspectj切面配置和XML配置文件方式实现切面编程

异常在使用的时候,遇到了部分的异常,我用的是最新的Spring版本,Spring-4.2.5版本的,首先确保你的配置文件中引入了下面红色部分. <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" <span style="color:#ff0000;">

solr schema.xml

http://blog.csdn.net/escaflone/article/details/5726320(转载) 现在我们开始研究载入的数据部分(importing data) 在正式开始前,我们先介绍一个存储了大量音乐媒体的网站http://musicbrainz.org , 这里的数据都是免费的,一个大型开放社区提供. MusicBrainz每天都提供一个数据快照(snapshot)的SQL文件,这些数据可以被导入PostgreSQL数据库中. 一.字段配置(schema) schema