Lucene全文检索的【增、删、改、查】 实例

  创建索引

Lucene在进行创建索引时,根据前面一篇博客,已经讲完了大体的流程,这里再简单说下:

Directory directory = FSDirectory.open("/tmp/testindex");
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_CURRENT, analyzer);
IndexWriter iwriter = new IndexWriter(directory, config);
Document doc = new Document();
String text = "This is the text to be indexed.";
doc.add(new Field("fieldname", text, TextField.TYPE_STORED)); iwriter.close();

  1 创建Directory,获取索引目录

  2 创建词法分析器,创建IndexWriter对象

  3 创建document对象,存储数据

  4 关闭IndexWriter,提交

/**
     * 建立索引
     *
     * @param args
     */
    public static void index() throws Exception {

        String text1 = "hello,man!";
        String text2 = "goodbye,man!";
        String text3 = "hello,woman!";
        String text4 = "goodbye,woman!";

        Date date1 = new Date();
        analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
        directory = FSDirectory.open(new File(INDEX_DIR));

        IndexWriterConfig config = new IndexWriterConfig(
                Version.LUCENE_CURRENT, analyzer);
        indexWriter = new IndexWriter(directory, config);

        Document doc1 = new Document();
        doc1.add(new TextField("filename", "text1", Store.YES));
        doc1.add(new TextField("content", text1, Store.YES));
        indexWriter.addDocument(doc1);

        Document doc2 = new Document();
        doc2.add(new TextField("filename", "text2", Store.YES));
        doc2.add(new TextField("content", text2, Store.YES));
        indexWriter.addDocument(doc2);

        Document doc3 = new Document();
        doc3.add(new TextField("filename", "text3", Store.YES));
        doc3.add(new TextField("content", text3, Store.YES));
        indexWriter.addDocument(doc3);

        Document doc4 = new Document();
        doc4.add(new TextField("filename", "text4", Store.YES));
        doc4.add(new TextField("content", text4, Store.YES));
        indexWriter.addDocument(doc4);

        indexWriter.commit();
        indexWriter.close();

        Date date2 = new Date();
        System.out.println("创建索引耗时:" + (date2.getTime() - date1.getTime()) + "ms\n");
    }

  增量添加索引

Lucene拥有增量添加索引的功能,在不会影响之前的索引情况下,添加索引,它会在何时的时机,自动合并索引文件。

/**
     * 增加索引
     *
     * @throws Exception
     */
    public static void insert() throws Exception {
        String text5 = "hello,goodbye,man,woman";
        Date date1 = new Date();
        analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
        directory = FSDirectory.open(new File(INDEX_DIR));

        IndexWriterConfig config = new IndexWriterConfig(
                Version.LUCENE_CURRENT, analyzer);
        indexWriter = new IndexWriter(directory, config);

        Document doc1 = new Document();
        doc1.add(new TextField("filename", "text5", Store.YES));
        doc1.add(new TextField("content", text5, Store.YES));
        indexWriter.addDocument(doc1);

        indexWriter.commit();
        indexWriter.close();

        Date date2 = new Date();
        System.out.println("增加索引耗时:" + (date2.getTime() - date1.getTime()) + "ms\n");
    }

  删除索引

Lucene也是通过IndexWriter调用它的delete方法,来删除索引。我们可以通过关键字,删除与这个关键字有关的所有内容。如果仅仅是想要删除一个文档,那么最好就顶一个唯一的ID域,通过这个ID域,来进行删除操作。

/**
     * 删除索引
     *
     * @param str 删除的关键字
     * @throws Exception
     */
    public static void delete(String str) throws Exception {
        Date date1 = new Date();
        analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
        directory = FSDirectory.open(new File(INDEX_DIR));

        IndexWriterConfig config = new IndexWriterConfig(
                Version.LUCENE_CURRENT, analyzer);
        indexWriter = new IndexWriter(directory, config);

        indexWriter.deleteDocuments(new Term("filename",str));  

        indexWriter.close();

        Date date2 = new Date();
        System.out.println("删除索引耗时:" + (date2.getTime() - date1.getTime()) + "ms\n");
    }

  更新索引

Lucene没有真正的更新操作,通过某个fieldname,可以更新这个域对应的索引,但是实质上,它是先删除索引,再重新建立的。

/**
     * 更新索引
     *
     * @throws Exception
     */
    public static void update() throws Exception {
        String text1 = "update,hello,man!";
        Date date1 = new Date();
         analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
         directory = FSDirectory.open(new File(INDEX_DIR));

         IndexWriterConfig config = new IndexWriterConfig(
                 Version.LUCENE_CURRENT, analyzer);
         indexWriter = new IndexWriter(directory, config);

         Document doc1 = new Document();
        doc1.add(new TextField("filename", "text1", Store.YES));
        doc1.add(new TextField("content", text1, Store.YES));

        indexWriter.updateDocument(new Term("filename","text1"), doc1);

         indexWriter.close();

         Date date2 = new Date();
         System.out.println("更新索引耗时:" + (date2.getTime() - date1.getTime()) + "ms\n");
    }

  通过索引查询关键字

Lucene的查询方式有很多种,这里就不做详细介绍了。它会返回一个ScoreDoc的集合,类似ResultSet的集合,我们可以通过域名获取想要获取的内容。

/**
     * 关键字查询
     *
     * @param str
     * @throws Exception
     */
    public static void search(String str) throws Exception {
        directory = FSDirectory.open(new File(INDEX_DIR));
        analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
        DirectoryReader ireader = DirectoryReader.open(directory);
        IndexSearcher isearcher = new IndexSearcher(ireader);

        QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, "content",analyzer);
        Query query = parser.parse(str);

        ScoreDoc[] hits = isearcher.search(query, null, 1000).scoreDocs;
        for (int i = 0; i < hits.length; i++) {
            Document hitDoc = isearcher.doc(hits[i].doc);
            System.out.println(hitDoc.get("filename"));
            System.out.println(hitDoc.get("content"));
        }
        ireader.close();
        directory.close();
    }

  全部代码

package test;

import java.io.File;
import java.util.Date;
import java.util.List;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.LongField;
import org.apache.lucene.document.TextField;
import org.apache.lucene.document.Field.Store;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.Term;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.util.Version;

public class TestLucene {
    // 保存路径
    private static String INDEX_DIR = "D:\\luceneIndex";
    private static Analyzer analyzer = null;
    private static Directory directory = null;
    private static IndexWriter indexWriter = null;

    public static void main(String[] args) {
        try {
//            index();
            search("man");
//            insert();
//            delete("text5");
//            update();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
    /**
     * 更新索引
     *
     * @throws Exception
     */
    public static void update() throws Exception {
        String text1 = "update,hello,man!";
        Date date1 = new Date();
         analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
         directory = FSDirectory.open(new File(INDEX_DIR));

         IndexWriterConfig config = new IndexWriterConfig(
                 Version.LUCENE_CURRENT, analyzer);
         indexWriter = new IndexWriter(directory, config);

         Document doc1 = new Document();
        doc1.add(new TextField("filename", "text1", Store.YES));
        doc1.add(new TextField("content", text1, Store.YES));

        indexWriter.updateDocument(new Term("filename","text1"), doc1);

         indexWriter.close();

         Date date2 = new Date();
         System.out.println("更新索引耗时:" + (date2.getTime() - date1.getTime()) + "ms\n");
    }
    /**
     * 删除索引
     *
     * @param str 删除的关键字
     * @throws Exception
     */
    public static void delete(String str) throws Exception {
        Date date1 = new Date();
        analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
        directory = FSDirectory.open(new File(INDEX_DIR));

        IndexWriterConfig config = new IndexWriterConfig(
                Version.LUCENE_CURRENT, analyzer);
        indexWriter = new IndexWriter(directory, config);

        indexWriter.deleteDocuments(new Term("filename",str));  

        indexWriter.close();

        Date date2 = new Date();
        System.out.println("删除索引耗时:" + (date2.getTime() - date1.getTime()) + "ms\n");
    }
    /**
     * 增加索引
     *
     * @throws Exception
     */
    public static void insert() throws Exception {
        String text5 = "hello,goodbye,man,woman";
        Date date1 = new Date();
        analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
        directory = FSDirectory.open(new File(INDEX_DIR));

        IndexWriterConfig config = new IndexWriterConfig(
                Version.LUCENE_CURRENT, analyzer);
        indexWriter = new IndexWriter(directory, config);

        Document doc1 = new Document();
        doc1.add(new TextField("filename", "text5", Store.YES));
        doc1.add(new TextField("content", text5, Store.YES));
        indexWriter.addDocument(doc1);

        indexWriter.commit();
        indexWriter.close();

        Date date2 = new Date();
        System.out.println("增加索引耗时:" + (date2.getTime() - date1.getTime()) + "ms\n");
    }
    /**
     * 建立索引
     *
     * @param args
     */
    public static void index() throws Exception {

        String text1 = "hello,man!";
        String text2 = "goodbye,man!";
        String text3 = "hello,woman!";
        String text4 = "goodbye,woman!";

        Date date1 = new Date();
        analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
        directory = FSDirectory.open(new File(INDEX_DIR));

        IndexWriterConfig config = new IndexWriterConfig(
                Version.LUCENE_CURRENT, analyzer);
        indexWriter = new IndexWriter(directory, config);

        Document doc1 = new Document();
        doc1.add(new TextField("filename", "text1", Store.YES));
        doc1.add(new TextField("content", text1, Store.YES));
        indexWriter.addDocument(doc1);

        Document doc2 = new Document();
        doc2.add(new TextField("filename", "text2", Store.YES));
        doc2.add(new TextField("content", text2, Store.YES));
        indexWriter.addDocument(doc2);

        Document doc3 = new Document();
        doc3.add(new TextField("filename", "text3", Store.YES));
        doc3.add(new TextField("content", text3, Store.YES));
        indexWriter.addDocument(doc3);

        Document doc4 = new Document();
        doc4.add(new TextField("filename", "text4", Store.YES));
        doc4.add(new TextField("content", text4, Store.YES));
        indexWriter.addDocument(doc4);

        indexWriter.commit();
        indexWriter.close();

        Date date2 = new Date();
        System.out.println("创建索引耗时:" + (date2.getTime() - date1.getTime()) + "ms\n");
    }

    /**
     * 关键字查询
     *
     * @param str
     * @throws Exception
     */
    public static void search(String str) throws Exception {
        directory = FSDirectory.open(new File(INDEX_DIR));
        analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
        DirectoryReader ireader = DirectoryReader.open(directory);
        IndexSearcher isearcher = new IndexSearcher(ireader);

        QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, "content",analyzer);
        Query query = parser.parse(str);

        ScoreDoc[] hits = isearcher.search(query, null, 1000).scoreDocs;
        for (int i = 0; i < hits.length; i++) {
            Document hitDoc = isearcher.doc(hits[i].doc);
            System.out.println(hitDoc.get("filename"));
            System.out.println(hitDoc.get("content"));
        }
        ireader.close();
        directory.close();
    }
}
时间: 2024-11-03 21:12:22

Lucene全文检索的【增、删、改、查】 实例的相关文章

Python---MySQL数据库之四大操作(增 删 改 查)

一.对数据库,表,记录---四大操作(增 删 改 查) 1.操作数据库 (1)对数据库(文件夹):进行增加操作 Create  database  库名; 例:  Create  database  db7 ; 查询库: show  databases; 结果: +-----------------------------+ | Database                   | +----------------------------+ | information_schema | |

ADO.NET 增 删 改 查

ADO.NET:(数据访问技术)就是将C#和MSSQL连接起来的一个纽带 可以通过ADO.NET将内存中的临时数据写入到数据库中 也可以将数据库中的数据提取到内存中供程序调用 ADO.NET所有数据访问技术的基础 连接数据库基本格式:需要两个类1.数据库连接类 SqlConnection2.数据库操作类 SqlCommand 1.连接数据库写连接字符串,立马就要想到有4点要写完,1.连接哪台服务器,2.连接哪个数据库,3.连接用户名,4.密码string sql = "server=.(服务器i

oracle 11g 建库 建表 增 删 改 查 约束

一.建库 1.(点击左上角带绿色+号的按钮) 2.(进入这个界面,passowrd为密码.填写完后点击下面一排的Test按钮进行测试,无异常就点击Connect) 二.建表 1-1. create table 表名1( Tid number(4) --primary key 列级约束(主键), Tname varchar(10) --ont null  非空约束,只能定义在列级约束, Tsex varchar2(2)--check (tsex='男'  or  tsex='女') 检查约束, T

数据库基础学习4--表格的 增 删 改 查(简单查询与高级查询)

一.增 C:create 增加,创建,向数据库里面添加数据. insert into Fruit values('K009','苹果',3.0,'高青',90,'') insert into Fruit(Ids,Name,Price,Source,Numbers) values('K010','苹果',3.0,'高青',90) 二.改 U:update修改,从数据库表里面修改数据. update Fruit set Source='烟台' where Ids='K001' 三.删 D:delet

SQL 增/删/改/查 (总结)

1.增 INSERT  INTO  表名  VALUES (" "," ") ; INSERT INTO 表名(字段)  VALUES(" "); 2. 删 DELETE   FROM 表名  WHERE  ... 3.改 UPDATE  表名 SET   ... WHERE  ... 4.查 SELECT × FROM  表名 ORDER BY ...

1-24.list的增,删,改,查

增: 1.append(通过元素增加,默认加在最后print(a.append('hgf'))) 2.extend( 迭代的去增,就是把字符串的字符逐个去给这个列表加上去) 3.insert((按照索引去增加,前为索引,后为增加的字) 删: 1,pop(通过位置索引去删除) 2.del (通过索引,切片删除) 3.clear(清空列表) 4.remove(通过元素去删除) #返回值是什么意思? 改: a[]='i' print() 查: for i in a: print(a) 其他列表 1.s

php基础:数据库的含义和基本操作 增 删 改 查

//数据库是帮我们管理数据的一个软件,我们将数据给他,放进数据库里.他能很稳妥的帮我们管理起来,且效率很高.//php的大部分工作就是  php->连接数据库->写入数据->查出数据->格式化数据->显示出来,//数据库管理数据是以表的形式组成的,多行多列,表头声明好了,一个表创建好了,剩下的就是往里面添加数据 多张表放在一个文件夹里面就形成了库  mysql服务器帮我们管理多个库C:\wamp\bin\mysql\mysql5.6.12\data   数据库中的数据放在这个

Android 增,删,改,查 通讯录中的联系人

一.权限 操作通讯录必须在AndroidManifest.xml中先添加2个权限, <uses-permission android:name="android.permission.READ_CONTACTS"></uses-permission> <uses-permission android:name="android.permission.WRITE_CONTACTS"></uses-permission>

js数组的管理[增,删,改,查]

今天在设计表单的时候遇到对数组的一些处理的问题,比如说怎么创建一个数组,然后牵扯到数组的增删改查的方法.请看API FF: Firefox, N: Netscape, IE: Internet Explorer 方法 描述 FF N IE concat() 连接两个或更多的数组,并返回结果. 1 4 4 join() 把数组的所有元素放入一个字符串.元素通过指定的分隔符进行分隔. 1 3 4 pop() 删除并返回数组的最后一个元素 1 4 5.5 push() 向数组的末尾添加一个或更多元素,

数据操作:增 删 改 查

1.创建数据库create database 库名gouse 库名go 2.创建表 create table 表名 ( 列名 类型, 列名 类型, 列名 类型 ) 例子: create table one( daihao varchar(10), shuming varchar(50), zuozhe varchar(10), chubanshe varchar(50), shijian datetime) 3.添加行记录 insert into 表名(列名,列名,列名) values('值',