参考资料:
http://wiki.apache.org/solr/ExtractingRequestHandler#Sending_documents_to_Solr
/update
标准的update request handler,适用于XML,JSON,CSV或者JAVABIN文件类型
<requestHandler name="/update" class="solr.UpdateRequestHandler"> <lst name="defaults"> <str name="update.chain">uuid</str> </lst> </requestHandler>
/update/extract
除了以上标准文件之外的文件,可通过此配置来为文件建索引
依赖包
<lib dir="../../dist/" regex="apache-solr-cell-\d.*\.jar" /> <lib dir="../../contrib/extraction/lib" regex=".*\.jar" />
常规配置
<requestHandler name="/update/extract" startup="lazy" class="solr.extraction.ExtractingRequestHandler" > <lst name="defaults"> <str name="xpath">/xhtml:html/xhtml:body/descendant:node()</str> <str name="capture">content</str> <str name="fmap.meta">attr_meta_</str> <str name="uprefix">attr_</str> <str name="lowernames">true</str> <str name="update.chain">uuid</str> </lst> </requestHandler>
时间: 2024-10-29 19:11:19