PHP 构建自定义搜索引擎(6)

    earchd 返回的每个主键

  最后一个配置步骤是构建索引清单 8 显示了数据源 catalog 的索引。

  清单 8. 描述 catalog 数据源的一个可能的索引

index catalog
 source = catalog
 path = /var/data/sphinx/catalog
 morphology = stem_en

 min_word_len = 3
 min_prefix_len = 0
 min_infix_len = 3

  第 1 行将指向 sphinx.conf 文件中的指定数据源。第 2 行将定义存储索引数据的位置;按照约定Sphinx 索引将被存储到 /var/data/sphinx 中。第 3 行将允许索引使用英文词法。并且第 5 行至第 7 行将告诉索引器只索引含有三个字符或更多字符的那些单词并且为每个这样的字符的子字符串创建中缀索引(为了便于引用,清单 9 显示了 Body Parts 的完整示例 sphinx.conf 文件)。

  清单 9. Body Parts 的示例 sphinx.conf

source catalog
 type = mysql

 sql_host = localhost
 sql_user = reaper
 sql_pass = s3cr3t
 sql_db = body_parts
 sql_sock = /var/run/mysqld/mysqld.sock
 sql_port = 3306

 # indexer query
 # document_id MUST be the very first field
 # document_id MUST be positive (non-zero, non-negative)
 # document_id MUST fit into 32 bits
 # document_id MUST be unique

 sql_query = \
 id, partno, description, \
 assembly, model \

 sql_group_column = assembly
 sql_group_column = model

 # document info query
 # ONLY used by search utility to display document information
 # MUST be able to fetch document info by its id, therefore
 # MUST contain '$id' macro

 sql_query_info = SELECT * FROM Inventory WHERE id=$id

index catalog
 source = catalog
 path = /var/data/sphinx/catalog
 morphology = stem_en

 min_word_len = 3
 min_prefix_len = 0
 min_infix_len = 3

 port = 3312
 log = /var/log/searchd/searchd.log
 query_log = /var/log/searchd/query.log
 pid_file = /var/log/searchd/searchd.pid

  底部的 searchd 部分将配置 searchd 守护程序本身。该部分中的条目不言自明。query.log 尤为有用:它将在运行时显示每次搜索并显示结果,例如搜索的文档数和匹配总数。

  您现在已经准备好为 Body Parts 应用程序构建索引。为此,需要执行以下步骤:

  键入 $ sudo mkdir -p /var/data/sphinx 创建目录结构 /var/data/sphinx
  假定 MySQL 正在运行,使用如下所示的代码运行索引器来创建索引。

  清单 10. 创建索引

$ sudo /usr/local/bin/indexer --config /usr/local/etc/sphinx.conf --all
Sphinx 0.9.7
Copyright (c) 2001-2007, Andrew Aksyonoff

using config file '/usr/local/etc/sphinx.conf'...
indexing index 'catalog'...
collected 8 docs, 0.0 MB
sorted 0.0 Mhits, 82.8% done
total 8 docs, 149 bytes
total 0.010 sec, 14900.00 bytes/sec, 800.00 docs/sec

