1、按官方的标准安装,我的系统是CentOS(64位),默认安装了Python 2.6.6,如果没有安装还需要先安装Python 2.6.6,版本不要错;官方安装地址:http://www.coreseek.cn/products-install/install_on_bsd_linux/ 安装时注意在./config的时候加上对python的支持,因为我们要使用的是Oracle数据源
相关代码:
$ cd csft-3.2.14
$ ./configure –prefix=/usr/local/coreseek –without-unixodbc –with-mmseg –with-mmseg-includes=/usr/local/mmseg3/include/mmseg/ –with-mmseg-libs=/usr/local/mmseg3/lib/ –with-python
$ make && make install
2、安装cx_Oracle:下载cx_Oracle-5.1.2-11g-py26-1.x86_64.rpm
执行安装包
rpm -ivh cx_Oracle-5.1.2-11g-py26-1.x86_64.rpm
3、下载安装instantclient-basic-linux.x64-11.2.0.3.0.zip,我下载的是zip包,当然我认为也可以下载rpm包,而且不需要自己配置会更合适一些
4、打开~下的.bash_profile,添加下面两句(如果下载的rpm包,则不需要做如下操作
增加以下两行代码
LD_LIBRARY_PATH=/root/instantclient_11_2;
export LD_LIBRARY_PATH
5、在python下测试import cx_Oracle,如没有报错,则说明python连接oracle已经没有错误了
6、配置csft的config文件/root/coreseek-3.2.14/testpack/etc/csft_demo_python_pyoracle.conf
文件如下:
python
{
path = /root/coreseek-3.2.14/testpack/etc/pysource
path = /root/coreseek-3.2.14/testpack/etc/pysource/csft_demo_pyoracle
}
#源定义
source python_oracle
{
type = python
name = csft_demo_pyoracle.MainSource
}
#index定义
index python_oracle
{
source = python_oracle
path = /root/coreseek-3.2.14/testpack/var/data/python_oracle
docinfo = extern
mlock = 0
morphology = none
min_word_len = 1
html_strip = 0
#中文分词配置,详情请查看:http://www.coreseek.cn/products-install/coreseek_mmseg/
charset_dictpath = /usr/local/mmseg3/etc/
charset_type = zh_cn.utf-8
}
#全局index定义
indexer
{
mem_limit = 128M
}
#searchd服务定义
searchd
{
listen = 9312
read_timeout = 5
max_children = 30
max_matches = 1000
seamless_rotate = 0
preopen_indexes = 0
unlink_old = 1
pid_file = /root/coreseek-3.2.14/testpack/var/log/searchd_python_oracle.pid
log = /root/coreseek-3.2.14/testpack/var/log/searchd_python_oracle.log
query_log = /root/coreseek-3.2.14/testpack/var/log/query_python_oracle.log
}
7、设置/root/coreseek-3.2.14/testpack/etc/pysource/csft_demo_pyoracle/__init__.py,
文件内容如下
# -*- coding:utf-8 -*-from os import path
import os
os.environ[‘NLS_LANG’] = ‘SIMPLIFIED CHINESE_CHINA.UTF8’
import sys
import cx_Oracleclass MainSource(object):
def __init__(self, conf):
self.conf = conf
self.idx = 0
self.data = []
self.conn = None
self.cur = Nonedef GetScheme(self): #获取结构,docid、文本、整数
return [
(‘id’ , {‘docid’:True, } ),
(‘title’, { ‘type’:’text’} ),
(‘abstract’, { ‘type’:’text’} ),
(‘search_type’, {‘type’:’integer’} ),
]
def GetFieldOrder(self): #字段的优先顺序
return [(‘title’, ‘abstract’)]def Connected(self): #如果是数据库,则在此处做数据库连接
if self.conn==None:
self.conn = cx_Oracle.connect(‘yzjs/a123456@10.166.166.222/genome’)
self.cur = self.conn.cursor()
sql = ‘SELECT t.id as id, t.title, t.abstract, t.search_type as search_type from V_SPHINX_INDEX t where t.title is not null and t.abstract is not null’
self.cur.execute(sql)
for rows in self.cur:
item = []
item.append(rows[0])
item.append(rows[1])
item.append(rows[2].read())
item.append(rows[3])
self.data.append(item)
passdef NextDocument(self): #取得每一个文档记录的调用
if self.idx < len(self.data):
item = self.data[self.idx]
self.docid = self.id = item[0] #’docid’:True
self.title = item[1]#.decode(“GBK”).encode(“UTF-8”)
self.abstract = item[2]#.decode(“GBK”).encode(“UTF-8”)
self.search_type = item[3]
self.idx += 1
return True
else:
return Falseif __name__ == “__main__”: #直接访问演示部分
conf = {}
source = MainSource(conf)
source.Connected()
while source.NextDocument():
print “id=%d, subject=%s” % (source.id, source.abstract)#.decode(“UTF-8”))
pass
#eof
8、然后根据下面三段脚本,可以方便索引,启动,热索引数据
索引
/usr/local/coreseek/bin/indexer -c /root/coreseek-3.2.14/testpack/etc/csft_demo_python_pyoracle.conf –all
启动
/usr/local/coreseek/bin/searchd -c /root/coreseek-3.2.14/testpack/etc/csft_demo_python_pyoracle.conf
停止
/usr/local/coreseek/bin/searchd -c /root/coreseek-3.2.14/testpack/etc/csft_demo_python_pyoracle.conf –stop
重编译
/usr/local/coreseek/bin/indexer -c /root/coreseek-3.2.14/testpack/etc/csft_demo_python_pyoracle.conf –all –rotate
9、引入sphinxapi.php,写如下代码测试
$cl = new SphinxClient ();
$cl -> SetServer ( ‘192.168.42.148’, 9312);
$cl -> SetConnectTimeout ( 3 );
$cl -> SetArrayResult ( true );
$cl -> SetMatchMode ( SPH_MATCH_ANY);
$data[‘res’] = $res = $cl -> Query ( $keyword, ‘*’ );