act_as_ferretで全文検索

Yahoo! Green

必要な製品のインストール

# gem install ferret
# emerge mecab =mecab-ruby-0.92  ## Gentoo以外の人は適宜工夫

$ cd $RAILS_ROOT
$ ruby script/plugin install svn://projects.jkraemer.net/acts_as_ferret/tags/stable/acts_as_ferret

$RAILS_ROOT/lib/mecab_analyzer.rb

require 'rubygems'
require 'MeCab'
include Ferret

class MecabAnalyzer < Ferret::Analysis::Analyzer
  include Ferret::Analysis

  def initialize(use_surface = false)
    @use_surface = use_surface
  end

  def token_stream(field, str)
    return MecabTokenizer.new(str, @use_surface)
  end
end

$RAILS_ROOT/lib/mecab_tokenizer.rb

require 'rubygems'
require 'MeCab'

class MecabTokenizer
  def initialize(str, use_surface=false)
    @mecab = MeCab::Tagger.new
    self.text = str
    @use_surface = use_surface
  end

  def text=(str)
    @text = str
    @n = @mecab.parseToNode(text)
    @n = @n.next # skip EOS
    @pos = 0
  end

  attr_reader :text
  def next
    return nil if @n.stat == MeCab::MECAB_EOS_NODE
    features = @n.feature.split(/,/)
    t = @use_surface ? @n.surface : features[6]
    t = @n.surface.downcase if t == '*'  # added on 09/05/2007
    token = Analysis::Token.new(t, @pos, @pos+@n.rlength)
    @pos += @n.rlength
    @n = @n.next
    return token
  end
end

全文検索したいモデル

class Article < ActiveRecord::Base
  acts_as_ferret(:fields => [:title, :content],
                 :store_class_name => true,
                 :ferret => {
                   :or_default => false,
                   :analyzer => MecabAnalyzer.new()
                 })
end

これで、Article.find_by_contents('単語')みたいにして全文検索できる。