LRBlog

Logical Reality Design: Web Design and Software Development

Fixing problems with sphinx search

July 24, 2008

I’ve been working a lot this week with sphinx and ultrasphinx on a project that’s a fork of Insoshi.    Insoshi is in the process of switching search from ferret to sphinx, and sphinx has been integrated into the Insoshi edge branch.

I’ve had dozens of problems, in fact it’s fair to say I’ve spent upwards of 15 hours just debugging ultrasphinx and getting my tests to pass.   There were several problems; here are the main three and how I fixed each one.

This should be useful to anyone upgrading Insoshi to the sphinx version, or to anyone else trying to get ultrasphinx working in their Rails project. I definitely don’t recommend starting with this post if you’re just starting out with sphinx. Instead, go read this much better introductory tutorial from the guys over at Insoshi. Then if you have problems, come back here and you may find solutions.

Getting search tests (or specs) to pass with sphinx

This one is pretty simple, in retrospect, but it can be frustrating and opaque if you are used to ferret.  Unlike ferret, sphinx (at least via ultrasphinx) runs only via a daemon.   Where acts_as_ferret uses a daemon only for the production environment and just accesses the index files directly in test or development, ultrasphinx can only get to the indexes through the daemon.

So, to run your tests, you just build up the indexes for test and run them.  In this case, I’m running the specs for Insoshi’s searches controller:

From the command line in $RAILS_ROOT:

rake db:test:prepare
rake ultrasphinx:configure RAILS_ENV=test
rake ultrasphinx:index RAILS_ENV=test
rake ultrasphinx:daemon:start RAILS_ENV=test
script/spec spec/controllers/searches_controller_spec.rb

The problem, of course, is that it doesn’t work!   The reason is that db:test:prepare creates the structure of your database, but doesn’t load any of your fixtures as data: the test db is empty..  So when you run the index command, an empty index is built.   You can see this from the output of that first index command, which will look something like this:

collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.078 sec, 0.00 bytes/sec, 0.00 docs/sec

Ultrasphinx has built an empty index.

The solution

The solution, believe it or not, is to run the tests, let them fail, re-index, and run the tests again (Many thanks to Long Nguyen at Insoshi for helping me figure this one out):

rake db:test:prepare
rake ultrasphinx:configure RAILS_ENV=test
rake ultrasphinx:index RAILS_ENV=test
rake ultrasphinx:daemon:start RAILS_ENV=test
script/spec spec/controllers/searches_controller_spec.rb   #FAIL!!
rake ultrasphinx:index RAILS_ENV=test
script/spec spec/controllers/searches_controller_spec.rb   #PASS!!

The first attempt to run the specs loads the fixtures, and leaves them in the database, thus letting the subsequent index command build an actual index.

Running sphinx for both test and development environments at the same time

The next big challenge was enabling behavior-driven development. I like to work with autotest and growl running constantly in the background. But this was tough to do with sphinx, because the daemon needed to be stopped and re-started, and the index re-created for each environment, alternately running all of the above commands either with or without RAILS_ENV=test.

The solution is to set up your ultrasphinx base configuration to completely separate both the test and development indexes and to let the daemons for the two environments listen on different ports. I had tried something like this and come close, but not quite, when Long at Insoshi again bailed me out. You need to change the port (in two places), and the paths of the logs, pidfile, and index directories so that test and development daemons are using entirely separate resources. Here’s a diff of my test.conf and default.conf:

33c33
<   port = 3312
---
>   port = 3322
35c35
<   log = log/searchd.log
---
>   log = log/searchd_test.log
39c39
<   pid_file = log/searchd.pid
---
>   pid_file = log/searchd_test.pid
50c50
<   server_port = 3312
---
>   server_port = 3322
57c57
<   sql_range_step = 5000   
---
>   sql_range_step = 999999999   
64c64
<   path = sphinx
---
>   path = sphinx_test

The sql_range_step is related to the next issue, which is that sphinx does not play well with foxy fixtures. Anyway, make the above changes and you should be able to run test and development sphinx daemons at the same time:

rake db:test:prepare
rake ultrasphinx:configure
rake ultrasphinx:configure RAILS_ENV=test
rake ultrasphinx:index
rake ultrasphinx:index RAILS_ENV=test
rake ultrasphinx:daemon:start
rake ultrasphinx:daemon:start RAILS_ENV=test

If it worked, you should see separate indexes in $RAILS_ROOT/sphinx and $RAILS_ROOT/sphinx_test, and two daemons running, which you can confirm with ps waux | grep searchd:

evan      1339   0.0  0.0    78100    292 s000  S     5:37PM   0:00.52 searchd --config <YOUR_RAILS_ROOT>/config/ultrasphinx/test.conf
evan      1326   0.0  0.0    78100    292 s000  S     5:36PM   0:00.68 searchd --config <YOUR_RAILS_ROOT>/config/ultrasphinx/development.conf

Getting sphinx to play well with foxy fixtures

The next problem I discovered was that on some machines, but not others, running my search specs would result in these weird errors:

1)
ActiveRecord::RecordNotFound in 'SearchesController Person searches should search by name'
Couldn't find Person with ID=328556765
/var/www/domains/unithrive/vendor/plugins/ultrasphinx/lib/ultrasphinx/search/internals.rb:308:in `reify_results'
/var/www/domains/unithrive/vendor/plugins/ultrasphinx/lib/ultrasphinx/search/internals.rb:286:in `each'
/var/www/domains/unithrive/vendor/plugins/ultrasphinx/lib/ultrasphinx/search/internals.rb:286:in `reify_results'
/var/www/domains/unithrive/vendor/plugins/ultrasphinx/lib/ultrasphinx/search.rb:362:in `run'
/var/www/domains/unithrive/vendor/plugins/ultrasphinx/lib/ultrasphinx/search/internals.rb:352:in `perform_action_with_retries'
/var/www/domains/unithrive/vendor/plugins/ultrasphinx/lib/ultrasphinx/search.rb:342:in `run'
/var/www/domains/unithrive/app/controllers/searches_controller.rb:38:in `index'
./spec/controllers/searches_controller_spec.rb:51:
script/spec:4:

When I poked into this “Couldn’t find Person with ID=328556765″ error, it seemed like sphinx was almost working. The index was set up, and the search was finding someone in the index during the test. Ultrasphinx was passing back the id 328556765, which didn’t exist in the database. So why would Sphinx “find” a record in its index but then pass back an ID for a database record that didn’t exist?

And furthermore, why would it work on one machine, but not on another?

The brainstorm came when I checked what the actual database IDs were for this particular record, with Person.find_by_name(”fixtures’ name”).id. On machines where it worked, the id was a huge number (is it generally is with foxy fixtures), but on machines where it didn’t work, the id was an even huger number.

Sphinx tries to make sure that all items that get indexed have a different index in sphinx, and it does this by multiplying all of your id’s by N, where N is the number of models getting indexed, and adding an offset of 0 for the first model, 1 for the second, etc. This guarantees that every record from every table will have a unique id. In the case of this application, all of my Person records were getting indexed by sphinx as (Person#id * 4 + 2).

Danger, Will Robinson: 32-bit int rollover!

The problem is that foxy fixtures generate their own ids from a hash of the fixture label, and those ids can be anywhere in the 32-bit unsigned integer space. But Sphinx also stores ids as 32-bit unsigned integers. This means if you happen to get a large fixture id, and then sphinx multiplies it by 4 (or whatever; it could be higher if you have more indexed models), your id will rollover and come out as (id * N + n) % (2^32). Sphinx will store that result, and then when it finds the record in a search, it will try to recreate the original id by subtracting n and dividing by N … giving you the wrong id. Your test will fail to find the record.

Incidentally, this problem with foxy fixtures is why your test.base file needs the line sql_range_step = 999999999. Sphinx builds indexes by searching a few ids at a time. But the ids generated by foxy fixtures are so big that if sphinx only collects them in ranges of 5000 at a time, it will take forever to find them all.

After some googling, I found that these issues are discussed in a thread over at RubyForge.

The solution

I’m working on a plugin that monkeypatches foxy fixtures to create sequential, low-numbered IDs. In the meantime, you can just compile sphinx to support 64-bit ids, which should give you plenty of headroom to handle foxy fixture ids multiplied by N in sphinx*:

In your sphinx source directory:

configure --enable-id64
make
sudo make install

That should do it. Let me know in comments if any of this information helped you.

*At least until you start approaching 2^32 models in your application, that is.

Installing Insoshi on a Dreamhost Account

May 17, 2008

With the Insoshi social networking platform rapidly gaining in popularity, I thought it might be useful to folks to know how to install it on the ever-popular Dreamhost shared account. If you need a Dreamhost account, please consider using the promo code “LRDESIGN” when you sign up. It will save you $50 on your first year of membership, and will help me with my site hosting expenses so I can keep this blog going.

So, there are still some possible drawbacks to this approach, but I was able to get a running install of Insoshi on my Dreamhost account with this sequence. You might want to read to the bottom of this post to learn about the difficulties with acts_as_ferret and Dreamhost before you commit to running your insoshi site on DH. Hopefully these problems will have a solution soon, and I’ll update this post if/when they do.

If you use this method and it works (or doesn’t!) please let me know in comments.

1) Set up a domain

Use the DH control panel to create a new fully hosted domain for your Insoshi site, for example yourdomain.com or insoshi.yourdomain.com. I will use “insoshi.yourdomain.com” through the rest of this post to indicate the domain that you want to use to run insoshi. I set up http://insoshi.lrdesign.com/ in the process of writing this post, but I can’t guarantee that it will stay up.

When you set up the domain:

  • Make sure that fastcgi support is selected.
  • Set “specify your web directory” to point to /home/username/insoshi.yourdomain.com/public/

2) Set up mysql databases

  1. In the dreamhost panel, select “Goodies -> Manage MySQL”
  2. Scroll down to “create a new mysql database”

I used these example settings:

  • database name: insoshi
  • use hostname: mysql.yourdomain.com (use “new hostname” to create this if you do not already have it)
  • New user: insoshi
  • New password:

If you want to run tests, you should create a second database called insoshi_test; you can leave the other settings the same.

3) Download the tarball of the current insoshi distribution:

cd
wget http://insoshi.com/home/tarball

You’ll get a tarball with a name like “insoshi-insoshi-e1fd8b8e440c9f3ab34161d4e87de78e956c1012.tar.gz”. Unzip the tarball and copy the contents to the directory you want the website to appear in:

tar xzf insoshi-insoshi-e1fd8b8e440c9f3ab34161d4e87de78e956c1012.tar.gz
cp -r insoshi-insoshi-e1fd8b8e440c9f3ab34161d4e87de78e956c1012/* insoshi.yourdomain.com/

4) Set up your database.yml file

cd ~/insoshi.yourdomain.com/config
cp database.example database.yml

Edit ~/insoshi.yourdomain.com/database.yml and make it look like the following, where <password> is the password you chose in the previous step:

development:
  adapter: mysql
  database: insoshi
  username: insoshi
  password: <password>
  host: mysql.lrdesign.com
  port: 3306

# Warning: The database defined as 'test' will be erased and
# re-generated from your development database when you run 'rake'.
# Do not set this db to the same as development or production.
test:
  adapter: mysql
  database: insoshi_test
  username: insoshi
  password: <password>
  host: mysql.lrdesign.com
  port: 3306

production:
  adapter: mysql
  database: insoshi
  username: insoshi
  password: <password>
  host: mysql.lrdesign.com
  port: 3306

5) Run the insoshi install script.

This will migrate the database and do some insoshi-specific setup. It’s also an excellent way to check that you’ve configured your database.yml correctly.

cd ~/insoshi.yourdomain.com
rake install

If it works, you should see a bunch of migrations (22 as of the current version of insoshi). If not, go back and figure out what’s wrong with your database.yml file. :)

6) Get rails working

These instructions are an adaptation of the instructions at the Dreamhost Wiki page about Rails.

Generate a dummy rails app and copy the dispatch scripts to your insoshi install:

cd ~
rails dummy
cp dummy/public/dispatch.* insoshi.yourdomain.com/public/

Edit ~/insoshi.yourdomain.com/public/.htaccess file to enable fastCGI. Change the line with dispatch.cgi to look like this:

RewriteRule ^(.*)$ dispatch.fcgi [QSA,L]

change your permissions on the public and dispatch files:

cd ~/insoshi.yourdomai.com
chmod 755 public
chmod 755 public/dispatch.*  

I also had to make my log files writeable:

chmod a+w log/
chmod a+w log/*

7) Start the ferret server

I was not able to get insoshi to run in production mode on Dreamhost at first
because in production mode it needs the ferret server (text search) to be
running or it will refuse to load some of the models. (In test and development
mode, acts_as_ferret will access the ferret databases directly, so this problem only appears in production). Based on this post, I found I could get it working by running:

script/ferret_server start -e production

At this point, I was able to load Insoshi in my browser.

Unfortunately, there are still some problems

I am pretty confident that the approach to running ferret_server above will not be a long term solution, because Dreamhost kills any processes that you leave running for more than a few hours. So ferret_server will go down after a while, and with it your site, so this will probably only get your insoshi site up for a few hours before you have to restart ferret_server. This basically means there’s no good way to use the rails plugin acts_as_ferret on a DH shared account, and unfortunately Insoshi depends on AAF.

Possible Workarounds

You could try putting the startup command in a cron script, to restart ferret_server when Dreamhost kills it, for example:

0,15,30,45 * * * * cd ~/insoshi.yourdomain.com; script/ferret_server start -e production

would attempt to start the ferret server every fifteen minutes. But that might run afoul of Dreamhost server policies (does anyone know for sure?), and in any case your site would still be down in between the time DH killed the ferret_server process and your cron job started again.

You can also alter config/ferret_server.yml to have ferret treat production mode the same as development, directly accessing the ferret database and bypassing ferret_server entirely. However, if you get concurrent access with multiple users, you are very likely to get a corrupted ferret database with that approach.

Hopefully, some permanent solutions?

The Insoshi guys are working on replacing ferret with Sphinx, and that may be a permanent solution to this problem.

You also may want to consider lobbying Dreamhost to allow users run persistent processes like ferret_server. If you are a Dreamhost subscriber you can vote for this feature by following this link to Dreamhost’s Policies Suggestions and voting for “Be able to run simple, persistent scripts!”.