ruby

Full-Text Search for Ruby on Rails with Litesearch

Julian Rubisch

Julian Rubisch on

Full-Text Search for Ruby on Rails with Litesearch

In this post, we'll turn to the last piece of the puzzle in LiteStack: Litesearch.

As an example, we will equip a prompts index page with a search bar to query a database for certain prompts. We will generate a couple of fake records to test our search functionality against.

Let's get to it!

What Is Litesearch?

Litesearch is a convenience wrapper built around FTS5, SQLite's virtual table-based full-text search module.

We'll dive into the mechanics a bit later. For now, we will assume that Litesearch is a Ruby module providing a simple API to perform text searches against a SQLite database. This works in standalone mode, but we will focus on the ActiveRecord integration, of course.

Configure the Prompt Model for Litesearch

The single addition we must make to our prompt model is a search index schema definition. To do this, we have to include the Litesearch::Model module in our model and call the litesearch class method to add fields to the schema:

diff
class Prompt < ApplicationRecord include AccountScoped + include Litesearch::Model # ... + litesearch do |schema| + schema.fields [:title] + end # ... end

You can also target associations like so, and change the tokenizer used for indexing:

ruby
litesearch do |schema| schema.field :account_name, target: "accounts.name" schema.tokenizer :porter end

Note: Currently, ActionText fields are not supported.

Let's quickly try this out in the Rails console:

ruby
> Current.account = Account.first > Prompt.search("teddy") Prompt Load (7.4ms) SELECT prompts.*, -prompts_search_idx.rank AS search_rank FROM "prompts" INNER JOIN prompts_search_idx ON prompts.id = prompts_search_idx.rowid AND rank != 0 AND prompts_search_idx MATCH 'teddy' WHERE "prompts"."account_id" = ? ORDER BY prompts_search_idx.rank [["account_id", 1]] => [#<Prompt:0x0000000105f80fb0 id: 1, title: "A cute teddy bear", prompt_image: <...>, account_id: 1, created_at: Fri, 12 Jan 2024 10:47:08.604031000 UTC +00:00, updated_at: Fri, 12 Jan 2024 10:47:41.321896000 UTC +00:00, content_type: "image/jpeg", search_rank: 1.0e-06>]

Remember to set Current.account, because our prompt model is scoped to an account, otherwise we get an empty result set.

Impressive! By changing only 4 lines of code, we already have a crude working version of full-text search.

Add a Typeahead Search Bar to Our Ruby on Rails Application

Next up, we'll combine a few of the techniques we've reviewed to implement snappy typeahead searching. Before we do that, though, let's generate more sample data. I will use the popular faker gem to do that:

sh
$ bundle add faker --group development $ bin/rails console

Drop into a Rails console and create 50 sample prompts. I'm re-using the first prompt's image data here. Also, note that I'm again setting the Current.account first.

ruby
> Current.account = Account.first * 50.times do * Prompt.create(title: "a #{Faker::Adjective.positive} #{Faker::Creature::Animal.name}", content_type: "image/png", prompt_image: Prompt.first.prompt_image) > end

To prepare our user interface for reactive searching, we will wrap the prompts grid in a Turbo frame. This frame will be replaced every time the search query changes.

diff
<!-- app/views/prompts/index.html.erb --> <h1>Prompts</h1> + <%= turbo_frame_tag :prompts, class: "grid" do %> - <div id="prompts" class="grid"> <% @prompts.each do |prompt| %> <%= link_to prompt do %> <%= render "index", prompt: prompt %> <% end %> <% end %> + <% end %> - </div> <%= link_to "New prompt", new_prompt_path %>

The PromptsController needs to be updated to filter prompts if a query parameter is passed in:

diff
# app/controllers/prompts_controller.rb class PromptsController < ApplicationController # ... def index @prompts = Prompt.all + + @prompts = @prompts.search(params[:query]) if params[:query].present? end # ... end

Next, let's rig up the search bar in the prompt index view. For this, we'll use a shoelace input component:

diff
<!-- app/views/prompts/index.html.erb --> <h1>Prompts</h1> + <section> + <sl-input name="search" type="search" placeholder="Search for a prompt title" clearable> + <sl-icon name="search" slot="suffix"></sl-icon> + </sl-input> + </section> <div id="prompts" class="grid"> <% @prompts.each do |prompt| %> <%= link_to prompt do %> <%= render "index", prompt: prompt %> <% end %> <% end %> </div> <%= link_to "New prompt", new_prompt_path %>

To implement typeahead searching, we must add a bit of custom JavaScript to app/javascript/application.js:

diff
// app/javascript/application.js // Entry point for the build script in your package.json import "@hotwired/turbo-rails"; import "./controllers"; import "trix"; import "@rails/actiontext"; import { setBasePath } from "@shoelace-style/shoelace"; setBasePath("/"); + + document + .querySelector("sl-input[name=search]") + .addEventListener("keyup", (event) => { + document.querySelector( + "#prompts" + ).src = `/prompts?query=${encodeURIComponent(event.target.value)}`; + });

This tiny JavaScript snippet does little more than place a keyup listener on our search field, and update the Turbo Frame's src attribute afterward. The input's value is added as the query parameter. Turbo Frame's default behavior performs the rest of the magic: reloading when the src attribute changes, with the updated content fetched from the server.

Here's what this looks like:

Excursus: Highlighting Search Results Using a Turbo Event in Rails

Currently, Litesearch doesn't feature a native highlighting solution like pg_search, but it is pretty easy to build this ourselves using the before-frame-render event:

diff
// Entry point for the build script in your package.json import "@hotwired/turbo-rails"; import "./controllers"; import "trix"; import "@rails/actiontext"; import { setBasePath } from "@shoelace-style/shoelace"; setBasePath("/"); document .querySelector("sl-input[name=search]") .addEventListener("keyup", (event) => { document.querySelector( "#prompts" ).src = `/prompts?query=${encodeURIComponent(event.target.value)}`; }); + document + .querySelector("turbo-frame#prompts") + .addEventListener("turbo:before-frame-render", (event) => { + event.preventDefault(); + + const newHTML = event.detail.newFrame.innerHTML; + + const query = document.querySelector("sl-input[name=search]").value; + if (!!query) { + event.detail.newFrame.innerHTML = newHTML.replace( + new RegExp(`(${query})`, "ig"), + "<em>$1</em>" + ); + } + + event.detail.resume(); + });

This leverages a nifty, somewhat hidden Turbo feature: intercepting rendering. The Turbo before-render and before-frame-render events support pausing rendering and mangling returned HTML from the server. Here, we use this to wrap each occurrence of a search query in an <em> tag:

Under the Hood: Litesearch for Ruby on Rails Explained

We've covered the basics of activating and configuring Litesearch for your LiteStack-powered Ruby on Rails application. As you might have guessed, there's a lot more potential hidden here.

So let's briefly examine how Litesearch wraps around and leverages SQLite's built-in full-text search module, FTS5.

Virtual Tables in SQLite

First, let's discuss the notion of virtual tables in SQLite. Since there's no direct counterpart in the PostgreSQL or MySQL realm, it pays off to learn about these.

From the vantage point of a user issuing an SQL statement against the database, a virtual table is a transparent proxy that adheres to the interface of a table. In the background, however, every query or manipulation invokes a callback of the virtual table structure instead of writing to disk.

In short, a virtual table is something you reach for when you want to access "foreign" data without leaving the domain of your database connection. Apart from full-text search, other examples include geospatial indices or accessing a different file format, such as CSV.

SQLite's FTS5 Full-Text Search Extension

At its core, SQLite's full-text search engine is a virtual table.

The table definition used by Litesearch in ActiveRecord mode looks like this:

ruby
"CREATE VIRTUAL TABLE #{name} USING FTS5(#{col_names}, content='', contentless_delete=1, tokenize='#{tokenizer_sql}')"

name is the index name (it defaults to "#{table_name}_search_idx"), and col_names are the fields we set in our Litesearch schema definition.

We will now briefly look at tokenizers.

Tokenizers

To allow for efficient indexing, a full-text search engine employs a helper utility to split the payload into tokens: a tokenizer. FTS5 has three built-in tokenizers you can choose from:

  • unicode61 (default): All punctuation and whitespace characters (i.e. ",", "." etc.) are considered separators. Text is split at those characters, and the resulting list of connected characters (usually, words) are the tokens. In the wild, you might encounter the remove_diacritics option. This option specifies how to treat glyphs added to letters, like "á", "à", etc. The default is to remove these "diacritics", so these characters are regarded as equivalent.
  • ascii: Similar to unicode61, but all non-ASCII characters are always considered token characters. There is no remove_diacritics option.
  • porter: A tokenizer that employs porter stemming for tokenization. This essentially means that you can do similarity searches, i.e., "search", "searchable", and "searching" will be considered related.

FTS5 Search Interface

To enable a convenient experience, Litesearch exposes a search class method. Essentially, this method joins the model's table to the associated search index and issues a MATCH query. Results are then ordered according to the search rank and returned:

ruby
def search(term) self.select( "#{table_name}.*" ).joins( "INNER JOIN #{index_name} ON #{table_name}.id = #{index_name}.rowid AND rank != 0 AND #{index_name} MATCH ", Arel.sql("'#{term}'") ).select( "-#{index_name}.rank AS search_rank" ).order( Arel.sql("#{index_name}.rank") ) end

Currently, Litesearch doesn't expose more of FTS5's search syntax, but you can learn more about it in FTS5's documentation.

Wrapping Up

This concludes our series on LiteStack. In this post, we discovered Litesearch, the full-text search engine built into LiteStack. We learned how to configure an ActiveRecord model to expose search fields and other options to an SQLite text search index.

We then flexed our Hotwire muscles to build a simple reactive search interface into our UI.

Finally, we explored some of the inner workings of full-text search in SQLite to get a better understanding of what powers it, its benefits, and its limitations.

Happy coding!

P.S. If you'd like to read Ruby Magic posts as soon as they get off the press, subscribe to our Ruby Magic newsletter and never miss a single post!

Julian Rubisch

Julian Rubisch

Our guest author Julian is a freelance Ruby on Rails consultant based in Vienna, specializing in Reactive Rails. Part of the StimulusReflex core team, he has been at the forefront of developing cutting-edge HTML-over-the-wire technology since 2020.

All articles by Julian Rubisch

Become our next author!

Find out more

AppSignal monitors your apps

AppSignal provides insights for Ruby, Rails, Elixir, Phoenix, Node.js, Express and many other frameworks and libraries. We are located in beautiful Amsterdam. We love stroopwafels. If you do too, let us know. We might send you some!

Discover AppSignal
AppSignal monitors your apps