Vector Database with Rails using pgvector

August 13, 2023

Ruby on Rails, Vector Database, Machine Learning, Search Algorithms, pgvector, postgres

Imagine a system that not only stores data but grasps its very essence. A system that can swiftly navigate through myriad data points to pinpoint the most relevant. That’s the power of vector databases. They’re tailor-made for QA systems, acting as memory boosters for large language models. Their ability to seamlessly link questions to exact answers showcases their pivotal role in modern AI. I’m going to show you how you can setup your very own vector database using Ruby on Rails so you can ask your data questions!

What is a Vector Database? #

Databases have been our go-to solution for storing and retrieving information for decades. Traditional systems like PostgreSQL have even ventured into full text search capabilities, adeptly handling keyword searches. Tools like Elasticsearch emerged, scaling these keyword search capabilities for larger datasets. But as powerful as they are, they still operate within the confines of exact keyword matching and predefined criteria.

Vector databases are designed to handle multi-dimensional data vectors. They go beyond the static record treatment typical of traditional databases, viewing data as dynamic points in multi-dimensional spaces. The beauty lies in their querying mechanism: rather than seeking exact matches, they find data points that are “close” in this vast space. This nuance makes them perfect for tasks like similarity searches.

Use Cases: #

There are certain scenarios where vector databases outshine their traditional counterparts:

Similarity Search: Whether it’s finding similar images, documents, or products, vector databases excel in tasks that require identifying patterns and similarities within data.

Semantic Search: For applications that require understanding context rather than exact keywords, like searching for articles or papers based on abstract ideas, vector databases make your search more intuitive and relevant.

Machine Learning Model Integration: With AI models churning out vectorized outputs, integrating these with vector databases allows for seamless storage, retrieval, and further analysis.

Choosing Your Vector Database: #

There are a few popular choices for warehousing vectors: Chroma DB, Weaviate, and the combination of Postgres with the pgvector extension. Each has its unique strengths, but my personal favorite, and the one I’ll be using, is pgvector. It seamlessly integrates my current infrastructure, offers robust scaling capabilities, and I don’t have to manage yet another database.

Setting up Rails with pgvector #

Prerequisites: #

Ruby on Rails Environment: Ensure you’ve got Rails installed.
PostgreSQL Setup: You’ll need PostgreSQL installed since we’re leveraging the pgvector extension.
Familiarity with Database Operations: Basic understanding of migrations and Active Record will be beneficial.

Setting Up: #

Step 1: New Rails Project

If you’re starting from scratch, initiate a new Rails project:

rails new vector_project -d postgresql

Step 2: Incorporate the Neighbor Gem

Neighbor gem makes working with vectors in Rails a breeze.
OpenAI gem lets us generate embeddings and talk directly to the Chat API
Dotenv-rails let’s us use a .env to setup our API key in development/test environments. You might use something different.
Tiktoken is optional but becomes a life saver for production environments where you need to be mindful of your tokens

Add them to your Gemfile:

gem 'neighbor'
gem 'ruby-openai'
gem 'dotenv-rails'
gem 'tiktoken_ruby'

Run bundle install to finalize the installation.

Step 3: Integrate pgvector

With PostgreSQL in place, we’ll integrate the pgvector extension:

rails generate migration AddPgVectorExtension

In the generated migration file:

def change
  enable_extension 'vector'
end

Step 4: Setting up Environment Variables

Create a .env file in your root directory.
Add your OpenAI API key:

OPENAI_API_KEY=your_api_key_here

Remember to add .env to .gitignore to keep your credentials safe from prying eyes.

Step 5: Configure Neighbor and OpenAI

In an initializer (config/initializers/openai.rb), configure OpenAI settings:

Openai.configure do |config|
  config.api_key = ENV.fetch("OPENAI_API_KEY")
end

Step 6: Set Up OpenAI Service Module

I like making a service for interactions with OpenAI. Here’s how to set it up:

Create a new file in the services directory of your Rails application. If the directory doesn’t exist, you can create it:

mkdir app/services
touch app/services/openai_service.rb

Copy and paste the provided OpenaiService module into openai_service.rb.

# frozen_string_literal: true

# The OpenaiService module enables interaction with OpenAI models to generate text responses and create text embeddings.
module OpenaiService
  def self.chat(parameters, &block)
    client = OpenAI::Client.new

    parameters[:stream] = block if block_given?

    response = client.chat(
    parameters: parameters
    )

    return response if block_given?

    prompt_tokens = response.dig('usage', 'prompt_tokens')
    completion_tokens = response.dig('usage', 'completion_tokens')
    total_tokens = response.dig('usage', 'total_tokens')
    content = response.dig('choices', 0, 'message', 'content')

    {
    content:,
    prompt_tokens:,
    completion_tokens:,
    total_tokens:
    }
  end
  
  def self.embeddings(data)
    return if data.empty?

    client = OpenAI::Client.new

    response = client.embeddings(
    parameters: {
        model: 'text-embedding-ada-002',
        input: data
    }
    )

    prompt_tokens = response.dig('usage', 'prompt_tokens')
    total_tokens = response.dig('usage', 'total_tokens')
    embedding = response.dig('data', 0, 'embedding')

    raise 'Empty embedding' if embedding.nil?

    {
    embedding:,
    prompt_tokens:,
    total_tokens:
    }
  end
end

This module provides two key methods: chat for talking to the OpenAI Chat API, and embeddings to generate vector embeddings of given text using the Text Embedding Ada 002 model.

Step 7: Generate the Chunk Scaffold

To quickly generate the required MVC components for our Chunk model, we’ll use the Rails scaffold generator:

rails generate scaffold Chunk data:string embedding:vector token_count:integer

Step 8: Apply the Migration

Now, let’s modify the generated migration. Here’s how the updated migration should look:

class CreateChunks < ActiveRecord::Migration[6.1]
  def change
    create_table :chunks, id: :uuid do |t|
      t.timestamps null: false, default: -> { 'NOW()' }, index: true

      t.string :data, null: false
      t.vector :embedding, dimensions: 1536, using: :ivfflat, opclass: :vector_ip_ops
      t.integer :token_count, null: false
    end
  end
end

Run the migration:

rails db:migrate

Step 9: Model Configuration

After creating the scaffold, navigate to the app/models/chunk.rb file.

Add Associations and Scopes:

The has_neighbors :embedding line allows us to perform vector-based search operations on the Chunk model based on the embedding field.

The scopes :with_embedding and :without_embedding let us filter records that have or don’t have vector embeddings set, respectively.

class Chunk < ApplicationRecord
  has_neighbors :embedding

  scope :with_embedding, -> { where.not(embedding: nil) }
  scope :without_embedding, -> { where(embedding: nil) }
end

Add Validations:

These validations ensure that both the data and token_count fields have values before the record is saved:

validates :data, presence: true
validates :token_count, presence: true

Add Instance Methods:

The nearest method is a convenience method to wrap nearest neighbors for the current record based on the embedding field. The method uses inner product as the distance metric, which can be used with vectors trained using certain embeddings.

The set_token_count method calculates the token count for the data field before the record is created. This ensures we know the token consumption if the data was to be passed to OpenAI models like GPT-4.

def nearest
  nearest_neighbors(:embedding, distance: :inner_product)
end

def set_token_count
  self.token_count = Tiktoken.encoding_for_model("gpt-4").encode(data).length
end

Add Callback:

Set the token count before validation when a new record is being created:

before_validation :set_token_count, on: :create

Final Model:

Once you add all the modifications, your Chunk model will look like:

class Chunk < ApplicationRecord
  has_neighbors :embedding

  scope :with_embedding, -> { where.not(embedding: nil) }
  scope :without_embedding, -> { where(embedding: nil) }

  validates :data, presence: true
  validates :token_count, presence: true

  before_validation :set_token_count, on: :create

  def nearest
    nearest_neighbors(:embedding, distance: :inner_product)
  end

  def set_token_count
    self.token_count = Tiktoken.encoding_for_model("gpt-4").encode(data).length
  end
end

Note: The docs for the neighbor gem mention that it is best to use inner_product for OpenAI embeddings.

Step 10: Creating the Embeddings Job

In order to offload the task of syncing embeddings from OpenAI, we’ll utilize ActiveJob. This will ensure our application remains responsive even when waiting for the OpenAI response.

Generate the job:

rails generate job SyncEmbeddings

In the generated file (app/jobs/sync_embeddings_job.rb):

class SyncEmbeddingsJob < ApplicationJob
  queue_as :default

  def perform(chunk_id)
    if chunk_id
      chunk = Chunk.find(chunk_id)
      embedding = OpenaiService.embeddings(chunk.data)
      chunk.update!(embedding: embedding[:embedding])
    end

    Chunk.without_embedding.each do |chunk|
      embedding = OpenaiService.embeddings(chunk.data)
      chunk.update!(embedding: embedding[:embedding])
    end
  end
end

This job leverages our OpenaiService to seamlessly sync embeddings. If you provide a chunk_id that isn’t nil, it will sync that embdding first, otherwise it will attempt to sync all chunks that don’t have embeddings.

Step 11: Populating the Database and Testing Vector Searches

Using Rails Console:

Open the Rails Console with the following command:

rails c

Create Some Chunks:

Now, let’s create some Chunk records with random data:

Chunk.create(data: "The quick brown fox jumps over the lazy dog.")
Chunk.create(data: "The quick brown bear jumps over the crazy cat.")
Chunk.create(data: "Lorem ipsum dolor sit amet, consectetur adipiscing elit.")
Chunk.create(data: "A journey of a thousand miles begins with a single step.")
Chunk.create(data: "The quick brown fox takes a nap.")

Here, we’ve created five Chunk records.

Run the Sync Job:

Before you can search for similar vectors, the vectors need to be populated. This is where your sync job comes into play, which presumably calculates and updates the vector embeddings for each Chunk. Run it using:

EmbeddingSyncJob.perform_now

Ensure that your sync job correctly calculates and populates the embedding field for each Chunk.

Find Nearest Neighbors:

Now, let’s retrieve the nearest neighbors for the first Chunk record:

chunk = Chunk.first
nearest_chunks = chunk.nearest

nearest_chunks will contain a list of Chunk records sorted by similarity (based on the embedding field) to the first Chunk. Given that we have two identical records about the quick brown fox, you should see them at the top of this list.

Inspecting the Results:

You can print out the data from the nearest neighbors for easier inspection:

nearest_chunks.each { |c| puts c.data }

This should display the content of the chunks, starting with the ones most similar to the first chunk.

Advanced Features #

I was going to add a section for talking directly to Chat GPT and feeding your nearest neighbors into the prompt as context and show you how to update your UI in real time with hotwire, but this post got kind of long. If you want me to add a section on it, or ran into any issues getting this code to work, hit me up on LinkedIn and I might consider adding it. It should be fairly straightforward though with this foundation!

Conclusion #

Using vector databases with Rails has unlocked powerful, dynamic data handling capabilities. When combined with tools like pgvector and OpenAI, the results are nothing short of impressive.