Vector Database with Rails using pgvector
August 13, 2023
Imagine a system that not only stores data but grasps its very essence. A system that can swiftly navigate through myriad data points to pinpoint the most relevant. That’s the power of vector databases. They’re tailor-made for QA systems, acting as memory boosters for large language models. Their ability to seamlessly link questions to exact answers showcases their pivotal role in modern AI. I’m going to show you how you can setup your very own vector database using Ruby on Rails so you can ask your data questions!
What is a Vector Database? #
Databases have been our go-to solution for storing and retrieving information for decades. Traditional systems like PostgreSQL have even ventured into full text search capabilities, adeptly handling keyword searches. Tools like Elasticsearch emerged, scaling these keyword search capabilities for larger datasets. But as powerful as they are, they still operate within the confines of exact keyword matching and predefined criteria.
Vector databases are designed to handle multi-dimensional data vectors. They go beyond the static record treatment typical of traditional databases, viewing data as dynamic points in multi-dimensional spaces. The beauty lies in their querying mechanism: rather than seeking exact matches, they find data points that are “close” in this vast space. This nuance makes them perfect for tasks like similarity searches.
Use Cases: #
There are certain scenarios where vector databases outshine their traditional counterparts:
Similarity Search: Whether it’s finding similar images, documents, or products, vector databases excel in tasks that require identifying patterns and similarities within data.
Semantic Search: For applications that require understanding context rather than exact keywords, like searching for articles or papers based on abstract ideas, vector databases make your search more intuitive and relevant.
Machine Learning Model Integration: With AI models churning out vectorized outputs, integrating these with vector databases allows for seamless storage, retrieval, and further analysis.
Choosing Your Vector Database: #
There are a few popular choices for warehousing vectors: Chroma DB, Weaviate, and the combination of Postgres with the pgvector extension. Each has its unique strengths, but my personal favorite, and the one I’ll be using, is pgvector. It seamlessly integrates my current infrastructure, offers robust scaling capabilities, and I don’t have to manage yet another database.
Setting up Rails with pgvector #
Prerequisites: #
- Ruby on Rails Environment: Ensure you’ve got Rails installed.
- PostgreSQL Setup: You’ll need PostgreSQL installed since we’re leveraging the pgvector extension.
- Familiarity with Database Operations: Basic understanding of migrations and Active Record will be beneficial.
Setting Up: #
Step 1: New Rails Project
If you’re starting from scratch, initiate a new Rails project:
rails new vector_project -d postgresql
Step 2: Incorporate the Neighbor Gem
- Neighbor gem makes working with vectors in Rails a breeze.
- OpenAI gem lets us generate embeddings and talk directly to the Chat API
- Dotenv-rails let’s us use a .env to setup our API key in development/test environments. You might use something different.
- Tiktoken is optional but becomes a life saver for production environments where you need to be mindful of your tokens
Add them to your Gemfile:
gem 'neighbor'
gem 'ruby-openai'
gem 'dotenv-rails'
gem 'tiktoken_ruby'
Run bundle install
to finalize the installation.
Step 3: Integrate pgvector
With PostgreSQL in place, we’ll integrate the pgvector extension:
rails generate migration AddPgVectorExtension
In the generated migration file:
def change
enable_extension 'vector'
end
Step 4: Setting up Environment Variables
- Create a
.env
file in your root directory. - Add your OpenAI API key:
OPENAI_API_KEY=your_api_key_here
Remember to add .env
to .gitignore
to keep your credentials safe from prying eyes.
Step 5: Configure Neighbor and OpenAI
In an initializer (config/initializers/openai.rb
), configure OpenAI settings:
Openai.configure do |config|
config.api_key = ENV.fetch("OPENAI_API_KEY")
end
Step 6: Set Up OpenAI Service Module
I like making a service for interactions with OpenAI. Here’s how to set it up:
- Create a new file in the
services
directory of your Rails application. If the directory doesn’t exist, you can create it:
mkdir app/services
touch app/services/openai_service.rb
- Copy and paste the provided
OpenaiService
module intoopenai_service.rb
.
# frozen_string_literal: true
# The OpenaiService module enables interaction with OpenAI models to generate text responses and create text embeddings.
module OpenaiService
def self.chat(parameters, &block)
client = OpenAI::Client.new
parameters[:stream] = block if block_given?
response = client.chat(
parameters: parameters
)
return response if block_given?
prompt_tokens = response.dig('usage', 'prompt_tokens')
completion_tokens = response.dig('usage', 'completion_tokens')
total_tokens = response.dig('usage', 'total_tokens')
content = response.dig('choices', 0, 'message', 'content')
{
content:,
prompt_tokens:,
completion_tokens:,
total_tokens:
}
end
def self.embeddings(data)
return if data.empty?
client = OpenAI::Client.new
response = client.embeddings(
parameters: {
model: 'text-embedding-ada-002',
input: data
}
)
prompt_tokens = response.dig('usage', 'prompt_tokens')
total_tokens = response.dig('usage', 'total_tokens')
embedding = response.dig('data', 0, 'embedding')
raise 'Empty embedding' if embedding.nil?
{
embedding:,
prompt_tokens:,
total_tokens:
}
end
end
This module provides two key methods: chat
for talking to the OpenAI Chat API, and embeddings
to generate vector embeddings of given text using the Text Embedding Ada 002 model.
Step 7: Generate the Chunk Scaffold
To quickly generate the required MVC components for our Chunk
model, we’ll use the Rails scaffold generator:
rails generate scaffold Chunk data:string embedding:vector token_count:integer
Step 8: Apply the Migration
Now, let’s modify the generated migration. Here’s how the updated migration should look:
class CreateChunks < ActiveRecord::Migration[6.1]
def change
create_table :chunks, id: :uuid do |t|
t.timestamps null: false, default: -> { 'NOW()' }, index: true
t.string :data, null: false
t.vector :embedding, dimensions: 1536, using: :ivfflat, opclass: :vector_ip_ops
t.integer :token_count, null: false
end
end
end
Run the migration:
rails db:migrate
Step 9: Model Configuration
After creating the scaffold, navigate to the app/models/chunk.rb
file.
- Add Associations and Scopes:
The has_neighbors :embedding
line allows us to perform vector-based search operations on the Chunk
model based on the embedding
field.
The scopes :with_embedding
and :without_embedding
let us filter records that have or don’t have vector embeddings set, respectively.
class Chunk < ApplicationRecord
has_neighbors :embedding
scope :with_embedding, -> { where.not(embedding: nil) }
scope :without_embedding, -> { where(embedding: nil) }
end
- Add Validations:
These validations ensure that both the data
and token_count
fields have values before the record is saved:
validates :data, presence: true
validates :token_count, presence: true
- Add Instance Methods:
The nearest
method is a convenience method to wrap nearest neighbors for the current record based on the embedding
field. The method uses inner product as the distance metric, which can be used with vectors trained using certain embeddings.
The set_token_count
method calculates the token count for the data
field before the record is created. This ensures we know the token consumption if the data was to be passed to OpenAI models like GPT-4.
def nearest
nearest_neighbors(:embedding, distance: :inner_product)
end
def set_token_count
self.token_count = Tiktoken.encoding_for_model("gpt-4").encode(data).length
end
- Add Callback:
Set the token count before validation when a new record is being created:
before_validation :set_token_count, on: :create
- Final Model:
Once you add all the modifications, your Chunk
model will look like:
class Chunk < ApplicationRecord
has_neighbors :embedding
scope :with_embedding, -> { where.not(embedding: nil) }
scope :without_embedding, -> { where(embedding: nil) }
validates :data, presence: true
validates :token_count, presence: true
before_validation :set_token_count, on: :create
def nearest
nearest_neighbors(:embedding, distance: :inner_product)
end
def set_token_count
self.token_count = Tiktoken.encoding_for_model("gpt-4").encode(data).length
end
end
Note: The docs for the neighbor gem mention that it is best to use inner_product for OpenAI embeddings.
Step 10: Creating the Embeddings Job
In order to offload the task of syncing embeddings from OpenAI, we’ll utilize ActiveJob. This will ensure our application remains responsive even when waiting for the OpenAI response.
- Generate the job:
rails generate job SyncEmbeddings
- In the generated file (
app/jobs/sync_embeddings_job.rb
):
class SyncEmbeddingsJob < ApplicationJob
queue_as :default
def perform(chunk_id)
if chunk_id
chunk = Chunk.find(chunk_id)
embedding = OpenaiService.embeddings(chunk.data)
chunk.update!(embedding: embedding[:embedding])
end
Chunk.without_embedding.each do |chunk|
embedding = OpenaiService.embeddings(chunk.data)
chunk.update!(embedding: embedding[:embedding])
end
end
end
This job leverages our OpenaiService
to seamlessly sync embeddings. If you provide a chunk_id that isn’t nil, it will sync that embdding first, otherwise it will attempt to sync all chunks that don’t have embeddings.
Step 11: Populating the Database and Testing Vector Searches
- Using Rails Console:
Open the Rails Console with the following command:
rails c
- Create Some Chunks:
Now, let’s create some Chunk
records with random data:
Chunk.create(data: "The quick brown fox jumps over the lazy dog.")
Chunk.create(data: "The quick brown bear jumps over the crazy cat.")
Chunk.create(data: "Lorem ipsum dolor sit amet, consectetur adipiscing elit.")
Chunk.create(data: "A journey of a thousand miles begins with a single step.")
Chunk.create(data: "The quick brown fox takes a nap.")
Here, we’ve created five Chunk
records.
- Run the Sync Job:
Before you can search for similar vectors, the vectors need to be populated. This is where your sync job comes into play, which presumably calculates and updates the vector embeddings for each Chunk
. Run it using:
EmbeddingSyncJob.perform_now
Ensure that your sync job correctly calculates and populates the embedding
field for each Chunk
.
- Find Nearest Neighbors:
Now, let’s retrieve the nearest neighbors for the first Chunk
record:
chunk = Chunk.first
nearest_chunks = chunk.nearest
nearest_chunks
will contain a list of Chunk
records sorted by similarity (based on the embedding
field) to the first Chunk
. Given that we have two identical records about the quick brown fox, you should see them at the top of this list.
- Inspecting the Results:
You can print out the data from the nearest neighbors for easier inspection:
nearest_chunks.each { |c| puts c.data }
This should display the content of the chunks, starting with the ones most similar to the first chunk.
Advanced Features #
I was going to add a section for talking directly to Chat GPT and feeding your nearest neighbors into the prompt as context and show you how to update your UI in real time with hotwire, but this post got kind of long. If you want me to add a section on it, or ran into any issues getting this code to work, hit me up on LinkedIn and I might consider adding it. It should be fairly straightforward though with this foundation!
Conclusion #
Using vector databases with Rails has unlocked powerful, dynamic data handling capabilities. When combined with tools like pgvector and OpenAI, the results are nothing short of impressive.