A Python Library for Using PostgreSQL as a Vector Database in AI Applications

A Python Library for Using PostgreSQL as a Vector Database in AI Applications

Introducing the Timescale Vector Python client library: a new library for storing, indexing, and querying vector embeddings in PostgreSQL. Easily store millions of embeddings using PostgreSQL as a vector database. Complete with optimized schema, batch ingestion, hybrid search, and time-based vector search. Learn more about its key features. And then take it for a spin: try Timescale Vector today, free for 90 days.

Python is the lingua franca of AI. And today, it gets even better for building AI applications with PostgreSQL as a vector database. Introducing the Timescale Vector Python client library, which enables Python developers to easily store, index, and query millions of vector embeddings using PostgreSQL.

The Python client library is the simplest way to integrate Timescale Vector’s best-in-class similarity search and hybrid search performance into your generative AI application.

Here’s an overview of how the Timescale Vector Python client makes it easier than ever to build AI applications with PostgreSQL:

  • Optimized schema for vectors and metadata
  • Performant batch ingestion of vectors
  • Create Timescale Vector (DiskANN), HNSW (Hierarchical Navigable Small Worlds), and IVFFlat (Inverted File Flat) indexes in one line of code
  • Semantic search and hybrid search
  • ANN search with time-based filtering of vectors
  • A foundation for Retrieval Augmented Generation (RAG) with time-based context retrieval

In the remainder of this post, we’ll delve into each of these points with code examples!

How to Access the Timescale Vector Python Library

To get started with the Timescale Vector python client, sign up to the Timescale cloud PostgreSQL platform, create a new database, and then run the following in your terminal:

pip install timescale_vector

Then follow this up and running with Timescale Vector tutorial (be sure to download the .env file with your database credentials, you’ll need it to follow the tutorial).

Use the Timescale Vector Python library with a cloud PostgreSQL database, free for 90 days.

  • Three-month free trial for Timescale Vector: To make it easy to test and develop your applications with Timescale Vector, we’re giving new Timescale customers a 90-day extended trial. You won’t be charged for any cloud PostgreSQL databases you spin up during your trial period. Try Timescale Vector for free.
  • Special early access pricing: Existing Timescale customers can use Timescale Vector for free during the early access period.

Optimized PostgreSQL Schema for Storing Vectors and Metadata

Timescale Vector Python client creates an optimized schema to efficiently store vector embeddings and associated metadata for fast search and retrieval. All you need to create a table is a Timescale service URL, your table name, and the dimension of the vectors you want to store.

# Table information
TABLE_NAME = "company_documents"
EMBEDDING_DIMENSIONS = 1536

# Create client object
vec = client.Async(TIMESCALE_SERVICE_URL, 
                   TABLE_NAME,  
                   EMBEDDING_DIMENSIONS)

# create the table and the library handles the schema!
await vec.create_tables()

The create_tables() function will create a table with the following schema:

id | metadata | contents | embedding

  • id is the UUID that uniquely identifies each vector.
  • metadata is a JSONB column that stores the metadata associated with each vector.
  • contents is the text column that stores the content we want vectorized.
  • embedding is the vector column that stores the vector embedding representation of the content.

Performant Batch Ingestion of Vectors With PostgreSQL

Most Generative AI applications require inserting tens of thousands of records (embeddings plus metadata) into a table at a time. Timescale Vector makes it easy to batch ingest these records without extra configuration using the .upsert() method:

# batch upsert vectors into table
await vec.upsert(records)

Create Timescale Vector (DiskANN), HNSW, and IVFFlat ANN Indexes in One Line of Code

With a single line of code, you can create indices on your vectors to speed up similarity search on millions of embeddings.

The Timescale Vector Python library supports the timescale-vector index inspired by the DiskANN algorithm, which achieves 3x search speed vs. specialized vector database Weaviate, and between 40% to 1,590% performance improvement over pgvector when performing ANN searches on one million OpenAI embeddings.

Timescale Vector’s new index outperforms specialized vector database Weaviate by 243%  when performing approximate nearest neighbor searches at 99% recall
Timescale Vector’s new index outperforms specialized vector database Weaviate by 243% and all existing PostgreSQL index types when performing approximate nearest neighbor searches at 99% recall on one million OpenAI vector embeddings

You can create a timescale vector (DiskANN) index in a single line of code:

# Create a timescale vector (DiskANN) search index on the embedding column
await vec.create_embedding_index(client.TimescaleVectorIndex())

What’s more, the library also supports pgvector’s HNSW and IVFFlat indexing algorithms, along with smart defaults for all three index types. Advanced users can, of course, specify index parameters when creating an index via the index creation method arguments.

# Create HNSW search index on the embedding column
await vec.create_embedding_index(client.HNSWIndex())

# Create IVFFLAT search index on the embedding column
await vec.create_embedding_index(client.IvfflatIndex())

Similarity Search and Hybrid Vector Search in PostgreSQL

The Timescale Vector Python library provides a method for easy similarity search.

As a refresher, similarity search is where we find the vectors most similar in meaning to our query vector—more similar vectors are closer to each other, while less similar vectors are further away in the N-dimensional embedding space. Without indexes, this will default to performing exact nearest neighbor (KNN) search, but with the indexes discussed above enabled, you’ll perform approximate nearest neighbor (ANN) search.

# define search query and query_embedding
query_string = "What's new with Project X"
query_embedding = get_embeddings(query_string)

# search table for similar vectors to query_embedding
records = await vec.search(query_embedding)


In addition to simple similarity search (without metadata filters), the Timescale Vector Python library makes it simple to perform hybrid search on your vectors and metadata, where you not only query by vector similarity but by an additional metadata filter or LIMIT:

Filters can be specified as a dictionary where all fields and their values are matched exactly. You can also specify a list of dictionaries that uses OR semantics such that a row is returned if it matches any of the dictionaries.

We also support using more advanced metadata filters using Predicates. (See our documentation for more details.)

Our optimized schema design creates a GIN index on the metadata, allowing optimized searches for many metadata queries.

Similarity Search With Time Filters

Timescale Vector optimizes time-based vector search queries, leveraging the automatic time-based partitioning and indexing of Timescale’s hypertables.

Time-based filtering is useful to efficiently find recent embeddings, constrain vector search by a time range or document age, and store and retrieve large language model (LLM) responses and chat history with ease. Time-based semantic search also enables you to use RAG with time-based context retrieval to give users more useful LLM responses.

You can use efficient time-based similarity search via the Timescale Vector Python library by first ensuring to create your client with the time_partition_interval argument set to the time range you want your data partitioned by as follows:

# Table information
TABLE_NAME = "commit_history"
EMBEDDING_DIMENSIONS = 1536

# Partition interval
TIME_PARTITION_INTERVAL = timedelta(days=7)

# Create client object
vec = client.Async(TIMESCALE_SERVICE_URL, 
                   TABLE_NAME,  
                   EMBEDDING_DIMENSIONS, 
                   time_partition_interval=TIME_PARTITION_INTERVAL)

# create table
await vec.create_tables()

In the code block above, we set the time_partition_interval argument in the client creation function to enable automatic time-based partitioning of the table. This will partition the table into time-based chunks and create indexes on the time-based chunks to speed up time-based queries.

Each partition will consist of data for the specified length of time. We use seven (7) days for simplicity, but you can pick whatever value makes sense for your use case. For example, if you query recent vectors frequently, you might want to use a smaller time delta like one (1) day, or if you query vectors over a decade-long time period then you might want to use a larger time delta like six (6) months or one (1) year.

Once we’ve created the table with time partitioning enabled, we can perform time-based similarity searches as follows:

# Time filter variables for query
# Start date = 1 August 2023, 22:10:35
start_date = datetime(2023, 8, 1, 22, 10, 35)
# End date = 30 August 2023, 22:10:35
end_date = datetime(2023, 8, 30, 22, 10, 35)

# Similarity search with time filter
records_time_filtered = await vec.search(query_embedding,limit=3, 
                        uuid_time_filter=client.UUIDTimeRange(start_date, end_date))

This will ensure our similarity search only returns vectors that have times between the start_date and end_date.

Here’s some intuition for why Timescale Vector’s time-based partitioning speeds up ANN queries with time-based filters:

Timescale Vector partitions the data by time and creates ANN indexes on each partition individually. Then, during search, we perform a three-step process:

  • Step 1: filter our partitions that don’t match the time predicate.
  • Step 2: perform the similarity search on all matching partitions.
  • Step 3: combine all the results from each partition in step 2, rerank, and filter out results by time.

Timescale Vector leverages TimescaleDB’s hypertables, which automatically partition vectors and associated metadata by a timestamp. This enables efficient querying on vectors by both similarity to a query vector and time, as partitions not in the time window of the query are ignored, making the search a lot more efficient by filtering out whole swaths of data in one go.

A note on preparing data for time-based partitioning

Timescale Vector uses the DateTime portion of a UUID v1 to determine which partition a given row should be placed in.

  • If you want the current date and time associated with your vectors, you can create a new UUID v1 for each record that you want to insert:
# Table information
TABLE_NAME = "commit_history"
EMBEDDING_DIMENSIONS = 1536

# Partition interval
TIME_PARTITION_INTERVAL = timedelta(days=7)

# Create client object
vec = client.Async(TIMESCALE_SERVICE_URL, 
                   TABLE_NAME,  
                   EMBEDDING_DIMENSIONS, 
                   time_partition_interval=TIME_PARTITION_INTERVAL)

# create table
await vec.create_tables()

  • If you want a date or time in the past to be associated with your vectors, you can use our handy uuid_from_time() function to generate a uuid v1 from a Python datetime object, and then use that as your id for your vector when you insert it into the PostgreSQL database:
# Time filter variables for query
# Start date = 1 August 2023, 22:10:35
start_date = datetime(2023, 8, 1, 22, 10, 35)
# End date = 30 August 2023, 22:10:35
end_date = datetime(2023, 8, 30, 22, 10, 35)

# Similarity search with time filter
records_time_filtered = await vec.search(query_embedding,limit=3, 
                        uuid_time_filter=client.UUIDTimeRange(start_date, end_date))

For example, in this tutorial for Timescale Vector, we extract the dates from our metadata and turn them into UUID v1s, which we then use as the id part of our record when we ingest into the PostgreSQL table:

import uuid
id = uuid.uuid1()
await vec.upsert([(id, {"key": "val"}, "the brown fox", [1.0, 1.2])])

Retrieval Augmented Generation With Time-Based Context Retrieval

Let’s put everything together and look at a simplified example of how you can use the Timescale Vector Python library to power retrieval augmented generation where the context retrieved is constrained to a given time range.

Generation: In the example below, we define get_completion_from_messages(), which makes a call to an LLM and returns a completion response for a given prompt.

Time-based context retrieval: We define get_top_most_similar_docs(), which takes a given query embedding and returns the top five most similar rows to that embedding in our table that have a time associated with them between start_date and end_date.

Finally, we put it all together in process_user_message, which takes a user_input, like a question, as well as a start and end date, and returns a retrieval augmented response from the LLM using the time-based context retrieved from the records in our table.

# Make an LLM call and get completion for a given set of messages
def get_completion_from_messages(messages, model="gpt-4-0613", temperature=0, max_tokens=1000):
   response = openai.ChatCompletion.create(
       model=model,
       messages=messages,
       temperature=temperature,
       max_tokens=max_tokens,
   )
   return response.choices[0].message["content"]

# Get top 3 most similar document sections within a time range
def get_top_similar_docs(query_embedding, start_date, end_date):
   # Get the top most similar documents within the time range
   top_docs = await vec.search(query_embedding,limit=3, uuid_time_filter=client.UUIDTimeRange(start_date, end_date))
   return top_docs

#construct context string from relevant docs array
def process_user_message(user_input, all_messages, start_date, end_date):

   #Get documents related to the user input
   related_docs = get_top_similar_docs(get_embeddings(user_input), start_date, end_date)

   messages = [
       {"role": "system", "content": system_message},
       {"role": "user", "content": f"{user_input}"},
       {"role": "assistant", "content": f"Relevant information: \n {related_docs[0]} \n {related_docs[1]} \n {related_docs[2]}"}
       ]

   final_response = get_completion_from_messages(all_messages + messages)

This is a simple example of a powerful concept—using time-based context retrieval in your RAG applications can help provide more relevant answers to your users. This time-based context retrieval can be helpful to any dataset with a natural language and time component.

Timescale Vector uniquely enables this thanks to its efficient time-based similarity search capabilities, and taking advantage of it in your Python applications is easy thanks to the Timescale Vector Python client library.

Resources and Next Steps

Now that you’ve learned the foundational concepts of Timecale Vector’s Python library, install it for your next project that uses LLMs in Python:

pip install timescale_vector

And then continue your learning journey with the following tutorials and guides:

And a reminder:

  • Three-month free trial for Timescale Vector: To make it easy to test and develop your applications with Timescale Vector, we’re giving new Timescale customers a 90-day extended trial. You won’t be charged for any cloud PostgreSQL databases you spin up during your trial period. Try Timescale Vector for free.
  • Special early access pricing: Existing Timescale customers can use Timescale Vector for free during the early access period.
Ingest and query in milliseconds, even at terabyte scale.
This post was written by
9 min read
AI
Contributors

Related posts