Vector Databases – Where You Should Use Them Sourav Gupta September 14, 2023

Vector Databases – Where You Should Use Them

Rashmi, an ailing elderly woman who lives in a small Indian town, has been prescribed several medicines by her doctors. She relies on her son to procure a week’s worth of her daily medication. Sometimes, her son has to travel to different nearby towns for a few days in a row, on work related trips. Due to her ill health, Rashmi is unable to go alone to the medical stores with the last used strip of medicines and re-order them. Since she cannot read the English names of the medicines, Rashmi is unable to place an order over the phone with the few pharmacists who provide home delivery of medicines, if the total bill exceeds Rs 100. On such occasions, Rashmi has to depend on her neighbours or relatives to bring her the required medicines, which have to be taken 3 times a day.

Rashmi’s son suggested that instead of being dependent on his availability or the kindness of her well-wishers, she could make use of digital technology to make her life easier. After installing an application on her mobile phone for online India’s leading online chemist, her son created a profile for Rashmi and saved their home address. He then taught her how to upload an image of her medicines and click on Cash on Delivery (CoD) for each order. The mobile application is able to automatically identify the name of the required medicines from the uploaded image and then send it to Rashmi’s address.

Though Rashmi only used her mobile phone for making and receiving phone calls, she found this online ordering process with only a photograph, to be extremely convenient. Now, not only does Rashmi order her own medicines online but she has also got many of her peers to follow her example and embrace technology, even at their advanced ages.

From the tens of thousands of medicines sold in India, how was the mobile application able to correctly identify the correct stock keeping unit (SKU) and send it to Rashmi? The usage of vector databases enabled the application to not just find the right medicines but also efficiently identify alternative medicines, as well.

Understanding Vector Databases

Vector databases have seen high levels of adoption in the world of Artificial Intelligence (AI) and Machine Learning (ML). They are very popular in use cases involving recommendation systems, where they can help you find something similar to what you want. For example, if you loved The Chef’s Secret by Crystal King and want to find other novels that are centred around Italian cuisine, you will find it difficult to identify them using a regular library information system. That’s where a book recommendation system which leverages a vector database will be extremely useful.

How Do Vector Databases Work?

Vector databases store and organise data in a special way using numerical lists called vectors. A vector is akin to an ordered set of numbers with both a magnitude and direction (left part of above figure). When we search for something (the user’s question), the vector database makes a special vector representation (embedding) for the user’s question (right part of above figure). The vector database performs a vector similarity search and retrieves the closest matches using a method called Approximate Nearest Neighbour (ANN) search. It uses various algorithms to compare the data points and figure out which ones are most similar to your query (middle part of above figure).

To measure how well the results match your request, the vector database provides an output score. If the score is above 0.75 or 0.8, it means the information is quite likely to be what you’re looking for.

Vector databases offer several capabilities:

  1. They support semantic search, which means they can understand the context or meaning of search terms and are not limited to just exact word matches. This makes vector databases well suited for recommendation systems, content discovery, and question-answering systems.
  2. Vector databases are specifically designed for quick and efficient searches of similar items in high-dimensional spaces, a task that traditional relational databases struggle with.
  3. ML algorithms transform the selected item into a numerical representation which saves information on different attributes or features of that object. This is called vector embedding. The vector database can store and then search for embeddings to discover other items that have similar meanings to this specific item.
  4. Vector databases are excellent at handling multimedia data, including images, audio and video. These data types are transformed into high-dimensional vectors to allow for effective similarity search and retrieval.
  5. In the field of Natural Language Processing (NLP), vector databases store high-dimensional vectors that represent words, sentences or documents.

 Using Vector Databases

At Prescience Decision Solutions, we have used vector databases in different project that involved NLP, image processing, and large language models (LLMs). It is best to use vector databases with LLMs, where the latter is expected to find the best matching documents from a considerable corpus of available documents, for either information retrieval or recommendations. If all the available documents are represented as vectors, the vector databases can retrieve the most relevant documents rapidly and with extremely high accuracy.

If you are looking for a starting point for your business, take advantage of our personalized FREE consultation workshop 

Subscribe for more Gen AI updates: Stay ahead with the latest on Gen AI by joining our mailing list.