VECTOR DATABASES
Vector databases store data as high-dimensional vectors, representing features or attributes mathematically. Each vector is associated with a certain number of dimensions, ranging from tens to thousands, depending on the complexity and granularity of the data. These vectors are generated by embedding raw data, such as text, images, audio, video, and others.

ADVANTAGES
- Similarity Search: Used for computing the similarity between a pair of objects. It is essential for computing the similarity between vector embeddings.
- Fast Retrieval of Data: The concept of distance between vectors (Euclidean, Manhattan, Cosine, and Chebyshev) is used, which helps us in classifying the data effectively, resulting in fast data retrieval.
- Improved query performance.
- Highly scalable and flexible.
- High Dimensional Search: Gives us a wide range of data to operate upon.
For example, we can use a vector database to:
- Find images that are similar to a given image based on their visual content and style
- Find documents that are similar to a given document based on their topic and sentiment
- In general, find products that are similar to a given product based on their features and ratings
QUERY VECTOR

We use a query vector that represents our desired information, to perform similarity search and retrieve desired information from the vector database. The query vector can be either derived from:
- Same type of data as the stored vectors (using an image as a query for an image database)
- From different types of data (e.g., using text as a query for an image database).
Then, we need to use a similarity measure that calculates how close or distant two vectors are in the vector space. The similarity measure can be based on various metrics, such as cosine similarity, euclidean distance, hamming distance, or jaccard index.

The result of the similarity search and retrieval is a ranked list of vectors having the highest similarity scores with the query vector. We can then access the corresponding raw data associated with each vector from the original source or index.
APPLICATIONS
- Natural language processing
- Computer vision
- Recommendation systems
- Areas requiring semantic understanding and matching of data.
- Another use case for storing information in a vector database is to enable large language models (LLMs) to generate more relevant and coherent text based on an AI plugin.
- Stores information about different topics, keywords, facts, opinions, and/or sources related to the desired domain or genre.
POPULAR VDBs
Pinecone, Milvus, Chroma, Weaviate, Deep Lake, Qdrant, Vespa, etc.
Let’s connect and make a project together: 🐈⬛
