May 01, 2025

Image Similarity Search using Vector embeddings and Cosine similarity

 Image embeddings capture visual features (shapes, colors, objects, textures)

Image embedding models are typically CNN or vision transformer based

Image Embedding Generation:

  • Instead of embedding text, we embed images using a pre-trained vision model
  • Popular models include Vision Transformers (ViT), ResNet, EfficientNet, or CLIP
  • The model extracts features from images and converts them to dense vector representations

Core Similarities with Text Search:

  • Both methods convert unstructured data (text/images) into vector representations
  • Both use similarity metrics (typically cosine similarity) to find closest matches
  • Both can be stored in vector databases like AstraDB for efficient retrieval
Applications:

  • Product recommendations (visually similar products)
  • Reverse image search
  • Finding duplicate or near-duplicate images
  • Content moderation (finding similar inappropriate content)
  • Medical image analysis (finding similar cases)

Python code -

import os
import pandas as pd
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
from PIL import Image
import torch
from transformers import AutoImageProcessor, AutoModel

# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Initialize image embedding model - using ViT (Vision Transformer)
model_name = "google/vit-base-patch16-224"
processor = AutoImageProcessor.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name).to(device)

# Function to get image embedding
def get_image_embedding(image_path):
    image = Image.open(image_path)
    inputs = processor(images=image, return_tensors="pt").to(device)
    with torch.no_grad():
        outputs = model(**inputs)
    # Use CLS token as the image embedding
    embedding = outputs.last_hidden_state[:, 0, :].cpu().numpy().flatten()
    return embedding

# List of image paths
image_paths = [
    "Indigo Nation Men Plain Orange Shirts.jpg",
    "ADIDAS Men Navy Blue Shirts.jpg",
    "Indigo Nation Men Price catch Blue Shirts .jpg",
    "Puma Men's Foundation Grey Polo T-shirt.jpg",
    "Indigo Nation Men  Bling Pink Shirts.jpg"
]

# Query image
query_image = "Indigo Nation Men Plain Orange Shirts.jpg"

# Get embedding for the query image
query_embedding = get_image_embedding(query_image)

# Store all embeddings and their corresponding images
all_embeddings = []
for image_path in image_paths:
    embedding = get_image_embedding(image_path)
    all_embeddings.append(embedding)

# Convert to numpy arrays for similarity calculation
query_embedding_np = np.array(query_embedding).reshape(1, -1)
all_embeddings_np = np.array(all_embeddings)

# Calculate cosine similarity between query and all images
similarities = cosine_similarity(query_embedding_np, all_embeddings_np).flatten()

# Create a DataFrame to display results
results = pd.DataFrame({
    'Image': image_paths,
    'Similarity Score': similarities
})

# Sort by similarity score in descending order
results = results.sort_values('Similarity Score', ascending=False)

print(f"Query image: {query_image}")
print("\nSimilarity Search Results:")
print(results)

# Find the most similar image
most_similar_idx = np.argmax(similarities)
print(f"\nMost similar image: \"{image_paths[most_similar_idx]}\" with similarity score: {similarities[most_similar_idx]:.4f}")


No comments:

Fashion Catalog Similarity Search using Datastax AstraDB Vector Database

DataStax Astra DB's vector database capabilities can be leveraged to build an efficient fashion catalog similarity search, enabling user...