<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Thoughts in Code]]></title><description><![CDATA[Insights on tech, AI, and entrepreneurship from My journey as a developer and innovator.]]></description><link>https://www.ayarshabeer.com</link><image><url>https://substackcdn.com/image/fetch/$s_!3EI6!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9331e52a-80a4-402d-81fc-92acf8e49d03_1280x1280.png</url><title>Thoughts in Code</title><link>https://www.ayarshabeer.com</link></image><generator>Substack</generator><lastBuildDate>Mon, 06 Apr 2026 20:31:47 GMT</lastBuildDate><atom:link href="https://www.ayarshabeer.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Shabeer Ayar]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[ayarshabeer@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[ayarshabeer@substack.com]]></itunes:email><itunes:name><![CDATA[Shabeer Ayar]]></itunes:name></itunes:owner><itunes:author><![CDATA[Shabeer Ayar]]></itunes:author><googleplay:owner><![CDATA[ayarshabeer@substack.com]]></googleplay:owner><googleplay:email><![CDATA[ayarshabeer@substack.com]]></googleplay:email><googleplay:author><![CDATA[Shabeer Ayar]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[A Deep Dive into RAG Approaches: Making AI Systems Work Better with Knowledge]]></title><description><![CDATA[Ever try to explain something complicated and wish you had all your reference books handy?]]></description><link>https://www.ayarshabeer.com/p/a-deep-dive-into-rag-approaches-making</link><guid isPermaLink="false">https://www.ayarshabeer.com/p/a-deep-dive-into-rag-approaches-making</guid><dc:creator><![CDATA[Shabeer Ayar]]></dc:creator><pubDate>Mon, 13 Jan 2025 08:17:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Ego3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6300b9-fa1b-4a54-a399-baa5f4612ecb_2719x989.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ego3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6300b9-fa1b-4a54-a399-baa5f4612ecb_2719x989.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ego3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6300b9-fa1b-4a54-a399-baa5f4612ecb_2719x989.png 424w, https://substackcdn.com/image/fetch/$s_!Ego3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6300b9-fa1b-4a54-a399-baa5f4612ecb_2719x989.png 848w, https://substackcdn.com/image/fetch/$s_!Ego3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6300b9-fa1b-4a54-a399-baa5f4612ecb_2719x989.png 1272w, https://substackcdn.com/image/fetch/$s_!Ego3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6300b9-fa1b-4a54-a399-baa5f4612ecb_2719x989.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ego3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6300b9-fa1b-4a54-a399-baa5f4612ecb_2719x989.png" width="1456" height="530" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e6300b9-fa1b-4a54-a399-baa5f4612ecb_2719x989.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:530,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:100345,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ego3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6300b9-fa1b-4a54-a399-baa5f4612ecb_2719x989.png 424w, https://substackcdn.com/image/fetch/$s_!Ego3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6300b9-fa1b-4a54-a399-baa5f4612ecb_2719x989.png 848w, https://substackcdn.com/image/fetch/$s_!Ego3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6300b9-fa1b-4a54-a399-baa5f4612ecb_2719x989.png 1272w, https://substackcdn.com/image/fetch/$s_!Ego3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6300b9-fa1b-4a54-a399-baa5f4612ecb_2719x989.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Ever try to explain something complicated and wish you had all your reference books handy? That's what RAG does for AI - it helps AI systems give better answers by checking reliable sources first. Let's explore how different RAG approaches work and when to use them.</p><h2>The Foundation: Basic RAG Approaches</h2><h3>Flat Retrieval: The Simple but Effective Approach</h3><p>Think of flat retrieval like searching through a stack of papers where everything's treated equally. The system looks at each document to find relevant information.</p><p><strong>How it works:</strong></p><pre><code># Simple example of flat retrieval
documents = ["doc1.txt", "doc2.txt", "doc3.txt"]
query = "How do computers work?"

# Search through all documents equally
for doc in documents:
    similarity = calculate_similarity(query, doc)
    if similarity &gt; threshold:
        relevant_docs.append(doc)</code></pre><p><strong>Pros:</strong></p><ul><li><p>Easy to set up and maintain</p></li><li><p>Works well with smaller document collections</p></li><li><p>No complex organization needed</p></li></ul><p><strong>Cons:</strong></p><ul><li><p>Gets slower as your document collection grows</p></li><li><p>Might miss important context</p></li><li><p>Can be less precise than other methods</p></li></ul><p><strong>Best for:</strong> Small to medium-sized knowledge bases where quick setup is important.</p><h3>Hierarchical RAG: The Organized Library</h3><p>Picture a library with main sections, subsections, and individual books. That's hierarchical RAG - it organizes knowledge in levels.</p><p><strong>How it works:</strong></p><pre><code># Example of hierarchical organization
knowledge_base = {
    "technology": {
        "computers": {
            "hardware": ["cpu.txt", "memory.txt"],
            "software": ["os.txt", "apps.txt"]
        }
    }
}

def search_hierarchical(query):
    # First find relevant category
    category = find_category(query)
    # Then search within that category
    return search_documents(category, query)</code></pre><p><strong>Pros:</strong></p><ul><li><p>Faster searches in large document collections</p></li><li><p>Better context awareness</p></li><li><p>More organised knowledge structure</p></li></ul><p><strong>Cons:</strong></p><ul><li><p>Takes more time to set up</p></li><li><p>Needs regular maintenance</p></li><li><p>Might miss connections between categories</p></li></ul><p><strong>Best for:</strong> Large knowledge bases where search speed matters.</p><h2>Advanced Approaches</h2><h3>Hybrid RAG: The Best of Both Worlds</h3><p>Hybrid RAG combines different search methods, like using both keywords and meaning to find information.</p><p><strong>Real-world example:</strong></p><pre><code>def hybrid_search(query):
    # Get results from keyword search
    keyword_results = search_keywords(query)
    
    # Get results from semantic search
    semantic_results = search_semantic(query)
    
    # Combine and rank results
    final_results = combine_results(keyword_results, semantic_results)
    return final_results</code></pre><p><strong>Pros:</strong></p><ul><li><p>More accurate results</p></li><li><p>Handles different types of queries well</p></li><li><p>More robust search capability</p></li></ul><p><strong>Cons:</strong></p><ul><li><p>More complex to implement</p></li><li><p>Needs more computing power</p></li><li><p>Can be harder to debug</p></li></ul><p><strong>Best for:</strong> Systems that need high accuracy and can handle the extra complexity.</p><h3>Memory-Augmented RAG: The System That Remembers</h3><p>This approach remembers previous interactions to give better answers over time.</p><p><strong>Example of how it works:</strong></p><pre><code>class ConversationMemory:
    def __init__(self):
        self.history = []
    
    def add_interaction(self, query, response):
        self.history.append({
            'query': query,
            'response': response,
            'timestamp': time.now()
        })
    
    def get_relevant_history(self, current_query):
        return [h for h in self.history 
                if is_relevant(h, current_query)]</code></pre><p><strong>Pros:</strong></p><ul><li><p>Better personalized responses</p></li><li><p>Maintains context across conversations</p></li><li><p>Learns from past interactions</p></li></ul><p><strong>Cons:</strong></p><ul><li><p>Uses more memory</p></li><li><p>Privacy considerations</p></li><li><p>Can be biased by past interactions</p></li></ul><p><strong>Best for:</strong> Applications needing personalized, context-aware responses.</p><h2>Real-World Applications</h2><h3>Healthcare Example</h3><p>A hospital might use domain-specific RAG to help doctors find relevant medical research:</p><pre><code>medical_knowledge = {
    'symptoms': symptoms_database,
    'treatments': treatments_database,
    'research_papers': research_papers
}

def diagnose_assist(patient_symptoms):
    relevant_cases = search_medical_knowledge(patient_symptoms)
    return generate_diagnosis_suggestions(relevant_cases)</code></pre><h3>Legal Research Example</h3><p>Law firms might use hierarchical RAG to search through case law:</p><pre><code>legal_database = {
    'criminal_law': {
        'precedents': [...],
        'statutes': [...]
    },
    'civil_law': {
        'contracts': [...],
        'torts': [...]
    }
}</code></pre><h2>Choosing the Right Approach</h2><p>The best RAG approach depends on your needs:</p><ul><li><p>For a small company website: Simple flat retrieval might be enough</p></li><li><p>For a medical system: Domain-specific RAG with hierarchical organization</p></li><li><p>For a customer service bot: Memory-augmented RAG to remember customer history</p></li></ul><h2>What's Next?</h2><p>RAG systems keep getting better. New approaches like multi-modal RAG (handling text and images) and cross-lingual RAG (working across languages) are making these systems more powerful.</p><p>Remember, the goal is to help AI give better answers. Pick the approach that matches your needs, data size, and technical capabilities.</p><p>Want to try implementing one of these approaches? Let me know, and I can help you get started with more specific code examples!</p>]]></content:encoded></item><item><title><![CDATA[Why Less is More: The Case Against RAG in AI Knowledge Systems]]></title><description><![CDATA[A new research paper "Don't Do RAG: When Cache-Augmented Generation is All You Need for Knowledge Tasks" (arxiv.org/abs/2412.15605) suggests we might be overcomplicating how AI systems access knowledge.]]></description><link>https://www.ayarshabeer.com/p/why-less-is-more-the-case-against</link><guid isPermaLink="false">https://www.ayarshabeer.com/p/why-less-is-more-the-case-against</guid><dc:creator><![CDATA[Shabeer Ayar]]></dc:creator><pubDate>Thu, 02 Jan 2025 12:48:17 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!WCSO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b5d0352-d6c4-40ee-9fc4-88c1c9b2ffd5_1222x812.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WCSO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b5d0352-d6c4-40ee-9fc4-88c1c9b2ffd5_1222x812.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WCSO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b5d0352-d6c4-40ee-9fc4-88c1c9b2ffd5_1222x812.png 424w, https://substackcdn.com/image/fetch/$s_!WCSO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b5d0352-d6c4-40ee-9fc4-88c1c9b2ffd5_1222x812.png 848w, https://substackcdn.com/image/fetch/$s_!WCSO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b5d0352-d6c4-40ee-9fc4-88c1c9b2ffd5_1222x812.png 1272w, https://substackcdn.com/image/fetch/$s_!WCSO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b5d0352-d6c4-40ee-9fc4-88c1c9b2ffd5_1222x812.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WCSO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b5d0352-d6c4-40ee-9fc4-88c1c9b2ffd5_1222x812.png" width="1222" height="812" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0b5d0352-d6c4-40ee-9fc4-88c1c9b2ffd5_1222x812.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:812,&quot;width&quot;:1222,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:202871,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WCSO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b5d0352-d6c4-40ee-9fc4-88c1c9b2ffd5_1222x812.png 424w, https://substackcdn.com/image/fetch/$s_!WCSO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b5d0352-d6c4-40ee-9fc4-88c1c9b2ffd5_1222x812.png 848w, https://substackcdn.com/image/fetch/$s_!WCSO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b5d0352-d6c4-40ee-9fc4-88c1c9b2ffd5_1222x812.png 1272w, https://substackcdn.com/image/fetch/$s_!WCSO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b5d0352-d6c4-40ee-9fc4-88c1c9b2ffd5_1222x812.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>A new research paper "Don't Do RAG: When Cache-Augmented Generation is All You Need for Knowledge Tasks" (arxiv.org/abs/2412.15605) suggests we might be overcomplicating how AI systems access knowledge. Instead of the popular RAG (Retrieval-Augmented Generation) approach, researchers propose a simpler solution called CAG (Cache-Augmented Generation) that could work better in many cases.</p><p><strong>The Problem with RAG</strong></p><p>Current AI systems often use RAG to answer questions that need specific knowledge. Think of RAG like an AI assistant that needs to look up information in a database before answering your question. While this works, it has some issues:</p><ul><li><p>It's slow because the system has to search for information each time</p></li><li><p>Sometimes it grabs the wrong information</p></li><li><p>The whole system is pretty complex to set up and maintain</p></li></ul><p><strong>A Simpler Way: Cache-Augmented Generation</strong></p><p>The researchers suggest loading all the needed information into the AI's memory upfront instead of searching for it each time. It's like giving the AI a cheat sheet before the test rather than having it flip through a textbook for every question.</p><p><strong>This approach:</strong></p><ul><li><p>Works faster because there's no searching needed</p></li><li><p>Avoids mistakes from grabbing wrong information</p></li><li><p>Is simpler to set up and run</p></li></ul><p><strong>The Results</strong></p><p>The team tested their idea on two common question-answering tasks (SQuAD and HotPotQA). The results? The simpler cache approach worked as well or better than traditional RAG systems. Plus, it was way faster - in some cases, answering questions up to 40 times quicker.</p><p><strong>Important Considerations and Limitations</strong></p><p>While CAG shows promise, it's important to understand its limitations:</p><p><strong>Data Management Challenges:</strong></p><ul><li><p>Needs more memory upfront to store all information</p></li><li><p>Can be tricky to manage with large datasets</p></li><li><p>RAG might be more efficient with memory usage</p></li></ul><p><strong>Staying Current:</strong></p><ul><li><p>CAG uses preloaded data, so it can't access new information easily</p></li><li><p>RAG can pull fresh data in real-time</p></li><li><p>Better for situations where information changes often</p></li></ul><p><strong>Scaling Issues:</strong></p><ul><li><p>Might struggle with very large knowledge bases</p></li><li><p>RAG handles big data better by only grabbing what it needs</p></li><li><p>Performance could drop as data grows</p></li></ul><p><strong>Best Use Cases</strong></p><p><strong>CAG works best when:</strong></p><ul><li><p>You have a manageable amount of information to work with</p></li><li><p>The knowledge base doesn't change much</p></li><li><p>Speed is important</p></li><li><p>You want a simpler system</p></li></ul><p><strong>RAG might be better when:</strong></p><ul><li><p>You're working with constantly updating information</p></li><li><p>Your knowledge base is massive</p></li><li><p>You need flexible access to different data sources</p></li><li><p>Real-time data access is crucial</p></li></ul><p><strong>What This Means for the Future</strong></p><p>As AI models get better at handling more information at once, this approach could become even more useful. We might see a hybrid approach where systems use CAG for stable, frequently-accessed information and RAG for dynamic, real-time data needs.</p><p>This research from Brian J Chan, Chao-Ting Chen, Jui-Hung Cheng, and Hen-Hsen Huang reminds us that there's no one-size-fits-all solution in AI. The key is choosing the right tool for your specific needs.</p><p>Want to learn more? Check out the full paper on <a href="https://arxiv.org/html/2412.15605v1">https://arxiv.org/html/2412.15605v1</a></p>]]></content:encoded></item><item><title><![CDATA[Building a Simple Document Q&A System with RAG: A Step-by-Step Guide]]></title><description><![CDATA[Have you ever wanted to chat with your documents?]]></description><link>https://www.ayarshabeer.com/p/building-a-simple-document-q-and</link><guid isPermaLink="false">https://www.ayarshabeer.com/p/building-a-simple-document-q-and</guid><dc:creator><![CDATA[Shabeer Ayar]]></dc:creator><pubDate>Mon, 16 Dec 2024 10:45:44 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/ef4a997b-f065-4cf7-abca-ff7a7e157e61_3004x1588.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Have you ever wanted to chat with your documents? Let's look at a document Q&amp;A system that uses RAG (Retrieval Augmented Generation) to help you do just that. This app lets you upload documents and ask questions about them in plain English.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;a5b8e804-b99c-4502-a288-3a781d8b4700&quot;,&quot;duration&quot;:null}"></div><p></p><h2>What Can It Do?</h2><ul><li><p>Upload multiple documents (PDF, Word, Markdown)</p></li><li><p>Ask questions in natural language</p></li><li><p>Get detailed answers with source citations</p></li><li><p>Process documents once and reuse them for many questions</p></li></ul><h2>How It Works</h2><p>Let's break down the main parts of the system:</p><h3>1. Document Processing</h3><p>When you upload a document, the app:</p><ul><li><p>Splits it into smaller chunks that are easier to process</p></li><li><p>Creates embeddings (numerical representations) of each chunk using the MiniLM model</p></li><li><p>Stores these embeddings in Pinecone, a vector database</p></li><li><p>Keeps track of which files it has processed to avoid duplicate work</p></li></ul><h3>2. Question Answering</h3><p>When you ask a question:</p><ul><li><p>The app converts your question into an embedding</p></li><li><p>Searches Pinecone for the most relevant document chunks</p></li><li><p>Sends these chunks along with your question to GPT-4</p></li><li><p>Returns an answer based on the document content</p></li></ul><h3>3. Key Components</h3><p>The system uses several modern tools:</p><ul><li><p>Streamlit for the web interface</p></li><li><p>LangChain for document processing and RAG pipeline</p></li><li><p>Sentence Transformers for creating embeddings</p></li><li><p>Pinecone for storing and searching document chunks</p></li><li><p>OpenAI's GPT-4 for generating answers</p></li></ul><p>Here's a sample of what happens when you ask a question:</p><pre><code><code># User asks: "What are the main features of the product?"

1. Question &#8594; Embedding
question_embedding = embedding_model.encode("What are the main features?")

2. Find Relevant Chunks
relevant_docs = vector_store.similarity_search(question_embedding)

3. Generate Answer
answer = llm.generate_answer(question, relevant_docs)
</code></code></pre><h2>Technical Deep Dive</h2><p>The app uses a three-step RAG process:</p><ol><li><p><strong>Retrieval</strong>: The system finds relevant information from your documents using semantic search. It compares the meaning of your question with stored document chunks.</p></li><li><p><strong>Augmentation</strong>: It takes the retrieved chunks and adds them as context to your question. This gives the language model specific information to work with.</p></li><li><p><strong>Generation</strong>: GPT-4 generates an answer using only the provided context, ensuring responses are grounded in your documents.</p></li></ol><h3>Smart Features</h3><ul><li><p><strong>Deduplication</strong>: The app remembers which files it has processed to avoid duplicate work</p></li><li><p><strong>Chunk Management</strong>: Documents are split with overlap to maintain context</p></li><li><p><strong>Source Tracking</strong>: Every answer comes with references to source documents</p></li><li><p><strong>Configurable Retrieval</strong>: You can adjust how many document chunks to use per query</p></li></ul><h2>Setting Up Your Own Instance</h2><p>To run this app, you'll need:</p><ul><li><p>OpenAI API key (<a href="https://platform.openai.com/api-keys">openai.com</a>)</p></li><li><p>Pinecone API key and index (<a href="https://www.pinecone.io/">pinecone.io</a>)</p></li><li><p>Python environment with required packages</p><p></p></li></ul><h2>Getting Started</h2><ol><li><p>Clone the repository</p></li><li><p>Install requirements</p></li><li><p>Set up your API keys</p></li><li><p>Run the Streamlit app</p></li></ol><p>Demo : <a href="https://doc-rag.streamlit.app/">https://doc-rag.streamlit.app/</a></p><p>[Source Code: <a href="https://github.com/ayarshabeer/doc-qa">Github</a>]</p><h2>Technical Requirements</h2><ul><li><p>Python 3.11+</p></li><li><p>Streamlit</p></li><li><p>LangChain</p></li><li><p>Sentence Transformers</p></li><li><p>Pinecone</p></li><li><p>OpenAI API access</p></li></ul><p>This Q&amp;A system shows how modern AI tools can make document interaction more natural and efficient. By combining RAG with a user-friendly interface, we've created a practical tool for anyone who needs to quickly find information in their documents.</p><p>Would you like me to explain any part in more detail?</p>]]></content:encoded></item><item><title><![CDATA[UV: Making Python Package Management Fast and Simple]]></title><description><![CDATA[Python package management has always been a bit slow.]]></description><link>https://www.ayarshabeer.com/p/uv-making-python-package-management</link><guid isPermaLink="false">https://www.ayarshabeer.com/p/uv-making-python-package-management</guid><dc:creator><![CDATA[Shabeer Ayar]]></dc:creator><pubDate>Thu, 12 Dec 2024 12:54:31 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/95cac3c8-97cc-4224-84c7-403b949ae7d3_2536x1424.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Python package management has always been a bit slow. If you've worked on Python projects, you know the wait when running <code>&#8220;pip install&#8221;</code>. But there's good news - UV is here to change that.</p><h2>What is UV?</h2><p>UV is a new Python package manager written in Rust. It does the same job as pip but much faster. Think of it as pip's speedier cousin.</p><h2>Why Should You Care?</h2><p>Simple - it saves time. Here's what UV does better:</p><ol><li><p>It's really fast - about 10-100x faster than pip</p></li><li><p>Works with your existing project setup</p></li><li><p>Handles both packages and Python versions</p></li></ol><h2>Getting Started</h2><p>First, let's install UV:</p><pre><code><code>curl -LsSf https://astral.sh/uv/install.sh | sh
</code></code></pre><p>Now, let's see UV in action. Here's how to create a new project:</p><pre><code><code># Create a new virtual environment
uv venv

# Activate it (on Unix systems)
source .venv/bin/activate

# Install some packages
uv pip install flask pandas
</code></code></pre><h2>Real-World Example</h2><p>Let's say you're building a web app. Here's how UV makes it easier:</p><pre><code><code># Create a requirements.txt file
uv pip freeze &gt; requirements.txt

# Install dependencies in a new environment
uv pip sync requirements.txt
</code></code></pre><h2>Key Features That Matter</h2><h3>Fast Installation</h3><p>UV installs packages in parallel. What used to take minutes now takes seconds.</p><h3>Smart Caching</h3><p>UV remembers what you've installed before. No more downloading the same packages over and over.</p><h3>Python Version Management</h3><p>Need Python 3.10 for one project and 3.11 for another? UV handles that:</p><pre><code><code>uv python install 3.11
uv python install 3.10
</code></code></pre><h2>Common Tasks Made Simple</h2><h3>Working with Requirements Files</h3><pre><code><code># Generate requirements.txt
uv pip freeze &gt; requirements.txt

# Install from requirements.txt
uv pip install -r requirements.txt
</code></code></pre><h3>Managing Virtual Environments</h3><pre><code><code># Create a venv
uv venv

# Remove a venv
rm -rf .venv
</code></code></pre><h2>When to Use UV</h2><p>UV is great for:</p><ul><li><p>New Python projects where you want fast setup</p></li><li><p>CI/CD pipelines where speed matters</p></li><li><p>Projects with lots of dependencies</p></li><li><p>Teams that need consistent Python versions</p></li></ul><h2>Should You Switch?</h2><p>If you're happy with pip, you don't need to switch right away. But if you want faster package installation and simpler Python version management, UV is worth trying.</p><h2>Tips for Success</h2><ul><li><p>Keep your requirements.txt up to date</p></li><li><p>Use UV's cache to speed up installations</p></li><li><p>Try UV in a test project first</p></li><li><p>Remember UV works alongside pip - you don't have to choose one or the other</p></li></ul><h2>Wrapping Up</h2><p>UV makes Python package management faster and simpler. It's not trying to replace everything - it just does common tasks better. Give it a try on your next project. Checkout official doc: <a href="https://docs.astral.sh/uv">https://docs.astral.sh/uv</a></p><p>Remember: You can still use your familiar pip commands with UV. Just add <code>uv pip</code> instead of <code>pip</code> and you're good to go.</p>]]></content:encoded></item><item><title><![CDATA[Understanding RAG (Retrieval-Augmented Generation) Techniques: A Beginner’s Guide]]></title><description><![CDATA[If you&#8217;ve ever used a chatbot or a smart assistant and been impressed by how well it answers your questions, there&#8217;s a good chance Retrieval-Augmented Generation (RAG) was involved.]]></description><link>https://www.ayarshabeer.com/p/understanding-rag-retrieval-augmented</link><guid isPermaLink="false">https://www.ayarshabeer.com/p/understanding-rag-retrieval-augmented</guid><dc:creator><![CDATA[Shabeer Ayar]]></dc:creator><pubDate>Thu, 05 Dec 2024 09:04:12 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/1a736385-5ed9-4618-9a60-144af581b9bb_1005x511.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you&#8217;ve ever used a chatbot or a smart assistant and been impressed by how well it answers your questions, there&#8217;s a good chance Retrieval-Augmented Generation (RAG) was involved. RAG is a simple but powerful concept: it retrieves relevant information (like from a database or documents) and uses that to generate an answer. Let&#8217;s break down some cool variations of RAG, what makes each special, and when to use them.</p><h4><strong>1. Simple RAG</strong></h4><p>At its core, Simple RAG is the bread and butter of this method. Imagine you&#8217;re searching for the best pizza in town. Simple RAG would go through restaurant reviews and pick out the most relevant information for you. Then, it uses that info to give you a summarised answer.</p><p>For example:</p><p>&#8226; <strong>Query:</strong> &#8220;What are the benefits of eating fruits?&#8221;</p><p>&#8226; <strong>Simple RAG&#8217;s Answer:</strong> &#8220;Fruits are rich in vitamins, fiber, and antioxidants, which support a healthy immune system.&#8221;</p><p>It&#8217;s straightforward but relies on having good data to fetch from.</p><div><hr></div><p><strong>2. Simple RAG with Memory</strong></p><p>Now let&#8217;s add memory to the mix. If you&#8217;ve been chatting about your favourite books, Simple RAG with Memory remembers your previous mentions. It uses that context to make follow-up conversations smarter and more personalised.</p><p>Example:</p><p>&#8226; <strong>You:</strong> &#8220;Tell me about Harry Potter.&#8221;</p><p>&#8226; <strong>Follow-up:</strong> &#8220;What other books are similar to it?&#8221;</p><p>&#8226; <strong>RAG with Memory:</strong> &#8220;You might enjoy &#8216;Percy Jackson&#8217; since it also explores magic and young heroes.&#8221;</p><p>This approach is great for maintaining meaningful interactions, especially in customer support or personal assistants.</p><div><hr></div><p><strong>3. Branched RAG</strong></p><p>Sometimes, one step of retrieval isn&#8217;t enough. Branched RAG takes things further by doing multiple rounds of searching, refining its results each time.</p><p>Imagine you&#8217;re planning a trip:</p><p>&#8226; <strong>Step 1:</strong> Search for top tourist spots in Paris.</p><p>&#8226; <strong>Step 2:</strong> Find restaurants near those spots.</p><p>&#8226; <strong>Step 3:</strong> Combine results into a travel plan.</p><p>This method is like asking follow-up questions during a conversation to get better details.</p><div><hr></div><p><strong>4. HyDE (Hypothetical Document Embedding)</strong></p><p>HyDE gets creative! Before searching, it imagines what the perfect answer would look like. Then, it uses this imagined answer to guide its search for real documents.</p><p>Think of it like guessing what your dream home might look like, then browsing listings to match that vision.</p><p>Example:</p><p>&#8226; <strong>Query:</strong> &#8220;How do plants grow?&#8221;</p><p>&#8226; <strong>HyDE&#8217;s Ideal Answer:</strong> &#8220;Plants grow through photosynthesis and require sunlight, water, and nutrients.&#8221;</p><p>&#8226; <strong>Search Results:</strong> Finds documents matching this description for a more accurate response.</p><div><hr></div><p><strong>5. Adaptive RAG</strong></p><p>What if some questions are easy, and others are super tricky? Adaptive RAG is smart enough to switch strategies depending on the difficulty.</p><p>&#8226; <strong>Easy Query:</strong> &#8220;What&#8217;s 2+2?&#8221; -&gt; Directly answers.</p><p>&#8226; <strong>Hard Query:</strong> &#8220;What&#8217;s the history of the Internet?&#8221; -&gt; Spends more time retrieving and generating a detailed response.</p><p>This adaptability makes it useful for dynamic environments like teaching tools or research assistance.</p><div><hr></div><p><strong>6. Corrective RAG (CRAG)</strong></p><p>Ever gotten a chatbot response that felt a bit off? CRAG fixes that. It fact-checks its own answers against retrieved information and improves the response if needed.</p><p>Example:</p><p>&#8226; <strong>Initial Answer:</strong> &#8220;Dinosaurs lived 10 million years ago.&#8221;</p><p>&#8226; <strong>Fact-Check:</strong> &#8220;Dinosaurs actually lived about 65 million years ago.&#8221;</p><p>&#8226; <strong>Corrected Answer:</strong> &#8220;Dinosaurs lived about 65 million years ago.&#8221;</p><p>This technique is perfect for ensuring accuracy in sensitive fields like medicine or law.</p><div><hr></div><p><strong>7. Self-RAG</strong></p><p>Here&#8217;s where things get introspective. Self-RAG evaluates its own answers, finds flaws, and fixes them. It&#8217;s like writing an essay, re-reading it, and improving the weak parts.</p><p>For example, if the first response was vague, Self-RAG might dive back into its sources and refine the answer until it&#8217;s crystal clear.</p><div><hr></div><p><strong>8. Agentic RAG</strong></p><p>This is the superhero of RAGs. Agentic RAG doesn&#8217;t just stop at answering questions&#8212;it solves problems step by step. It&#8217;s like having a digital assistant that plans your day, books appointments, and handles emails, all while chatting with you.</p><p>Example:</p><p>&#8226; <strong>Query:</strong> &#8220;Help me plan a weekend trip.&#8221;</p><p>&#8226; <strong>Agentic RAG&#8217;s Actions:</strong></p><p>1. Finds destinations.</p><p>2. Suggests travel options.</p><p>3. Books hotels and activities based on your preferences.</p><p>This is ideal for complex tasks that require multiple steps and decision-making.</p><div><hr></div><p><strong>The Role of Vector Databases</strong></p><p>All these RAG techniques rely on something crucial: the ability to find relevant data quickly. That&#8217;s where vector databases come in. They store data in a way that makes it easy to find &#8220;similar&#8221; pieces of information based on your query.</p><p>For example, if your query is &#8220;healthy eating,&#8221; a vector database might find documents about fruits, vegetables, and balanced diets&#8212;even if those documents don&#8217;t explicitly use the words &#8220;healthy eating.&#8221;</p><div><hr></div><p><strong>When to Use Each RAG Technique</strong></p><p>&#8226; <strong>Simple RAG:</strong> Quick, straightforward answers.</p><p>&#8226; <strong>Simple RAG with Memory:</strong> For ongoing, personalised conversations.</p><p>&#8226; <strong>Branched RAG:</strong> When you need multi-step research.</p><p>&#8226; <strong>HyDE:</strong> For vague or open-ended questions.</p><p>&#8226; <strong>Adaptive RAG:</strong> For mixed difficulty queries.</p><p>&#8226; <strong>Corrective RAG:</strong> Where accuracy is a must.</p><p>&#8226; <strong>Self-RAG:</strong> To ensure high-quality answers.</p><p>&#8226; <strong>Agentic RAG:</strong> For tasks that require planning and execution.</p><p><strong>Conclusion</strong></p><p>RAG and its variations are transforming how we interact with AI. Whether it&#8217;s answering simple questions or tackling complex tasks, there&#8217;s a RAG technique for every scenario. The beauty lies in its ability to blend retrieval with generation, ensuring that answers are both relevant and insightful.</p><p>Got questions or want to dive deeper into these techniques? Let&#8217;s chat in the comments! &#128522;</p>]]></content:encoded></item><item><title><![CDATA[Making AI Conversations Smarter with Model Context Protocol]]></title><description><![CDATA[Hey there!]]></description><link>https://www.ayarshabeer.com/p/making-ai-conversations-smarter-with</link><guid isPermaLink="false">https://www.ayarshabeer.com/p/making-ai-conversations-smarter-with</guid><dc:creator><![CDATA[Shabeer Ayar]]></dc:creator><pubDate>Wed, 27 Nov 2024 09:28:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!N8jE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3dee-04e7-40ac-8099-de6499440da0_1464x1158.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there!</p><p>When large language models first hit the scene, interacting with them was&#8230; clunky, to say the least. Remember having to copy and paste code into a text box just to get them to respond? Yeah, it worked, but barely.</p><p>Developers quickly realized that this wasn&#8217;t cutting it. Custom integrations popped up to improve how context was handled, but they were all over the place&#8212;each tool or app had its own system, which made everything fragmented and time-consuming to build.</p><p>Enter the <strong>Model Context Protocol (MCP)</strong>: a simple, universal solution to make AI interactions smoother, more efficient, and way less of a headache for developers.</p><div><hr></div><p><strong>The Problem MCP Solves</strong></p><p>Here&#8217;s the deal: when working with AI, you often need it to interact with local and remote resources&#8212;think databases, APIs, or even just the ongoing chat history.</p><p>Back in the day, you had to create custom pipelines to load this context into the model. Every project needed its own unique setup, and there was no standard way of doing it. This led to:</p><p>&#8226; <strong>Extra Work</strong>: Rebuilding the wheel for every project.</p><p>&#8226; <strong>Fragmentation</strong>: Every app did things differently, making collaboration tricky.</p><p>&#8226; <strong>Inconsistent Results</strong>: Custom systems often didn&#8217;t work as well as they should.</p><p>MCP fixes this by introducing a <strong>universal protocol</strong> that works across applications, making it easier to manage and load context efficiently.</p><div><hr></div><p><strong>What is the Model Context Protocol?</strong></p><p>At its core, MCP is a standardized way to handle context when working with large language models. Whether your resources are local (files, user inputs) or remote (APIs, cloud databases), MCP ensures the AI model has everything it needs to function properly.</p><p>Here&#8217;s what it brings to the table:</p><p>1. <strong>Universal Integration</strong>: One protocol to handle context across all your applications.</p><p>2. <strong>Efficient Context Management</strong>: Load only the relevant information, keeping interactions streamlined.</p><p>3. <strong>Seamless AI Interaction</strong>: Simplifies how your app communicates with the AI, whether it&#8217;s pulling data from a local file or querying a cloud API.</p><div><hr></div><p><strong>Why MCP is a Game-Changer</strong></p><p>MCP isn&#8217;t just about making life easier for developers (though it does). It&#8217;s also about improving the overall user experience. Here&#8217;s why it stands out:</p><p>1. <strong>No More Reinventing the Wheel</strong></p><p>With MCP, you don&#8217;t need to build custom pipelines for every project. It&#8217;s a plug-and-play solution that just works.</p><p>2. <strong>Better Resource Access</strong></p><p>Need your AI to pull data from a file on the user&#8217;s computer <em>and</em> query a remote database? MCP handles it all seamlessly.</p><p>3. <strong>Consistency Across Apps</strong></p><p>Because it&#8217;s a universal protocol, MCP ensures consistent behavior no matter where or how you&#8217;re using it.</p><p>4. <strong>Focus on What Matters</strong></p><p>Developers can spend more time building features and less time wrestling with context-loading issues.</p><div><hr></div><p><strong>How MCP Works</strong></p><p>Here&#8217;s a quick overview of how MCP simplifies AI interactions:</p><p>1. <strong>Define the Context</strong></p><p>Start by deciding what information the AI needs&#8212;this could be user inputs, files, or API responses. MCP makes it easy to structure and package this context.</p><p>2. <strong>Load the Context</strong></p><p>Use MCP to send this context to the AI model. Whether the resource is local or remote, the protocol handles the specifics.</p><p>3. <strong>Dynamic Updates</strong></p><p>As new information becomes available (e.g., the user provides additional input), MCP lets you update the context in real time without skipping a beat.</p><div><hr></div><p><strong>A Practical Example</strong></p><p>Let&#8217;s say you&#8217;re building a code assistant. Here&#8217;s how MCP helps:</p><p>&#8226; <strong>Without MCP</strong>: You&#8217;d need to create custom code to load files, process user inputs, and pull API data, then send it to the model in a format it understands.</p><p>&#8226; <strong>With MCP</strong>: You define the context (e.g., the file paths and user input), and MCP takes care of the rest&#8212;no need to rebuild the same pipelines for every project.</p><div><hr></div><p><strong>Getting Started with MCP</strong></p><p>If this sounds like something you need (and let&#8217;s be real, it probably is), the <a href="https://modelcontextprotocol.io/quickstart">MCP Quickstart Guide</a> is a great place to begin. It walks you through:</p><p>&#8226; Installing the MCP library.</p><p>&#8226; Setting up your first context object.</p><p>&#8226; Sending and updating context with simple API calls.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!N8jE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3dee-04e7-40ac-8099-de6499440da0_1464x1158.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!N8jE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3dee-04e7-40ac-8099-de6499440da0_1464x1158.png 424w, https://substackcdn.com/image/fetch/$s_!N8jE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3dee-04e7-40ac-8099-de6499440da0_1464x1158.png 848w, https://substackcdn.com/image/fetch/$s_!N8jE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3dee-04e7-40ac-8099-de6499440da0_1464x1158.png 1272w, https://substackcdn.com/image/fetch/$s_!N8jE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3dee-04e7-40ac-8099-de6499440da0_1464x1158.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!N8jE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3dee-04e7-40ac-8099-de6499440da0_1464x1158.png" width="1456" height="1152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2aef3dee-04e7-40ac-8099-de6499440da0_1464x1158.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1152,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:136043,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!N8jE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3dee-04e7-40ac-8099-de6499440da0_1464x1158.png 424w, https://substackcdn.com/image/fetch/$s_!N8jE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3dee-04e7-40ac-8099-de6499440da0_1464x1158.png 848w, https://substackcdn.com/image/fetch/$s_!N8jE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3dee-04e7-40ac-8099-de6499440da0_1464x1158.png 1272w, https://substackcdn.com/image/fetch/$s_!N8jE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aef3dee-04e7-40ac-8099-de6499440da0_1464x1158.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><p><strong>Final Thoughts</strong></p><p>The Model Context Protocol solves a real problem for developers: making AI smarter and easier to work with by standardizing how context is managed. It cuts out the messy, repetitive work of building custom solutions and lets you focus on what you do best&#8212;building great applications.</p><p>If you&#8217;re working with large language models, MCP is worth checking out. Your future self (and your users) will thank you.</p><p>Happy coding! &#128522;</p>]]></content:encoded></item><item><title><![CDATA[Why Text2SQL Falls Short: How TAG is Revolutionizing AI-Powered Database Queries]]></title><description><![CDATA[Unlocking deeper insights with a blend of AI reasoning and traditional data management systems]]></description><link>https://www.ayarshabeer.com/p/why-text2sql-falls-short-how-tag</link><guid isPermaLink="false">https://www.ayarshabeer.com/p/why-text2sql-falls-short-how-tag</guid><dc:creator><![CDATA[Shabeer Ayar]]></dc:creator><pubDate>Tue, 01 Oct 2024 11:11:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!GSh5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02bcad13-1274-4682-9911-551d9a471e03_722x742.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In today's data-driven world, the idea of querying databases with natural language is fascinating. Imagine being able to ask any question in plain English and instantly getting the right answer, no SQL queries or technical skills needed. This is where technologies like Text2SQL come into play, aiming to translate natural language into database queries. But as incredible as that sounds, it&#8217;s not quite enough for the complex questions businesses face daily. Enter Table-Augmented Generation (TAG), a new approach designed to tackle this challenge by combining the strengths of AI and traditional database systems.</p><h3>What is Text2SQL?</h3><p>Text2SQL is a system that translates natural language questions into SQL queries. It works pretty well for simple database queries, especially those that have direct relational equivalents like "What is the total sales for Q1?". It does a great job handling these types of questions. However, real-world questions often go beyond what SQL can express. Users don&#8217;t just want numbers&#8212;they want insights, patterns, or explanations that involve reasoning, world knowledge, and sometimes, complex data relationships.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.ayarshabeer.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Claybyte! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h3>The Problem with Current Methods</h3><p>While Text2SQL is a step forward, it has limitations. It can only answer questions that fit neatly into SQL queries. For instance, let&#8217;s say you want to know, &#8220;Why did my sales drop in the last quarter?&#8221; or &#8220;What are customers saying about product X in their reviews?&#8221;. These are not simple lookups. They require a deeper understanding of the data, the ability to analyze trends, and sometimes even perform sentiment analysis. Text2SQL falls short in these scenarios because it doesn&#8217;t have the capability to reason over unstructured data or connect external world knowledge.</p><p>Similarly, another approach called Retrieval-Augmented Generation (RAG) tries to address this by retrieving a few relevant pieces of information and using AI to generate answers. But RAG is also limited to small, simple queries that can be solved with basic lookups.</p><h3>Enter TAG: A New Paradigm</h3><p>TAG, or Table-Augmented Generation, is a more flexible and powerful way of answering natural language questions over databases. It works in three steps:</p><ol><li><p><strong>Query Synthesis</strong>: TAG first takes the natural language question and creates an executable query based on the database schema.</p></li><li><p><strong>Query Execution</strong>: It then runs this query to retrieve the relevant data from the database.</p></li><li><p><strong>Answer Generation</strong>: Finally, TAG uses the retrieved data along with AI to generate a coherent and useful answer in natural language.</p></li></ol><p>Unlike Text2SQL or RAG, TAG can handle much more complex queries. It can incorporate reasoning, context, and even external knowledge not explicitly stored in the database. This allows it to answer a broader range of questions with higher accuracy.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GSh5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02bcad13-1274-4682-9911-551d9a471e03_722x742.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GSh5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02bcad13-1274-4682-9911-551d9a471e03_722x742.png 424w, https://substackcdn.com/image/fetch/$s_!GSh5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02bcad13-1274-4682-9911-551d9a471e03_722x742.png 848w, https://substackcdn.com/image/fetch/$s_!GSh5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02bcad13-1274-4682-9911-551d9a471e03_722x742.png 1272w, https://substackcdn.com/image/fetch/$s_!GSh5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02bcad13-1274-4682-9911-551d9a471e03_722x742.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GSh5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02bcad13-1274-4682-9911-551d9a471e03_722x742.png" width="722" height="742" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/02bcad13-1274-4682-9911-551d9a471e03_722x742.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:742,&quot;width&quot;:722,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:104013,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GSh5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02bcad13-1274-4682-9911-551d9a471e03_722x742.png 424w, https://substackcdn.com/image/fetch/$s_!GSh5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02bcad13-1274-4682-9911-551d9a471e03_722x742.png 848w, https://substackcdn.com/image/fetch/$s_!GSh5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02bcad13-1274-4682-9911-551d9a471e03_722x742.png 1272w, https://substackcdn.com/image/fetch/$s_!GSh5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02bcad13-1274-4682-9911-551d9a471e03_722x742.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">An example TAG implementation for answering the user&#8217;s natural language question over a table about movies. The TAG pipeline proceeds in three stages: query synthesis, query execution, and answer generation.</figcaption></figure></div><p></p><h3>Why TAG is Better</h3><p>What makes TAG stand out is its ability to combine the raw computational power of databases with the reasoning abilities of AI. Databases are great at handling large-scale, structured data and performing exact computations like aggregations or filtering. On the other hand, AI models excel at semantic reasoning&#8212;understanding meaning from unstructured data like text, images, or external world knowledge.</p><p>For example, if you ask, "Which customer reviews of product X are positive?", TAG can retrieve the reviews from the database and use AI to determine whether each review is positive or negative. Similarly, if you ask, "What are the trends in retail for the last quarter?", TAG can combine data from the database with external knowledge about the retail sector to give a more nuanced answer.</p><h3>The Road Ahead</h3><p>The research on TAG is still developing, but early results are promising. TAG systems outperform current methods in both accuracy and execution time, especially for complex queries. While common approaches like Text2SQL might correctly answer 20% of queries, TAG-based systems are hitting success rates as high as 65%. This significant improvement highlights the potential of TAG to transform how we interact with data.</p><p>By unifying AI and database capabilities, TAG offers an exciting way forward for businesses and users who need more than just simple database lookups. Whether you're analyzing customer feedback, tracking sales performance, or exploring complex datasets, TAG brings us one step closer to the dream of truly intelligent, natural language interfaces for data.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.ayarshabeer.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Claybyte! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Beginner’s Guide to Vector Databases: Understanding the Basics]]></title><description><![CDATA[As technology advances, so does the way we handle data.]]></description><link>https://www.ayarshabeer.com/p/the-beginners-guide-to-vector-databases</link><guid isPermaLink="false">https://www.ayarshabeer.com/p/the-beginners-guide-to-vector-databases</guid><dc:creator><![CDATA[Shabeer Ayar]]></dc:creator><pubDate>Sat, 01 Jun 2024 09:39:05 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!iXQe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faceeb0d6-e34c-4630-872e-baacf36ff5ee_735x751.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>As technology advances, so does the way we handle data. Traditional databases are great for numbers and straightforward data, but they fall short when it comes to more complex stuff like images, videos, and large text files. That&#8217;s where vector databases come in, providing a smart solution for today&#8217;s data challenges. Let&#8217;s break down what vector databases are, their pros and cons, what they&#8217;re used for, and how to choose the right one for your needs.</p><h4><strong>What is a Vector Database?</strong></h4><p>Imagine if you could search for data not by exact match, but by similarity. That&#8217;s what vector databases do. They store information as vectors&#8212;basically, long lists of numbers that a computer uses to understand and compare different pieces of data. This makes it super easy to find things that are similar to each other, even in a huge sea of information.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iXQe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faceeb0d6-e34c-4630-872e-baacf36ff5ee_735x751.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iXQe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faceeb0d6-e34c-4630-872e-baacf36ff5ee_735x751.jpeg 424w, https://substackcdn.com/image/fetch/$s_!iXQe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faceeb0d6-e34c-4630-872e-baacf36ff5ee_735x751.jpeg 848w, https://substackcdn.com/image/fetch/$s_!iXQe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faceeb0d6-e34c-4630-872e-baacf36ff5ee_735x751.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!iXQe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faceeb0d6-e34c-4630-872e-baacf36ff5ee_735x751.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iXQe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faceeb0d6-e34c-4630-872e-baacf36ff5ee_735x751.jpeg" width="735" height="751" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aceeb0d6-e34c-4630-872e-baacf36ff5ee_735x751.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:751,&quot;width&quot;:735,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:177952,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iXQe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faceeb0d6-e34c-4630-872e-baacf36ff5ee_735x751.jpeg 424w, https://substackcdn.com/image/fetch/$s_!iXQe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faceeb0d6-e34c-4630-872e-baacf36ff5ee_735x751.jpeg 848w, https://substackcdn.com/image/fetch/$s_!iXQe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faceeb0d6-e34c-4630-872e-baacf36ff5ee_735x751.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!iXQe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faceeb0d6-e34c-4630-872e-baacf36ff5ee_735x751.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Illustration of a 3D vector space where data points represent words grouped by similarity. For instance, &#8216;Wolf&#8217;, &#8216;Dog&#8217;, and &#8216;Cat&#8217; are closer together in space, while &#8216;Apple&#8217; and &#8216;Banana&#8217; form a distinct cluster, demonstrating semantic relationships in a vector database.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.ayarshabeer.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Shabeer&#8217;s Substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h4><strong>Benefits of Vector Databases</strong></h4><p>1. <strong>Quick Searches for Similar Items</strong>: If you&#8217;ve ever used a website that suggests products or movies based on what you like, you&#8217;ve seen vector databases in action. They&#8217;re fantastic at quickly finding items that resemble each other.</p><p>2. <strong>Handles Lots of Data Easily</strong>: These databases are built to manage large amounts of data smoothly, which is perfect for businesses dealing with tons of information.</p><p>3. <strong>Works Well with AI</strong>: Vector databases fit nicely with AI and machine learning, making them a go-to for applications that use these technologies.</p><h4><strong>Downsides of Vector Databases</strong></h4><p>1. <strong>Complex to Manage</strong>: Setting up and maintaining these databases can be tricky unless you really know what you&#8217;re doing.</p><p>2. <strong>Requires Strong Hardware</strong>: They need powerful computers to run effectively, which can get expensive.</p><p>3. <strong>Limited in Flexibility</strong>: If you need to run complex queries or combine different types of data, vector databases might not be the best choice.</p><h4><strong>Popular Vector Databases</strong></h4><p>Here are some well-known vector databases:</p><p>&#8226; <strong>Pinecone</strong>: Known for its scalability and ease of use, Pinecone is a good choice for businesses that need robust, scalable vector search capabilities.</p><p>&#8226; <strong>Milvus</strong>: An open-source vector database that supports multiple similarity metrics and is designed for high performance in large-scale environments.</p><p>&#8226; <strong>Weaviate</strong>: An open-source vector search engine that integrates seamlessly with machine learning models and offers GraphQL and RESTful interfaces.</p><p>&#8226; <strong>Faiss by Facebook AI</strong>: Primarily a library for efficient similarity search, Faiss is often used in combination with databases to handle vector data effectively.</p><p>&#8226; <strong>Annoy by Spotify</strong>: Another library focused on nearest neighbor search, useful for building custom vector database solutions.</p><h4><strong>When to Use Vector Databases</strong></h4><p>1. <strong>Finding Similar Content</strong>: Whether it&#8217;s searching for similar images or recommending music based on what you already like, vector databases make these tasks a breeze.</p><p>2. <strong>Custom Recommendations</strong>: Online stores and streaming services use these databases to suggest products or shows you might enjoy.</p><p>3. <strong>Spotting Fraud</strong>: In banking, vector databases help spot unusual patterns that could indicate fraud.</p><p>4. <strong>Understanding Human Language</strong>: They help chatbots and search engines understand and respond to natural language more effectively.</p><p>For example, the idea of <em><strong>&#8220;King - Man + Woman = Queen&#8221;</strong></em> in language processing shows how these databases can understand relationships and similarities between words.</p><h4><strong>Choosing the Right Vector Database</strong></h4><p>Here&#8217;s what to consider:</p><p>1. <strong>Size and Growth of Your Data</strong>: Make sure the database can handle your current data and any increase in the future.</p><p>2. <strong>Speed Needs</strong>: Think about how fast you need the system to respond to queries.</p><p>3. <strong>Compatibility</strong>: Check whether the database works well with other systems you&#8217;re using.</p><p>4. <strong>Budget</strong>: Consider both the purchase cost and what you&#8217;ll spend on running the hardware.</p><h4><strong>Conclusion</strong></h4><p>Vector databases are changing the game for businesses that deal with complex, varied data types. By choosing the right database, you can enhance your applications with fast, relevant data retrieval, making your operations smoother and more efficient. Whether you&#8217;re recommending products, detecting fraud, or improving customer interactions, vector databases can provide the tools you need to succeed.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.ayarshabeer.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Shabeer&#8217;s Substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>