<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[Talking Tech]]></title><description><![CDATA[We speak fluent binary. We talk the language of technology]]></description><link>https://talkingtech.io/</link><image><url>https://talkingtech.io/favicon.png</url><title>Talking Tech</title><link>https://talkingtech.io/</link></image><generator>Ghost 5.79</generator><lastBuildDate>Tue, 14 Apr 2026 22:59:25 GMT</lastBuildDate><atom:link href="https://talkingtech.io/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[Quantum Computing: From Lab Bench to Real-World Impact in 2026]]></title><description><![CDATA[By 2026, quantum computing moved from theoretical promise to practical reality. Quantum advantage became real in specific applications that classical systems couldn't solve.]]></description><link>https://talkingtech.io/quantum-computing-from-lab-bench-to-real-world-impact-in-2026/</link><guid isPermaLink="false">69de40bc0806f20541ab447e</guid><dc:creator><![CDATA[Majid Hussain]]></dc:creator><pubDate>Tue, 14 Apr 2026 13:31:33 GMT</pubDate><media:content url="https://talkingtech.io/content/images/2026/04/Quantum-breakthroughs-in-a-futuristic-lab.png" medium="image"/><content:encoded><![CDATA[<h3 id="quantum-computing-from-lab-bench-to-real-world-impact-in-2026">Quantum Computing: From Lab Bench to Real-World Impact in 2026</h3>
<img src="https://talkingtech.io/content/images/2026/04/Quantum-breakthroughs-in-a-futuristic-lab.png" alt="Quantum Computing: From Lab Bench to Real-World Impact in 2026"><p>By 2026, quantum computing moved from theoretical promise to practical reality. Quantum advantage became real in specific applications that classical systems couldn&apos;t solve.</p>
<h4 id="what-changed">What changed?</h4>
<p>The problem with early quantum computing was its fragility. Quantum states collapsed easily. Decoherence destroyed calculations. It was science at best&#x2014;promise at worst.</p>
<p>By 2026, the technology matured dramatically. Error correction breakthroughs stabilized qubits. Hybrid quantum-classical approaches became standard. Real businesses integrated quantum processors.</p>
<h4 id="the-milestone-quantum-advantage-became-real">The milestone: Quantum Advantage Became Real</h4>
<p>Researchers demonstrated quantum computers outperforming classical systems in specific tasks. This isn&apos;t theoretical anymore&#x2014;it&apos;s measurable. Companies running quantum advantage applications report:</p>
<ul>
<li>5x speedup in optimization problems</li>
<li>10x improvement in material simulations</li>
<li>Complete breakthroughs in drug discovery timelines</li>
</ul>
<h4 id="three-breakthrough-areas">Three breakthrough areas:</h4>
<ol>
<li>
<p><strong>Materials Science</strong> - Quantum simulation models molecules and materials at atomic scale. Drug development timelines compressed from years to months.</p>
</li>
<li>
<p><strong>Logistics &amp; Manufacturing</strong> - Quantum optimization solves routing and scheduling problems classical computers couldn&apos;t handle. Supply chains optimized. Factories reconfigured.</p>
</li>
<li>
<p><strong>Finance</strong> - Quantum algorithms process risk models and pricing calculations faster. Portfolio optimization improved. Algorithmic trading enhanced.</p>
</li>
</ol>
<h4 id="the-timeline-quantum-safe-cryptography-gains-urgency">The timeline: Quantum-Safe Cryptography Gains Urgency</h4>
<p>Q-Day is near. When quantum computers become powerful enough to break today&apos;s encryption. Nearly all digital communications are at risk.</p>
<p>The response is already underway:</p>
<ul>
<li>Post-quantum cryptography standards adopted</li>
<li>Hybrid encryption schemes deployed</li>
<li>Financial institutions lead migration</li>
<li>Government agencies accelerate protection</li>
</ul>
<p>The technology landscape evolved.</p>
<p>Quantinuum&apos;s H2 quantum processor combined with Microsoft&apos;s software stack created a hybrid quantum-classical approach. Major vendors entered the race globally. American, Chinese, and European tech leaders pushing boundaries.</p>
<p>The business adoption is accelerating.</p>
<p>Enterprises aren&apos;t waiting for perfection. They&apos;re integrating quantum where it matters most. Pharma companies testing quantum for drug discovery. Banks testing quantum for risk modeling. Logistics firms testing quantum for optimization.</p>
<p>The talent challenge is real.</p>
<p>Organizations lack quantum expertise. Training programs emerging. Certifications gaining credibility. The skills gap mirrors zero trust earlier.</p>
<p>The standards debate matters.</p>
<p>IEEE and other bodies addressing quantum computing standards. Without standards, interoperability suffers. Without frameworks, investment hesitates.</p>
<p>2026 is the turning point.</p>
<p>Quantum advantage is real. Quantum security is urgent. Quantum adoption is accelerating. The lab has moved to the production floor.</p>
<p>The infrastructure is maturing.</p>
<p>Cloud platforms provide quantum access. Software stacks abstract the hardware. Developers write quantum code without quantum physicists.</p>
<p>The future applications expanding.</p>
<ul>
<li>AI integration quantum machine learning</li>
<li>Climate modeling quantum simulation</li>
<li>Financial fraud quantum detection</li>
<li>Supply chain quantum optimization</li>
</ul>
<p>2026 proves quantum computing is no longer experimental. It&apos;s the new frontier. Organizations embracing quantum will have a competitive advantage. The question isn&apos;t whether quantum works&#x2014;it&apos;s how to use it.</p>
<p>The bottom line: quantum computing achieved real advantage in 2026. The race is on globally. The future belongs to those who adapt.</p>
<p>Quantum advantage is real. The future is quantum.</p>
]]></content:encoded></item><item><title><![CDATA[Beyond Lines of Code: How Repository Intelligence Is Transforming AI-Driven Development]]></title><description><![CDATA[The AI landscape is shifting from generative to relational intelligence. GitHub's Mario Rodriguez describes 2026 as a year of repository intelligence]]></description><link>https://talkingtech.io/beyond-lines-of-code-how-repository-intelligence-is-transforming-ai-driven-development-2/</link><guid isPermaLink="false">69db74a60806f20541ab4462</guid><dc:creator><![CDATA[Majid Hussain]]></dc:creator><pubDate>Sun, 12 Apr 2026 10:35:09 GMT</pubDate><media:content url="https://talkingtech.io/content/images/2026/04/feature_image-22.png" medium="image"/><content:encoded><![CDATA[<h2 id="the-ai-landscape-is-shifting-from-generative-to-relational-intelligence-githubs-mario-rodriguez-describes-2026-as-a-year-of-repository-intelligence">The AI landscape is shifting from generative to relational intelligence. GitHub&apos;s Mario Rodriguez describes 2026 as a year of repository intelligence.</h2>
<img src="https://talkingtech.io/content/images/2026/04/feature_image-22.png" alt="Beyond Lines of Code: How Repository Intelligence Is Transforming AI-Driven Development"><p>This isn&apos;t about AI writing more code. It&apos;s about AI understanding the entire context.</p>
<h2 id="what-is-repository-intelligence">What is repository intelligence?</h2>
<p>It&apos;s the emerging paradigm where AI understands not just lines of code, but the relationships, dependencies, and historical evolution of codebases. GitHub&apos;s repository intelligence leverages millions of repositories to teach models about how code actually works in practice.</p>
<h2 id="why-the-shift">Why the shift?</h2>
<p>The problem with previous AI tools was they treated code as isolated lines. Repository intelligence treats code as a living system&#x2014;understanding how a function interacts across modules, how historical commits inform current decisions, and how code patterns evolve over time.</p>
<h2 id="three-key-capabilities">Three key capabilities:</h2>
<ol>
<li>
<p><strong>Contextual Understanding</strong> - AI that knows your codebase&apos;s unique patterns, naming conventions, and architectural decisions.</p>
</li>
<li>
<p><strong>Change Prediction</strong> - Predicting how modifications to one module affect the entire system.</p>
</li>
<li>
<p><strong>Historical Learning</strong> - Learning from past mistakes and successful implementations across your organization&apos;s repositories.</p>
</li>
</ol>
<h2 id="githubs-repository-intelligence-specifically">GitHub&apos;s repository intelligence specifically:</h2>
<p>The sheer volume matters. With billions of repositories, the data enables models to learn from real-world code patterns rather than synthetic training data.</p>
<p>The result is AI that doesn&apos;t just write code&#x2014;it writes code that works.</p>
<h2 id="the-impact-is-already-visible">The impact is already visible.</h2>
<p>Companies using repository intelligence report:</p>
<ul>
<li>40% reduction in onboarding time for new developers</li>
<li>30% improvement in code quality metrics</li>
<li>25% faster feature delivery</li>
</ul>
<p>But the benefits go beyond productivity. The deeper insight comes from understanding the &apos;why&apos; behind code decisions. Repository intelligence can explain historical decisions, predict future risks, and suggest architectural improvements.</p>
<h2 id="the-challenge-is-data-quality">The challenge is data quality.</h2>
<p>The model only works if you feed it real, clean data. Organizations need to provide consistent code documentation and maintain well-structured repositories.</p>
<h2 id="the-technology-is-maturing">The technology is maturing.</h2>
<p>In early 2026, Opus 4.5 and Sonnet 4.5 emerged as top performers for coding work. Anthropic&apos;s shift toward developer-focused models reflects industry consensus: developers need tools that work.</p>
<h2 id="the-future-is-already-here">The future is already here.</h2>
<p>Repository intelligence transforms AI from a code generator to a code architect. The question isn&apos;t whether AI can write better code&#x2014;it&apos;s whether we can use it to write better systems.</p>
<p>The bottom line: repository intelligence is no longer experimental. It&apos;s the new standard. Organizations integrating this technology will have a competitive advantage in software quality, delivery speed, and developer experience.</p>
]]></content:encoded></item><item><title><![CDATA[Beyond Perimeter: How Zero Trust Architecture Is Redefining Security]]></title><description><![CDATA[What was once theoretical is now practical. Zero Trust isn't just about adding more layers. It's about a complete reversal of assumptions. Never trust, always verify. Every access request is evaluated based on policy regardless of source.]]></description><link>https://talkingtech.io/beyond-perimeter-how-zero-trust-architecture-is-redefining-security/</link><guid isPermaLink="false">69d9755f0806f20541ab4444</guid><category><![CDATA[Cyber Security]]></category><category><![CDATA[Zero Trust]]></category><dc:creator><![CDATA[Majid Hussain]]></dc:creator><pubDate>Fri, 10 Apr 2026 22:20:42 GMT</pubDate><media:content url="https://talkingtech.io/content/images/2026/04/ChatGPT-Image-Apr-11--2026--02_19_01-AM.png" medium="image"/><content:encoded><![CDATA[<img src="https://talkingtech.io/content/images/2026/04/ChatGPT-Image-Apr-11--2026--02_19_01-AM.png" alt="Beyond Perimeter: How Zero Trust Architecture Is Redefining Security"><p>The old model of security was built on two assumptions that no longer hold: trust the inside, distrust the outside. In 2026, that&apos;s dead.</p>
<p>Enter Zero Trust.</p>
<p>What was once theoretical is now practical. Zero Trust isn&apos;t just about adding more layers. It&apos;s about a complete reversal of assumptions. Never trust, always verify. Every access request is evaluated based on policy regardless of source.</p>
<p>Why the shift? Three realities:</p>
<ol>
<li>Remote work is permanent. Employees access data from anywhere.</li>
<li>Cloud environments have expanded beyond IT&apos;s control.</li>
<li>Supply chains are interconnected. A compromise in a third-party can breach your internal systems.</li>
</ol>
<p>The architecture requires continuous verification. Identity is the primary factor. Devices must be authenticated. Data access must be limited. The principle of least privilege applies everywhere.</p>
<p>This means breaking down the perimeter concept. There is no internal zone that&apos;s safe. Every access point is a potential threat. Every user is a potential threat.</p>
<p>The implementation involves multiple components. Multi-factor authentication becomes ubiquitous. Continuous monitoring tracks behavioral patterns. Micro-segmentation limits lateral movement. Zero Trust Network Access extends to all devices.</p>
<p>But the real story isn&apos;t just about technology. It&apos;s about culture. Zero Trust requires that every actor understands their role. Every action matters. Every assumption can be wrong.</p>
<p>Organizations that have adopted Zero Trust see measurable improvements. Fewer breaches when security incidents occur. Faster detection and response. Better compliance and audit outcomes. Reduced attack surface from the start.</p>
<p>The challenge is adoption. Legacy systems need modernization. New tools must be integrated. Training is essential. Culture must shift. It&apos;s a marathon, not a sprint.</p>
<p>The bottom line: perimeter security has failed. Zero Trust is the new standard. Organizations that adopt it early will have a competitive advantage. Organizations that wait may find themselves defending against attacks that bypass traditional controls.</p>
<p>The technology is ready. The time is now.</p>
]]></content:encoded></item><item><title><![CDATA[The Future is Agentic: Why AI Agents Will Transform 2026]]></title><description><![CDATA[The AI hype cycle of 2025 left many in the "trough of disillusionment." Generative AI tools became ubiquitous, yet few saw the transformative spark. Enter 2026, where the real revolution begins: **agentic AI**.]]></description><link>https://talkingtech.io/the-future-is-agentic-why-ai-agents-will-transform-2026/</link><guid isPermaLink="false">69d7dd4b0806f20541ab4412</guid><dc:creator><![CDATA[Majid Hussain]]></dc:creator><pubDate>Thu, 09 Apr 2026 17:13:23 GMT</pubDate><media:content url="https://talkingtech.io/content/images/2026/04/Gemini_Generated_Image_qpms9tqpms9tqpms.png" medium="image"/><content:encoded><![CDATA[<img src="https://talkingtech.io/content/images/2026/04/Gemini_Generated_Image_qpms9tqpms9tqpms.png" alt="The Future is Agentic: Why AI Agents Will Transform 2026"><p>The AI hype cycle of 2025 left many in the &quot;trough of disillusionment.&quot; Generative AI tools became ubiquitous, yet few saw the transformative spark. Enter 2026, where the real revolution begins: <strong>agentic AI</strong>.</p>
<h2 id="beyond-chatbots">Beyond Chatbots</h2>
<p>The fundamental shift isn&apos;t about better text generation&#x2014;it&apos;s about <strong>action</strong>. While generative AI chatbots could draft emails and summarize documents, AI agents can now execute end-to-end workflows.</p>
<p>At MIT, researchers describe agentic AI as systems that &quot;perceive, plan, act, and learn.&quot; Unlike traditional chatbots that wait for explicit prompts, agents autonomously execute multi-step tasks with memory, tool access, and adaptive reasoning.</p>
<h2 id="the-2026-reality">The 2026 Reality</h2>
<p>Microsoft&apos;s latest analysis highlights seven trends defining AI&apos;s next chapter:</p>
<ol>
<li><strong>Agentic automation</strong> replacing manual workflows</li>
<li><strong>Interoperable agents</strong> breaking vendor lock-in</li>
<li><strong>Hardened governance</strong> with security-audited releases</li>
<li><strong>AI-driven cybersecurity</strong> at enterprise scale</li>
<li><strong>Autonomous robotics</strong> in manufacturing</li>
<li><strong>Massive infrastructure investments</strong> (billions for data centers)</li>
<li><strong>ROI-focused adoption</strong> demanding immediate value</li>
</ol>
<p>Amazon&apos;s millionth robot, coordinated by DeepFleet AI, exemplifies this shift&#x2014;improving warehouse efficiency by 10% through autonomous decision-making.</p>
<h2 id="why-it-matters">Why It Matters</h2>
<p>The business case is compelling:</p>
<ul>
<li><strong>Faster time-to-value</strong>: Companies demand immediate ROI, not years of R&amp;D</li>
<li><strong>Rapid prototyping</strong>: Accelerated product development cycles</li>
<li><strong>Human augmentation</strong>: Pair programming with AI agents</li>
<li><strong>End-to-end automation</strong>: From scheduling to execution</li>
</ul>
<h2 id="the-architecture-shift">The Architecture Shift</h2>
<p>According to Stack AI&apos;s 2026 guide, workflow architecture now matters more than ever. The key questions:</p>
<ul>
<li>How much autonomy should agents have?</li>
<li>What happens when things go wrong?</li>
<li>How do we ensure safety at scale?</li>
</ul>
<p>Oracle&apos;s recent AI Agent Studio updates answer these with capabilities for &quot;workflow orchestration, content intelligence, contextual memory, and ROI measurement.&quot;</p>
<h2 id="looking-ahead">Looking Ahead</h2>
<p>The next decade won&apos;t just be about AI&#x2014;it&apos;ll be about <strong>agency</strong>. Systems that think, plan, and act will reshape industries from healthcare to logistics. The organizations that thrive will be those that learn to work <em>with</em> agentic AI, not against it.</p>
<p>The question isn&apos;t whether AI will become agent-based&#x2014;it already is. The question is: are you ready?</p>
]]></content:encoded></item><item><title><![CDATA[Prompt Engineering in 2026: The Skills That Actually Matter]]></title><description><![CDATA[In 2026, the conversation has moved beyond "how do I write a better prompt?" to "how do I design systems that work with AI agents autonomously?"]]></description><link>https://talkingtech.io/prompt-engineering-in-2026-the-skills-that-actually-matter-2/</link><guid isPermaLink="false">69d55f3f0806f20541ab437f</guid><category><![CDATA[Prompt Engineering]]></category><category><![CDATA[AI coding]]></category><dc:creator><![CDATA[Majid Hussain]]></dc:creator><pubDate>Tue, 07 Apr 2026 19:58:30 GMT</pubDate><media:content url="https://talkingtech.io/content/images/2026/04/Futuristic-tech-and-prompt-engineering.png" medium="image"/><content:encoded><![CDATA[<h2 id="the-great-pivot">The Great Pivot</h2>
<img src="https://talkingtech.io/content/images/2026/04/Futuristic-tech-and-prompt-engineering.png" alt="Prompt Engineering in 2026: The Skills That Actually Matter"><p>In 2023, &quot;prompt engineering&quot; was the buzzword of the decade. We were all obsessed with finding that perfect prompt, crafting our golden questions to coax AI into spitting out exactly what we wanted. It was impressive work, sure, but it was just the warm-up.</p>
<p>By late 2024, we started noticing something strange. The best prompts were becoming harder to find. The AI was getting smarter at figuring out what we needed, even when we weren&apos;t being super specific. And then came 2025, and the landscape shifted completely.</p>
<p>Today, in 2026, the conversation has moved beyond &quot;how do I write a better prompt?&quot; to &quot;how do I design systems that work with AI agents autonomously?&quot;</p>
<h2 id="from-single-shots-to-multi-agent-workflows">From Single Shots to Multi-Agent Workflows</h2>
<p>The old days of single-shot prompting are largely a relic now. Sure, you can still use isolated prompts for quick tasks, but the real value is in <strong>orchestration</strong>.</p>
<p>The prompt engineering of 2026 isn&apos;t about finding the perfect one-shot prompt anymore. It&apos;s about:</p>
<ul>
<li>Designing clear <strong>agent protocols</strong> that specify roles, goals, and handoffs</li>
<li>Creating <strong>multi-agent workflows</strong> where specialized agents collaborate</li>
<li>Building <strong>feedback loops</strong> that allow agents to self-correct</li>
<li>Establishing <strong>context management</strong> systems that preserve information across iterations</li>
</ul>
<p>Think of it less like prompting and more like <strong>software architecture</strong>. You&apos;re not writing a request to a function; you&apos;re designing a system of functions that talk to each other.</p>
<h2 id="the-new-skill-stack">The New Skill Stack</h2>
<p>If you&apos;re building your career for 2026, here&apos;s what you should be focusing on:</p>
<h3 id="1-agent-orchestration-design">1. <strong>Agent Orchestration Design</strong></h3>
<p>This is the new frontier. You need to be able to:</p>
<ul>
<li>Define clear agent roles and responsibilities</li>
<li>Design handoff protocols between agents</li>
<li>Create feedback mechanisms for agent collaboration</li>
<li>Handle failure modes and recovery strategies</li>
</ul>
<h3 id="2-prompt-quality-control">2. <strong>Prompt Quality Control</strong></h3>
<p>The prompt of 2026 is about <strong>consistency and reliability</strong>, not just cleverness. You need to:</p>
<ul>
<li>Ensure prompts work across different agents and contexts</li>
<li>Create fallback mechanisms for when prompts fail</li>
<li>Build in human-in-the-loop validation points</li>
<li>Measure and track prompt effectiveness</li>
</ul>
<h3 id="3-context-engineering">3. <strong>Context Engineering</strong></h3>
<p>The ability to manage and manipulate context is super important. This includes:</p>
<ul>
<li>Summarization for long contexts</li>
<li>Pruning irrelevant information</li>
<li>Creating context hierarchies</li>
<li>Managing agent memory and state</li>
</ul>
<h2 id="the-practical-takeaway">The Practical Takeaway</h2>
<p>The &quot;best prompt&quot; in 2026 is the one that fits into a <strong>robust, multi-agent system</strong>. It&apos;s not about the magic bullet. It&apos;s about building systems that can:</p>
<ul>
<li>Handle ambiguity gracefully</li>
<li>Self-correct when things go wrong</li>
<li>Collaborate across specialized functions</li>
<li>Evolve and adapt over time</li>
</ul>
<h2 id="what-this-means-for-you">What This Means for You</h2>
<p>If you&apos;re an individual contributor: Focus on learning how to integrate AI agents into your existing workflows. The skill that separates good AI users from great ones is the ability to design <strong>robust, multi-stage processes</strong> that use multiple AI capabilities.</p>
<p>If you&apos;re building a career in AI: Stop thinking about &quot;prompt engineering&quot; as a standalone skill. Start thinking about <strong>agent systems design</strong> &#x2014; the ability to architect workflows that leverage multiple AI agents working together.</p>
<p>The field of &quot;prompt engineering&quot; hasn&apos;t disappeared; it&apos;s evolved into something much more complex and powerful. And if you want to be at the cutting edge in 2026 and beyond, you need to be ready to think about systems, not just prompts.</p>
]]></content:encoded></item><item><title><![CDATA[Setting Up OpenClaw Locally with Ollama (and What I Learned Along the Way)]]></title><description><![CDATA[The idea was simple: run a capable, private AI assistant with GPU acceleration and a clean web interface.
In reality, it turned into a deep dive into agent systems, model limitations, and performance tuning.]]></description><link>https://talkingtech.io/setting-up-openclaw-locally-with-ollama-and-what-i-learned-along-the-way/</link><guid isPermaLink="false">69cf6fbf0806f20541ab422b</guid><category><![CDATA[AI]]></category><category><![CDATA[OpenClaw]]></category><category><![CDATA[Ollama]]></category><dc:creator><![CDATA[Majid Hussain]]></dc:creator><pubDate>Fri, 03 Apr 2026 07:52:47 GMT</pubDate><media:content url="https://talkingtech.io/content/images/2026/04/AI-interface-and-robotic-connection.png" medium="image"/><content:encoded><![CDATA[<img src="https://talkingtech.io/content/images/2026/04/AI-interface-and-robotic-connection.png" alt="Setting Up OpenClaw Locally with Ollama (and What I Learned Along the Way)"><p>I recently set out to build a fully local AI agent using OpenClaw and Ollama on my Proxmox server. The idea was simple: run a capable, private AI assistant with GPU acceleration and a clean web interface.</p><p>In reality, it turned into a deep dive into agent systems, model limitations, and performance tuning.</p><p>Here&#x2019;s a detailed breakdown of my setup, the challenges I faced, and what actually made things work.</p><hr><h2 id="%E2%9A%99%EF%B8%8F-my-setup">&#x2699;&#xFE0F; My Setup</h2><ul><li><strong>Proxmox host</strong></li><li><strong>Ubuntu VM</strong> &#x2192; running Ollama with GPU passthrough (RTX 2080)</li><li><strong>LXC container</strong> &#x2192; running OpenClaw</li><li><strong>Cloudflare Tunnel</strong> &#x2192; exposing the UI externally</li></ul><p>This separation allowed me to isolate workloads:</p><ul><li>GPU-heavy inference in the VM</li><li>lightweight orchestration in the container</li></ul><hr><h2 id="%F0%9F%9A%A7-challenges-i-faced">&#x1F6A7; Challenges I Faced</h2><h3 id="1-openclaw-service-failing-to-start">1. OpenClaw Service Failing to Start</h3><p>The systemd service kept crashing with errors related to the working directory:</p><p>Changing to the requested working directory failed: No such file or directory</p><p><strong>Fix:</strong></p><ul><li>Corrected the <code>WorkingDirectory</code> path in the service file</li><li>Ensured proper permissions for the OpenClaw user</li></ul><hr><h3 id="2-telegram-bot-not-responding">2. Telegram Bot Not Responding</h3><p>Even after setting up the bot via BotFather, OpenClaw showed:</p><p>access not configured</p><p><strong>Fix:</strong></p><ul><li>Properly paired the device via OpenClaw</li><li>Approved the chat after initiating a message</li><li>Verified bot token and chat ID mapping</li></ul><hr><h3 id="3-large-model-memory-limitations">3. Large Model Memory Limitations</h3><p>I initially tried running large models like:</p><p>qwen2.5-coder:32b</p><p>But it required ~50GB RAM, which was not feasible locally.</p><p><strong>Lesson:</strong></p><blockquote>Stick to smaller models unless you have high-end hardware.</blockquote><hr><h3 id="4-requests-timing-out-in-openclaw">4. Requests Timing Out in OpenClaw</h3><p>This was the biggest issue.</p><ul><li>Direct <code>curl</code> calls to Ollama were fast</li><li>But OpenClaw requests kept timing out</li></ul><p>Logs revealed:</p><ul><li>repeated tool call failures</li><li>retries inside the agent loop</li><li>eventual timeouts</li></ul><hr><h3 id="5-gpu-confusion">5. GPU Confusion</h3><p>At one point, I thought the GPU wasn&#x2019;t working because responses were slow.</p><p>After checking <code>nvidia-smi</code>:</p><ul><li>VRAM usage was high &#x2705;</li><li>GPU utilization was near 0% &#x274C;</li></ul><p>This led to an important realization:</p><blockquote>The bottleneck wasn&#x2019;t inference &#x2014; it was the agent orchestration.</blockquote><p>OpenClaw was spending most of its time preparing and retrying requests rather than actually generating tokens.</p><hr><h3 id="6-model-compatibility-with-tools">6. Model Compatibility with Tools</h3><p>Not all models behaved the same in an agent setup.</p><p>Here&#x2019;s what I observed:</p><ul><li><strong>DeepSeek models</strong> &#x2192; no tool support</li><li><strong>Phi3</strong> &#x2192; very fast, but unreliable tool handling</li><li><strong>Mistral</strong> &#x2192; supports tools, but noticeably slower</li><li><strong>LLaMA 3.x</strong> &#x2192; mixed performance</li></ul><p><strong>Key takeaway:</strong></p><blockquote>Model choice matters more than raw size or speed.</blockquote><hr><h3 id="7-cloudflare-tunnel-latency">7. Cloudflare Tunnel Latency</h3><p>Using a Cloudflare tunnel added extra latency and sometimes affected WebSocket behavior.</p><p>Accessing OpenClaw locally was consistently faster.</p><hr><h2 id="%F0%9F%92%A1-what-actually-fixed-it">&#x1F4A1; What Actually Fixed It</h2><p>After a lot of trial and error, these changes made the biggest difference:</p><hr><h3 id="1-choosing-the-right-model">1. Choosing the Right Model</h3><p>Instead of chasing the biggest or fastest model, I focused on balance:</p><ul><li>tool compatibility</li><li>response consistency</li><li>acceptable speed</li></ul><hr><h3 id="2-reducing-maxtokens">2. Reducing <code>maxTokens</code></h3><p>Limiting output length had an immediate impact:</p><p>&quot;maxTokens&quot;: 60-100</p><p>This reduced generation time and improved responsiveness significantly.</p><hr><h3 id="3-adjusting-context-window-size">3. Adjusting Context Window Size</h3><p>Another major improvement came from tuning the context window.</p><p>Large context windows:</p><ul><li>increase memory usage</li><li>slow down token processing</li><li>add unnecessary overhead</li></ul><p>By keeping the context window smaller and more focused, I was able to:</p><ul><li>reduce latency</li><li>improve overall throughput</li><li>make responses more consistent</li></ul><hr><h3 id="4-disabling-tools-game-changer-for-speed">4. Disabling Tools (Game Changer for Speed)</h3><p>When I disabled tools:</p><ul><li>no more retries</li><li>no agent loops</li><li>instant responses</li></ul><p><strong>Tradeoff:</strong></p><ul><li>lost memory and automation features</li></ul><p>But for general chat and quick responses, this made the system feel dramatically faster.</p><hr><h3 id="5-understanding-the-real-bottleneck">5. Understanding the Real Bottleneck</h3><p>The biggest realization from this setup was:</p><blockquote>It wasn&#x2019;t GPU, network, or even the model &#x2014; it was the agent loop.</blockquote><p>OpenClaw introduces:</p><ul><li>structured reasoning</li><li>tool execution cycles</li><li>validation and retries</li></ul><p>All of which add latency, even on powerful hardware.</p><hr><h2 id="%F0%9F%9A%80-final-setup-what-i-recommend">&#x1F680; Final Setup (What I Recommend)</h2><p>After all the experimentation, here&#x2019;s the setup that worked best for me:</p><h3 id="fast-mode-daily-use">Fast Mode (Daily Use)</h3><ul><li>lightweight model</li><li>tools disabled</li><li>low <code>maxTokens</code></li><li>optimized context window</li></ul><p><strong>Result:</strong></p><ul><li>fast, responsive experience</li><li>ideal for chat and coding</li></ul><hr><h3 id="agent-mode-when-needed">Agent Mode (When Needed)</h3><ul><li>tool-capable model</li><li>tools enabled</li><li>controlled token limits</li><li>slightly higher latency</li></ul><p><strong>Result:</strong></p><ul><li>more powerful workflows</li><li>automation and memory support</li></ul><hr><h2 id="%F0%9F%A7%A0-key-takeaways">&#x1F9E0; Key Takeaways</h2><ul><li>Local AI setups involve real tradeoffs between speed and capability</li><li>GPU acceleration helps, but orchestration matters more</li><li>Agent frameworks introduce significant overhead</li><li>Model compatibility with tools is critical</li><li>Tuning parameters like <code>maxTokens</code> and context window can drastically improve performance</li></ul><hr><h2 id="final-thoughts">Final Thoughts</h2><p>This project gave me a much deeper understanding of how modern AI systems actually work under the hood.</p><p>It&#x2019;s not just about running a model &#x2014; it&#x2019;s about how everything around it is orchestrated.</p><p>If you&#x2019;re building a similar setup, my advice would be:</p><blockquote>Start simple, measure everything, and optimize step by step.</blockquote>]]></content:encoded></item><item><title><![CDATA[Building Modern Web Applications: Lessons Learned from Implementing Module Federation]]></title><description><![CDATA[Lessons learned from implementing Module Federation in a modern web app — from tackling integration hurdles to improving team workflows. Practical tips to help you avoid pitfalls and make the most of a modular architecture.]]></description><link>https://talkingtech.io/building-a-modern-web-application-lessons-learned-from-implementing-module-federation/</link><guid isPermaLink="false">6895da229928ca05253d7ae1</guid><category><![CDATA[frontend]]></category><category><![CDATA[microfrontend]]></category><category><![CDATA[module federation]]></category><category><![CDATA[nextjs]]></category><category><![CDATA[web application]]></category><dc:creator><![CDATA[Majid Hussain]]></dc:creator><pubDate>Fri, 08 Aug 2025 11:55:50 GMT</pubDate><media:content url="https://talkingtech.io/content/images/2025/08/Microfrontends.png" medium="image"/><content:encoded><![CDATA[<img src="https://talkingtech.io/content/images/2025/08/Microfrontends.png" alt="Building Modern Web Applications: Lessons Learned from Implementing Module Federation"><p>I was recently playing around to modernize web applications by implementing a micro-frontend (MFE) architecture using Module Federation. This project involved creating a distributed system where multiple independent applications work together to create a seamless user experience. Here are the key lessons learned along the way.</p><h2 id="%F0%9F%8F%97%EF%B8%8F-architecture-overview">&#x1F3D7;&#xFE0F; Architecture Overview</h2><p>The application is built&#xA0;around a&#xA0;Shell + MFE architecture&#xA0;with four main components:</p><ul><li>&#x1F3E0; Shell Application&#xA0;- The orchestrator that manages the overall flow</li><li>&#x1F510; Authentication MFE&#xA0;- Handles user authentication and verification</li><li>&#x1F4B3; Feature MFE&#xA0;- Manages core functionality and processing</li><li>&#x1F6E1;&#xFE0F; Verification MFE&#xA0;- Handles identity verification and compliance</li></ul><h2 id="%F0%9F%8E%AF-key-lessons-learned">&#x1F3AF; Key Lessons Learned</h2><p></p><h3 id="module-federation-requires-careful-react-management">Module Federation Requires Careful React Management</h3><p>Challenge: Ensuring React is properly available across all MFEs was one of our&#xA0;biggest&#xA0;hurdles. </p><p>Solution: Implementing a robust React availability system:</p><pre><code>const ensureReactAvailable = () =&gt; {
  if (typeof window !== &quot;undefined&quot;) {
    if (!(window as any).React) {
      (window as any).React = React;
    }
    if (!(window as any).ReactDOM) {
      import(&apos;react-dom&apos;).then((ReactDOM) =&gt; {
        (window as any).ReactDOM = ReactDOM;
      });
    }
  }
};</code></pre><p>Lesson: Module Federation requires explicit React sharing, and you&#xA0;need to handle both client and server-side scenarios carefully.</p><h3 id="environment-isolation-is-critical">Environment Isolation is Critical</h3><p>Challenge: Different teams working on different MFEs needed isolated development environments. </p><p>Solution: Implementing a sophisticated cookie isolation. </p><pre><code># Development: Port-specific cookies
app-session-token_port_3001

# Staging: Subdomain-specific cookies  
app-session-token_staging

# Production: Standard cookies
app-session-token</code></pre><p>Lesson: Environment isolation prevents conflicts and allows teams to work independently without affecting each other.</p><h3 id="error-handling-must-be-comprehensive">Error Handling Must Be Comprehensive</h3><p>Challenge: Module Federation loading failures can break the entire application.</p><p>Solution: Implementing multiple layers of error handling </p><pre><code>const loadAuthMFE = async () =&gt; {
  try {
    // Primary: Module Federation
    if ((window as any).__FEDERATION__) {
      return await (window as any).__FEDERATION__.instance.loadRemote(&quot;auth/AuthMFE&quot;);
    }
    
    // Fallback: Direct import
    return await import(&quot;auth/AuthMFE&quot;);
  } catch (error) {
    // Graceful degradation
    return { 
      default: () =&gt; &lt;ErrorComponent message={error.message} /&gt; 
    };
  }
};
</code></pre><p>Lesson: Always have fallback mechanisms and graceful degradation strategies.</p><h3 id="state-management-across-mfes-is-complex">State Management Across&#xA0;MFEs is&#xA0;Complex</h3><p>Challenge: Sharing&#xA0;state across&#xA0;multiple independent applications. </p><p>Solution: Centralized state management in the shell with careful prop passing:</p><pre><code>const mfeProps = {
  someState: props.initialData?.someState,
  globalConfig: {
    someFunc: globalConfig.someFunc,
  },
};</code></pre><p>Lesson: Design your&#xA0;data flow carefully&#xA0;and ensure all&#xA0;MFEs have access to the&#xA0;data they&#xA0;need.</p><h3 id="build-performance-requires-optimization">Build Performance Requires Optimization</h3><p>Challenge: Building multiple applications with&#xA0;shared dependencies.</p><p>Solution: Using Turbo for <code>monorepo</code> builds and&#xA0;implemented careful dependency&#xA0;management:</p><pre><code>{
  &quot;build&quot;: {
    &quot;dependsOn&quot;: [&quot;^build&quot;, &quot;check-types&quot;],
    &quot;inputs&quot;: [&quot;src/**&quot;, &quot;components/**&quot;, &quot;*.config.js&quot;],
    &quot;outputs&quot;: [&quot;.next/**&quot;]
  }
}</code></pre><p>Lesson: <code>Monorepo</code> tooling like Turbo is essential for maintaining&#xA0;fast&#xA0;build times.</p><h3 id="css-architecture-needs-centralization">CSS Architecture&#xA0;Needs Centralization</h3><p>Challenge: Styling consistency across&#xA0;multiple independent applications.</p><p>Solution: Centralized&#xA0;styling in the&#xA0;shell with minimal&#xA0;MFE-specific&#xA0;CSS:</p><pre><code>//&#xA0;Shell&#xA0;manages&#xA0;all&#xA0;global&#xA0;styles

import&#xA0;&quot;@acme/ui-core/styles.css&quot;;
import&#xA0;&quot;@acme/design-system&quot;;

//&#xA0;MFEs&#xA0;have&#xA0;minimal&#xA0;globals.css </code></pre><p>Lesson: Centralize as much styling as&#xA0;possible to&#xA0;maintain&#xA0;consistency.</p><h3 id="typescript-configuration-is-critical">TypeScript Configuration is Critical</h3><p>Challenge: Type safety across multiple applications with shared types.</p><p>Solution: Shared TypeScript configurations and careful type definitions:</p><pre><code>{
  &quot;extends&quot;: &quot;@acme/typescript-config/nextjs.json&quot;,
  &quot;compilerOptions&quot;: {
    &quot;baseUrl&quot;: &quot;.&quot;,
    &quot;paths&quot;: {
      &quot;@/*&quot;: [&quot;./*&quot;],
      &quot;@acme/component-library&quot;: [&quot;../../packages/component-library/src&quot;]
    }
  }
}</code></pre><p>Lesson: Invest&#xA0;in proper TypeScript setup early -&#xA0;it pays dividends in maintainability.</p><h3 id="development-workflow-requires-coordination">Development Workflow Requires Coordination</h3><p>Challenge: Multiple teams&#xA0;working on different&#xA0;MFEs&#xA0;simultaneously.</p><p>Solution: Clear development workflow with team-specific environments: </p><pre><code>#&#xA0;Team&#xA0;A&#xA0;uses&#xA0;devA
export&#xA0;NODE_ENV=devA pnpm&#xA0;dev
#&#xA0;Team&#xA0;B&#xA0;uses&#xA0;devB&#xA0;&#xA0;
export&#xA0;NODE_ENV=devB pnpm&#xA0;dev</code></pre><p>Lesson: Establish clear development workflows and environment&#xA0;isolation early.</p><h2 id="%F0%9F%9B%A0%EF%B8%8F-technical-implementation-insights">&#x1F6E0;&#xFE0F; Technical Implementation&#xA0;Insights</h2><h3 id="module-federation-configuration">Module Federation&#xA0;Configuration</h3><p>Each&#xA0;MFE exposes its&#xA0;main component: </p><pre><code>// apps/auth/next.config.js
new NextFederationPlugin({
  name: &quot;auth&quot;,
  filename: &quot;static/chunks/remoteEntry.js&quot;,
  exposes: {
    &quot;./AuthMFE&quot;: &quot;./components/AuthMFE.tsx&quot;,
  },
  shared: {
    react: { singleton: true, eager: true },
    &quot;react-dom&quot;: { singleton: true, eager: true },
  },
})</code></pre><h3 id="error-boundary-implementation">Error Boundary Implementation</h3><p>Implementing comprehensive&#xA0;error&#xA0;boundaries for each&#xA0;MFE:</p><pre><code>&lt;ErrorBoundary
  onError={(error, errorInfo) =&gt; {
    console.error(&quot;&#x1F4A5; Auth MFE Render Error:&quot;, error, errorInfo);
  }}
  fallback={
    &lt;ErrorScreen
      title=&quot;Authentication Component Error&quot;
      message=&quot;The authentication component encountered an error.&quot;
      showIcon={false}
    /&gt;
  }
&gt;
  &lt;AuthComponent {...mfeProps} /&gt;
&lt;/ErrorBoundary&gt;</code></pre><h2 id="%F0%9F%93%88-performance-optimizations">&#x1F4C8; Performance Optimizations</h2><h3 id="bundle-splitting">Bundle Splitting</h3><p>Implementing careful bundle splitting to optimize loading: </p><pre><code>splitChunks: {
  chunks: &apos;all&apos;,
  cacheGroups: {
    vendor: {
      test: /[\\/]node_modules[\\/]/,
      name: &apos;vendors&apos;,
      chunks: &apos;all&apos;,
      priority: 10,
    },
    componentLibrary: {
      test: /[\\/]node_modules[\\/]@acme[\\/]component-library[\\/]/,
      name: &apos;component-library&apos;,
      chunks: &apos;all&apos;,
      priority: 20,
    },
  },
}</code></pre><h3 id="lazy-loading">Lazy Loading</h3><p>MFEs are loaded on-demand to&#xA0;improve initial page load: </p><pre><code>const AuthMFE = lazy(() =&gt; import(&quot;auth/AuthMFE&quot;));
const FeatureMFE = lazy(() =&gt; import(&quot;feature/FeatureMFE&quot;));</code></pre><h2 id="%F0%9F%9A%A8-common-pitfalls-and-solutions">&#x1F6A8; Common Pitfalls&#xA0;and Solutions</h2><h3 id="1-react-singleton-issues">1.  React&#xA0;Singleton Issues</h3><p>Problem: Multiple React instances causing&#xA0;hydration&#xA0;errors.</p><p>Solution: Ensure&#xA0;React is shared as a&#xA0;singleton across all MFEs.</p><h3 id="2-css-conflicts">2.  CSS&#xA0;Conflicts</h3><p>Problem: Styling&#xA0;conflicts between MFEs.</p><p>Solution: Centralize global styles and&#xA0;use CSS modules for component-specific styles.</p><h3 id="3-build-performance">3. Build Performance</h3><p>Problem: Slow&#xA0;builds with multiple applications.</p><p>Solution: Use Turbo for parallel builds and implement proper caching.</p><h3 id="4-development-complexity">4. Development Complexity</h3><p>Problem: Complex local development setup.</p><p>Solution: Create comprehensive scripts and documentation for team onboarding.</p><h2 id="%F0%9F%8E%AF-best-practices-established">&#x1F3AF; Best Practices Established</h2><ol><li>Always have fallback mechanisms&#xA0;for Module Federation loading</li><li>Implement comprehensive error boundaries&#xA0;for each MFE</li><li>Use TypeScript strictly&#xA0;across&#xA0;all applications</li><li>Centralize&#xA0;shared dependencies&#xA0;in packages</li><li>Implement proper environment isolation&#xA0;for team development</li><li>Document everything&#xA0;- especially the&#xA0;integration points</li><li>Test cross-MFE functionality&#xA0;thoroughly</li><li>Monitor performance&#xA0;and bundle&#xA0;sizes regularly</li></ol><h2 id="%F0%9F%94%AE-future-considerations">&#x1F52E; Future Considerations</h2><p>As we continue to evolve this architecture, we&apos;re considering:</p><ul><li>Runtime Module Federation&#xA0;for even more flexibility</li><li>Advanced caching strategies&#xA0;for better performance</li><li>Automated testing&#xA0;for cross-MFE integration</li><li>Performance monitoring&#xA0;and alerting</li><li>Advanced error tracking&#xA0;and recovery mechanisms</li></ul><h2 id="conclusion">Conclusion</h2><p>Implementing Module Federation for a Nextjs application was a challenging but rewarding journey. The key to success was:</p><ul><li>Careful planning&#xA0;of the&#xA0;architecture</li><li>Comprehensive error&#xA0;handling&#xA0;at every level</li><li>Proper tooling&#xA0;for development and build&#xA0;processes</li><li>Clear communication&#xA0;and documentation</li><li>Iterative&#xA0;improvement&#xA0;based on real-world usage</li></ul><p>The result is a modern,&#xA0;scalable application architecture&#xA0;that allows teams to work&#xA0;independently while maintaining a cohesive user experience. The lessons learned here can be applied to any micro-frontend architecture implementation.</p>]]></content:encoded></item><item><title><![CDATA[AI in Your Terminal: A Deep Dive into Claude Code and Gemini CLI]]></title><description><![CDATA[Compare Claude Code and Gemini CLI in this in-depth 2025 review. Discover which AI-powered CLI tool is better for coding, automation, and productivity.]]></description><link>https://talkingtech.io/ai-in-your-terminal-a-deep-dive-into-claude-code-and-gemini-cli/</link><guid isPermaLink="false">686902fe33fa1c0535fa6254</guid><category><![CDATA[Context Coding]]></category><category><![CDATA[AI coding]]></category><category><![CDATA[Gemini CLI]]></category><category><![CDATA[Claude Code]]></category><category><![CDATA[Vibe Coding]]></category><category><![CDATA[Agentic coding]]></category><dc:creator><![CDATA[Majid Hussain]]></dc:creator><pubDate>Sat, 05 Jul 2025 11:20:37 GMT</pubDate><media:content url="https://talkingtech.io/content/images/2025/07/geminivsclaudecode.png" medium="image"/><content:encoded><![CDATA[<img src="https://talkingtech.io/content/images/2025/07/geminivsclaudecode.png" alt="AI in Your Terminal: A Deep Dive into Claude Code and Gemini CLI"><p>Two of the most innovative AI tools for developers in 2025 are <strong>Claude Code</strong> by Anthropic and <strong>Gemini CLI</strong> by Google. Both deliver AI-powered coding and terminal interaction&#x2014;but cater to slightly different developer needs. Let&#x2019;s break down the latest features, strengths, weaknesses, and which is best for what.</p><h2 id="pros-cons">Pros &amp; Cons:</h2><blockquote class="kg-blockquote-alt"><strong>Claude Code (Anthropic)</strong></blockquote><h4 id="pros">Pros:</h4><ul><li>Deep codebase awareness: Excellent at understanding large and complex projects.</li><li>IDE + Terminal integration: Works with VS Code, JetBrains, and terminal natively.</li><li>Smart Git workflows: Can generate commits, refactor across files, and understand diffs.</li><li>High-quality completions: Powered by Claude Opus 4, with long-context reasoning and code integrity.</li><li>Memory, undo, and logging built-in.</li><li>Robust SDKs for TypeScript and Python.</li></ul><h4 id="cons">Cons:</h4><ul><li>Closed source: Not open for contributions or customization.</li><li>Limited to Claude&#x2019;s models: No third-party plugin ecosystem (yet).</li><li>CLI-only is more code-centric: Less general-purpose compared to Gemini.</li></ul><blockquote class="kg-blockquote-alt"><strong>Gemini CLI (Google)</strong></blockquote><h4 id="pros-1">Pros:</h4><ul><li>Open-source &amp; extensible: Under Apache 2.0 license, community-driven.</li><li>Reason-and-act agent loop: Ideal for chaining tasks and intelligent automation.</li><li>Massive context window (1M tokens): Great for long files, conversations, or context-rich tasks.</li><li>Multimodal support: Ties into Veo, Imagen, and Google Search.</li><li>Built-in terminal UX: Friendly for both developers and technical creators.</li><li>Free-tier (preview): Generous rate limits: 60 RPM, 1,000 req/day.</li></ul><h4 id="cons-1">Cons:</h4><ul><li>Preview-stage: May be less stable or polished than Claude Code.</li><li>No deep codebase integration (yet): Best for task-oriented or isolated coding help.</li><li>Limited IDE integration (for now): Most interactions are CLI-only.</li></ul><h2 id="comparison-table">Comparison Table:</h2>
<!--kg-card-begin: html-->
<table data-start="2189" data-end="3607" class="w-fit min-w-(--thread-content-width)"><tr data-start="2189" data-end="2317"><td data-start="2189" data-end="2219" data-col-size="sm">Feature</td><td data-start="2219" data-end="2266" data-col-size="sm">Claude Code</td><td data-start="2266" data-end="2317" data-col-size="md">Gemini CLI</td></tr><tr data-start="2447" data-end="2575"><td data-start="2447" data-end="2477" data-col-size="sm"><strong data-start="2449" data-end="2460">License</strong></td><td data-col-size="sm" data-start="2477" data-end="2524">Closed source</td><td data-col-size="md" data-start="2524" data-end="2575">Open-source (Apache 2.0)</td></tr><tr data-start="2576" data-end="2704"><td data-start="2576" data-end="2606" data-col-size="sm"><strong data-start="2578" data-end="2590">Best for</strong></td><td data-col-size="sm" data-start="2606" data-end="2653">Full-project coding, Git workflows</td><td data-col-size="md" data-start="2653" data-end="2704">Versatile terminal automation &amp; code snippets</td></tr><tr data-start="2705" data-end="2833"><td data-start="2705" data-end="2735" data-col-size="sm"><strong data-start="2707" data-end="2716">Model</strong></td><td data-col-size="sm" data-start="2735" data-end="2782">Claude Opus 4 / Sonnet 4</td><td data-col-size="md" data-start="2782" data-end="2833">Gemini 2.5 Pro</td></tr><tr data-start="2834" data-end="2962"><td data-start="2834" data-end="2864" data-col-size="sm"><strong data-start="2836" data-end="2858">Codebase Awareness</strong></td><td data-col-size="sm" data-start="2864" data-end="2911">Deep (multi-file refactoring, memory)</td><td data-col-size="md" data-start="2911" data-end="2962">Light (single task-focused)</td></tr><tr data-start="2963" data-end="3091"><td data-start="2963" data-end="2993" data-col-size="sm"><strong data-start="2965" data-end="2981">Context Size</strong></td><td data-col-size="sm" data-start="2993" data-end="3040">Large (unspecified, very capable)</td><td data-col-size="md" data-start="3040" data-end="3091">1 million tokens</td></tr><tr data-start="3092" data-end="3220"><td data-start="3092" data-end="3122" data-col-size="sm"><strong data-start="3094" data-end="3108">Multimodal</strong></td><td data-col-size="sm" data-start="3122" data-end="3169">No</td><td data-col-size="md" data-start="3169" data-end="3220">Yes (text + image/video/gen via plugins)</td></tr><tr data-start="3221" data-end="3349"><td data-start="3221" data-end="3251" data-col-size="sm"><strong data-start="3223" data-end="3238">SDK Support</strong></td><td data-col-size="sm" data-start="3251" data-end="3298">TypeScript, Python</td><td data-col-size="md" data-start="3298" data-end="3349">Plugin-based agent model</td></tr><tr data-start="3350" data-end="3478"><td data-start="3350" data-end="3380" data-col-size="sm"><strong data-start="3352" data-end="3367">Integration</strong></td><td data-col-size="sm" data-start="3380" data-end="3427">IDEs + Terminal</td><td data-col-size="md" data-start="3427" data-end="3478">CLI-first (early IDE integrations in progress)</td></tr><tr data-start="3479" data-end="3607"><td data-start="3479" data-end="3509" data-col-size="sm"><strong data-start="3481" data-end="3494">Stability</strong></td><td data-col-size="sm" data-start="3509" data-end="3556">GA (1.0+)</td><td data-col-size="md" data-start="3556" data-end="3607">Preview / Experimental</td></tr></table>
<!--kg-card-end: html-->
<p></p><h2 id="verdict-which-one-is-best-for-what">Verdict: Which One is Best for What?</h2>
<!--kg-card-begin: html-->
<table data-start="3658" data-end="4509" class="w-fit min-w-(--thread-content-width)"><thead data-start="3658" data-end="3771"><tr data-start="3658" data-end="3771"><th data-start="3658" data-end="3683" data-col-size="sm">Category</th><th data-start="3683" data-end="3699" data-col-size="sm">Winner</th><th data-start="3699" data-end="3771" data-col-size="md">Why?</th></tr></thead><tbody data-start="3886" data-end="4509"><tr data-start="3886" data-end="4023"><td data-start="3886" data-end="3914" data-col-size="sm"><strong data-start="3888" data-end="3912">Code Quality &amp; Depth</strong></td><td data-col-size="sm" data-start="3914" data-end="3933"><strong data-start="3916" data-end="3931">Claude Code</strong></td><td data-col-size="md" data-start="3933" data-end="4023">Excels at deep edits, large refactors, commit messages, understanding large codebases.</td></tr><tr data-start="4024" data-end="4143"><td data-start="4024" data-end="4052" data-col-size="sm"><strong data-start="4026" data-end="4046">Open Development</strong></td><td data-col-size="sm" data-start="4052" data-end="4072"><strong data-start="4054" data-end="4068">Gemini CLI</strong></td><td data-col-size="md" data-start="4072" data-end="4143">Fully open-source, extensible, and community-driven.</td></tr><tr data-start="4144" data-end="4265"><td data-start="4144" data-end="4172" data-col-size="sm"><strong data-start="4146" data-end="4172">Versatility (non-code)</strong></td><td data-col-size="sm" data-start="4172" data-end="4192"><strong data-start="4174" data-end="4188">Gemini CLI</strong></td><td data-col-size="md" data-start="4192" data-end="4265">Can write, summarize, search, scaffold projects, even generate media.</td></tr><tr data-start="4266" data-end="4387"><td data-start="4266" data-end="4296" data-col-size="sm"><strong data-start="4268" data-end="4295">Integration &amp; Stability</strong></td><td data-col-size="sm" data-start="4296" data-end="4315"><strong data-start="4298" data-end="4313">Claude Code</strong></td><td data-col-size="md" data-start="4315" data-end="4387">Stable GA release with editor + terminal support.</td></tr><tr data-start="4388" data-end="4509"><td data-start="4388" data-end="4417" data-col-size="sm"><strong data-start="4390" data-end="4415">Experimental Features</strong></td><td data-col-size="sm" data-start="4417" data-end="4437"><strong data-start="4419" data-end="4433">Gemini CLI</strong></td><td data-col-size="md" data-start="4437" data-end="4509">Supports multimodal, web search, and more via MCP.</td></tr></tbody></table>
<!--kg-card-end: html-->
<h2 id="final-thoughts">Final Thoughts:</h2><ul><li>Pick Claude Code if you want a serious AI coding companion that integrates deeply with your codebase, understands your Git workflow, and supports structured development inside IDEs.</li><li>Pick Gemini CLI if you want an AI-powered shell companion for general tasks, fast scaffolding, open extensibility, and multimedia integration &#x2014; and you&apos;re okay being part of its growing preview stage.</li></ul><p>Both tools push the boundaries of what an AI in your terminal can do &#x2014; but they serve different developer mindsets. Choose based on whether you need precision and polish (Claude) or freedom and flexibility (Gemini).</p>]]></content:encoded></item><item><title><![CDATA[Why Model Context Matters: Understanding the Rise of MCP in AI]]></title><description><![CDATA[How structured context is transforming AI from stateless tools to intelligent, memory-driven systems]]></description><link>https://talkingtech.io/why-model-context-matters-understanding-the-rise-of-mcp-in-ai/</link><guid isPermaLink="false">684330dff7c0d005141c5b6c</guid><category><![CDATA[AI]]></category><category><![CDATA[Generative AI]]></category><dc:creator><![CDATA[Majid Hussain]]></dc:creator><pubDate>Fri, 06 Jun 2025 18:45:12 GMT</pubDate><media:content url="https://talkingtech.io/content/images/2025/06/rise_of_mcp-1.png" medium="image"/><content:encoded><![CDATA[<img src="https://talkingtech.io/content/images/2025/06/rise_of_mcp-1.png" alt="Why Model Context Matters: Understanding the Rise of MCP in AI"><p><em>How structured context is transforming AI from stateless tools to intelligent, memory-driven systems</em></p><h2 id="introduction">Introduction</h2><p>Language models have become shockingly capable &#x2014; they can summarize books, write code, and carry on conversations that <em>feel</em> intelligent. But under the hood, they&#x2019;re still just... guessing the next token.</p><p>What gives these models <em>real usefulness</em> isn&apos;t just their raw capability &#x2014; it&apos;s <strong>context</strong>.</p><p>And that&#x2019;s where <strong>Model Context Protocol (MCP)</strong> comes in.</p><p>MCP is emerging as a fundamental layer in AI architecture. It&apos;s how developers give language models <strong>memory</strong>, <strong>identity</strong>, <strong>tools</strong>, and <strong>goal-awareness</strong>. It turns dumb-but-powerful token predictors into <strong>stateful, smart assistants</strong>.</p><p>In this post, we&#x2019;ll break down why <strong>Model Context matters</strong>, what MCP is, and how it&#x2019;s powering the next generation of AI applications.</p><h2 id="the-problem-with-stateless-ai">The Problem with Stateless AI</h2><p>Large language models (LLMs) like GPT-4 are <strong>stateless by design</strong>. Each prompt is treated in isolation. The model has no idea who you are, what you want, or what happened five minutes ago &#x2014; unless you tell it again.</p><p>This creates obvious limitations:</p><ul><li>Repetition: You have to reintroduce yourself and your goals.</li><li>Short-term memory: Models can &#x201C;forget&#x201D; earlier parts of a conversation.</li><li>No personalization: Every session starts from zero.</li></ul><p>This is like using a web app that doesn&#x2019;t remember your login, settings, or history &#x2014; frustrating and inefficient.</p><hr><h2 id="what-is-model-context-protocol-mcp">What is Model Context Protocol (MCP)?</h2><p><strong>Model Context Protocol (MCP)</strong> is a structured way of passing additional context into language models. It defines what the model <em>should know</em> at runtime &#x2014; beyond just the user&#x2019;s prompt.</p><p>It&#x2019;s not a formal standard (yet), but many advanced AI systems &#x2014; including OpenAI&#x2019;s GPTs, LangChain agents, and enterprise AI stacks &#x2014; are already implementing MCP-like architectures.</p><h3 id="mcp-typically-includes">MCP typically includes:</h3>
<!--kg-card-begin: html-->
<div class="gist"><table><thead><tr><th>Component</th><th>Purpose</th></tr></thead><tbody><tr><td><strong>System Instructions</strong></td><td>Role definition and behavioral tuning (e.g., &#x201C;You are a helpful tax advisor.&#x201D;)</td></tr><tr><td><strong>Memory</strong></td><td>Persistent knowledge about the user, goals, or history</td></tr><tr><td><strong>Session Context</strong></td><td>Recent conversation turns, temporary instructions</td></tr><tr><td><strong>Tool Access</strong></td><td>Metadata about callable functions (e.g., SQL query tools, browsers, interpreters)</td></tr><tr><td><strong>Identity / Role</strong></td><td>User identity, role, or access level info</td></tr></tbody></table></div>
<!--kg-card-end: html-->
<hr><h2 id="from-prompts-to-protocols-why-the-shift">From Prompts to Protocols: Why the Shift?</h2><p>Before MCP, developers tried to cram everything into a giant prompt. That had downsides:</p><ul><li>Token bloat (you&#x2019;d hit context limits fast)</li><li>Hard to debug or update</li><li>No separation of concerns</li></ul><p>MCP emerged as a clean separation between <strong>context</strong> and <strong>conversation</strong>. It lets you <strong>modularize state</strong>:</p><pre><code>{
  &quot;mcp&quot;: {
    &quot;system_instruction&quot;: &quot;You are an AI Linux assistant.&quot;,
    &quot;memory&quot;: {
      &quot;user_os&quot;: &quot;Ubuntu 22.04&quot;,
      &quot;prefers_logs&quot;: &quot;journalctl&quot;
    },
    &quot;tools&quot;: [
      { &quot;name&quot;: &quot;get_logs&quot;, &quot;description&quot;: &quot;Fetch logs from a server...&quot; }
    ],
    &quot;session_id&quot;: &quot;abc-123&quot;
  },
  &quot;prompt&quot;: &quot;Why is Gunicorn failing to write a PID file?&quot;
}
</code></pre><p>Now your AI system becomes:</p><ul><li>Easier to reason about</li><li>Easier to scale and debug</li><li>Capable of <em>evolving</em> with the user</li></ul><h2 id="what-mcp-unlocks">What MCP Unlocks</h2><p>Here&#x2019;s what becomes possible when you adopt MCP in your AI system:</p><h3 id="personalized-interactions">Personalized Interactions</h3><p>Store memory like:</p><pre><code>{ &quot;user_name&quot;: &quot;Alex&quot;, &quot;favorite_framework&quot;: &quot;FastAPI&quot; }
</code></pre><p>Now the model can tailor recommendations without re-asking every time.</p><h3 id="long-term-memory">Long-Term Memory</h3><p>Remember past sessions, project status, or decisions made by the user.</p><h3 id="tool-oriented-reasoning">Tool-Oriented Reasoning</h3><p>Expose model-callable functions like:</p><ul><li><code>run_sql(query)</code></li><li><code>fetch_logs(service, range)</code></li><li><code>deploy_service(env)</code></li></ul><p>Let the model <em>plan</em> and then <em>act</em> via tools.</p><h3 id="multi-agent-collaboration">Multi-Agent Collaboration</h3><p>Pass structured context between agents (planner &#x2192; executor, QA bot &#x2192; code generator) using MCP as a shared state.</p><hr><h2 id="mcp-in-the-real-world">MCP in the Real World</h2><h3 id="openai-gpts-custom-gpts">OpenAI GPTs (Custom GPTs)</h3><p>Every custom GPT uses an internal MCP layer:</p><ul><li>Instructions = system role</li><li>Memory = persistent per-user facts</li><li>Tools = enabled functions</li><li>Files &amp; APIs = augment context and capabilities</li></ul><h3 id="langchain-semantic-kernel">LangChain &amp; Semantic Kernel</h3><p>These frameworks implement &#x201C;chains,&#x201D; &#x201C;agents,&#x201D; and &#x201C;memories&#x201D; &#x2014; all forms of MCP abstraction:</p><ul><li><code>ConversationBufferMemory</code>, <code>VectorStoreRetrieverMemory</code>, etc.</li><li>Agent inputs = system + tools + intermediate steps</li></ul><h3 id="autogen-crewai">AutoGen / CrewAI</h3><p>Multi-agent orchestration relies on MCP-style handoffs &#x2014; letting one agent know what another just did.</p><hr><h2 id="how-to-design-your-own-mcp-layer">How to Design Your Own MCP Layer</h2><p>You don&#x2019;t need a whole framework to use MCP. Here&#x2019;s how to start:</p><ol><li><strong>Define your system role</strong> (<code>system_instruction</code>)</li><li><strong>Store persistent user memory</strong> in a DB or Redis</li><li><strong>Build a context assembler</strong> that combines:<ul><li>memory</li><li>system prompt</li><li>recent history</li><li>available tools</li></ul></li><li><strong>Inject that structure into your model call</strong></li></ol><p>Simple example (Python pseudo-code):</p><pre><code>context = assemble_mcp(user_id, session_id)
response = openai.chat_completion(
  model=&quot;gpt-4&quot;,
  messages=build_prompt(context, user_input)
)
</code></pre><hr><h2 id="best-practices">Best Practices</h2><ul><li><strong>Separate long-term vs session memory</strong></li><li><strong>Use schemas or type systems</strong> for tools/functions</li><li><strong>Respect token budgets</strong> (truncate old turns or summarize)</li><li><strong>Secure tool execution</strong> &#x2014; don&#x2019;t let the model call anything directly without validation</li><li><strong>Audit context</strong> &#x2014; log what memory and tools were passed per session</li></ul><h2 id="the-future-of-mcp">The Future of MCP</h2><p>MCP isn&#x2019;t just a temporary workaround &#x2014; it&#x2019;s a <strong>new programming model</strong> for intelligent systems.</p><p>As models gain long-term memory, better planning, and richer tool ecosystems, MCP will evolve into:</p><ul><li><strong>Standardized schemas</strong> (e.g., OpenMCP)</li><li><strong>Shared memory across apps/agents</strong></li><li><strong>Context negotiation</strong> between services</li><li><strong>User-controlled memory UIs</strong> (&quot;What do you know about me?&quot;)</li></ul><hr><h2 id="tldr">TL;DR</h2><ul><li>AI isn&#x2019;t smart without <strong>context</strong></li><li><strong>MCP</strong> provides a structured way to inject memory, role, tools, and session data</li><li>It&#x2019;s already powering advanced agents, GPTs, and orchestration frameworks</li><li>Start thinking not just about <strong>prompts</strong>, but about <strong>protocols</strong></li></ul><h2 id="next-up">Next up?</h2><p>Let me know if you&apos;d like:</p><ul><li>A follow-up on <strong>&quot;How to Build Your Own MCP Layer&quot;</strong> with real code</li><li>An explainer on <strong>tool integration and security</strong></li><li>A diagram-based summary of how MCP flows into model APIs</li></ul>]]></content:encoded></item><item><title><![CDATA[Project Corsa:  Microsoft's New Native TypeScript Compiler]]></title><description><![CDATA[Microsoft's Project Corsa is a new native TypeScript compiler built for speed and efficiency. It dramatically reduces compilation time while staying fully compatible with TypeScript, offering faster builds and improved developer productivity.]]></description><link>https://talkingtech.io/new-native-compiler-for-typescript/</link><guid isPermaLink="false">67d5b2f8923fb905225aaa85</guid><category><![CDATA[TypeScript]]></category><category><![CDATA[JavaScript]]></category><category><![CDATA[Corsa]]></category><category><![CDATA[Microsoft]]></category><category><![CDATA[Golang]]></category><category><![CDATA[Compiler]]></category><dc:creator><![CDATA[Majid Hussain]]></dc:creator><pubDate>Sat, 15 Mar 2025 17:23:55 GMT</pubDate><media:content url="https://talkingtech.io/content/images/2025/03/DALL-E-2025-03-15-21.20.37---A-futuristic-illustration-of-a-high-performance-compiler-represented-as-a-glowing--ultra-fast-digital-engine.-The-scene-has-a-cyberpunk-aesthetic--wit.webp" medium="image"/><content:encoded><![CDATA[<img src="https://talkingtech.io/content/images/2025/03/DALL-E-2025-03-15-21.20.37---A-futuristic-illustration-of-a-high-performance-compiler-represented-as-a-glowing--ultra-fast-digital-engine.-The-scene-has-a-cyberpunk-aesthetic--wit.webp" alt="Project Corsa:  Microsoft&apos;s New Native TypeScript Compiler"><p>Microsoft has embarked on a significant transformation of its TypeScript programming language by developing a native compiler and toolset, aiming to enhance performance and developer productivity. This initiative involves porting the existing TypeScript compiler from JavaScript/TypeScript to Go, promising substantial improvements in compilation speed and resource efficiency.</p><h3 id="the-need-for-a-native-compiler">The Need for a Native Compiler:</h3><p>As TypeScript has grown in popularity, developers working on large-scale projects have encountered performance bottlenecks, including slow build times and high memory consumption. These challenges can hinder development workflows and affect productivity. Recognizing these issues, Microsoft has undertaken the task of creating a native implementation of the TypeScript compiler to address these performance concerns.</p><h3 id="impressive-performance-gains">Impressive Performance Gains:</h3><p>The native compiler has demonstrated remarkable performance enhancements across various codebases:</p>
<!--kg-card-begin: html-->
<table class="gist" style="background-color: black; color: white;"><thead style="background-color: black; color: white"><tr style="background-color: black; color: white"><th style="background-color: black; color: white">Codebase</th><th style="background-color: black; color: white">Size (LOC)</th><th style="background-color: black; color: white">JavaScript Compiler Time</th><th style="background-color: black; color: white">Native Compiler Time</th><th style="background-color: black; color: white">Speedup</th></tr></thead><tbody style="background: black !important; color: white"><tr><td style="background-image: none;">VS Code</td><td>1,505,000</td><td>77.8 seconds</td><td>7.5 seconds</td><td style="background-image: none;">10.4x</td></tr><tr><td style="background-image: none;">Playwright</td><td>356,000</td><td>11.1 seconds</td><td>1.1 seconds</td><td style="background-image: none;">10.1x</td></tr><tr><td style="background-image: none;">TypeORM</td><td>270,000</td><td>17.5 seconds</td><td>1.3 seconds</td><td style="background-image: none;">13.5x</td></tr><tr><td style="background-image: none;">date-fns</td><td>104,000</td><td>6.5 seconds</td><td>0.7 seconds</td><td style="background-image: none;">9.5x</td></tr><tr><td style="background-image: none;">tRPC</td><td>18,000</td><td>5.5 seconds</td><td>0.6 seconds</td><td style="background-image: none;">9.1x</td></tr><tr><td style="background-image: none;">rxjs</td><td>2,100</td><td>1.1 seconds</td><td>0.1 seconds</td><td style="background-image: none;">11.0x</td></tr></tbody></table>
<!--kg-card-end: html-->
<p>These results indicate that the native compiler can reduce build times by approximately 10 times, significantly enhancing the efficiency of development processes.</p><h3 id="impact-on-developer-experience">Impact on Developer Experience:</h3><p>Beyond faster build times, the native compiler offers additional benefits:</p><ul><li><strong>Improved Editor Performance</strong>: Developers can expect quicker editor startup times and more responsive code navigation features, such as renaming variables and finding references. For instance, loading the entire VS Code project in an editor has been reduced from 9.6 seconds to 1.2 seconds using the native language service.</li><li><strong>Reduced Memory Usage</strong>: Preliminary observations suggest that the native implementation consumes roughly half the memory compared to the current JavaScript-based compiler, contributing to a smoother development experience.</li></ul><h3 id="timeline-and-future-developments">Timeline and Future Developments:</h3><p>Microsoft plans to release a preview of the native compiler capable of command-line type-checking by mid-2025, with a feature-complete solution for project builds and language services expected by the end of the year. This native implementation is slated to be part of TypeScript 7.0 upon reaching parity with the existing compiler. In the interim, Microsoft will continue to maintain the JavaScript-based compiler through the 6.x releases to ensure stability and support for ongoing projects.</p><h3 id="community-engagement-and-feedback">Community Engagement and Feedback:</h3><p>Microsoft is actively seeking feedback from the developer community to refine the native compiler. Developers are encouraged to build and run the Go code from the new repository, which is available under the same license as the existing TypeScript codebase. Regular updates will be provided as new functionality becomes available for testing.</p><p>The development of a native TypeScript compiler represents a significant advancement in addressing performance challenges associated with large-scale TypeScript projects. By leveraging a native implementation, Microsoft aims to provide developers with faster build times, improved editor responsiveness, and reduced memory usage, thereby enhancing the overall development experience. This initiative underscores Microsoft&apos;s commitment to evolving TypeScript in alignment with the growing demands of modern software development.</p><p>For a more in-depth discussion on this topic, you might find the following resources insightful:</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://devblogs.microsoft.com/typescript/typescript-native-port/?ref=talkingtech.io"><div class="kg-bookmark-content"><div class="kg-bookmark-title">A 10x Faster TypeScript - TypeScript</div><div class="kg-bookmark-description">Embarking on a native port of the existing TypeScript compiler and toolset to achieve a 10x performance speed-up.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://devblogs.microsoft.com/typescript/wp-content/uploads/sites/11/2018/10/Microsoft-Favicon.png" alt="Project Corsa:  Microsoft&apos;s New Native TypeScript Compiler"><span class="kg-bookmark-author">TypeScript</span><span class="kg-bookmark-publisher">Anders Hejlsberg</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://devblogs.microsoft.com/typescript/wp-content/uploads/sites/11/2018/08/typescriptfeature.png" alt="Project Corsa:  Microsoft&apos;s New Native TypeScript Compiler"></div></a></figure><figure class="kg-card kg-embed-card"><iframe width="200" height="113" src="https://www.youtube.com/embed/pNlq-EVld70?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen title="A 10x faster TypeScript"></iframe></figure><p></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[How to Use Ollama to Host Your Own DeepSeek LLM Locally]]></title><description><![CDATA[With the increasing demand for privacy, security, and control over AI models, hosting your own large language models (LLMs) like DeepSeek locally has become a viable option. ]]></description><link>https://talkingtech.io/how-to-use-ollama-to-host-your-own-deepseek-llm-locally/</link><guid isPermaLink="false">6799d7601a49fb0463d4e07c</guid><category><![CDATA[AI]]></category><category><![CDATA[Generative AI]]></category><category><![CDATA[DeepSeek]]></category><dc:creator><![CDATA[Majid Hussain]]></dc:creator><pubDate>Wed, 29 Jan 2025 07:30:08 GMT</pubDate><media:content url="https://talkingtech.io/content/images/2025/01/DALL-E-2025-01-29-11.29.46---A-futuristic-home-server-setup-showcasing-a-local-AI-model-being-hosted-using-Ollama.-The-scene-features-a-sleek-workstation-with-multiple-monitors-di.webp" medium="image"/><content:encoded><![CDATA[<img src="https://talkingtech.io/content/images/2025/01/DALL-E-2025-01-29-11.29.46---A-futuristic-home-server-setup-showcasing-a-local-AI-model-being-hosted-using-Ollama.-The-scene-features-a-sleek-workstation-with-multiple-monitors-di.webp" alt="How to Use Ollama to Host Your Own DeepSeek LLM Locally"><p>With the increasing demand for privacy, security, and control over AI models, hosting your own large language models (LLMs) locally has become a viable option. <strong>Ollama</strong> is a powerful tool that simplifies this process, allowing users to run and interact with open-source LLMs on their local machines efficiently. This guide will walk you through the installation, configuration, and usage of Ollama to host your own DeepSeek LLM.</p><h2 id="what-is-ollama">What is Ollama?</h2><p>Ollama is a lightweight framework that enables users to run large language models locally with minimal setup. It supports various open-source models such as Llama, Mistral, and more. Ollama is designed to be user-friendly, making it an excellent choice for developers, researchers, and AI enthusiasts who want to leverage LLMs without relying on cloud-based services.</p><h2 id="prerequisites">Prerequisites</h2><p>Before installing Ollama, ensure that your system meets the following requirements:</p><ul><li><strong>Operating System:</strong> macOS or Linux (Windows support via WSL2)</li><li><strong>Hardware:</strong> A modern CPU with AVX2 support (GPU acceleration recommended but not mandatory)</li><li><strong>Memory:</strong> At least 16GB RAM for optimal performance</li></ul><h3 id="installation">Installation :</h3><h3 id="macos">macOS</h3><ol><li>Open a terminal window.</li></ol><p>Verify the installation:</p><pre><code>ollama --version</code></pre><p>Run the following command to install Ollama:</p><pre><code>brew install ollama</code></pre><h3 id="linux">Linux</h3><p>Verify the installation:</p><pre><code>ollama --version</code></pre><p>Download and install Ollama using the package manager:</p><pre><code>curl -fsSL https://ollama.ai/install.sh | sh</code></pre><h3 id="windows-via-wsl2">Windows (via WSL2)</h3><ol><li>Install <strong>Windows Subsystem for Linux 2 (WSL2)</strong>.</li><li>Follow the Linux installation steps inside WSL2.</li><li>Use <code>ollama</code> commands within your WSL2 terminal.</li></ol><h2 id="downloading-and-running-models">Downloading and Running Models</h2><p>Once installed, you can start using Ollama to download and run models.</p><h3 id="listing-available-models">Listing Available Models</h3><p>To see the available models, run:</p><pre><code>ollama list</code></pre><h3 id="downloading-a-model">Downloading a Model</h3><p>To download a model, use:</p><pre><code>ollama pull deepseek-r1:1.5b</code></pre><p>You can replace deepseek-r1:1.5b&#xA0; with any supported model, such as llama2.</p><h3 id="running-a-model-interactively">Running a Model Interactively</h3><p>To start an interactive chat session with the model:</p><pre><code>ollama run deepseek-r1:1.5b</code></pre><p>This will allow you to enter prompts and receive responses from the model in real-time.</p><h2 id="hosting-the-model-as-an-api">Hosting the Model as an API</h2><p>Ollama provides a simple way to expose the model as an API for integration into applications.</p><h3 id="starting-the-api-server">Starting the API Server</h3><p>Run the following command to start a local API server:</p><pre><code>ollama serve</code></pre><p>This will expose an endpoint (typically on port 11434) that applications can use to send requests to the model.</p><h3 id="making-api-requests">Making API Requests</h3><p>You can interact with the API using <code>curl</code> or any HTTP client:</p><pre><code>curl -X POST http://localhost:11434/api/generate -d &apos;{&quot;model&quot;: &quot;llama2&quot;, &quot;prompt&quot;: &quot;Hello, world!&quot;}&apos;</code></pre><p>The response will contain the model&apos;s generated output.</p><h2 id="customizing-models">Customizing Models</h2><p>You can also fine-tune models by creating a custom <code>Modelfile</code>:</p><pre><code>FROM deepseek-r1:1.5b
PARAMETER temperature 0.7</code></pre><p>Save this as <code>Modelfile</code> and build it using:</p><pre><code>ollama create my-custom-model .</code></pre><p>Now, you can run your customized model with:</p><pre><code>ollama run my-custom-model</code></pre><h2 id="conclusion">Conclusion</h2><p>Ollama makes it easy to host and run LLMs locally, providing privacy, control, and reduced latency compared to cloud-based solutions. Whether you&apos;re a developer building AI-powered applications or a researcher exploring LLM capabilities, Ollama is a powerful tool that streamlines the process.</p><p>By following this guide, you can quickly set up and deploy your own DeepSeek LLMs, ensuring you have full control over your AI experience.</p><p>Want to see an example ? <a href="https://gpt.talkingtech.io/?ref=talkingtech.io" rel="noreferrer">&quot;Some-GPT&quot;</a> on TalkingTech.io is now self hosted and using DeepSeek-R1:7b model. </p><h2 id></h2>]]></content:encoded></item><item><title><![CDATA[Open AI GPT-4o mini - A new cost effective model]]></title><description><![CDATA[OpenAI has unveiled a game-changer: GPT-4o mini! This powerful yet budget-friendly AI model is poised to revolutionize the world of AI applications by making it much more accessible.]]></description><link>https://talkingtech.io/open-ai-gpt-4o-mini-is-announced/</link><guid isPermaLink="false">669a128b8eed040413fe9f98</guid><category><![CDATA[AI]]></category><category><![CDATA[Open AI]]></category><category><![CDATA[GPT-4]]></category><dc:creator><![CDATA[Majid Hussain]]></dc:creator><pubDate>Fri, 19 Jul 2024 07:30:38 GMT</pubDate><media:content url="https://images.unsplash.com/photo-1666597107756-ef489e9f1f09?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wxMTc3M3wwfDF8c2VhcmNofDE3fHxBSXxlbnwwfHx8fDE3MjEzNzQwMzF8MA&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=2000" medium="image"/><content:encoded><![CDATA[<img src="https://images.unsplash.com/photo-1666597107756-ef489e9f1f09?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wxMTc3M3wwfDF8c2VhcmNofDE3fHxBSXxlbnwwfHx8fDE3MjEzNzQwMzF8MA&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=2000" alt="Open AI GPT-4o mini - A new cost effective model"><p>OpenAI has unveiled a game-changer: GPT-4o mini! This powerful yet budget-friendly AI model is poised to revolutionize the world of AI applications by making it much more accessible. Not only is GPT-4o mini remarkably affordable at 15 cents per million input tokens and 60 cents per million output tokens, but it also outperforms previous models in key areas like chat conversations. Benchmarked at an impressive 82% on the MMLU test, GPT-4o mini surpasses even GPT-4 in chat preferences. This marks a significant leap forward in affordability and functionality, making GPT-4o mini over 60% cheaper than GPT-3.5 Turbo.</p><p>Here are some highlights from the announcement.</p><ul><li><strong>Intelligence:</strong>&#xA0;GPT-4o mini outperforms GPT-3.5 Turbo in textual intelligence (scoring 82% on MMLU compared to 69.8%) and multimodal reasoning.</li><li><strong>Price:</strong>&#xA0;GPT-4o mini is more than 60% cheaper than GPT-3.5 Turbo, priced at $0.15 per 1M input tokens and $0.60 per 1M output tokens (roughly the equivalent of 2500 pages in a standard book).</li><li><strong>Modalities:</strong>&#xA0;GPT-4o mini currently supports text and vision capabilities, and we plan to add support for audio and video inputs and outputs in the future.</li><li><strong>Languages:</strong>&#xA0;GPT-4o mini has improved multilingual understanding over GPT-3.5 Turbo across a wide range of non-English languages.</li></ul><p>Because GPT-4o mini is both affordable and speedy, it&apos;s perfect for several situations:</p><ul><li>Handling large amounts of data: Need to analyze a whole codebase or a long chat history? No problem!</li><li>Working on a tight budget: Tasks like summarizing lengthy documents become much more cost-effective.</li><li>Delivering quick responses: Think real-time customer service chatbots that need to answer fast.</li></ul><p>Just like its bigger brother, GPT-4o, the mini version can remember information from the past (up to October 2023) and process conversations of up to 128,000 words. Plus, it can generate responses of up to 16,000 words at a time. And to make it even better, Open AI is offering ways to fine-tune GPT-4o mini in just a few days!</p><p>Want to learn more about GPT-4o mini ? visit official announcement <a href="https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/?ref=talkingtech.io" rel="noreferrer">here</a></p>]]></content:encoded></item><item><title><![CDATA[Open AI 4o (omni) - All you need to know]]></title><description><![CDATA[Open AI recently announced their new large language model gpt-4o(omni)]]></description><link>https://talkingtech.io/open-ai-4o-whats-new/</link><guid isPermaLink="false">664598ad6a39b504071996e8</guid><dc:creator><![CDATA[Majid Hussain]]></dc:creator><pubDate>Thu, 16 May 2024 05:35:11 GMT</pubDate><media:content url="https://talkingtech.io/content/images/2024/05/GPT-4o-Video_Card.png.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://talkingtech.io/content/images/2024/05/GPT-4o-Video_Card.png.jpg" alt="Open AI 4o (omni) - All you need to know"><p>Open AI recently announced their new large language model <code>gpt-4o(omni)</code> . </p><h3 id="capabilities"><strong>Capabilities:</strong> </h3><p>GPT-4o is a major upgrade over previous models, bringing GPT-4 level intelligence in a faster package. It&apos;s also what they&apos;re calling a &quot;multimodal&quot; AI, meaning it can understand and respond to text, images, and even video. For example, you could show it a picture of a dish from a foreign menu and it could translate the text, tell you about the food, and even suggest similar dishes.</p><h3 id="accessibility"><strong>Accessibility:</strong> </h3><p>OpenAI is rolling out GPT-4o access in phases. It&apos;s already available to some paid users of their ChatGPT service, with free users getting a limited version. They plan on making it more widely available soon.</p><h3 id="safety"><strong>Safety:</strong> </h3><p>Safety is a big focus for OpenAI with GPT-4o. They&apos;ve built safety features into the model itself and have additional safeguards in place to prevent misuse.</p><p>You can view demo videos and learn more about GPT-4o here:</p><p><a href="https://openai.com/index/hello-gpt-4o/?ref=talkingtech.io">https://openai.com/index/hello-gpt-4o/</a></p>]]></content:encoded></item><item><title><![CDATA[Deploying Flask (Python) Apps to Ubuntu with Gunicorn and Nginx]]></title><description><![CDATA[This article describes all the steps necessary to deploy a Flask (Python framework for creating web applications) application on Ubuntu with Gunicorn and Nginx.]]></description><link>https://talkingtech.io/deploying-flask-apps-to-ubuntu-gunicorn-nginx/</link><guid isPermaLink="false">6607299a6a39b504071994cf</guid><dc:creator><![CDATA[Majid Hussain]]></dc:creator><pubDate>Thu, 11 Apr 2024 12:18:00 GMT</pubDate><media:content url="https://images.unsplash.com/photo-1601897690942-bcacbad33e55?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wxMTc3M3wwfDF8c2VhcmNofDI4fHxzaGlwcGluZ3xlbnwwfHx8fDE3MTI4Mzc1MTJ8MA&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=2000" medium="image"/><content:encoded><![CDATA[<img src="https://images.unsplash.com/photo-1601897690942-bcacbad33e55?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wxMTc3M3wwfDF8c2VhcmNofDI4fHxzaGlwcGluZ3xlbnwwfHx8fDE3MTI4Mzc1MTJ8MA&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=2000" alt="Deploying Flask (Python) Apps to Ubuntu with Gunicorn and Nginx"><p>I was recently playing around with creating web apps using python, where I came across Flask - A micro framework for python to build web applications. Flask is renowned for its straightforward design, which gives developers the freedom to select the elements they desire and customize their apps to meet their needs.</p><p>While developing web apps using Flask is pretty much straightforward, being new to Python and Flask I struggled a little while trying to deploy the apps for production on Ubuntu. </p><p>Though there are many articles already written and available on the internet to help the deployment process, none of them are having a concrete information describing each step necessary. </p><p>In this article I will try to describe all the steps necessary to deploy a Flask application on Ubuntu with Gunicorn and Nginx.</p><p>This article assumes that you already have created your web application using Flask locally and have setup following things setup before you start the deployment process.</p><ol><li>A server with Ubuntu 22 or higher installed and a non-root user with sudo privileges.</li><li>You have Nginx installed on the server. </li><li>You have a domain name configured to point to your server.</li><li>You have familiarity with the WSGI specification, which the Gunicorn server will use to communicate with your Flask application.</li></ol><p>Once you have everything ready, you can start the deployment process.</p><h3 id="installing-python-and-other-necessary-components">Installing Python and other necessary components</h3><p>First step is to install Python and other required libraries and components from the Ubuntu repositories on your Ubuntu server.</p><p>We will need Python Package Manager <code>pip</code> to manage Python packages / libraries. You also need to install the Python development files necessary to build some of the Gunicorn components.</p><p>Run the following command on the terminal to update the local package index and install the packages.</p><pre><code>sudo apt update
sudo apt install python3-pip python3-dev build-essential libssl-dev libffi-dev python3-setuptools</code></pre><h3 id="setting-up-the-virtual-environment">Setting up the Virtual Environment </h3><p>In this step you&#x2019;ll set up a virtual environment in order to isolate the Flask application from the other Python files on your system.</p><p>Run following command on terminal to install Python Virtual Environment Package.</p><pre><code>sudo apt install python3-venv</code></pre><p>Once installed move to your application directory and setup the environment.</p><pre><code>cd YOUR_PROJECT
python3 -m venv YOUR_PROJECT_VENV</code></pre><p>Here YOUR_PROJECT is the directory name for your Flask application (Assuming you have already created a Flask web application) and YOUR_PROJECT_ENV is name given to the Virtual Environment for your application. You can replace these with your choice of names.</p><p>Now that Virtual Environment is created your need to activate / run it. Run following command on terminal to do so.</p><pre><code>source YOUR_PROJECT_VENV/bin/activate</code></pre><p>Once you do this your command prompt will show you that you are in virtual environment now by displaying <code>(YOUR_PROJECT_VENV)</code> in the terminal. </p><h3 id="installing-flask-and-gunicorn">Installing Flask and Gunicorn</h3><p>Now that we have setup the virtual environment, we need to install the Framework and server to run our Flask application. </p><p>Run the following command in your virtual environment to install Flask and Gunicorn</p><pre><code>(YOUR_PROJECT_VENV) $ pip install gunicorn flask</code></pre><p>Once we have installed the required libraries let&apos;s run our application on <code>localhost</code> to make sure everything is working as expected. </p><p>But before doing that we need to open the port <code>5000</code> (default port for Flask Application) on the firewall, so our Flask application can be served from that port. Run following command to do so.</p><pre><code>sudo ufw allow 5000</code></pre><p>Ok we are ready to serve our app locally. Let&apos;s do it.</p><pre><code>  (YOUR_PROJECT_VENV) $ python YOUR_PROJECT_MAIN_FILE.py</code></pre><p>If successful terminal will show you that your application is running on SOME_IP on port 5000</p><pre><code>Running on http://0.0.0.0:5000/</code></pre><p>You can navigate to this address (replace the IP address with your server IP address) in your browser and see your application running.</p><p>Ok. first milestone achieved and we are able to run the Flask application locally on Ubuntu server. But this is not the end as we need to setup the production environment for our app. Let&apos;s dive into it in next step.</p><h3 id="setup-wsgi">Setup WSGI</h3><p>You might be wondering what is WSGI and why do we need it to run a Flask web application. Well! WSGI is the Web Server Gateway Interface. It is a specification that describes how a web server communicates with web applications. Why do we need it ? because A traditional web server (Nginx in this case) does not understand or have any way to run Python applications. I am not going into details for WSGI, you can learn more about it by searching on Google &#x1F604; Let&apos;s just setup WSGI.</p><p>Let&apos;s create an entry point for our application, which will tell the Gunicorn server where to start and how to interact with our application.</p><pre><code>nano ~/YOUR_PROJECT/wsgi.py</code></pre><p>Above command will create a file named <code>wsgi.py</code> on root of our application and open it for editing in nano editor. Just copy paste following into editor and adjust it to your setup. </p><pre><code>from YOUR_PROJECT import app

if __name__ == &quot;__main__&quot;:
    app.run()</code></pre><p>Save and close the editor once you are done.</p><h3 id="configuring-gunicorn">Configuring Gunicorn</h3><p>Once we have setup the WSGI entry point for our application, its time to setup the Gunicorn web server. </p><p>Gunicorn stands for &apos;Green Unicorn&apos; and it&apos;s a Python WSGI HTTP Server for UNIX. </p><p>I think this definition of Gunicorn is enough for the scope of this article. You can learn more about it if you want <a href="https://gunicorn.org/?ref=talkingtech.io" rel="noreferrer">here</a>. </p><p>Before we move on we need to check that Gunicorn can serve the application correctly.</p><p>We can do this by simply passing it the name of our entry point. This is constructed as the name of the module (minus the&#xA0;<code>.py</code>&#xA0;extension), plus the name of the callable within the application. In our case, this is&#xA0;<code>wsgi:app</code>.  Also specify the interface and port to bind to so that the application will be started on a publicly available interface. Run the following command to test it out.</p><pre><code>(YOUR_PROJECT_VENV) $ cd ~/YOUR_PROJECT
(YOUR_PROJECT_VENV) $ gunicorn --bind 0.0.0.0:5000 wsgi:app</code></pre><p>If you don&apos;t see any error on the console and see something like following in your terminal, that means Gunicorn is able to serve your application &#x1F604;</p><pre><code>Listening at: http://0.0.0.0:5000 </code></pre><p>You can check if application is running in browser by replacing the IP address with your server&apos;s IP address.</p><p>Ok, Now we are in good shape as we know that our application is ready to be served to public. We don&apos;t need the virtual environment anymore, so you can deactivate it by running following command.</p><pre><code>(YOUR_PROJECT_VENV) $ deactivate</code></pre><p>From here on we will not be using virtual environment through the terminal but we will use system&#x2019;s commands. We will need to do a little setup for that.</p><p>We need to create the systemd service unit file. Creating a systemd unit file will allow Ubuntu&#x2019;s init system to automatically start Gunicorn and serve the Flask application whenever the server boots. </p><p>In case you don&apos;t know much about Systemd. It is a tool that controls several systems in Linux. Want to learn more about systemd ? <a href="https://systemd.io/?ref=talkingtech.io" rel="noreferrer">Here</a> is a comprehensive resource for that.</p><p>Let&apos;s create a system unit file with <code>.service</code> extension in <code>/etc/systemd/system</code> directory. </p><pre><code>sudo nano /etc/systemd/system/YOUR_PROJECT.service</code></pre><p>Above command will create the service file and open it in nano editor, where we have to define the service.</p><p>Let&apos;s start with <code>Unit</code> section. </p><pre><code>[Unit]
Description=Description for gunicorn instance serving YOUR_PROJECT
After=network.target</code></pre><p>In the <code>Service</code> section we define following things:</p><ol><li>The <code>User</code> under which process will run. This can be regular user account since it is the owner all of the relevant files.</li><li>The <code>Group</code> ownership to the&#xA0;<code>www-data</code>&#xA0;group so that Nginx can communicate easily with the Gunicorn processes.</li><li>The <code>WorkingDirectory</code> is the root path of our web application.</li><li>The <code>Path</code> environmental variable to point to our app&apos;s virtual environment. </li><li>And the last thing is command to start the service. <code>ExecStart</code> will be doing this.</li></ol><p>Overall <code>Service</code> section of the file should look something like this.</p><pre><code>[Service]
User=YOUR_USER
Group=www-data
WorkingDirectory=/PATH_TO_YOUR_APP_DIRECTORY
Environment=&quot;PATH=/PATH_TO_YOUR_PROJECT_VENV/bin&quot;
ExecStart=/PATH_TO_YOUR_PROJECT_VENV/bin/gunicorn --workers 3 --preload --bind unix:/var/run/YOUR_APPLICATION.sock -m 007 wsgi:app</code></pre><p>Remember to replace the values with your own.</p><p>Here is breakdown of the Start command:</p><ol><li>Start 3 worker processes (This can be adjusted to your project needs by changing --workers parameter)</li><li>Will load application code before the worker processes are forked. This way you can save some RAM resources as well as speed up server boot times. This we have defined in --preload parameter.</li><li>Create and bind to a Unix socket file,&#xA0;<code>YOUR_APPLICATION.sock</code>, within our /var/run directory. (this is important, because I was trying to put the socket file in my application directory and system was not able to access it. You need to keep it in /var/run directory)</li><li>And lastly we are specifying the WSGI entry point <code>wsgi:app</code></li></ol><p>Last section to add into service file is <code>Install</code> section.</p><pre><code>[Install]
WantedBy=multi-user.target</code></pre><p>This defines how / when the service unit will be started at boot. We want this service to start when the regular multi-user system is up and running.</p><p>That&apos;s mostly it with the service unit file. Let&apos;s save the file and close the editor.</p><p>At this point we can start Gunicorn service we created and enable it so that it starts at boot. Run following commands to do so.</p><pre><code>sudo systemctl start YOUR_PROJECT
sudo systemctl enable YOUR_PROJECT</code></pre><p>If you don&apos;t see any error on the terminal it means service has successfully started. It will show you in terminal all the instance that were started.</p><p>Phew! we are finally done with setting up the Systemd service to start serving our app using Gunicorn at server startup. </p><p>Now we will move on the last section of this article which is setting up the Nginx Server to point to our application running on our server.</p><h3 id="setup-nginx">Setup Nginx</h3><p>We need to configure the Nginx server to pass all the requests to our application and this will be done using the socket file. </p><p>Let&apos;s create the Nginx configuration file. Run following command.</p><pre><code>sudo nano /etc/nginx/sites-available/YOUR_PROJECT</code></pre><p>In editor add following section to instruct Nginx to start listening on port 80 and define the domain name for your web application. In the location section we instruct the Nginx to pass all the requests to our socket file.</p><pre><code>server {
    listen 80;
    server_name YOUR_DOMAINE_NAME;

     location / {
        include proxy_params;
        proxy_pass http://unix:/home/YOUR_USER/YOUR_PROJECT/YOUR_PROJECT.sock;
    }
}</code></pre><p>Let&apos;s save and close. Our application is now available to Nginx to serve it on port 80. But one last thing is remaining and that is to enable the application on Nginx. This is easy. You can just run the following command to create a symlink for the configuration file as configuration is same for sites_enabled. </p><pre><code>sudo ln -s /etc/nginx/sites-available/YOUR_PROJECT /etc/nginx/sites-enabled</code></pre><p>At this point you can test if your Nginx configuration is correct without any syntax issues by running the following command.</p><pre><code>sudo nginx -t</code></pre><p>If no error is reported, it means we are good to start the Nginx serving our application. Let&apos;s do so by restarting the Nginx so it takes in account our modified configuration. Fingers crossed &#x1F91E;</p><pre><code>sudo systemctl restart nginx</code></pre><p>Hurray! We have concluded our deployment of Flask application on Ubuntu server. One last thing to do is to close port&#xA0;<code>5000</code> on firewall and allow full access to the Nginx server</p><pre><code>sudo ufw delete allow 5000
sudo ufw allow &apos;Nginx Full&apos;</code></pre><p>You should be able to see your application being served from your domain at this point. All the best.</p>]]></content:encoded></item><item><title><![CDATA[Creating an AI Chatbot using OpenAI Assistants API and Flowise]]></title><description><![CDATA[AI chatbots are becoming very popular these days. If you are running a business and have an online presence like a website or a mobile app, you can create an AI chatbot to engage the visitors.]]></description><link>https://talkingtech.io/creating-an-ai-chatbot-using-openai/</link><guid isPermaLink="false">66047740fb7a0e0440919734</guid><dc:creator><![CDATA[Majid Hussain]]></dc:creator><pubDate>Wed, 27 Mar 2024 22:04:34 GMT</pubDate><media:content url="https://images.unsplash.com/photo-1485827404703-89b55fcc595e?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wxMTc3M3wwfDF8c2VhcmNofDF8fGJvdHxlbnwwfHx8fDE3MTE1NzcxMDJ8MA&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=2000" medium="image"/><content:encoded><![CDATA[<img src="https://images.unsplash.com/photo-1485827404703-89b55fcc595e?crop=entropy&amp;cs=tinysrgb&amp;fit=max&amp;fm=jpg&amp;ixid=M3wxMTc3M3wwfDF8c2VhcmNofDF8fGJvdHxlbnwwfHx8fDE3MTE1NzcxMDJ8MA&amp;ixlib=rb-4.0.3&amp;q=80&amp;w=2000" alt="Creating an AI Chatbot using OpenAI Assistants API and Flowise"><p>AI chatbots are becoming very popular these days. If you are running a business and have an online presence like a website or a mobile app, you can create an AI chatbot to engage the visitors. These AI chatbots are not like traditional scripted chatbots. They can be trained on your own private data and can provide near to  accurate information to your website visitors and potentially convert them into clients. They are not just limited to providing the information to visitors but are capable of engaging them and can also collect some very useful information, that can be used by businesses to generate leads. </p><p>In November 2023 OpenAI announced OpenAI Assistants API along-with new GPT-4 Turbo model having enhanced capabilities then their existing models. </p><h2 id="open-ai-assistants-api">Open AI Assistants API</h2><p>OpenAI&apos;s Assistants API let&apos;s developers build agent-like experiences within their own applications.  An assistant is a purpose-built AI that has specific instructions.  They can leverage extra knowledge, and can call different models and tools to perform tasks.&#xA0;I am not gonna dive deep into Assistants API today instead we will try to create an AI Chatbot using Assistants API and Flowise. You can read more about Assistant&apos;s API <a href="https://platform.openai.com/docs/assistants/overview/agents?context=with-streaming&amp;ref=talkingtech.io" rel="noreferrer">here</a></p><h2 id="flowise-ai">Flowise AI</h2><p>If you are a developer you can create your own very customized AI apps using Assistants API. They have a very easy to understand documentation available. But it takes time and what if you don&apos;t know how to code. Well you are in luck. Flowise is an open source tool, which helps you build AI apps with zero to little coding knowledge. </p><p>As per their documentation &quot;Flowise is a low-code/no-code drag &amp; drop tool with the aim to make it easy for people to visualize and build LLM apps.&quot; It doesn&apos;t only provide extensive integration with OpenAI but has a lot other cool features as well.  Click <a href="https://docs.flowiseai.com/?ref=talkingtech.io" rel="noreferrer">here</a> to learn more about Flowise. </p><h2 id="creating-a-chatbot">Creating A Chatbot</h2><p>Let&apos;s dive into creating a chatbot using Assistants API and Flowise.</p><p>First thing we need to do is to create an API Key at OpenAI platform. So if you haven&apos;t signed up at OpenAI, signup now and then navigate to <a href="https://platform.openai.com/api-keys?ref=talkingtech.io">https://platform.openai.com/api-keys</a> to generate an API key.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-12.40.23-AM.png" class="kg-image" alt="Creating an AI Chatbot using OpenAI Assistants API and Flowise" loading="lazy" width="2000" height="1036" srcset="https://talkingtech.io/content/images/size/w600/2024/03/Screenshot-2024-03-28-at-12.40.23-AM.png 600w, https://talkingtech.io/content/images/size/w1000/2024/03/Screenshot-2024-03-28-at-12.40.23-AM.png 1000w, https://talkingtech.io/content/images/size/w1600/2024/03/Screenshot-2024-03-28-at-12.40.23-AM.png 1600w, https://talkingtech.io/content/images/size/w2400/2024/03/Screenshot-2024-03-28-at-12.40.23-AM.png 2400w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Generate API key at OpenAI platform</span></figcaption></figure><p>Here we can create an Assistant too as you can see Assistants menu item in the screenshot above, but same can be done via FlowsAI interface after setting up the OpenAI API key. We are aiming to create Assistant from FlowsAI interface. </p><p>Click on &quot;Create new secret key&quot; and copy the key and keep it safe somewhere for now. </p><p>Now let&apos;s move on to Flowise website to create the chatbot. Click <a href="https://docs.flowiseai.com/getting-started?ref=talkingtech.io" rel="noreferrer">here</a> to navigate to documentation for Flowise. </p><p>Remember Flowise is not limited to chatbots only, you can build a customized LLM ochestration flow, a chatbot, an agent with all the integrations available in Flowise.</p><p>You can get a running instance of Flowise on any of the cloud providers mentioned <a href="https://docs.flowiseai.com/configuration/deployment?ref=talkingtech.io" rel="noreferrer">here</a> in matter of minutes, but for the purpose of this guide I will setup a local instance only via NPM.</p><p>Make sure you have latest NodeJS installed on your machine.</p><p>Run following command in terminal </p><pre><code>npm install -g flowise
</code></pre>
<p>Once installed you can start Flowise server</p><pre><code>npx flowise start
</code></pre>
<p>That&apos;s it. A local instance of Flowise is ready to play with &#x1F603;</p><p>Navigate to <a href="http://localhost:3000/?ref=talkingtech.io">http://localhost:3000/</a> and you will see following interface.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.06.18-AM.png" class="kg-image" alt="Creating an AI Chatbot using OpenAI Assistants API and Flowise" loading="lazy" width="2000" height="1138" srcset="https://talkingtech.io/content/images/size/w600/2024/03/Screenshot-2024-03-28-at-1.06.18-AM.png 600w, https://talkingtech.io/content/images/size/w1000/2024/03/Screenshot-2024-03-28-at-1.06.18-AM.png 1000w, https://talkingtech.io/content/images/size/w1600/2024/03/Screenshot-2024-03-28-at-1.06.18-AM.png 1600w, https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.06.18-AM.png 2000w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Flowise web interface</span></figcaption></figure><p>Note: In order to use Flowise in your production web apps, you will need a hosted instance.</p><p>First thing we need to do is to set the credentials. In our case we have OpenAI API key that we generated in previous step. Click on &quot;Credentials&quot; to go to credentials screen, then click &quot;Add Credential&quot; button. A popup screen will appear with a list of providers. Search for OpenAI using search box and click on it. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.11.12-AM.png" class="kg-image" alt="Creating an AI Chatbot using OpenAI Assistants API and Flowise" loading="lazy" width="2000" height="1091" srcset="https://talkingtech.io/content/images/size/w600/2024/03/Screenshot-2024-03-28-at-1.11.12-AM.png 600w, https://talkingtech.io/content/images/size/w1000/2024/03/Screenshot-2024-03-28-at-1.11.12-AM.png 1000w, https://talkingtech.io/content/images/size/w1600/2024/03/Screenshot-2024-03-28-at-1.11.12-AM.png 1600w, https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.11.12-AM.png 2000w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Adding credentials</span></figcaption></figure><p>A popup screen will appear in order to enter the API key. give it a name for identification purpose. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.14.56-AM.png" class="kg-image" alt="Creating an AI Chatbot using OpenAI Assistants API and Flowise" loading="lazy" width="2000" height="1001" srcset="https://talkingtech.io/content/images/size/w600/2024/03/Screenshot-2024-03-28-at-1.14.56-AM.png 600w, https://talkingtech.io/content/images/size/w1000/2024/03/Screenshot-2024-03-28-at-1.14.56-AM.png 1000w, https://talkingtech.io/content/images/size/w1600/2024/03/Screenshot-2024-03-28-at-1.14.56-AM.png 1600w, https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.14.56-AM.png 2000w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Add OpenAI API Key</span></figcaption></figure><p>Great! Credentials are stored. Now let&apos;s create the Assistant. Click on &quot;Assistants&quot; from the menu to go to Assistants screen and click &quot;Add&quot; to add a new Assistant.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.19.25-AM.png" class="kg-image" alt="Creating an AI Chatbot using OpenAI Assistants API and Flowise" loading="lazy" width="2000" height="1089" srcset="https://talkingtech.io/content/images/size/w600/2024/03/Screenshot-2024-03-28-at-1.19.25-AM.png 600w, https://talkingtech.io/content/images/size/w1000/2024/03/Screenshot-2024-03-28-at-1.19.25-AM.png 1000w, https://talkingtech.io/content/images/size/w1600/2024/03/Screenshot-2024-03-28-at-1.19.25-AM.png 1600w, https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.19.25-AM.png 2000w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Adding Assistant</span></figcaption></figure><p>Fill in the information as shown in the screen. </p><ol><li>Name: You can give it any name you want. </li><li>Description:  Add some description about what your chat will do. </li><li>Model: Select the model you want to use. (Note OpenAI has different prices for different models.) </li><li>Credentials: Select the credentials that we created in previous step</li><li>Assistant Instructions: This is very important field. You have to enter the instructions for the Assistant. Tell the AI assistant what it will be doing, How it will communicate with users. You can give it a name also. </li><li>File Upload (optional) : You can upload a file containing any information that you want your chatbot to use during the conversation with users. </li></ol><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.19.39-AM.png" class="kg-image" alt="Creating an AI Chatbot using OpenAI Assistants API and Flowise" loading="lazy" width="2000" height="1088" srcset="https://talkingtech.io/content/images/size/w600/2024/03/Screenshot-2024-03-28-at-1.19.39-AM.png 600w, https://talkingtech.io/content/images/size/w1000/2024/03/Screenshot-2024-03-28-at-1.19.39-AM.png 1000w, https://talkingtech.io/content/images/size/w1600/2024/03/Screenshot-2024-03-28-at-1.19.39-AM.png 1600w, https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.19.39-AM.png 2000w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Creating an Assistant</span></figcaption></figure><p>You can learn more about all these fields in OpenAI documentation as well as Flowise documentation. Once filled click &quot;Add&quot; button to create the Assistant.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.40.18-AM.png" class="kg-image" alt="Creating an AI Chatbot using OpenAI Assistants API and Flowise" loading="lazy" width="2000" height="769" srcset="https://talkingtech.io/content/images/size/w600/2024/03/Screenshot-2024-03-28-at-1.40.18-AM.png 600w, https://talkingtech.io/content/images/size/w1000/2024/03/Screenshot-2024-03-28-at-1.40.18-AM.png 1000w, https://talkingtech.io/content/images/size/w1600/2024/03/Screenshot-2024-03-28-at-1.40.18-AM.png 1600w, https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.40.18-AM.png 2000w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Math Tutor Assistant</span></figcaption></figure><p>Once Assistant is created. Next step is to create a chatflow. From the left menu click on Chatflows and click on &quot;Add New&quot; button.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.41.33-AM.png" class="kg-image" alt="Creating an AI Chatbot using OpenAI Assistants API and Flowise" loading="lazy" width="2000" height="738" srcset="https://talkingtech.io/content/images/size/w600/2024/03/Screenshot-2024-03-28-at-1.41.33-AM.png 600w, https://talkingtech.io/content/images/size/w1000/2024/03/Screenshot-2024-03-28-at-1.41.33-AM.png 1000w, https://talkingtech.io/content/images/size/w1600/2024/03/Screenshot-2024-03-28-at-1.41.33-AM.png 1600w, https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.41.33-AM.png 2000w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Chatflows screen</span></figcaption></figure><p>This will open the canvas screen. In order to add a flow to canvas click on &quot;+&quot; sign on left side of the screen which will pop open a dialog like below.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.35.18-AM.png" class="kg-image" alt="Creating an AI Chatbot using OpenAI Assistants API and Flowise" loading="lazy" width="986" height="1922" srcset="https://talkingtech.io/content/images/size/w600/2024/03/Screenshot-2024-03-28-at-1.35.18-AM.png 600w, https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.35.18-AM.png 986w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Select assistant</span></figcaption></figure><p>Search for &quot;OpenAI&quot; in search box and click and drag &quot;OpenAI Assistant&quot; on the canvas.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.36.40-AM.png" class="kg-image" alt="Creating an AI Chatbot using OpenAI Assistants API and Flowise" loading="lazy" width="2000" height="1083" srcset="https://talkingtech.io/content/images/size/w600/2024/03/Screenshot-2024-03-28-at-1.36.40-AM.png 600w, https://talkingtech.io/content/images/size/w1000/2024/03/Screenshot-2024-03-28-at-1.36.40-AM.png 1000w, https://talkingtech.io/content/images/size/w1600/2024/03/Screenshot-2024-03-28-at-1.36.40-AM.png 1600w, https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.36.40-AM.png 2000w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">create chatflow</span></figcaption></figure><p>Select the Assistant we created in previous step and click on &quot;Save&quot; icon from top-right corner.</p><p>Once saved you can click on the chat icon to interact and test your chatbot. Try asking some questions.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.37.33-AM.png" class="kg-image" alt="Creating an AI Chatbot using OpenAI Assistants API and Flowise" loading="lazy" width="2000" height="1079" srcset="https://talkingtech.io/content/images/size/w600/2024/03/Screenshot-2024-03-28-at-1.37.33-AM.png 600w, https://talkingtech.io/content/images/size/w1000/2024/03/Screenshot-2024-03-28-at-1.37.33-AM.png 1000w, https://talkingtech.io/content/images/size/w1600/2024/03/Screenshot-2024-03-28-at-1.37.33-AM.png 1600w, https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.37.33-AM.png 2000w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Test chatbot</span></figcaption></figure><p>Well! our chatbot is almost ready to use now. As a last step we need to integrate it in our website. You can click on &quot;&lt;/&gt;&quot; icon on top-right corner to see the options on how to integrate it.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.38.01-AM.png" class="kg-image" alt="Creating an AI Chatbot using OpenAI Assistants API and Flowise" loading="lazy" width="2000" height="1066" srcset="https://talkingtech.io/content/images/size/w600/2024/03/Screenshot-2024-03-28-at-1.38.01-AM.png 600w, https://talkingtech.io/content/images/size/w1000/2024/03/Screenshot-2024-03-28-at-1.38.01-AM.png 1000w, https://talkingtech.io/content/images/size/w1600/2024/03/Screenshot-2024-03-28-at-1.38.01-AM.png 1600w, https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-28-at-1.38.01-AM.png 2000w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Copy embed code for chatbot</span></figcaption></figure><p> You can copy the embed code and paste it in your website, and the chatbot will start appearing on your website.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-27-at-10.49.11-PM.png" class="kg-image" alt="Creating an AI Chatbot using OpenAI Assistants API and Flowise" loading="lazy" width="2000" height="1196" srcset="https://talkingtech.io/content/images/size/w600/2024/03/Screenshot-2024-03-27-at-10.49.11-PM.png 600w, https://talkingtech.io/content/images/size/w1000/2024/03/Screenshot-2024-03-27-at-10.49.11-PM.png 1000w, https://talkingtech.io/content/images/size/w1600/2024/03/Screenshot-2024-03-27-at-10.49.11-PM.png 1600w, https://talkingtech.io/content/images/2024/03/Screenshot-2024-03-27-at-10.49.11-PM.png 2000w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Techi - the talking tech bot :)</span></figcaption></figure><p> Note that this embed code also gives you some customization options in order to change the look and feel of your chatbot. You can see the list of all the options by clicking <a href="https://github.com/FlowiseAI/FlowiseChatEmbed?ref=talkingtech.io" rel="noreferrer">here</a></p>]]></content:encoded></item></channel></rss>