Tag: LLM

  • Testing Chrome’s Prompt API: On-Device AI for the Web

    Testing Chrome’s Prompt API: On-Device AI for the Web

    The web just took another step toward local, browser-based intelligence — and I’m here for it.

    I recently signed up for the Prompt API trial in Chrome, and I’ve been diving into what it could mean for the future of interactive web experiences. For those not familiar, the Prompt API gives developers access to on-device language models directly through the browser. That means we can build AI-powered tools without relying on external APIs or sending data to the cloud — it all runs locally, right where the user is.

    Why I’m Excited

    For years, we’ve watched AI integrations live mostly on servers — calling APIs, managing tokens, and handling latency. The idea that we can now tap into a browser-level AI model opens up an entirely new playground for building responsive, privacy-friendly, and highly personalized web apps.

    It’s still early, but the possibilities are endless.

    What I’m Experimenting With

    As part of the trial, I’m already thinking through a few prototype projects that could take advantage of this local AI layer:

    • Product Comparison Tool
      Imagine selecting two products on a site and instantly getting a human-like breakdown of pros, cons, and suitability — all processed right in your browser.
    • Smart Shopping Cart
      A cart that helps you make smarter decisions. Instead of just holding products, it could suggest alternatives, bundle recommendations, or tell you if you’re missing something commonly paired with your picks.
    • AI-Powered FAQ System
      Instead of static questions and answers, the FAQ could understand context. Users could ask natural questions, and the browser would generate helpful, brand-specific answers based on local content.

    The Future of On-Device Intelligence

    This feels like a major shift — not just for performance, but for privacy and accessibility. You don’t need an API key, a backend pipeline, or even a connection to OpenAI or Gemini. Everything happens in the browser, using the user’s device capabilities.

    That opens doors for lighter, faster, and more compliant AI experiences, especially in industries where data sensitivity matters.

    If you’re a developer, I recommend checking out the Prompt API GitHub repo and the official Chrome developer docs.

    I’ll be sharing updates as I build out my first demos — so stay tuned for more experiments soon on bradbartell.dev.

  • The True Cost of LLMs — and How to Build Smarter with Ollama + Supabase

    Over the past few years, the cost of training large language models (LLMs) has skyrocketed. Models like GPT-4 are estimated to cost $20M–$100M+ just to train once, with projections of $1B per run by 2027. Even “smaller” foundation models like GPT-3 required roughly $4.6M in compute.

    That’s out of reach for nearly every company. But the good news? You don’t need to train a new LLM from scratch to harness AI in your business. Instead, you can run existing models locally and pair them with a vector database to bring in your company’s knowledge.

    This approach — Retrieval Augmented Generation (RAG) — is how many startups and internal tools are building practical, affordable AI systems today.


    Training vs. Using LLMs

    • Training from scratch
      • Requires thousands of GPUs, months of compute, and millions of dollars.
      • Only feasible for major labs (OpenAI, Anthropic, DeepMind, etc.).
    • Running + fine-tuning existing models
      • Can be done on commodity cloud servers — or even a laptop.
      • Cost can drop from millions to just hundreds or thousands of dollars.

    The trick: instead of teaching a model everything, let it “look things up” in your own database of knowledge.


    Ollama: Running LLMs Locally

    Ollama makes it easy to run open-source LLMs on your own hardware.

    • It supports models like LLaMA, Mistral, and Gemma.
    • You can run it on a laptop (Mac/Windows/Linux) and/or in a Docker container. I like to run it in docker on my machine, it’s the easiest way to control costs while building and testing
    • Developers can expose endpoints to applications with a simple API.

    Instead of paying per token to OpenAI or Anthropic, you run the models yourself, with predictable costs.

    Bash
    # Example: pull and run LLaMA 3.2 with Ollama
    ollama pull llama3.2
    ollama run llama3.2

    Supabase: Your Vector Database

    When you add RAG into the mix, you need somewhere to store embeddings of your documents. That’s where Supabase comes in:

    • Supabase is a Postgres-based platform with built-in pgvector extension.
    • You can store text embeddings (numerical representations of text meaning).
    • With SQL or RPC calls, you can run similarity searches (<->) to fetch the most relevant chunks of data.

    For example, embedding your FAQs:

    SQL
    CREATE TABLE documents (
      id bigserial PRIMARY KEY,
      content text,
      embedding vector(1536)
    );
    
    -- Search for relevant documents
    SELECT content
    FROM documents
    ORDER BY embedding <-> (SELECT embedding FROM query_embedding)
    LIMIT 5;

    This gives your LLM the ability to retrieve your data before generating answers.

    RAG in Action: The Flow

    1. User asks a question → “What’s our refund policy?”
    2. System embeds the query using nomic-embed-text (in ollama) or OpenAI embeddings.
    3. Supabase vector search finds the closest matching policy docs.
    4. Ollama LLM uses both the question + retrieved context to generate a grounded answer.

    Result: Instead of the model hallucinating, it answers confidently with your company’s real data.

    Cost Reality Check

    • Training GPT-4: $50M+
    • Running Ollama with a 7B–13B parameter model: a few hundred dollars per month in compute (or free if local).
    • Using Supabase for vector search: low monthly costs, scales with usage.

    For most businesses, this approach is 95% cheaper and far faster to implement.

    Final Thoughts

    Building your own GPT-4 is impossible for most organizations. But by combining:

    • Ollama (local LLM runtime)
    • Supabase + pgvector (semantic search layer)
    • RAG pipelines

    …you can get the power of custom AI at a fraction of the cost.

    The future isn’t about every company training billion-dollar models — it’s about smart teams leveraging open-source LLMs and vector databases to make AI truly useful inside their workflows.

    Interested in this for your company? Feel free to reach out on LinkedIn and I’ll use my experience doing this for Modere and one of my freelance clients to build one for you.

  • How Social Media Companies Can Use AI Without Losing Human Control

    AI is changing the way businesses work — and social media is no exception. Agencies are under constant pressure to deliver content faster, track trends in real time, and respond to audiences across multiple platforms. But in all the hype around automation, one principle often gets lost: AI should never replace human creativity and judgment.

    Instead, think of AI as a digital assistant that handles the repetitive, data-heavy tasks and gives you a head start on creative thinking. Humans remain in control, making the final decisions, ensuring brand safety, and applying the nuance that no algorithm can capture.

    Here are three practical ways social media companies can use AI to enhance their work — while always keeping people in the driver’s seat.


    1. Creative Campaign Ideation

    The challenge:

    Brainstorming campaign ideas is a cornerstone of social media marketing, but it can be time-consuming. Teams can spend hours trying to crack the “big idea,” only to end up circling the same concepts.

    How AI helps:

    AI can dramatically speed up the ideation process by:

    • Generating dozens of campaign angles from a single prompt.
    • Suggesting different creative formats (short-form video, Instagram carousels, LinkedIn thought pieces).
    • Tailoring ideas to audience segments (teen lifestyle, small business owners, B2B decision-makers, etc.).

    Where humans come in:

    The team takes these raw AI-generated ideas and applies strategy, creativity, and brand voice. Humans filter out what won’t resonate, refine what has potential, and ensure that the concepts align with client goals. AI provides volume and variety — humans provide vision.


    2. Social Listening & Insight Generation

    The challenge:

    Audiences move fast, and conversations can shift overnight. Agencies need to understand what’s trending, how competitors are positioning themselves, and where opportunities exist — but manually monitoring these signals across multiple platforms can eat up entire days.

    How AI helps:

    AI-powered monitoring tools can:

    • Track mentions, hashtags, and brand sentiment at scale.
    • Spot emerging trends before they hit the mainstream.
    • Highlight unusual spikes in competitor activity or audience engagement.

    Where humans come in:

    AI surfaces the noise; humans decide what matters. A strategist interprets the signals, applies market context, and recommends how to act. For example, AI might detect a sudden surge in conversation around sustainable fashion. A human marketer decides if it’s worth jumping in, how to align it with brand values, and whether it’s appropriate for the campaign calendar.


    3. Customer Engagement & Support

    The challenge:

    On social media, audiences expect instant responses — but agencies can’t realistically have humans available 24/7 to handle every comment, DM, or inquiry.

    How AI helps:

    AI chatbots and response assistants can:

    • Handle routine questions (“What are your hours?” “Where can I buy this?”).
    • Direct users to helpful resources or FAQs.
    • Flag urgent or sensitive conversations for human follow-up.

    Where humans come in:

    Community managers review and step in for anything that requires empathy, nuance, or strategic decision-making. When conversations escalate — such as customer complaints, influencer inquiries, or brand reputation issues — only a human can respond with the judgment and care needed. AI handles scale; humans handle relationships.


    Why Balance Matters

    AI is powerful, but it’s not perfect. Left unchecked, it can generate off-brand content, misinterpret conversations, or mishandle sensitive interactions. That’s why the best social media companies use AI as an assistant, not a replacement.

    By letting AI take on the repetitive tasks — brainstorming raw ideas, monitoring chatter, drafting first responses — agencies free up their human teams to do what they do best: create compelling campaigns, build client trust, and foster authentic connections with audiences.

  • Future-Proofing Your Content Strategy with llms.txt

    Search is evolving—and fast. With the rise of generative AI and large language models (LLMs), how your content is found, interpreted, and used is shifting from traditional keyword-based search engines to conversational AI platforms. In this new era, visibility isn’t just about ranking #1 on Google—it’s about being the source LLMs cite, summarize, or paraphrase in their responses. That’s where llms.txt comes in.

    What Is llms.txt?

    The llms.txt file is a new standard being proposed as a way for website owners to communicate how their content should be accessed and used by large language models like ChatGPT, Google Gemini, Claude, and others. It’s a simple text file placed at the root of your domain, similar to robots.txt, but with a focus on LLMs rather than search engine crawlers.

    For example:

    https://bradbartell.dev/llms.txt

    This file lets you:

    • Allow or disallow LLMs from training on or referencing your content
    • Specify conditions for use (like attribution or licensing terms)
    • Signal your openness to AI systems in a transparent, machine-readable way

    How Is llms.txt Different from robots.txt?

    While both llms.txt and robots.txt are used to guide automated systems, they serve different purposes:

    Featurerobots.txtllms.txt
    Primary AudienceWeb crawlers (e.g., Googlebot, Bingbot)Large language models (e.g., ChatGPT, Gemini)
    FocusSearch indexing and crawling behaviorAI training and content usage
    SyntaxStandard directives like Disallow, AllowEmerging conventions for AI content governance
    Current AdoptionWidely implemented and recognizedStill emerging, but gaining attention

    robots.txt tells search engines whether to index pages. llms.txt goes a step further by addressing whether your content can be used in training datasets or real-time generative answers.

    Why It Matters for the Future of SEO and AI Search

    As AI becomes the front door to more digital experiences, how LLMs interpret and use your content will define your visibility. This includes:

    • Whether your content is cited in AI-generated summaries
    • How accurate or up-to-date AI answers are when referring to your site
    • The ability to control or monetize the use of your original content

    By proactively adding llms.txt, you demonstrate digital maturity and readiness to engage with AI systems on your terms.

    How to Implement llms.txt

    1. Create a plain text file named llms.txt.
    2. Add directives or policy notes, such as:
    User-Agent: *
    Allow: /
    Attribution: required
    Licensing: CC-BY-NC
    Contact: ai@yoursite.com
    1. Upload it to the root of your domain (e.g., https://yoursite.com/llms.txt).
    2. Monitor adoption and adjust policies as standards evolve.

    Conclusion: Stay Ahead of the Curve

    The introduction of llms.txt is more than a technical tweak—it’s a strategic move. As more AI models crawl, synthesize, and present content, your site’s policies should keep pace. By embracing llms.txt, you’re not just protecting your content—you’re positioning your brand to thrive in the next wave of search and discovery.