Blog

Agents Wielding Rails

What if the future of software is built not by humans… but by agents?

And what if the fastest way to make that future real, today, is to hand those agents a battle-tested tool that’s already shaped the modern web?

That tool is Ruby on Rails. And if you understand why Rails is so powerful, especially in the hands of a smart, agentic LLM. You’ll see why I believe this idea is not just interesting..

It’s world-shaping.

The Premise: Agents Can Code

With the rise of GPT-4, Claude, and open-source LLMs, we’ve entered the age of AI software engineers. These models can:

Read and write code
Modify existing files
Chain together commands
Run in loops with planning and self-correction
And most importantly: ship working features

But just because they can code doesn’t mean they’re fast, reliable, or production-ready. What separates “cool demos” from real shipped software is the framework they’re operating in.

And this is where Rails comes in.

Rails: The Weapon of Maximum Productivity

Ruby on Rails is one of the most token-efficient and high-leverage frameworks ever created. It’s not just faster for human developers — it’s also easier for LLMs to reason about, execute within, and build production-grade apps.

Here’s why:

1. Convention Over Configuration = Predictability

LLMs love patterns. Rails is full of them. Everything from file structure to naming conventions is standardized. An LLM doesn’t have to guess where to write code — it knows.

2. Token-Terseness = More Context, Less Bloat

Rails lets you do more with fewer tokens:

rubyCopyEditvalidates :email, presence: true, uniqueness: true

Compare that to the verbose alternatives in Java or Go. This means agents can “see” more of your app at once — and reason more effectively.

3. Scaffolds & Generators = Toolable Workflows

You can build a working CRUD interface with a single command:

rails g scaffold Post title:string body:text

For an agent, this is pure gold. It’s like calling a function that expands into hundreds of lines of boilerplate instantly.

4. Batteries Included = Fewer External Dependencies

Rails comes with:

ActiveRecord (ORM)
ActionMailer (email)
ActiveJob (background jobs)
ActiveStorage (file uploads)
Built-in testing, caching, asset pipeline, and more.

Agents don’t waste time configuring libraries. They use what’s already there.

5. Readable, Declarative DSLs

Rails reads like English:

has_many :comments, dependent: :destroy

That’s not just good for humans. It’s good for LLMs, too.

Rails vs. Other Stacks for Agents

Framework	Agent Friendliness	Why it Slows Agents Down
Rails	🔥🔥🔥	Convention, terseness, full-stack
Django	🔥🔥	Similar to Rails, but slightly less opinionated
Node (Express)	🔥	Fragmented, unstructured, too many decisions
Go	🔥	Verbose, requires fine-grained planning
Java (Spring)	🧊	Config-heavy, non-obvious wiring

Rails is simply faster — not just for humans, but for machines trying to build working software autonomously.

Why This Matters

The world doesn’t need another no-code tool or website builder.

What we need is an agentic software stack — a way for AI to not just help write code, but to architect, modify, test, and deploy entire systems.

Rails is uniquely well-suited to this job:

It provides structure.
It encodes best practices.
It minimizes the token tax.
It comes with decades of documentation and StackOverflow gold.

And most importantly: it’s production-proven.

The Rails Thesis

If LLMs are the developers of the future, Rails is the fastest path from thought to product.

Not React. Not Next.js. Not Go. Not Bun.

Rails.

Because Rails isn’t just a web framework.. it’s an opinionated, token-terse operating system for building full-stack apps. That’s exactly what LLMs need.

If you give an agent access to a Rails repo, a terminal, and structured memory, it can ship software faster than many humans.

And that’s not theoretical. It’s already happening.

Final Thought

The future isn’t code-free.

The future is agent-coded, and Rails is their weapon of choice.

If you’re building agents, investing in tooling, or trying to accelerate AI-native software development, you’d be crazy to ignore Rails.

It’s not old tech.

It’s perfectly shaped for what’s coming.

July 1, 2025

LlamaBot as a super tool

Turning LlamaBot into a full blown MCP client

Why am I wanting LlamaBot to become an MCP Client, similar to Claude and ChatGPT?

I don’t want to rely on Claude and ChatGPT to be my only access to the MCP server world. (I want control over the full stack).
I want access to the client-side agentic workflows. (ReAct, CodeAct, etc).

I want other people to have access to the client-side agentic workflows (so that Claude & OpenAI aren’t black-box magic. I assume that they’re mostly implementing a similar type of ReAct and CodeAct type of agent workflow out of the box).

Creating TypeScript/React front-end for LlamaBot

I decided I want conversation threads for LlamaBot, so we can have multiple, unique persistent conversations with our agent (similar to other Chat agents like ChatGPT and Claude Sonnet).

While having a basic HTML front-end for our initial project made sense, if we’re turning this into a full-blown MCP client, including multiple conversation threads, previewing LlamaBot projects, etc. Then it makes sense that we branch into something that put our project on stable ground. Hence, React with TypeScript.

I actually haven’ t personally used TypeScript previously, but I’m a big fan of React (my first startup’s front-end relied heavily on React and React-Native).

React is amazing because of it’s hook and state update propagation — makes for 10X cleaner and re-usable front-end code once you understand it.

Typed languages in general are nice, because they provide compilation level checking that prevents bugs before run-time. Although I haven’ t used TypeScript extensively, I’m excited to implement it into LlamaBot, because it will lead to a more stable user experience and allow us to build in some amazing functionality to the front-end.

Being a lazy vibe-coder, I decided to let Cursor Agent take the first stab at the entire setup of creating our front-end with TypeScript and React. Let’s see how it does!

^ Cursor banging out a TypeScript/React front-end effortlessly.

Cursor and Claude Sonnet 4 coming up with a banger design for the interface.

It wouldn’t be LlamaBot without our beloved mascot staring down the user with his piercing gaze.

Let’s display our list of Agents that the user can select & run (pulled from langgraph.json).

Adding front-end to detect tool calls and format it as a message:

Next step, getting LlamaBot to write these in as “artifacts”, similar to how Claude creates artifacts.

I created a new folder structure “artifacts” that can house individual projects.

From here, we can equip our agent to write directly to artifacts/<artifact_id>/page.html, artifacts/<artifact_id>/assets/script.js,and artifacts/<artifact_id>/assets/styles.css

We could also have a model.py file, and a controller.py file, that could allow backend functionality for our front-end to interact with (maybe even giving it the ability to trigger additional agent flows and display output!)

One example of this would be a storybook generator side project that I’ve worked on previously, that generates “chapters” of text, and then an associated audio recording of the chapters, and pictures to go along with it.

A very fun project that lived in it’s own FastAPI application and used LangGraph, but once we have artifacts properly working, we could have LlamaBot recreate it as an artifact! (More to come on this)

June 6, 2025
Giving our Coding Agent Playwright as a Tool Call
Playwright is a powerful tool for our AI Agent. Playwright can launch a browser that can be controlled via Python code.

Source

It can do the following:
1. Query HTML/CSS elements and interact with the browser dom.
2. Take screenshots of the page it’s viewing.
3. View JavaScript logs
Why is giving Playwright to our coding agent useful?
1. Our agent could use Playwright to take screenshots and recreate it as HTML/CSS.
2. Our agent could inspect HTML page DOMs & styling.
3. Our agent could dynamically test the code we’re building in real time, (as if it were a human).
  - It could detect if our page wasn’t loading, read JavaScript logs, and feed it back into the agent in a type of “self-healing” verification loop.
    
    (This is what real web software developers do, they load the browser and test the app in real time as they’re building it out).
Challenges with Directly Copying HTML/CSS Source

While we can do that, these files can often be extremely massive. And they are transpiled

While directly copying HTML and CSS from existing websites seems straightforward, this comes with challenges. Modern web pages, particularly those built with no-code platforms or advanced page builders (like Wix, SquareSpace, Elementor, etc.), typically produce large, complex, and heavily transpiled codebases. This includes heavy, excessive CSS styles, many nested elements, and hugely bloated JavaScript files.

This causes:

Massive File Sizes: Transpiled code from visual builders is enormous, making it difficult for Language Learning Models (LLMs) or agents to parse efficiently. Current LLMs have input token limits, restricting the amount of content they can understand and process at once.

Edibility: We’re not looking to infringe on copyrighted work, so we’ll need to be able to make our own derivations from this, ideally using the LLM. But copied code from these tools often lacks readability and is challenging to edit or debug, leading to difficulties for the LLM to understand it, let alone make effective changes.

Instead, a vision first approach, along with passing in a parsed down version of their HTML structure helps generate clean, understandable, and editable code, overcoming these direct-copy challenges effectively.

Using AI Vision for Web Page Cloning is Now Possible Due to the Latest Advancements in AI Vision Reasoning Capabilities.

AI models have gotten really good at visually reasoning about image information. Take OpenAI’s announcement back in December 2024, about O3’s breakthrough in visual reasoning tasks, such as the ARC AGI data-set.

ARC AGI is an evaluation set to compare AI systems performance against. It was intended to be a set of questions and tasks that would be very challenging for AI models to solve, and it’s creators didn’t anticipate a solution to appear as rapidly as it did.

See the announcement here:

We want to test the models ability to learn new skills on the fly. … ARC AGI version 1 took 5 years to go from 0% [solved] to 5% [solved]. However today, O3 has scored a new state of the art score that we have verified. O3 was able to score 87.5%, human performance is comparable at 85% threshold, so being above this is a major milestone.

Gregory Kamradt, president of the ARC foundation.

Source: InfoQ

Source: InfoQ

Given these breakthroughs, an AI model like O3 should be able to reason about the image we give it, and provide a very clear verbal representation of webpage, that can then be passed in to another LLM to create the HTML/CSS code.

Our Approach to Cloning WebPages using AI, and our Agent Architecture:

Here’s the video going over the implementation!

06/04/2025, 7:59PM:

Feeling a little discouraged! I made the decision to add the playwright screenshot in as a two-step tool-calling process. (two new additional tools).

That means our agent has the following tool calls at its disposal:
1. Write HTML
2. Write CSS
3. Write JavaScript
4. get_screenshot_and_html_content_using_playwright
5. clone_and_write_html_to_file
There are two main problems happening right now.
1. The LLM is correctly picking “get_screenshot_and_html_content_using_playwright” when I send in a prompt like
Please clone this webpage: https://fieldrocket.us

2. The LLM is not including image sources for some reason, even though the trimmed_html that we get from playwright, does indeed have the image src tags included in the HTML.

Furthermore, our tracing is lame because when we get into our clone_and_write_html_to_file, we aren’t using langchain_openai sdk, so it’s not logging the LLM input & output in LangSmith (making it harder to observe & debug)

But, roughly 30% of the time, it’s jumping straight from the get_screenshot tool call, into the write_html tool call, rather than going to the clone_and_write_html_to_file.

It does make me wonder: what does this @tool decorator even do?

Is the LLM just seeing the function name of the tool call, or is it also seeing the comment just below the method signature? In the LangChain academy course on LangGraph, Lance doesn’t specify. But he has the comment in there right below the signature, so I assumed the LLM could see it.

According to this guide

You must include a docstring which will serve as the function description and help LLM in understanding its use.

Which is what I assumed, and how Lance appeared to present it in the LangChain academy course.

One workaround that could work, is collapsing the two separate tool-calls into a single one. That way the LLM isn’t having to make one right decisions, just a single right decision.

I bet that would solve this first problem.

I now collapsed the two separate tools into one:
```
@tool
def get_screenshot_and_html_content_using_playwright(url: str) -> tuple[str, list[str]]:
    """
    Get the screenshot and HTML content of a webpage using Playwright. Then, generate the HTML as a clone, and save it to the file system. 
    """
    html_content, image_sources = asyncio.run(capture_page_and_img_src(url, "assets/screenshot-of-page-to-clone.png"))

    llm = ChatOpenAI(model="o3")

    # Getting the Base64 string
    base64_image = encode_image("assets/screenshot-of-page-to-clone.png")

    print(f"Making our call to o3 vision right now")
    
    response = llm.invoke(
        messages=[
            SystemMessage(content="""
                ### SYSTEM
You are “Pixel-Perfect Front-End”, a senior web-platform engineer who specialises in
 * redesigning bloated, auto-generated pages into clean, semantic, WCAG-conformant HTML/CSS
 * matching the *visual* layout of the reference screenshot to within ±2 px for all major breakpoints

When you reply you MUST:
1. **Think step-by-step silently** (“internal reasoning”), then **output nothing but the final HTML inside a single fenced code block**.
2. **Inline zero commentary** – the code block is the entire answer.
3. Use **only system fonts** (font-stack: `Roboto, Arial, Helvetica, sans-serif`) and a single `<style>` block in the `<head>`.
4. Avoid JavaScript unless explicitly asked; replicate all interactions with pure HTML/CSS where feasible.
5. Preserve all outbound links exactly as provided in the RAW_HTML input.
7. Ensure the layout is mobile-first responsive (Flexbox/Grid) and maintains the same visual hierarchy:  
   e.g) **header ➔ main (logo, search box, buttons, promo) ➔ footer**.

### USER CONTEXT
You will receive two payloads:

**SCREENSHOT** – a screenshot of the webpage.  
**RAW_HTML** – the stripped, uglified DOM dump (may include redundant tags, hidden dialogs, etc.).

### TASK
1. **Infer the essential visual / UX structure** of the page from SCREENSHOT.  
2. **Cross-reference** with RAW_HTML only to copy:
   * anchor `href`s & visible anchor text
   * any aria-labels, alt text, or titles that improve accessibility.
3. **Discard** every element not visible in the screenshot (menus, dialogs, split-tests, inline JS blobs).
4. Re-create the page as a **single HTML document** following best practices described above.

### OUTPUT FORMAT
Return one fenced code block starting with <!DOCTYPE html> and ending with </html>
No extra markdown, no explanations, no leading or trailing whitespace outside the code block.
                 
                 Here is the trimmed down HTML:
                 {trimmed_html_content}
            """),
            HumanMessage(content=f"Here is the trimmed down HTML: {trimmed_html_content}"),
            HumanMessage(content=f"data:image/jpeg;base64,{base64_image}")
        ]
    )

    breakpoint()

    with open("/Users/kodykendall/SoftEngineering/LLMPress/Simple/LlamaBotSimple/page.html", "w") as f:
        f.write(response.content)
    
    return "Cloned webpage written to file"
```
Let’s try it now.
June 4, 2025
Part 18: LangGraph Pre-Built Components: An Easier Way to Build Agents

Previously when building our coding agent, LlamaBot, we made a very simple Agent workflow, as described by this diagram here:

This is an intuitive & simple implementation of a coding agent that can take a user’s initial message, decide if the user wants to write code or not, create a “design plan”, and then write code that gets saved to the file system in order to implement that plan.

I like this approach because it’s simple, straightforward, and easy to understand.

There’s a natural progression of thinking from first principles that arrives at this simple agent workflow. That thought process can be seen in real time, by watching videos 1-16 of building our coding agent from scratch, here: https://www.youtube.com/watch?v=AadSBNKglMM&list=PLjxwvRWwj8anN2aTUhX2P0oKc0ghnhXHQ&pp=gAQB

We use LangGraph because it helps us build more reliable agents. One definition of an agent is allowing the LLM to decide the control flow of our application. By representing our application as a set of nodes and edges, we give the LLM autonomy to decide what code should be executed next.

There’s a fatal flaw of our current implementation: the LLM has limited authority to decide the control flow of our application.

For our current implementation, we are on the far-left of this curve. Our current implementation is essentially just a router.

And this is great, because it’s very reliable and simple.

BUT, if we want to build a more impressive and capable coding agent, we need to give the LLM more autonomy.

Our current agent only writes a single file to our file-system right now, into page.html.

This is simple, but limited.

What if we wanted our agent to be able to write the styles.css or script.js into separate files in our file system?

Under our current paradigm, we would need to add two separate nodes, and 2 additional LLM prompts, and an additional 4 edges to make this workflow work.

New Agent Architecture (ReAct agent architecture)

June 1, 2025
Exploring MCP for LlamaBot, Part 1
Source: Hugging Face

MCP Server:

My process of setting up Claude Desktop to act as an MCP Client, setting up MCP.

Adding our own mcp_server.py into LlamaBot. Here’s a basic mcptool call to compute body mass index.

I had to open up this file path for Claude desktop with cursor to configure MCP
```
cursor /Users/kodykendall/Library/Application Support/Claude/claude_desktop_config.json
```
And add this JSON a configuration:
```
{
    "mcpServers": {
      "filesystem": {
        "command": "/Users/kodykendall/SoftEngineering/LLMPress/Simple/LlamaBotSimple/mcp/run_mcp_server.sh",
        "args": []
      }
    }
  } 
```
I had to wrap the Python execution inside a bash script, so that it was executable and so it would activate the venv for LlamaBot.
```
#!/bin/bash

# Get the directory where this script is located
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(dirname "$SCRIPT_DIR")"

# Activate the virtual environment
source "$PROJECT_ROOT/venv/bin/activate"

# Run the MCP server with all passed arguments
python "$SCRIPT_DIR/mcp_server.py" "$@" 
```
Then, when opening Claude Desktop, when it attempted to use the BMI tool-call, Claude desktop asked for permission to execute it.

It calculated my BMI and Claude told me I’m overweight. Dang! I’m 25.66 BMI, and the normal weight is 24.9.

But good 👍, that means that Claude (and other MCP clients) can consume our BMI request that’s in mcp_client.py

I was wondering: is this accessing the MCP server via the piping input/output through standard in/out running locally on my mac, or through HTTP protocol?

Our implementation uses stdio as the default (very nice if you’re running other programs locally that are in separate processes from your agent. This means you could expose tool-calling across many different applications on your computer, that don’t have to be running under the same process on your operating system).

Great, Claude is using the MCP tools, but how can I integrate MCP client with my own project, so that LlamaBot can act as a client and use these same tools?

Before doing that, I’m going to focus on getting tool calling working with something more relevant to LlamaBot: writing some code to our file-system.

Goal: Can we get Claude Desktop to write code to our file-system?

Yes.

Sweet! Not too shabby.

This begs the question. If we can have an MCP Client like Claude Desktop write code directly to our file-system, then do we need the LlamaBot chat.html interface in the first place?

Yes, we do. For 2 reasons:
1. We don’t know what Claude is doing under the hood.
2. Customizing & creating multi-step agentic workflows isn’t possible with Claude currently
Claude is using tool-calling under the hood, and implementing some sort of agent architecture. But we don’t know exactly how Claude is operating under the hood. Is it just simple tool calling? Or do they have some sort of reasoning & acting agent loop (ReAct) architecture in there? We have no idea.

We’ll continue exploring implementing MCP client, and MCP servers into LlamaBot.
May 31, 2025
05.29.25 – LlamaBot User Input Area Expansion UI/UX
This is a silent coding session, because I’m sitting at the airport, but want to document the changes I’m making to LlamaBot still.

Also sometimes, it’s nice to not have to be in front of the camera talking. I’ve found myself making less progress lately because of feeling like I can’t work on the camera unless I’m recording myself in real time.

This is my compromise! It also means I can spare the nice people sitting next to me at the airport from hearing me monologue about LangChain, LlamaBot, and the future of user experience.

Alright, jumping into it.

Goal:
- Fix our user input field and cuts off text that’s long. (It needs to expand horizontally)
OpenAI’s = ❤️

Changing it quickly with cursor:

Done!

Here’s the PR: https://github.com/KodyKendall/LlamaBot/pull/3

PS: I added a CLA to protect the long-term viability of the project 👍
May 29, 2025
WordPress and the intersection of AI
When I talk about WordPress, people here in San Francisco usually give me a puzzled look.

“Did you know WordPress powers 43% of all websites and growing?”

I tell them.

“Yes, it’s absolutely absurd. Who uses WordPress anymore? Framer and WebFlow are going to crush WordPress”

Meanwhile, Framer and WebFlow collectively have less than 3% of the market share for CMS platforms, and WordPress has over 60%. Sorry, Framer and WebFlow are not going to overtake WordPress anytime soon.

While I’ve moved past the idea that WordPress will be the major vehicle for the AI future (that idea was something I believed briefly when I first realized the enormity of WordPress, the power of open source communities, and the staying power/switching costs of software), I still deeply admire WordPress, and think there’s an interesting story to be told about WordPress and AI.

The original thought was that AI could penetrate into WordPress’s ecosystem and be a large surface area for introducing AI into the day to day happenings of the world. While this will likely still happen, it will happen everywhere. Not just WordPress.

The release of NLWeb from Microsoft is a great example of how quickly the world is going to uptake WordPress – within days of it opening, multiple people (myself included) began building plugins to WordPress to turn it into NLWeb servers (allowing AI Agents to search and query existing WordPress sites, as well as regular users having an AI chatbot that could ask any question of the WordPress site, and have it surface the relevant content).

So no, the reason I keep talking about WordPress isn’t because I think it will be a large focal point of the AI future. It’s because it’s a useful and inspiring analogy about what the future of open source could hold for the age of AI agents.

I’m inspired by Matt Mullenweg. Despite the negative press & controversy that’s surrounded him, he’s the one who introduced me to the idea and concept of the power of open source software.

He visited my hometown Salt Lake City, and spoke about open source at the Silicone Slopes Summit in 2023. I was in the audience, and I remember being shocked when he told us how his little project he started when he was 19, WordPress, took over the world.

He was talking a different language that I wasn’t used to when it came to startups: personal freedoms, the right to ownership, democratizing publishing. This was lofty language that resonated with a founder in his early 20s.

He challenged the audience to consider building open source software, and that maybe they could build something that had as large of an impact on the world as WordPress.

By making WordPress extremely easy to download, install, and deploy, he was giving a gift to anybody who wanted to start a blog and publish their ideas. This helped spark a revolution, and later helped small business owners publish critical information about their business, that allowed them to show up on placed like Google.

But that was 2004, and my obsession began with WordPress in 2024. In 20 years, how did WordPress become this massive behemoth that has an $800B world-wide ecosystem?

And yet, when I first tried WordPress, even though I have a degree in Computer Science and have been using computers daily since 4 or 5 years old, it was confusing and clunky.

And yet, it’s still so dominant.

This is where I understand people’s misunderstandings: WordPress? Really? But Framer, WebFlow, Wix, SquareSpace, they’re all so much easier!

There’s no way WordPress will be around for long.

And this is the question I’ve obsessed over for all of 2024. And I think I’ve found the answer. WordPress isn’t going away. I’ve boiled it down to three things:
1. Because it’s open source
2. It has a huge community
3. It has a large and open 3rd party marketplace that creates huge network effects and causes massive switching costs.
And all 3 of these things, actually give WordPress a competitive advantage over all of it’s competitors.

When Microsoft announced NLWeb, you had multiple indie hackers/developers jumping at the opportunity, to introduce the protocol into WordPress’s ecosystem. I’m sure this was happening before the top executives at Framer, Webflow, Wix, or Squarespace had even realized what NLWeb actually is. And this is why WordPress actually is a better, more durable tool than any of it’s competitors.

It’s infinitely customizable and flexible, and this means there is more deep innovation happening on it’s platform than anywhere else. This means that new innovative ideas can actually be rolled out faster to WordPress users than it’s competitor’s users. For example, our AI coding agent, LlamaPress, has a slick integration with WordPress that means that WordPress has a better AI native building & editing experience than any other page builder out there currently. WordPress literally has: prompt to website creation, with all the flexibility that comes with it, before it’s competitors.

Because some random hacker in San Francisco decided to code up a 3rd party integration with it. Matt Mullenweg and I have never talked 1 on 1. But in a way, I’m an employee and asset of Automattic. All the 3rd party WordPress developers of the world are. Because in their sum, Automattic has an army across the globe of people working non-stop 24/7, providing free time, energy, innovation, and intellect, to helping increase WordPress’s capabilities through 3rd party extensions and plugins.

And that’s why open source, deep communities, and 3rd party extensibility holds deep competitive advantages over closed source proprietary systems.

This is also why I think LangChain and LangGraph are going to be big winners in the new AI agentic world. I’m betting big on LangChain and LangGraph.

But that does beg the question: is WordPress the right analogy to compare against LangChain and LangGraph? And also, where would open weight models such as Llama and Deepseek fit into this?

I will write a future article going deeper into this idea. But briefly, here’s how I view things.
1. Open LLMs are analogous to the Linux operating system. They open up a base-layer of possibility where you’re not locked into a proprietary vendor.
2. LangChain and LangGraph are analogous to other open source software, such as PHP, Apache Web Server, and MySQL. LAMP is the “wrappers/orchestrators” of the operating system, just as LangChain/LangGraph are the “wrappers/orchestrators” of the LLM.
3. What is the equivalent of WordPress? WordPress is essentially a “wrapper” of the LAMP stack, right?
The equivalent of WordPress is LlamaPress. Will this bear into fruition? Will LlamaPress one day support a $800B world-wide ecosystem of builders? The future is uncertain, there is no clear answer yet.

But I know that 2024 looked a lot like 2004. And in 2004, WordPress was a tiny community, of raving, excited contributors, who all probably didn’t conceive of taking over the world. And that’s what LlamaPress currently is.

We are a wrapper of LangGraph, LangChain, and LLMs, so that non-technical users can publish their own micro web-apps and micro AI agents. They can self-host LlamaPress on their own servers at home, and they can build a software and AI system that they truly own.

What an exciting time to be alive as a builder.
May 28, 2025

The Unstoppable Proliferation of Technology

Proliferation is the default of technology. General purpose technologies become waves when they diffuse widely. Without an epic and near-controlled global diffusion, it’s not a wave; it’s a historical curiosity.

Once diffusion starts, however, the process echoes throughout history, from agriculture’s spread throughout the Eurasian landmass to the slow scattering of water mils out from the Roman Empire across Europe.

Once a technology gets traction, once a wave starts building, the historical pattern we saw with cars is clear.

When Gutenberg invented the printing press around 1440, there was only a single example in Europe: his original in Mainz, Germany. But just fifty years later a thousand presses spread across the Continent. Books themselves, one of the most influential technologies in history, multiplied with explosive speed.

Or take electricity. The first electricity power stations debuted in London and New York in 1882, Milan and St. Petersburg in 1883, and Berlin in 1884. Unsurprisingly, consumer technologies exhibit a similar trend. Alexander Graham Bell introduced the telephone in 1876, By 1900, America had 600,000 telephones. Ten years later there were 5.8 million. Increasing quality joins decreasing prices in this picture. A primitive TV sold in $1000 in 1950 would cost just $8 in 2023. though, of course, TVs today are infinitely better and so cost more.

Proliferation is catalyzed by two forces: demand and the resulting cost decreases, each of which drives technology to become even better and cheaper. Of course, behind technological breakthroughs are people.

They labor at improving technology in workshops, labs, and garages, motivated by money, fame, and often knowledge itself. Technologists, innovators, and entrepreneurs get better by doing and crucially by copying.Copying is a critical driver of diffusion. Mimicry spurs competition, and technologies improve further.

Economies of scale kick in and reduce costs. Civilization’s appetite for userful and cheaper technologies is boundless. This will not change.

Mustafa Suleyman, Founder of DeepMind
Page 31, The Coming Wave.

Nine Additional Historical Examples of the Proliferation of Technology

Below are nine clear-cut cases where a single invention or platform tipped from “curiosity” to worldwide wave, echoing the diffusion dynamic described in the excerpt.

1. Medieval water-mills

Roman engineers introduced the water-mill to Europe in the 1st century CE, but the real take-off waited for the early Middle Ages. By 1086 the Domesday Book already lists ≈5,600 mills in England alone, implying tens of thousands across Europe — a three-order-of-magnitude jump in a few centuries.

2. Steam engines in the Industrial Revolution

Thomas Savery’s 1698 pump was “engine #1.” By 1800 Britain had ~2,300 working engines spread over mining, textiles and transport — enough to transform national energy use and industrial output. UCI Social Sciences

3. U.S. railroads

Track length grew from 23 miles in 1830 to 254,000 miles by 1916 — more than a ten-thousand-fold expansion that rewired the American economy and settled the continent.

4. Household electrification

Only ~10 % of U.S. homes were wired in 1910; by 1930 the figure had leapt to ≈70 %, thanks to falling generator costs and the network effects of appliance makers.

5. Penicillin

Before WWII the world’s supply was measured in hundreds of millions of units; by late-1945 U.S. firms were turning out ≈650 billion units every month — a 1,600-fold jump in two years that made antibiotics a global standard of care.

6. Personal computers

Annual shipments exploded from 48 000 units in 1977 to 125 million in 2001; installed base passed one billion machines in 2008, cementing the PC as a general-purpose platform.

7. The Internet

Worldwide penetration doubled from 30 % of humanity in 2010 to 60 % in 2020—an extra 2½ billion users in a single decade.

8. Smartphones

The iPhone launched in 2007; by 2024 there are ≈7.16 billion smartphone subscriptions, essentially one per adult on the planet, with further growth still forecast.

9. Solar photovoltaic modules

As cumulative output doubled again and again, module prices crashed from $106 /W in 1976 to <$0.40 /W in 2019 (≈99 % decline). The plummeting cost turned solar from space-probe niche to the largest source of new generating capacity worldwide.

10. Generative-AI: ChatGPT

OpenAI’s chatbot hit 100 million monthly users in just two months after launch — the fastest consumer-app adoption ever recorded, out-pacing even TikTok.

Why they all snowballed
Across very different eras, the same feedback loop appears: initial breakthrough → early adopters prove value → copy-cat entrants & competition drive scale → scale slashes unit cost and boosts quality → huge new demand pulls the technology into ubiquity. The historical record suggests that once diffusion reaches a critical mass, “proliferation is indeed the default.”

How Open Source Software Accelerates Proliferation of Technological Adoption

A live open-source “wave”: Linus Torvalds Writes Linux

1991 – one grad-student prototype.
2025 – Linux runs ≈ 96 % of the world’s top one-million web servers and ≥ 90 % of super-computers, and its kernel underpins every Android phone on the planet. That jump from hobby OS to the default internet substrate mirrors the pattern in your excerpt: free code → relentless copying → economies of scale → ubiquity.

WordPress — open-source diffusion on steroids

Milestone	What happened	Why it mattered
May 2003	Fork of the abandoned b2/cafelog blog engine ships under the GPL.	Zero licensing barriers; anyone can copy, hack, re-host.
July 2011	WordPress already powers 50 million sites.	Confirms the wave is real; gives plug-in/theme creators a market. Weblog Tools Collection
May 2025	*43.5 % of all* websites (≈ 529 million) now run WordPress; it commands 61 % of the CMS market**. WPZOOM	Past Gutenberg- press levels of diffusion; effectively the default publishing layer for the web.
Ecosystem scale	70 k + plugins, 30 k + themes; countless SaaS hosts, agencies, and product companies. WPZOOM Colorlib	Copy-and-compete loop drives quality up, cost down; every plug-in author is an incremental R&D lab.

How WordPress rode (and fueled) the proliferation loop

Demand pull – Small businesses, bloggers, newsrooms all needed cheap, flexible publishing. A single-click installer and a theme got them online in minutes instead of paying for bespoke CMS builds.
GPL-powered mimicry – Because the core and every derivative must stay GPL, copying isn’t just legal — it’s expected. That norm produced tens of thousands of plugins and themes, each someone’s attempt to out-innovate the last.
Economies of scale – As install base ballooned, hosting companies optimized specifically for WordPress, pushing running costs toward zero and performance up. Managed-WP hosting, page-builder plugins, and WooCommerce (itself GPL) stacked more value on top of the same free core.
Feedback spiral – More users → more plugin revenue → more developers → better tooling → more users. The very thing Gutenberg’s printing press kicked off in the 15th century repeats here in software form.

Bottom line: WordPress is the open-source textbook case of technology’s default toward proliferation — from one fork in 2003 to nearly half the web by 2025 — driven by zero-cost replication, a rabid developer community, and the economic flywheel that open licensing unlocks.

What is the future of Open Source technology amidst the Coming Wave of AI?

1 Large-Language Models: the “source weights” wave

Open-weight releases have reached critical mass:

Meta’s Llama series. Llama 3 (and, weeks ago, the 70 B-parameter “3.3” refresh) ships full checkpoint downloads via Hugging Face; requests are auto-approved within hours. GitHub
Mistral’s Mixtral line. Apache-2 licensed, 8×22 B MoE and code-tuned variants deliver GPT-4-class performance while remaining completely royalty-free. Mistral AI Documentation
Coming soon: even OpenAI has confirmed an “open-weight” model for mid-2025, a signal that open sourcing is no longer fringe. WIRED

Why it matters next:

Hardware + quantization (gguf/llama.cpp) put 8–70 B models on consumer GPUs and phones; the cost of running frontier LLMs is collapsing.
Specialisation loops (LoRA, QLoRA, DPO) let any lab or startup turn a general model into a niche expert for ≪ $10 k.
Governance battles now focus on compute controls rather than banning open weights entirely. WIRED

Result: open LLMs are on track to become the Linux layer of AI — the default substrate every other product is built on.

2 Open-source orchestration frameworks (LangChain → LangGraph)

LangChain stabilised with v0.2 (May 2024), splitting into small, composable packages and adding first-class tool-calling, tracing, and eval hooks. Langchain

LangGraph then extended LangChain with a graph-of-states model for multi-agent, fault-tolerant workflows (think Airflow, but for LLM calls). AWS is already teaching customers how to deploy LangGraph graphs on Bedrock. Amazon Web Services, Inc. Its GitHub repo has become the de-facto reference for “resilient agent” design patterns. GitHub

Trajectory:

2023	2024	2025-2026 (expected)
Prompt chains	Tool-calling, RAG pipelines	Full agent graphs with tracing, rollback, cost & carbon budgets built-in
Single-thread	Concurrency via LangGraph	Distributed micro-agent swarms across GPU clusters
Ad-hoc logging	LangSmith / AgentOps observability	SLO dashboards & policy enforcement layers GitHub

3 Next layer up: open frameworks that create agents

With orchestration commoditised, the community is racing to ship “agent builders” that sit on top:

Project	Built on	What it gives you
Flowise	LangChainJS	Drag-and-drop UI; ship a bespoke RAG or chat agent in minutes. GitHub
CrewAI	Python (independent)	Role-based “crew” pattern for collaborative multi-agent tasks. GitHub
Microsoft Autogen	Optional LangChain adapter	Programmatic framework + Autogen Studio no-code GUI; enterprises prototype multi-agent workflows, then export to code. GitHub
AgentOps	LangChain/LangGraph plug-ins	Monitoring, evals and cost tracking for anything you build. GitHub

Where this is heading

Domain-specific starter kits. Expect OSS templates for “legal research agent,” “WordPress marketing copilot,” etc., each bundling tools + guardrails.
Marketplace economics. Plugins, data connectors, tool APIs will trade like WordPress themes did — a low-friction path to monetise niche know-how.
Local-first & privacy. Quantised Llama-class models plus LangGraph on-device mean SMEs can run powerful workflows without sending data to the cloud.
Safety layers baked-in. Guardrails.ai-style validation and the new “constitutional decoding” patches will ship as default nodes, not after-thoughts.

Stacking the AI-Agent universe the way the web was stacked

Web-era layer (≈2000s)	Role	AI-era analogue	Role
Linux kernel	Free, modifiable operating substrate that every higher layer builds on.	Open-weight LLMs (Llama 3, Mixtral, Gemma-it, etc.)	The raw cognition layer: freely downloadable weights anyone can fine-tune or quantize.
Apache / PHP / MySQL (the LAMP middle tier)	Runtime, networking, and data plumbing that turns static bits into dynamic applications.	LangChain + LangGraph	Orchestration fabric that turns raw LLM calls into stateful, tool-using, multi-step agents and workflows.
WordPress	A user-facing UI/UX so non-coders can publish, extend with plugins, and run entire businesses on the web stack.	LlamaPress	The missing application-layer CMS for agents: a radically open-source builder that lets technical and non-technical users spin up custom software tools and AI agents on top of LangChain/LangGraph, just as easily as WordPress let them spin up websites.

Bottom line:
If open-source LLMs are the Linux of the emerging AI platform and LangChain/LangGraph are its PHP/MySQL/Apache, then LlamaPress aims to be the WordPress—the approachable, plugin-friendly front door through which the next hundred million creators build tailor-made software and autonomous agents.

The big picture

History says that once a technology’s unit-cost crashes and copying is friction-free, proliferation is inevitable. Open-weight LLMs have crossed that threshold; open orchestration frameworks are doing the same for workflow logic; and a Cambrian explosion of agent-builder toolkits is lowering the bar from “developer” to “power user.”

By the late-2020s, spinning up a bespoke AI workforce could be as routine — and as openly sourced — as launching a WordPress site today. The democratization of AI is not a distant goal; it is the next wave, already appearing.

May 13, 2025