Visual Programming with AI

The Computer That Works How You Thought Computers Worked

Steve Ruiz demos tldraw.computer - a visual programming language where language models become the execution engine. See graph-based AI workflows that execute multimodal computations, return editable structures instead of pixels, and enable true human-AI collaboration on the canvas.

I want a computer that works the way that I thought a computer worked before I knew how a computer works.

— Steve Ruiz, on the core philosophy behind tldraw.computer (00:15:30)

~18 min

Live demo with real-time AI

React

Canvas-based architecture

LLM

As execution engine

The Philosophy: Intuitive Computing

Steve Ruiz built tldraw.computer around a simple but revolutionary idea: computing should work how we intuitively expect it to work. No code, no syntax - just connect inputs to outputs and let AI figure out the rest.

I want a computer that works the way that I thought a computer worked before I knew how a computer works.

Watch (00:15:30)

Why This Matters

This represents a fundamental shift from code-first to intent-first interaction. Instead of learning syntax, users express their intent visually, and language models figure out the implementation. This makes AI accessible to non-programmers while enabling sophisticated workflows that traditional code cannot handle.

React All The Way Down

The entire tldraw canvas is built with React, creating a truly composable platform where any web content can become part of your drawing. YouTube videos, CodeSandbox editors, Figma files, even Excel spreadsheets - all embeddable and interactive.

It's just normal web stuff. It's like react all the way down.

Watch (00:01:46)

Universal Embedding

YouTube, Figma, CodeSandbox, Excel - any web content can be embedded directly into the canvas with full interactivity.

Recursive Embedding

tldraw can even embed itself - drawing inside tldraw inside tldraw, demonstrating the canvas's composability.

The Paradigm Shift: LLMs as Execution Engines

In traditional programming, code executes deterministically. In tldraw.computer, language models become the execution engine, enabling "nonlinear thinking" that can handle ambiguity, context, and creative problem-solving.

Revolutionary Insight

"The execution here is not being done in code. The execution is being done by a language model. Language models are capable of this kind of like nonlinear thinking."

Watch (00:12:03)

Why This Changes Everything

This represents a paradigm shift from deterministic computation to probabilistic reasoning. When LLMs execute, they can handle fuzzy inputs, make creative leaps, infer context, and gracefully handle ambiguity. "Two plus octopus" doesn't crash the system - it infers octopus might have 8 legs and returns 10. This "soft computing" enables applications that traditional code cannot handle.

Self-Scripting Nodes That Execute

tldraw.computer uses a node-based visual programming interface where each node generates its own script, accepts inputs, produces outputs, and pipes data to the next node. The graph itself becomes a program executed by AI.

This graph is going to execute. Right now, the instruction is creating a script for itself. And then it just executed the script.

Watch (00:09:03)

Self-Scripting

Each node creates its own prompt script based on its role in the workflow.

Data Flow

Nodes accept inputs, process them via LLM, and pipe structured outputs to downstream nodes.

Multimodal

Text, images, audio, camera feeds - all flow seamlessly through the same graph.

Make Real: Draw It, Then Build It

Before tldraw.computer, there was "Make Real" - a feature that sends screenshots of hand-drawn wireframes to GPT-4 with vision and asks it to generate working React code. One of the first tools enabling non-programmers to create software.

What if we could take the diagrams that we were drawing, the wireframes that we were drawing, and we could just kind of make them real?

Watch (00:03:47)

Iterative Bug Fixing

When the generated app has a bug, users can take a screenshot, annotate it directly on the canvas, and send it back to the model with the original source. The AI fixes the specific bug - a powerful workflow for visual debugging.

Watch bug fix demo (00:06:30)

Three-State Boolean: Yes, No, or Maybe

Traditional programming uses binary booleans (true/false). tldraw.computer embraces the probabilistic nature of LLMs with three-valued logic: yes, no, or maybe. This acknowledges uncertainty rather than forcing false precision.

So we have a boolean value of yes, no, or maybe.

Watch (00:14:06)

Embracing Ambiguity

Language models don't always return binary answers. Three-valued logic gracefully handles uncertainty, partial matches, and context-dependent responses.

Practical Application

Demo shows a pop song sorter that classifies songs into "good," "bad," or "maybe" - enabling nuanced recommendations that traditional boolean logic cannot express.

Canvas as Collaborative Interface

Unlike AI image generators that paint pixels (Midjourney, DALL-E), tldraw returns structured shape data that becomes editable canvas elements. This creates a true shared workspace where humans and AI can both manipulate objects.

It's not painting pixels in the way that like mid journey would. It's it's doing it as text. It's like kind of returning a structure that I can map into to shapes on the canvas.

Watch (00:16:47)

Why This Distinction Matters

When AI returns structured shapes instead of raster images, humans can edit, refine, and iterate on AI-generated content. The "draw a cat" demo creates an editable vector cat that can be modified. Ask the AI to "make the cat blow out the candle," and it infers context and updates the drawing accordingly. This bidirectional collaboration is impossible with pixel-based generation.

The Killer Use Case

Turning Children's Drawings Into Stories

"The killer use case for this, if it's not immediately obvious, is turning your daughter's drawings and stuff into pictures and stories and piping them all around."

This humanizes the technology and shows how AI can enhance creativity and emotional connection. Take a child's drawing, pipe it through tldraw.computer to generate an illustration, add speech synthesis to narrate a story, and create a magical experience that bridges imagination and reality.

Watch (00:13:14)

Technical Architecture & Demos

Steve demonstrated multiple working systems built on tldraw's canvas architecture, developed in collaboration with Google for the Gemini 2 launch. Each demo showcases different capabilities of the platform.

Ad Generator Demo

A graph that takes "AI engineer conference" as input and generates: (1) commercial text script, (2) speech audio, (3) illustrated image, all flowing through connected nodes. Each node self-scripts and executes via LLM.

Watch (00:09:03)

Multimodal Arithmetic: Two Plus Octopus

Demonstrates nonlinear thinking: given inputs "2" and "octopus" (not a number), the LLM infers octopus might have 8 legs, performs 2 + 8, and returns 10. This context-aware reasoning is impossible with traditional code.

Watch (00:12:12)

Draw Fast: Real-Time Image Generation

Uses latent consistency models to generate images in near real-time. As you sketch, the generated image updates. Shows how drawing becomes a dynamic, iterative process with AI.

Watch (00:06:56)

Built With Gemini Flash

tldraw.computer was built in collaboration with Google for the Gemini 2 launch. It uses Gemini Flash for fast, multimodal processing - enabling the real-time demos shown throughout the presentation. Early access to Google's latest models enabled this innovation.

Key Takeaways

LLMs as Universal Computers

Language models can perform any computation through prompting, making them general-purpose execution engines for visual programming.

Visual Programming for AI

Graph-based workflows make AI composable, debuggable, and accessible to non-technical users without sacrificing power.

Structure Over Pixels

Generating editable shapes instead of raster images enables true human-AI collaboration and iterative refinement.

Embracing Ambiguity

Three-valued logic (yes/no/maybe) acknowledges the probabilistic nature of LLMs and enables nuanced applications.

Most Revolutionary Insight: The idea that language models themselves become the execution engine fundamentally reimagines what "programming" means in the age of AI. We're moving from deterministic computation to probabilistic reasoning, from code to intent, from syntax to semantics.

"I think we're only just scratching the surface of what can be done with this paradigm." — Steve Ruiz (00:18:26)