Use Cloudflare As Your AI's Personal Computer With 1 Click. A Manual For Building Your Internet Of Compute.
That's right, one click in the SeqPU notebook provisions a complete private Cloudflare Computer in three seconds: Sandbox, KV, R2, D1, Vectorize, Workers AI, Browser, Email, Tunnel, all bound to a per-user Worker under your keychain.
You write code. Two buttons make it fire. Hit Schedule and it runs on cron. Hit Publish and it becomes a headless API, a UI site, or a bot. Same code, your choice of surface. Need a GPU, a dedicated CPU run, or your local rig? Publish that work as its own tool on the hardware it needs and call it from your Cloudflare computer. That is the chaining pattern.
1. Quick Start
- Go to SeqPU.com. Sign up with Google or email. You get free SeqPU credits on signup — no credit card.
- Open a notebook. Click Cloudflare ⚡ in the hardware panel.
- Three seconds later, your private edge stack is provisioned.
- Write code in a notebook cell — or open Claude Desktop with the SeqPU MCP server enabled and describe what you want built.
- Hit Schedule (cron) or Publish (headless API / UI / bot). Or both.
2. What You Got — Your Cloudflare Computer
The Cloudflare ⚡ button provisions, per-user, idempotently:
- A per-user customer Worker in a Cloudflare Workers for Platforms dispatch namespace. Serves your published tools at
tools.seqpu.com/u/{user-id}/{tool-id}. Isolated to you. - A private Workers KV namespace for fast key-value state.
- A private R2 bucket with optional jurisdiction lock — EU, FedRAMP, or FedRAMP-high — chosen once at provisioning, permanent.
- A private D1 SQLite database with full SQL, transactions, indices, 5 GB free.
- Three private Vectorize indexes at 1,536 / 1,024 / 768 dimensions. The SDK auto-dispatches by vector length — OpenAI
text-embedding-3-small(1,536-dim, your key) goes to the 1,536 index, Workers AIbge-large/bge-m3/qwen3-embedding(1,024-dim, free) to the 1,024 index,bge-base/embeddinggemma(768-dim, free) to the 768 index. Sameseqpu.vectors.add(id, vec)call — you never pick the index. - A private Durable Object for stateful coordination and schedule alarms.
- Account-level bindings for Workers AI, Browser Rendering, Email Routing, and Cloudflare Tunnel.
- A keychain — secrets injected as plaintext env bindings at sandbox spawn, scoped per script, never visible to any LLM that calls the script.
- A hardware-isolated sandbox with Python 3.11, Node.js 22, and Bun — ready to import
numpy,pandas,matplotlib,requests,httpx,openai,anthropic,beautifulsoup4,pyyaml,python-dotenv, plusgit,curl,jqin shell.
You never touch the Cloudflare dashboard, run wrangler init, mint an API token, or write a config file. The button moves from Setting up edge instance… to Edge ready. That's the entire infra step.
3. The SDK — How Your Code Talks To Your Cloudflare Computer
Inside the Sandbox, import seqpu exposes the back-channel to your Worker's bindings. The script writes the call. SeqPU resolves your user from the service token and routes to your customer Worker. The LLM never sees the credentials.
import seqpu
# Object storage on your R2 bucket
seqpu.storage.put("notes/2026-05-13.md", text, content_type="text/markdown")
text = seqpu.storage.get("notes/2026-05-13.md").text
url = seqpu.storage.url("notes/2026-05-13.md", ttl=3600)
# Semantic search on your Vectorize index
seqpu.vectors.add("doc-1", embedding, metadata={"title": "..."})
hits = seqpu.vectors.query(query_embedding, top_k=5)
# Workers AI inference on Cloudflare's edge
response = seqpu.models.complete(
prompt="Summarize this in three sentences.",
model="llama-3.1-8b", # SeqPU shorthand; @cf/meta/llama-3-8b-instruct also works
)
# Manage Cloudflare Tunnels (then call them via standard requests)
tunnel = seqpu.tunnels.create("home-server")
seqpu.tunnels.list()
# Queue async work to another script
seqpu.queue.publish("downstream-queue", {"job_id": 1234})Keychain secrets are available as environment variables: os.environ["ANTHROPIC_API_KEY"], scoped per script bundle, only for the secrets you have authorized. Add OPENAI_API_KEY to use the default 1,536-dim seqpu.models.embed() (pass-through, no SeqPU markup); skip it and pass model="bge-base" for free 768-dim Workers AI embeddings.
4. The Two Paths — Schedule And Publish
You write code in a notebook cell. You then have two buttons:
- Morning operators
- Hourly polls
- Every-15-min monitors
- Daily reports
- API Endpoint (REST + MCP)
- UI site (HTML template)
- Bot (Telegram today; Discord/Slack soon)
The same script can be both scheduled AND published. One file, two ways to fire. Every surface you add is one more click.
5. Path A — Scheduling Jobs
How
- Write your script in the notebook.
- Pick the hardware tier in the GPU selector (defaults to Cloudflare edge).
- Save any required secrets to your keychain.
- Hit Schedule. Pick a preset or enter custom cron.
- Pick your timezone (IANA name —
America/Los_Angeles,Europe/London, etc. DST handled automatically). - Hit Save. Your agent is now firing on schedule.
Preset frequencies in the schedule modal
Presets cover the common cases — every 5 minutes through weekly on a specific day. Custom takes any cron expression you want.
Cron cheat sheet
What runs
Each fire spawns your Sandbox on the hardware tier you selected, runs your code to completion, writes results, and tears down. Transient failures retry inside the Durable Object alarm wrapper so they do not silently die.
Cost
Default Cloudflare edge schedule: active-CPU billed, a typical fire costs around $0.00002. A daily agent: 30 fires/mo × $0.00002 = $0.0006/month in compute. Schedules on Modal hardware (dedicated CPU $0.05/hr, GPUs up to B200 $6.25/hr) bill at hardware-tier × wall-clock seconds. See Section 11 for the full rate sheet.
6. Path B — Publish (Headless API / UI / Bot)
Publishing turns your script into a callable surface. Hit Publish. Pick a mode:
The Inputs & Outputs Contract
Inputs: When someone calls your API, SeqPU takes the JSON body and injects each field as a variable in your script. POST {"prompt": "a cat"} → your script gets a variable called prompt with value "a cat". Or use INPUTS.get("name") as a dict fallback.
Outputs: Your script writes results to /outputs/. SeqPU reads those files and returns them in the API response.
Your Script
# prompt and style are auto-injected from the API call body
result = generate(prompt, style=style)
with open("/outputs/response.txt", "w") as f:
f.write(result)How Callers Invoke It
curl -X POST "https://api.seqpu.com/v1/tools/execute" \
-H "CF-Access-Client-Id: YOUR_CLIENT_ID" \
-H "CF-Access-Client-Secret: YOUR_CLIENT_SECRET" \
-H "Content-Type: application/json" \
-d '{
"toolId": "your-tool-id",
"inputs": {"prompt": "a cat", "style": "anime"}
}'The Response
{
"success": true,
"jobId": "abc-123",
"status": "completed",
"outputs": {
"response.txt": {
"url": "https://seqpu...modal.run?...",
"contentType": "text/plain"
}
},
"executionTime": 12.5,
"cost": 0.0045
}MCP comes for free. Publishing as an API endpoint automatically exposes the tool to MCP-aware clients (Claude Desktop, Cursor, Codex, OpenAI's MCP client). The tool name, description, and input schema you set become the MCP tool spec.
The HTML Attribute Contract
data-seqpu-input="id"— marks an input element. Its value is auto-injected as a variable namedidin your script.data-seqpu-output="id"— marks an output element. Reads/outputs/{id}.{ext}.data-seqpu-type="image|video|audio|file|json"— tells the UI how to render the output.data-seqpu-generate— the button that triggers your script. User clicks → inputs collected → script runs → outputs rendered.
HTML Template
<input type="text" data-seqpu-input="prompt" placeholder="Describe an image..." /> <button data-seqpu-generate>Generate</button> <img data-seqpu-output="result" data-seqpu-type="image" />
Your Script
# prompt is auto-injected from the input above
image = pipe(prompt).images[0]
image.save("/outputs/result.png")Flow: User clicks Generate → prompt sent to your selected hardware → script runs → result.png appears in the <img>.
The Bot Contract
Your script needs:
task— auto-injected variable containing the user's message text.seqpu.notify(message, chat_id=telegram_chat_id)— the call that sends the response back.
Auto-injected per fire:
telegram_chat_id— the chat to reply to. Pass this toseqpu.notify().context— conversation history as JSON. Parse withjson.loads(context)for memory across turns.
Your Script
import json
import anthropic
# task and telegram_chat_id are auto-injected
history = json.loads(context) if context else []
history.append({"role": "user", "content": task})
client = anthropic.Anthropic() # uses ANTHROPIC_API_KEY from keychain
response = client.messages.create(
model="claude-haiku-4-5",
max_tokens=1024,
messages=history,
)
reply = response.content[0].text
seqpu.notify(reply, chat_id=telegram_chat_id)Set up the bot by saving the platform token to your keychain (TELEGRAM_BOT_TOKEN, DISCORD_BOT_TOKEN, SLACK_WEBHOOK_URL) and pointing the platform's webhook at SeqPU.
7. Chaining — Publish A Tool On The Hardware It Needs, Call It From Cloudflare
When the work outgrows the edge — a GPU for transcription, an H100 for fine-tuning, a frontier API call — you publish the heavy piece as its own tool on the hardware it needs, then call that tool from your Cloudflare script.
The chaining pattern in 4 steps
- Write the heavy piece as its own script (e.g. a Whisper transcription script).
- Pick the right hardware in the GPU selector (e.g. T4).
- Publish it as an API Endpoint. You get back a tool ID and a callable URL.
- Call it from your Cloudflare orchestrator script via HTTP POST (or via the SDK). The orchestrator stays on the cheap edge. The heavy work runs on Modal hardware only for the seconds it needs.
From your Cloudflare orchestrator:
import requests
import os
# Auth: SEQPU_CLIENT_ID is auto-injected at sandbox spawn.
# SEQPU_CLIENT_SECRET is something you mint once (Settings → Service Tokens)
# and store in your keychain — it does NOT auto-inject.
response = requests.post(
"https://api.seqpu.com/v1/tools/execute",
headers={
"CF-Access-Client-Id": os.environ["SEQPU_CLIENT_ID"],
"CF-Access-Client-Secret": os.environ["SEQPU_CLIENT_SECRET"],
"Content-Type": "application/json",
},
json={
"toolId": "whisper-transcribe",
"inputs": {"audio_url": audio_url},
},
)
result = response.json()
transcription_url = result["outputs"]["transcription.txt"]["url"]Same pattern for every engine. You compose your own internet of compute by publishing what you need, on the hardware you need, and calling it from your Cloudflare computer.
8. The Engines You Chain To
response = seqpu.models.complete(
prompt="Summarize this email thread in three sentences.",
model="llama-3.1-8b",
)
# Free Workers AI embeddings (768-dim or 1024-dim, bundled in edge fire cost):
embedding = seqpu.models.embed("text to embed", model="bge-base")
# Default seqpu.models.embed() uses OpenAI text-embedding-3-small (1536-dim,
# pass-through to your OPENAI_API_KEY) — see Engine 2.import anthropic
import os
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
response = client.messages.create(
model="claude-sonnet-4",
max_tokens=2048,
messages=[{"role": "user", "content": diff_text}],
)
result = response.content[0].textAPI keys live in your keychain, injected at spawn. The LLM that called your script never sees them. These calls run inside your edge fire as a network roundtrip — the script idles on await while the API responds, and the edge bill stops during the wait. The provider tokens are a separate per-token charge against your credits.
# In your Cloudflare orchestrator:
response = requests.post(
"https://api.seqpu.com/v1/tools/execute",
headers={...}, # your service token
json={"toolId": "image-gen-flux", "inputs": {"prompt": prompt}},
)
image_url = response.json()["outputs"]["result.png"]["url"]Hardware tiers: dedicated CPU ($0.05/hr), T4 ($0.59/hr), L4 ($0.80), A10G ($1.10), L40S ($1.95), A100 40GB ($2.10), A100 80GB ($2.50), H100 ($3.95), H200 ($4.54), B200 ($6.25). Spins up, your tool runs, spins down. Pay only for the seconds it was alive.
import requests
# Manage tunnels from your customer Worker
tunnel = seqpu.tunnels.create("home-server")
print(tunnel["installCommand"]) # paste on your server — only manual step
seqpu.tunnels.list()
seqpu.tunnels.status(tunnel["tunnelId"])
# Then call your local rig over the tunnel using standard HTTP
result = requests.post(
f"https://{tunnel['publicHostname']}/api/query",
json={"query": "SELECT * FROM customers WHERE health_score < 0.4"},
).json()Your data stays on your hardware. The tunnel reaches it from inside the Worker without exposing it to the public internet. You don't even leave your script — seqpu.tunnels.create() returns the exact cloudflared install command to paste on your server. That's the only manual step.
# Synchronous tool call
response = requests.post("https://api.seqpu.com/v1/tools/execute",
headers={...}, json={"toolId": "your-other-tool", "inputs": {...}})
# Or async via queue
seqpu.queue.publish("downstream-queue", {"job": "process_result"})9. Four Worked Examples
Build: Click Cloudflare ⚡. 40 lines: Gmail API, seqpu.models.complete() (Llama 8B handles the summarization in the same fire), Slack POST. Hit Schedule → "Weekdays at 9 AM" or 0 7 * * 1-5. Done.
Build: Same notebook. Add an Anthropic API call (it's an HTTP roundtrip from inside the edge sandbox — script idles on await while Claude responds, active-CPU billing stops during the wait). Hit Schedule → Custom → */15 9-18 * * 1-5. Or ask Claude through MCP to build it.
Build: Whisper needs a GPU only for the 2 seconds it runs; everything else stays on the edge. Three steps:
- Write a Whisper script. GPU selector → T4. Publish as API Endpoint. Inputs:
audio_url. Outputs:transcription.txt. Save the tool ID. - Write the Telegram bot script on Cloudflare edge. It downloads the voice, uploads to R2, POSTs to your Whisper tool, calls
seqpu.notify()with the result. - Publish the bot script as Bot → Telegram. Point your BotFather webhook at the SeqPU URL.
Build: One script, the engines that fit each job. seqpu.tunnels.create("on-prem-db") + requests.post() for local MySQL through Tunnel. seqpu.models.complete() for bulk Workers AI scoring (free with the edge fire). anthropic.messages.create() for the narrative on top-50. Hit Schedule → "Daily at 9 AM" (or 0 10 * * *). Done.
10. MCP Autonomy — You Describe The Outcome. Claude Builds It.
You don't have to write code yourself. With the SeqPU MCP server enabled in Claude Desktop, you describe the agent in plain English. Claude has 13 platform-level tools plus one auto-generated seqpu_run_on_{tier} tool per hardware tier — 25-plus tools total at runtime:
| Tool | What It Does |
|---|---|
seqpu_run · seqpu_run_on_* | Execute code on default tier or any named hardware tier |
seqpu_tools_list / _call | Discover your published tools and invoke them |
seqpu_publish_as_api / _ui / _bot | Publish current code as tool, UI, or bot |
seqpu_schedule_create / _list / _cancel | Manage scheduled jobs |
seqpu_credits_balance / _topup | Credit visibility & Stripe top-up |
seqpu_keychain_list / _put | Manage your secrets |
You type:
Watch my GitHub PRs every 30 minutes during weekday work hours. Score each one for quality and security. Slack me if anything is urgent.
Claude runs autonomously: lists keychain, asks for missing secrets, writes the script, publishes it, schedules the cron, replies with the audit link and cost estimate. You never opened a terminal.
11. Pricing
Nothing on this stack is free. Every byte of memory, every active CPU-second, every GPU-second, every storage operation, every API roundtrip is metered. SeqPU pays the bills behind the scenes: Cloudflare for the edge, Modal for dedicated CPU and GPU tiers, and every provider we add next. You see one credit balance. Scale is our problem. Your project is yours.
Layer 1: Cloudflare Edge
Always on. Active-CPU billed. You pay only when the CPU is actually computing, not when the script is idling on an await or waiting on a network response.
- A typical edge fire (script wakes, hits an API, processes, writes, sleeps) costs around $0.00002 from your credits.
- Idle is free.
- Workers AI inference and your Anthropic / OpenAI / Mistral API calls all happen FROM the edge. The script idles on
awaitduring network roundtrips; the edge bill stops during the wait. - Provider tokens (Anthropic, OpenAI, Mistral) bill per-token against your credits, separate from the edge fire cost.
What the $0.00002 actually pays for: Cloudflare's underlying meters
Nothing here is invented. Every Cloudflare resource is metered at Cloudflare's published rates (May 2026). SeqPU pays those bills behind the scenes; you see one number. Here is the breakdown for a typical edge fire on the basic Container instance (1/4 vCPU, 1 GiB RAM, 4 GB disk):
| Cloudflare resource | Published rate | Per fire | Cost |
|---|---|---|---|
| Workers for Platforms request | $0.30 / million | 1 | $0.0000003 |
| Worker CPU time | $0.02 / million ms | ~10 ms | $0.0000002 |
| Container vCPU (active) | $0.000020 / vCPU-sec | ~1 vCPU-sec | $0.0000050 |
| Container memory (basic = 1 GiB) | $0.0000025 / GiB-sec | ~3 GiB-sec | $0.0000075 |
| Container disk (basic = 4 GB) | $0.00000007 / GB-sec | ~12 GB-sec | $0.0000008 |
| Durable Object request (alarm wrapper) | $0.15 / million | 1 | $0.0000002 |
| Durable Object duration | $12.50 / million GB-s | ~0.0004 GB-s | $0.0000050 |
| Subtotal per fire | – | – | ~$0.000019 |
A typical fire lands around $0.00002. Heavier fires (longer wall-clock, more memory time, more storage ops) cost proportionally more.
Layer 2: Modal (heavy compute, by the second)
For workloads that outgrow the edge: voice transcription, image gen, fine-tuning, heavy data processing, anything that needs a dedicated CPU or a GPU. Publish the heavy work as its own tool on the right Modal tier; call it from your Cloudflare edge computer.
| Hardware | Rate | VRAM | What It's For |
|---|---|---|---|
| Dedicated CPU | $0.05/hr | – | Heavy data crunching, long CPU work that doesn't fit the edge |
| T4 | $0.59/hr | 16 GB | Whisper, small-model inference |
| L4 | $0.80/hr | 24 GB | Balanced. Image gen, medium inference |
| A10G | $1.10/hr | 24 GB | Faster than L4 for compute-bound work |
| L40S | $1.95/hr | 48 GB | Large image gen, 13B to 30B inference |
| A100 40GB | $2.10/hr | 40 GB | Pro. Fine-tuning, large inference |
| A100 80GB | $2.50/hr | 80 GB | Pro+. 70B inference, fine-tuning |
| H100 | $3.95/hr | 80 GB | Flagship. Large-batch training and inference |
| H200 | $4.54/hr | 141 GB | Next-gen. The biggest single-GPU workloads |
| B200 | $6.25/hr | 192 GB | Ultimate. Only when the work actually needs it |
Billed per second wall-clock. Spin up, run, spin down. Pay only for the seconds it was alive. Idle is free.
Examples. A 2-second T4 fire: $0.59 / 3600 × 2 = $0.00033. A 10-minute dedicated Modal CPU run: $0.05 / 60 × 10 = $0.0083. Whether the work hit Cloudflare edge or a dual B200 on Modal, SeqPU paid the underlying provider bill from one invoice and debited your credits one-for-one.
Three-agent personal ops department, rolled
All three run on Cloudflare edge. Compute is rounding error; the bill is mostly Claude API tokens.
| Script | Cadence | Layer | Compute | LLM API | Total |
|---|---|---|---|---|---|
| Morning Operator | Weekdays 7 AM | Edge | $0.00044 | $0.40 | $0.40 |
| PR Coach | 15 min, work hrs | Edge | $0.0144 | $3.60 | $3.61 |
| Customer Health | Daily 10 AM | Edge | $0.0006 | $0.66 | $0.66 |
| Total | – | – | $0.015 | $4.66 | $4.66/mo |
Three autonomous agents, 24/7, for under $5/month. Compute is a rounding error. Most of the bill is Claude tokens. Annual: about $56.
Publishing tools for others: marketplace markup
When you publish a tool, you can set a markup percentage on the base compute cost (capped at 30%). End users who call your tool pay base + markup; you earn the markup per run.
Example: a Flux image-gen tool on L4, expected runtime 8 seconds.
- Base compute cost: $0.80/hr × 8 sec / 3600 = $0.0018
- Your markup: 30% (the cap), so +$0.00054
- End user pays: $0.00234 per run
- You earn: $0.00054 per run
Set the markup to 0% if you're publishing tools for yourself; set it up to 30% to monetize.
The Cloudflare computer is the orchestrator. Schedule fires it on cron. Publish makes it callable as a headless API, a UI, or a bot. From there it reaches Workers AI, LLM APIs, Modal tools you published on the dedicated CPU or GPU they need, your local rigs — every engine paying only for the seconds it runs.
We bring the fly swatter, not the bazooka — and the bazooka when the bazooka is what the work actually needs, only for the seconds it runs.
SeqPU pays Cloudflare, Modal, and every provider we add next, behind the scenes. You buy credits and focus on your project. We handle the scale.
The cloud computer is for the AI. The home computer stays yours.
References
Cloudflare Containers pricing · Workers for Platforms · KV · R2 · D1 · Vectorize · Workers AI · Cloudflare Tunnel · SeqPU — Encapsulated Agentic Architecture · SeqPU Docs