Running Your Own IPFS Gateway

ipfs kubo nginx self-hosting homelab

In Hello, IPFS I mentioned, almost in passing, that this site is “pinned on my own Kubo gateway node.” That post was the what: flat files, content-addressed, resolved through ENS. This one is the how — the node that actually serves and pins the bytes.

A personal site that doesn’t depend on a single host still needs a host to pin and serve it. Mine runs on my own hardware behind a reverse proxy, and the config below is the real shape of it (with my internal names and addresses swapped out). The lessons are the interesting part anyway, and the best one cost me an afternoon.

The chain: Caddy → nginx → Kubo

Three pieces, each doing one job:

internet/LAN ──> firewall ──> Caddy ──> nginx ──> Kubo
                 (LAN only)   (TLS,     (routing,  (IPFS host)
                              wildcard   access     API :5001
                              cert)      control)   gateway :8080
  • A firewall sits at the perimeter. The gateway is a LAN-only service, so inbound traffic to it from the internet is dropped at the edge — none of what follows is reachable from outside the network in the first place.
  • Internal DNS is what makes the names work. The gateway hostname (and the per-CID origins below) resolve to the proxy only on the LAN — queries never leave the network, and there are no public DNS records pointing at any of this. It’s also where .eth resolution hooks in, which I’ll get to.
  • Caddy sits at the edge and terminates TLS. It holds a wildcard certificate from Let’s Encrypt via a DNS-01 challenge, so every subdomain — including the gateway and the per-CID origins below — is covered by one cert. A guard rejects anything that isn’t coming from the LAN.
  • nginx handles routing and a second, independent layer of access control.
  • Kubo — the reference IPFS implementation — runs on a separate host. This is where the DAG actually lives.

Running Kubo

The node itself is a dozen lines of Docker Compose:

services:
  ipfs:
    image: ipfs/kubo:latest
    container_name: ipfs
    restart: unless-stopped
    environment:
      IPFS_PROFILE: server,pebbleds
    ports:
      - 4001:4001/tcp       # swarm (TCP)
      - 4001:4001/udp       # swarm (UDP/QUIC)
      - 5001:5001           # API
      - 8080:8080           # gateway
    volumes:
      - ipfs_data:/data/ipfs

volumes:
  ipfs_data: {}

A few things worth pointing at:

  • IPFS_PROFILE: server,pebbleds. The server profile disables local network discovery (MDNS) and NAT port-mapping — correct for a box in a rack that isn’t trying to find peers on the LAN. pebbleds switches the datastore to PebbleDB.
  • 4001 is the swarm port (how the node talks to other IPFS peers), exposed for both TCP and QUIC. 5001 is the admin API and 8080 the gateway — both reachable only from the LAN, never the internet (see below).

Compose gets the daemon running; the configuration is applied separately and declaratively, so a rebuild always lands in the same state. The interesting bits:

# Open CORS on the API and gateway (the API is locked down at the network
# layer instead — see the security section).
ipfs config --json API.HTTPHeaders.Access-Control-Allow-Origin '["*"]'
ipfs config --json Gateway.HTTPHeaders.Access-Control-Allow-Origin '["*"]'

# Subdomain gateway for the public hostname.
ipfs config --json Gateway.PublicGateways '{
  "gateway.example.com": {
    "UseSubdomains": true,
    "Paths": ["/ipfs", "/ipns"],
    "NoDNSLink": false
  }
}'

# Faster provider lookups.
ipfs config --json Routing.AcceleratedDHTClient true

# Let the node resolve .eth names itself, via a DoH endpoint.
ipfs config --json DNS.Resolvers '{"eth.": "https://your-ens-resolver.example/dns-query"}'

That last one is my favorite, and it deserves more than a passing mention.

ENS names live on Ethereum, not in DNS. A .eth name has no authoritative nameserver, so a stock resolver has no idea what to do with mysticryuujin.eth — it’ll just NXDOMAIN. To bridge that gap I run my own DNS-over-HTTPS resolver for the eth. zone, backed by an Ethereum node. When a query for a .eth name arrives, it reads that name straight out of the ENS registry on-chain — the name’s resolver contract, its contenthash, any DNS records it publishes — and answers as if it were serving an ordinary zone. (This is the same bridge the public eth.limo service provides; I just run my own so the lookups never leave the network.)

Two places consume it. Kubo’s DNS.Resolvers points the eth. zone at that endpoint, so the node resolves ENS itself — ipfs name resolve /ipns/mysticryuujin.eth works directly, no public gateway in the loop. And the internal DNS resolver forwards the eth. zone to the same place, so every machine on the LAN can browse .eth names natively. The whole stack speaks Ethereum naming without anything special in the application layer.

Two nginx server blocks, on purpose

Here’s where it gets interesting. There are two nginx server blocks, and the split matters for both correctness and security.

# --- Apex: IPFS admin API + path-style gateway ---
server {
  listen 80;
  server_name gateway.example.com;

  # ... real-IP recovery + private-IP allowlist (see security section) ...

  proxy_set_header Host $http_host;
  proxy_read_timeout 600;
  client_max_body_size 0;

  # Kubo HTTP API (port 5001) — full node control.
  location /api {
    proxy_http_version 1.1;
    proxy_buffering off;
    proxy_request_buffering off;
    proxy_pass http://ipfs-host:5001;
  }

  # Path-style gateway; Kubo 301-redirects /ipfs/<cid> to the subdomain origin.
  location / {
    proxy_pass http://ipfs-host:8080;
  }
}

# --- Subdomain origins: per-CID / per-IPNS content (origin-isolated) ---
server {
  listen 80;
  server_name ~^.+\.ipfs\.gateway\.example\.com$
              ~^.+\.ipns\.gateway\.example\.com$;

  # ... same real-IP + allowlist ...

  # Pass the original Host so Kubo's subdomain routing resolves the CID/IPNS key.
  location / {
    proxy_pass http://ipfs-host:8080;
  }
}

The apex block exposes two things: the admin API at /api:5001, and the path-style gateway at /:8080.

The wildcard block matches <anything>.ipfs.gateway.example.com (and the .ipns variant) and forwards only / to the gateway. This is origin isolation: each CID gets served from its own subdomain origin, so the browser’s same-origin policy keeps one piece of content from poking at another’s localStorage, cookies, or service workers. It’s the whole reason subdomain gateways exist.

Why is /api only on the apex? Because if the wildcard block also exposed it, then https://<cid>.ipfs.gateway.example.com/api would hit the Kubo admin API instead of serving that CID’s /api path. You’d be one URL away from letting arbitrary content reach your node’s control plane. Keep the admin API on exactly one name.

End to end, a path request flows like this: you hit gateway.example.com/ipfs/<cid>, Kubo 301-redirects you to <cid>.ipfs.gateway.example.com, the wildcard cert already covers that name, nginx forwards it with the original Host header intact, and Kubo reads the CID straight out of the hostname.

The war story: why bulk uploads deadlocked

Now the part that cost me real time.

When I first stood this up, deploys hung. Adding a single file through the HTTPS endpoint worked fine. Running ipfs add -r against the whole built dist/ directory — a few hundred files — would just sit there forever.

The first fix was a band-aid: route the bulk upload around the proxy entirely and hit the Kubo API directly on the LAN. The deploy script still carries the escape hatch:

# IPFS_ADD_API optionally overrides the endpoint for the bulk upload (the
# Kubo add API is bidirectionally streaming and needs a non-buffering proxy).
IPFS_ADD_API="${IPFS_ADD_API:-$IPFS_API}"
CID="$(command ipfs --api "$IPFS_ADD_API" add -r -Q --cid-version 1 --pin dist)"

Point IPFS_ADD_API straight at the box on the LAN and the add goes through — no proxy, no problem. That got deploys working, but it bugged me: why did HTTPS break, and only for bulk adds?

The answer: nginx’s defaults are exactly wrong for a streaming API. Two of them, specifically.

First, client_max_body_size defaults to 1m. Upload more than a megabyte and nginx rejects it. Easy to spot, easy to fix.

The subtler one is buffering. The Kubo add API is bidirectionally streaming: as the client uploads the multipart body, the server streams back per-file progress events. With nginx’s response buffering on, those events pile up unread in nginx’s buffers, that back-pressure stalls the upstream, and once the buffers fill the whole exchange deadlocks. Single-file adds squeaked through only because so few events fit in the buffers that nothing ever blocked.

The real fix, in the /api location:

location /api {
  proxy_http_version 1.1;
  proxy_buffering off;          # don't buffer the streamed response
  proxy_request_buffering off;  # don't buffer the streamed upload
  proxy_pass http://ipfs-host:5001;
}

plus client_max_body_size 0; (no limit) and proxy_read_timeout 600; on the server block. With buffering off in both directions, the upload and the progress stream flow concurrently the way Kubo expects, and bulk adds go right through HTTPS. The same requirement applies to anything else streaming — ipfs dag export, ipfs log tail, pubsub.

If you take one thing from this post: a reverse proxy in front of the Kubo API must disable buffering. It’s not optional, and the failure mode (works for small things, hangs for big ones) is built to waste your afternoon.

Don’t expose the API

The Kubo API at :5001 is full node control. Pin anything, unpin anything, read the config, rewrite the config, shut it down. And as you saw above, I run it with wide-open CORS. That’s fine — but only because the API is never reachable from the internet.

Lockdown is defense-in-depth at three layers. The firewall drops inbound internet traffic to the gateway before it reaches anything. The edge proxy rejects any request whose real client IP isn’t on the LAN. And then nginx does it again, independently:

# Recover the real client IP from the upstream proxy's X-Forwarded-For.
set_real_ip_from 10.0.0.0/8;
set_real_ip_from 172.16.0.0/12;
set_real_ip_from 192.168.0.0/16;
real_ip_header X-Forwarded-For;
real_ip_recursive on;

# Private IPs only.
allow 10.0.0.0/8;
allow 172.16.0.0/12;
allow 192.168.0.0/16;
allow 127.0.0.0/8;
deny all;

Three independent gates, all private-only. The open CORS header is harmless when nothing off-LAN can complete a request in the first place. If you take two things from this post: never put a Kubo :5001 API anywhere the public internet can reach it.

A name that doesn’t change: IPNS

Content addressing has a catch. Every time the site changes — a new post, a fixed typo — the CID changes too. That’s by design: the address is a hash of the bytes. But it means the “address of my site” is a moving target, and ENS records live on-chain. If my CID changed on every deploy and ENS pointed straight at it, every deploy would be an Ethereum transaction with real gas. That’s absurd for fixing a typo.

IPNS is the fix. An IPNS name is the hash of a public key, and you publish a signed record that points that name at a CID. Republish whenever you like to point it somewhere new. The name is permanent; what it resolves to is mutable — exactly the indirection a mutable site on an immutable filesystem needs.

The key is generated once, on the node:

ipfs key gen --type=ed25519 mysticryuujin
# k51qzi5uqu5d...   <- the IPNS name (a libp2p public-key CID)

That k51... string is the name I publish under, forever. It lives on the node and is worth backing up: lose the private key and you lose the ability to ever update that name again.

Then every deploy ends by signing a fresh record under that same key:

ipfs name publish --key=mysticryuujin --lifetime=72h --ttl=1m "/ipfs/$CID"
  • --key selects the keypair from above, so every deploy republishes the same name — only the CID it targets changes.
  • --lifetime=72h is how long the signed record stays valid before it expires out of the DHT. As long as I republish well inside that window (every deploy does), the name never goes dark.
  • --ttl=1m is a caching hint — how long resolvers may cache the answer. Short, so a fresh deploy shows up quickly instead of being pinned to a stale CID by someone’s cache.

And here’s the payoff: ENS only has to be told about this once. The site’s ENS contenthash is set to ipns://k51... — the name, not a CID. That single on-chain transaction is the only gas I ever pay for content. After it, the deploy pipeline is just add + name publish, no chain involved, and the resolution chain reads end to end:

mysticryuujin.eth ──> ENS contenthash ──> ipns://k51… ──> latest CID ──> bytes
   (on-chain, set once)        (republished every deploy)   (on my node)

Closing the loop

This is the node my deploy script talks to. Everything in Hello, IPFSastro build → ipfs add --cid-version 1 --pin → ipns publish — runs against the gateway’s HTTPS API, the very endpoint described above. The last line of every deploy is a sanity check that the gateway already serves the fresh CID:

curl -fsS -o /dev/null -w '%{http_code}' "$IPFS_GATEWAY/ipfs/$CID/"

If that returns 200, the new build is live and pinned on my own hardware before it ever touches an external pinning service.

Everything above is my setup, with the internal names filed off. If you want to actually stand up a stack like this, I packaged the patterns into an open-source, educational reference: spirens (“Sovereign Portal for IPFS Resolution via Ethereum Naming Services”). Clone it, add a domain, fill in a .env, and spirens up brings up a reverse proxy, a Kubo gateway, a local-first Ethereum RPC, and — handling the .eth resolution I described above — dweb-proxy, the same ENS→IPFS bridge that powers eth.limo. It’s a learning on-ramp, not a turnkey production deploy: read every config before you run it.

The permanent web, served from a box in my house. 🌐

← All posts