Back to monorepo orchestration Target affected workspaces Configure turbo pipelines Compare the Nx approach

Remote Caching Setup

A remote cache turns a build artifact produced once — on any machine, by any contributor — into a hit for everyone else who runs the same task with the same inputs. Without it, every CI job and every fresh git clone re-runs work that has already been computed elsewhere, and your build minutes scale with the number of people on the team rather than the number of meaningful changes. This page covers how to stand up a remote cache for a JavaScript monorepo, how the cache key is derived, how to secure it against poisoning, and how to wire it into CI so that the first job to build a package warms the cache for every job that follows.

Remote caching is one of the highest-leverage pieces of a Monorepo Architecture & Orchestration setup, and it only works if the task definitions feeding it are deterministic. The cache is downstream of your pipeline: it stores whatever your task runner tells it to store, keyed by whatever inputs your task runner decides are relevant. Get the Turborepo Pipeline Configuration wrong — a missing outputs glob, a volatile file leaking into inputs — and the remote cache faithfully stores garbage or never gets a hit. Decide which runner you are committing to first by reading Choosing a Monorepo Task Runner; the cache topology differs between Turborepo and Nx.

Remote cache topology Developer and CI runners share one remote cache: a local hit avoids the network, a local miss falls back to the remote store, and writes only flow from protected branches. Dev workstation local .turbo cache read + write CI runners ephemeral, no local cache Remote cache keyed by task hash signed artifacts Object store S3 / blob TTL eviction write read
One cache, two readers: developers and ephemeral CI runners share artifacts keyed by a deterministic task hash, with writes scoped to protected branches.

The problem statement

Every machine that builds a package computes the same dist/ from the same source. A remote cache makes that computation happen once. The hard parts are not turning the feature on — they are guaranteeing that two machines derive the same cache key for the same logical work, and stopping an untrusted machine from writing a poisoned artifact under a key that a trusted machine will later read. Everything below serves those two goals.

Architecture and prerequisites

Decide the cache topology and network boundaries before you integrate caching with any task runner. A remote cache is an HTTP service in front of an object store; the runner uploads a tar of a task's outputs under a key, and downloads it later when the key matches.

Pre-deployment checklist:

  • Terminate TLS 1.2+ at the load balancer or edge proxy; the cache protocol carries tokens.
  • Allow-list the IP ranges of your CI runners and, optionally, developer egress.
  • Rotate access tokens automatically (90-day maximum) through your secrets manager.
  • Confirm round-trip latency from CI to the cache endpoint is low; a slow cache that you still wait on is worse than no cache.
# Verify the TLS handshake and certificate validity window
openssl s_client -connect cache.example.com:443 -servername cache.example.com </dev/null 2>/dev/null \
  | openssl x509 -noout -dates

# Measure connect and first-byte latency to the cache health endpoint (target < 50ms)
curl -o /dev/null -s -w "connect: %{time_connect}s  ttfb: %{time_starttransfer}s\n" \
  https://cache.example.com/health

If those numbers are healthy, the cache is reachable and trusted; the remaining work is configuring the runner to talk to it.

Tool-specific configuration

The runner decides what to cache, where the outputs live, and which inputs feed the hash. Declare these explicitly. Implicit caching of non-deterministic artifacts is the single most common cause of "cache miss in CI but hit locally" reports.

Turborepo (v2.0+)

Keep these task definitions in lockstep with your Turborepo Pipeline Configuration so the same outputs and inputs govern both local and remote behavior.

{
  "$schema": "https://turbo.build/schema.json",
  "remoteCache": {
    "enabled": true,
    "signature": true,
    "timeout": 30
  },
  "tasks": {
    "build": {
      "outputs": ["dist/**", ".next/**", "!.next/cache/**"],
      "inputs": ["src/**", "package.json", "tsconfig.json"]
    },
    "test": {
      "outputs": ["coverage/**"],
      "inputs": ["src/**", "tests/**"]
    }
  }
}

Critical fields: signature: true enables HMAC verification of every artifact, so a reader rejects anything not signed with the shared key. timeout (seconds, in v2) caps how long the runner waits on a degraded endpoint before falling back to local execution. Explicit outputs globs prevent partial restoration — and the !.next/cache/** exclusion keeps a volatile framework cache out of the artifact. Turborepo v2 renamed pipeline to tasks; on v1 the same block lives under pipeline.

Nx (v17+)

Configure the workspace runner to match your Nx Workspace Architecture.

{
  "tasksRunnerOptions": {
    "default": {
      "runner": "nx-cloud",
      "options": {
        "cacheableOperations": ["build", "test", "lint"],
        "accessToken": "${NX_CLOUD_ACCESS_TOKEN}"
      }
    }
  }
}

Critical fields: cacheableOperations must exclude e2e and any network-dependent task. accessToken resolves at runtime from the environment — never commit a plaintext token.

Cache internals: how the key is derived

A remote cache is keyed, not by your branch or your timestamp, but by a hash of everything that could change a task's output. For a single package's build task the hash folds together the hashed contents of every file matched by inputs, the resolved values of every variable in env, the hashes of upstream dependencies declared via dependsOn (the ^build chain), the package manager lockfile, and the runner's own version. Two machines that agree on all of those derive the same key and therefore share the same artifact.

This is why determinism matters more than the network. If a .env.local, an absolute path, or a build timestamp leaks into the hashed set, two machines compute different keys and never share. Chasing those mismatches is the subject of Fixing Turborepo Remote Cache Misses, which walks through diffing local and CI hashes to find the contaminating input.

SaaS versus self-hosted topology

Before wiring credentials you must decide where the artifacts physically live, because that choice drives your security model, your latency, and your bill.

Dimension Managed SaaS Self-hosted
Setup effort Minimal — a token and a team slug You run an HTTP service and an object store
Data residency Artifacts leave your network Stays inside your perimeter
Latency Depends on provider region You place it next to your runners
Cost model Per-seat or per-bandwidth Storage + egress you control
Best for Small teams, public code Regulated data, large bandwidth, air-gapped CI

The deciding question is usually data residency: if build outputs may contain anything you are contractually barred from sending to a third party, you self-host. Latency is the secondary factor — a cache one region away can be slower to fetch than rebuilding a small package, which silently erodes the benefit. Teams that land on the self-hosted side should follow Self-Hosting a Turborepo Remote Cache for the server and storage layout; everyone else can point TURBO_TOKEN/TURBO_TEAM at the managed endpoint and move on.

Execution strategy

In CI, the first job to build a given package uploads the artifact; every subsequent job — in the same workflow or a later one — restores it. The flags below make that explicit and bounded.

# Build the affected graph, continuing past failures, with a bounded worker pool
pnpm exec turbo run build \
  --filter='...[origin/main]' \
  --continue \
  --concurrency=4

# Inspect what was hit vs. missed without running anything
turbo run build --dry=json | jq '.tasks[] | {id: .taskId, cache: .cache.status}'

Set --concurrency to the runner's vCPU count; over-provisioning turns parallel uploads into an I/O bottleneck. --continue lets independent tasks finish even when one fails, so a single broken package does not starve the rest of the cache.

Security and isolation

A shared cache is a shared trust boundary. The threat is poisoning: an attacker (or a buggy job) writes a malicious artifact under a key that a trusted build will later read as a hit.

  • Artifact signing. Set signature: true (Turborepo) and provision a TURBO_REMOTE_CACHE_SIGNATURE_KEY so readers reject any artifact not signed with the shared secret.
  • Branch-scoped writes. Grant write only to main and release/*. Pull-request and fork builds run read-only — they may benefit from the cache but can never populate it.
  • Token hygiene. Issue short-lived tokens via OIDC federation rather than long-lived secrets stored in repository settings, and rotate the signature key on a schedule.

CI/CD integration

The workflow below injects cache credentials as masked secrets, scopes write access by branch, and degrades gracefully when the cache is unreachable.

# .github/workflows/ci.yml
name: ci
on:
  push:
    branches: [main]
  pull_request:

jobs:
  build:
    runs-on: ubuntu-latest
    env:
      TURBO_TOKEN: ${{ secrets.TURBO_TOKEN }}
      TURBO_TEAM: ${{ vars.TURBO_TEAM }}
      TURBO_REMOTE_CACHE_SIGNATURE_KEY: ${{ secrets.CACHE_SIGNATURE_KEY }}
      # Pull requests read the cache but must not write to it
      TURBO_REMOTE_ONLY: ${{ github.event_name == 'pull_request' }}
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 2  # needed for --filter '...[origin/main]' to diff
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'pnpm'
      - run: pnpm install --frozen-lockfile
      - name: Build with remote cache
        run: pnpm exec turbo run build --continue --concurrency=4
        timeout-minutes: 15

The fetch-depth: 2 line is load-bearing: change-based filtering needs at least the previous commit to compute a diff. The timeout-minutes guard prevents a hung cache connection from holding a runner indefinitely. For the deeper tuning that turns a working cache into a fast one — compression, parallel uploads, warming strategies — see Optimizing Turborepo Remote Cache for CI. Teams that cannot send artifacts to a SaaS endpoint should follow Self-Hosting a Turborepo Remote Cache to run the protocol behind their own object store.

Validation and performance tuning

Verify hit rates, confirm signature enforcement, and put an eviction policy in place so the store does not grow without bound.

# Hit/miss breakdown for a build, machine-readable
turbo run build --dry=json | jq '.tasks[] | {id: .taskId, cache: .cache.status}'

# Reset the Nx local cache when debugging a suspected stale hit
npx nx reset

# Verify an artifact's signature against the public key
openssl dgst -sha256 -verify cache-signature.pub -signature artifact.sig artifact.tar.zst

Tuning parameters:

  • Concurrency. Pin --concurrency to the runner vCPU count.
  • Compression. Turborepo v2+ streams zstd-compressed artifacts natively; enable transport compression at the proxy if you self-host.
  • Eviction. Use tiered retention: main 90 days, feature branches 14 days, failed or unverified runs 0 days.

Reasoning about cache hits in CI

Once the cache is live, most operational questions reduce to "why did (or didn't) this task hit." The answer is always in the hash, and you can read it directly rather than guessing. A dry run prints the resolved hash and the cache status for every task in the graph; the summary file (written by --summarize) records the inputs that fed each hash. Together they let you trace any miss back to a concrete cause without rebuilding.

# Per-task hash and hit/miss for the whole graph
turbo run build --dry=json \
  | jq '.tasks[] | {task: .taskId, hash: .hash, status: .cache.status}'

Three patterns cover almost every report. A task that hits locally but misses in CI is almost always a hashed input that varies between the two environments — a machine-specific path, an unpinned Node.js version, a stray .env.local. A task that misses on every run regardless of environment usually has no outputs declared, so there is nothing to store and restore. And a whole graph that invalidates after an unrelated edit points at something over-broad in globalDependencies or a too-greedy inputs glob. Each of these has a fix in the configuration above; the diagnostic step is simply to read the hash rather than assume.

When the mismatch is subtle, the full walkthrough of diffing local and CI hashes lives in Fixing Turborepo Remote Cache Misses, and the throughput-oriented tuning that follows a clean hit rate is in Optimizing Turborepo Remote Cache for CI.

Common pitfalls and mitigation

Mistake Impact Resolution
Hardcoding cache tokens in turbo.json Credential leak; supply-chain exposure Inject via masked CI secrets or OIDC federation
Missing explicit outputs globs Partial or corrupt artifact restoration Declare exact output globs per task
Unrestricted write on pull requests Cache poisoning from untrusted code Run forks and PRs in TURBO_REMOTE_ONLY read mode
No artifact signing Readers trust unverified artifacts Set signature: true and provision a signature key
Volatile files in inputs Hash drift; perpetual misses Exclude .env.local, logs, OS binaries from inputs
Ignoring timeout on a degraded endpoint Silent CI hangs Cap timeout and let the build fall back to local execution

Frequently Asked Questions

How do I prevent cache poisoning in a shared monorepo? Enable artifact signing so readers reject any artifact not signed with the shared key, grant write access only to protected branches, and run pull-request and fork builds in read-only mode so untrusted code can never populate the cache.

Why is the cache key the same hash on every machine, and what breaks it? The key is a hash of the task's matched inputs, declared environment variables, upstream dependency hashes, the lockfile, and the runner version. It breaks when a machine-specific or time-varying file leaks into the hashed set, which is why two machines disagree and never share an artifact.

Can I run a remote cache without a SaaS provider? Yes. The remote cache protocol is an HTTP contract you can serve yourself in front of S3 or any blob store, with TLS, IP allow-listing, and token rotation. See Self-Hosting a Turborepo Remote Cache for a working setup.

How do I handle cache misses on the very first CI run? A cold cache simply executes the task and uploads the result, so the first run is no slower than a no-cache build. To avoid every contributor paying that cost, run a scheduled job on main that warms the cache for the critical dependency graph.

Related

Monorepo Architecture & Orchestration