Skip to main content

Command Palette

Search for a command to run...

Spotify System Design: The One Detail Interviewers Expect—CDN + Signed URLs

Updated
3 min read
Spotify System Design: The One Detail Interviewers Expect—CDN + Signed URLs

![CDN + Signed URLs diagram](https://bugfree-s3.s3.amazonaws.com/mermaid_diagrams/image_1776618969925.png "CDN + Signed URLs"){:width="700"}

Spotify System Design: CDN + Short‑Lived Signed URLs

When designing a Spotify‑like streaming system, the single most important detail interviewers expect you to call out is: never stream audio directly from your application servers. Instead, store audio files in object storage (S3 or equivalent) and serve them through a CDN. Your streaming service should only authenticate and authorize the user, enforce subscription rules, then return a short‑lived signed URL that points to the CDN edge.

This approach achieves three key goals:

  • Dramatically reduces origin (app server) load — the CDN handles most of the traffic.
  • Lowers global latency — users fetch from nearby CDN edge locations.
  • Limits casual piracy — URLs expire quickly, making leaked links unusable.

Core request flow (state this clearly in interviews):

Client → API Gateway → Auth Service → Streaming Service → Signed CDN URL → CDN Edge

Why this architecture works

  • Origin offload: The CDN caches content on edge nodes. The origin (object store or app servers) is hit only on cache misses.
  • Fast auth separation: The streaming service does the minimal work — validate user/session and subscription level — then delegates heavy lifting to the CDN.
  • Short lifetime for security: Signed URLs or signed cookies expire, reducing the window for unauthorized sharing.

Implementation notes & tradeoffs

  • Where to store audio: Use object storage (Amazon S3, GCS) behind the CDN. Do not route audio bytes through your app fleet.
  • Type of signed access:
    • S3 presigned URLs (good for direct S3 access).
    • CDN-backed signed URLs or signed cookies (CloudFront, Fastly, Akamai) preferred because they keep origin private and allow edge signing/validation.
  • Token lifetime: Keep it short — often 30 seconds to a few minutes depending on playback buffering and resumability needs. Balance between UX (seek/resume) and security.
  • Range requests & chunking: Support HTTP Range headers so players can seek and resume without fetching the whole file. Also enables partial caching at edges.
  • Cache control: Use appropriate Cache-Control headers and CDN TTLs. For frequently accessed tracks, set long TTLs; for personalized or frequently updated content, shorten TTLs.
  • Origin protection: Lock your object storage bucket so only the CDN (or your origin service) can read objects. Use origin access identities or private backends.
  • DRM & license servers: If strong piracy protection is required, add a DRM/license server for decryption keys; signed URLs alone are not full DRM.
  • Key management & signing: Rotate signing keys, use HMAC or asymmetric signing, and keep signing code minimal inside the streaming service.

Operational considerations

  • Metrics to track: cache hit ratio, origin request rate, edge latency, signed URL error rates, and unauthorized access attempts.
  • Cost: CDN egress reduces origin compute costs but adds CDN egress charges — overall cheaper at scale due to fewer origin servers.
  • Edge logic: Use edge rules (headers, redirect, geofencing) at the CDN for further optimization.

How to say it in an interview (concise script)

"I would store audio in object storage and serve it through a CDN. The streaming service only authenticates and authorizes the client, then returns a short‑lived signed CDN URL. Flow: Client → Gateway → Auth → Streaming Service → signed URL → CDN edge. This reduces origin load, lowers latency globally, and limits piracy because links expire. For DRM‑level protection, add a license server for decryption keys."

Wrap up

Mentioning this CDN + signed URL pattern shows you understand scalability, global performance, and basic security tradeoffs. If asked, be ready to discuss TTL choices, range requests, how to protect the origin, and where DRM fits into the picture.

More from this blog

B

bugfree.ai

365 posts

bugfree.ai is an advanced AI-powered platform designed to help software engineers and data scientist to master system design and behavioral and data interviews.