|

Leaf–Spine Architecture Data Center (DC) Design | Spine-Leaf

Leaf-Spine Architecture
Summary Insights:
  • In today topic I will cover some theoretical concepts & practical design of Data Center,spine-leaf architecture maybe it will not cover everything, but I will try to explain with one example like how servers are connecting to leaf-spine switches, how to calculate the quantity of Leaf/spine switches, NIC (network interface card) etc in Data Center.
  • East-west traffic & North-south traffic

Before starting Spine-leaf architecture with example let me explain some concepts & terminology first.

Leaf-Spine Architecture:

Its also called spine-leaf Design architecture, this is the advance tier-2 design of modern Data center includes leaf & spine only, not like traditional tier-3 design consists of Access, Aggregation/Distribution & Core layer switches.

Its reduce latency ,improve scalability (easily expands when required) achieve equal cost multi path (ECMP) and efficient utilization of bandwidth as same path cost on each link, from Server–>Leaf–>Spine–>Leaf–>Server.

Below spine-leaf Data center design , you can see the advantage on right side note as well, Further i will explain the difference b/w traditional & Leaf-Spine ,just keep reading

Leaf-Spine Design

Due to high demand of east-west traffic (B/w servers horizontally inside DC)  in Data center leaf-spine is better option as only two hop Leaf–>Spine–>Leaf from server to server & server to storage communication.

North–South Traffic:

In North-South Traffic is coming inside or going outside the Data Center,traffic flow is b/w outside clients & inside servers in Data Center.

North-South Traffic

East–West Traffic:

In East-West traffic will stay inside Data center means traffic flow is b/w servers in DC, or communication b/w Server & storage (VM to VM )

East-West traffic

Difference Between ToR and Leaf:

Both term are used interchangeably ,all servers in rack are connected with switch usually installed in top of Rack to lay small length fiber cable like 1 or 2 meter to connect all servrers.Leaf is part of leaf-spine architecture that all server connected direclty with leaf switch.

Leaf switch is also called Top of rack (ToR), all servers in rack are connected with switch in top of rack.

Below picture of Top of Rack switch (ToR) location in Rack.

Top of Rack (ToR)

Oversubscription rate:

It’s the ratio b/w Bandwidth of server side connected to leaf switch (southbound traffic) and uplink bandwidth from leaf to spine (Northbound  traffic),for example server side link Bandwidth is 400GB with leaf switch & uplink from leaf switch to spine is 400GB then oversubscription is 400G/400G = 1:1

If server side is 400GB link & leaf uplink with spine is 200Gb then

400GB/200G = 2:1

Customer Project Requirment :

Let’s move to one example below that how to design & calculate the servers, switches & NIC quantities.

Rack Layout

Customer Project:
  • There is one cloud company i.e., ReadTech—requiring ~50 virtual machine (VMs) per server to run different applications, 24 servers per rack, and 1 ToR (leaf) switch per rack. The initial plan is 30 racks/cabinets. They want oversubscription rate strict 1:1 (Downlink B.W = Uplink B.W) and redundancy on server dual connected 25G ports on each server, also low Cost.

For above requirement below are the questions from customer side:

  • How many Leaf & spine switches required?
  • Capacity of leaf & spine switches like how many ports & capacity?
  • How many NIC card on each server?
  • 25G,100G ports on server & switches?

Let’s design & calculate the customer requirements in below different section

Analysis:

As per the above customer requirments below are the summary analysis.

  • Server NIC: Dual 25 GbE link as per requirement (active/active via LACP)
  • 24 server on each rack and two 25G port , 24×2=48  ports
  • Rack/cabinet: total 30 cabinet each cabinet one TOP/Leaf switch

 ToR Switch:each Rack has one ToR, As to reduce cost for each TOR/leaf  redundacny we’ll pair adjacent racks (Rack-1 with Rack-2, Rack-3 with Rack-4 & so on) so each servers can dual-home across two ToRs  like Rack-1 server will connect with Rack-1 ToR & Rack-2 ToR.

  •  Leaf model example: 48×25G downlinks with server + 12×100G uplinks toward Spine switch.
  •  Spine model example:Total 30 leaf switch , each leaf have 1200Gb uplink , to reduce cost leaf will connect 2x100G link with each spine then 12/2=6 spine switch.
  • Each leaf connect with spine with 2 link 30×2= 60 100G port required,
  • Now will choose switch consist of  64×100G ports as 4 ports for uplink connectivity.

Server & Leaf Downlink Math:

Per rack:

  • Servers: 24
  • Per server: 2 × 25 G = 50 G (active/active LACP)
  • Total leaf downlink: 24 × 50 G = 1,200 G (1.2 Tb/s)

Port Quantity:

  • 24 servers × 2×25G = 48×25G ports on the ToR → exactly matches a 48×25G leaf.

What 1:1  Oversubscription rate:

To meet 1:1, each leaf must have uplink capacity = 1,200 G toward the spines.

  • Leaf uplinks available: 12×100G = 1,200 G
  • Leaf downlink toward server is 1200 G for each leaf

How Many Spines? (And why 6 is ideal):

Every leaf must connect to every spine. There are two clean ways to reach 1,200 G per leaf:

  • Option A (chosen):6 spines, 2×100G from each leaf to each spine
    • Per leaf uplinks: 6 spines × 2×100G = 12×100G = 1,200 G
    • Per spine downlinks: 30 leaf switches × 2×100G = 60×100G used on each spine
    • Fits a 64×100G spine (4 ports spare for uplink connectivity)
  • Option B:12 spines, 1×100G from each leaf to each spine
    • Per leaf uplinks: 12 × 1×100G = 1,200 G
    • Per spine downlinks: 30 leaves × 1×100G = 30×100G used (lots of spare ports)

Final Design of Project:

Here’s how we designed it:

  • Scale target: 30 racks/cabinets (≈ 720 servers, ~50 VMs each)
  • Per rack math: 24 servers × (2×25G) = 1.2 Tb/s downlink
  • Leaf ports: 48×25G to servers, 12×100G to spines ⇒ 1.2 Tb/s uplink (1:1)
  • Fabric rule: every leaf connects to every spine
  • Two clean ways to hit 1:1:

  • 6 spines × (2×100G per spine) → 12×100G/leaf ✅
  • 12 spines × (1×100G per spine) → 12×100G/leaf (works, but more boxes

Why we chose 6 spines:


Same throughput, half the chassis, simpler cabling/control plane, fits 64×100G spines (≈60 leaf links + spare), and acceptable failure impact (1 spine down ≈ -16.7% per-leaf capacity during maintenance).

Next post will deep dive Traditoinal & Leaf-spine Architecture comparison with the help of diagram.

If you have any question or want to share you experience knowledge , i will be happy to listen from you in comment section about Spine-leaf Design ,east-west & north-south traffic.

If you want to learn about VMware vCenter virtualization check below link

VMware ESXI Virtualization

For Hyper-V Virtualization setup check below

Microsoft Hyper-V

Leave a Reply

Your email address will not be published. Required fields are marked *