GPU Rack Assembly: The Complete Process from Bare Metal to Production
TL;DR
The physical process of building a production-ready GPU compute rack from individual components: rails, servers, switches, power distribution, cabling, and cooling infrastructure.
What GPU Rack Assembly Actually Involves
GPU rack assembly is the physical process of building a production-ready GPU compute rack from individual components: rails, servers, switches, power distribution, cabling, and cooling infrastructure. It is not the same as plugging a GPU into a PCIe slot. At hyperscale, GPU rack assembly is a multi-day process per rack requiring specialized tooling, trained technicians, and a systematic methodology that ensures every connection is correct, tested, and documented before the rack enters production.
The term covers a range of deployment scenarios. At one end, it includes assembling standard 4U GPU servers (like the NVIDIA DGX H100 or HGX-based platforms from Dell, Supermicro, and ASUS) into a 42U or 48U rack with top-of-rack switching, power distribution, and cable management. At the other end, it includes deploying rack-scale systems like the NVIDIA GB200 NVL72 and GB300 NVL72, where the entire rack arrives as an integrated unit and must be connected to facility power, cooling, and network infrastructure.
Leviathan Systems performs GPU rack assembly for every current NVIDIA platform, from H100 air-cooled systems through GB300 NVL72 liquid-cooled rack-scale deployments. This guide documents our assembly process.
Pre-Assembly: Planning and Procurement
Rack Layout Design
Every GPU deployment begins with a rack layout document that specifies exactly what goes where. The layout defines server positions (measured in rack units from the bottom), switch positions, PDU placement, cable management arm locations, and blank panel positions.
For GPU racks, layout design is constrained by power density, weight distribution, and cooling airflow. Heavy GPU servers should be positioned low in the rack to maintain a safe center of gravity. Servers with the highest power draw should be distributed to avoid concentrating heat load on any single section of the rack. Cable management must account for the cable volume at each server position — an 8-GPU server like the DGX H100 has over 20 cable connections per unit.
The rack layout document also specifies the cable schedule: every connection from every port, including fiber type, connector type, cable length, and label identifier. The cable schedule is created before any physical work begins and is the authoritative reference throughout the assembly process.
Bill of Materials Verification
Before assembly begins, every component listed in the bill of materials must be physically present and inspected. This includes servers, switches, PDUs, cables, fiber patch panels, cable management hardware, rack accessories, labels, and consumables.
Missing components are the most common cause of assembly delays. A single missing DAC cable or fiber jumper can halt progress on a rack for days while the replacement is sourced and shipped. The BOM verification process includes opening every box, verifying model numbers and serial numbers against the purchase order, and checking for shipping damage.
Staging Area Setup
The assembly staging area must be clean, well-lit, and climate-controlled. GPU components are sensitive to electrostatic discharge (ESD), so the staging area requires ESD-safe flooring, grounded workbenches, and technicians wearing ESD wrist straps at all times.
The staging area should have sufficient space to lay out all components for a single rack simultaneously. This allows the assembly team to verify the complete BOM, organize components by installation order, and pre-stage cables by length and type before beginning physical assembly.
Phase 1: Mechanical Assembly
Rail Installation
Rail installation is the foundation of the rack build. Every rail must be level, properly secured to the rack posts, and positioned at the exact rack unit specified in the layout document. Rail position errors compound — a single rail installed one rack unit off shifts every subsequent component and invalidates the cable schedule.
Rails for GPU servers are typically tool-less snap-in designs, but the high weight of GPU servers (a DGX H100 weighs approximately 125 pounds) means that rail security is critical. After installation, each rail pair must be tested by applying downward force at the front and rear to verify both attachment points are fully engaged.
Server rails must also be verified for compatibility with the specific server model. While most enterprise servers use common four-post rail kits, some GPU platforms (particularly HGX-based systems with non-standard form factors) require platform-specific rails that have different mounting hole patterns.
Server Installation
With rails in place, servers are installed starting from the bottom of the rack and working upward. This maintains a low center of gravity throughout the assembly process and prevents working above heavy components.
Each server is lifted into position using a minimum of two technicians for units weighing over 50 pounds. For servers exceeding 100 pounds (most 4U+ GPU servers), a mechanical server lift is required. Under no circumstances should a technician attempt to install a GPU server alone — the risk of equipment damage and personal injury is too high.
After sliding the server into the rack, verify that the rail latch engages on both sides, the server is fully seated at the rear, and the front bezel or faceplate aligns with adjacent components. Connect the cable management arm if applicable.
Switch Installation
Top-of-rack switches are installed after the servers they serve. For GPU racks using InfiniBand (Quantum-2 or Quantum-X800) or high-speed Ethernet (Spectrum-X with ConnectX-8 SuperNICs), the switch is typically positioned at the top of the rack to minimize cable lengths to the aggregation layer above.
For racks with both management network and high-speed data network connections, separate switches may be installed for each network. The management switch (typically 1GbE or 10GbE) handles BMC/iDRAC connectivity, while the high-speed switch handles GPU-to-GPU inter-rack communication.
PDU Installation
Power distribution units are installed in the rear vertical mounting positions of the rack. For GPU racks, redundant PDUs (A-feed and B-feed) are standard. Each PDU must be rated for the total rack power draw plus a 20% headroom margin.
Monitored or switched PDUs are preferred over basic PDUs because they provide per-outlet power monitoring, which is essential for troubleshooting power anomalies and verifying balanced load distribution across phases.
Phase 2: Power Cabling
Power Cable Routing
Every server receives redundant power connections — one to the A-feed PDU and one to the B-feed PDU. Power cables are routed through the rear cable management channels and secured with hook-and-loop fasteners (not zip ties, which are permanent and complicate future servicing).
Cable routing follows a specific order: power cables first, then management network, then high-speed data network. This layering prevents power cables from crossing over or compressing data cables, which can cause electromagnetic interference and signal degradation.
Each power cable must be fully seated in both the server power supply and the PDU outlet. Power connections are verified by applying gentle outward pressure to confirm the retention mechanism is engaged. A single loose power connection will cause the server to fail over to single-feed power, eliminating redundancy without generating an obvious alarm in many configurations.
Power Verification
Before energizing, verify the following at every power connection: correct cable routing matches the cable schedule, both A-feed and B-feed connections are present at every server, PDU circuit breakers are in the OFF position, and the facility power feed is confirmed at the correct voltage and phase rotation.
Energize the rack in stages: first the PDUs (verify voltage at PDU output), then the management switches, then servers one at a time starting from the bottom. Monitor PDU power draw at each stage and compare against expected values. Anomalies must be investigated before energizing additional equipment.
Phase 3: Network Cabling
Management Network
The management network connects every server's BMC or management controller (Dell iDRAC, Supermicro IPMI, etc.) to the management switch. These connections use Cat6A copper cables for runs under 100 meters.
Management network cabling is straightforward but must be done with the same discipline as high-speed cabling. Label every cable at both ends. Route cables cleanly through management channels. Verify link status on every port after connecting.
High-Speed Data Network
The high-speed data network is where cabling complexity increases dramatically. Each GPU server may have 8 or more high-speed ports (100GbE, 200GbE, or 400GbE) that must be connected to the top-of-rack switch or directly to other servers in a leaf-spine or fat-tree topology.
For InfiniBand deployments, cable type selection is critical. HDR InfiniBand (200Gbps) uses QSFP56 connectors. NDR InfiniBand (400Gbps) uses OSFP connectors. The physical connector type must match the transceiver installed in both the server NIC and the switch port. Mixing connector types or transceiver generations is a common cabling error.
Fiber cables must be inspected and cleaned before every insertion. At 400Gbps data rates, a single particle of dust on a fiber end face can cause bit errors that degrade training performance. Use a fiber inspection scope to verify end face cleanliness and a one-click cleaner for remediation.
NVLink Cabling (Rack-Scale Systems)
For rack-scale systems like the GB200 NVL72 and GB300 NVL72, NVLink cabling connects compute trays to NVLink switch trays within the rack. This cabling is typically pre-installed at the factory, but verification is required on site.
NVLink cables use proprietary high-density connectors and must be installed with precise insertion force. Under-seated connectors are the most common NVLink failure mode. Use the manufacturer's insertion force gauge to verify proper seating on every connection, then run NVLink bandwidth tests to confirm full link width.
Phase 4: Cooling System Integration
Air-Cooled Racks
For air-cooled GPU racks (H100, H200, most HGX B200 configurations), cooling integration focuses on airflow management. Hot aisle/cold aisle containment must be properly sealed with blanking panels in all unused rack positions. Any gap in the containment allows hot exhaust air to recirculate to server intakes, causing thermal throttling.
Verify that all server fans are operational and set to the correct cooling profile. GPU servers generate significantly more heat than standard compute servers and require aggressive fan curves. A 4U GPU server with 8 GPUs can dissipate 10kW or more, generating a concentrated heat plume at the rear of the rack that must be captured by the hot aisle containment or rear-door heat exchanger.
Liquid-Cooled Racks
For liquid-cooled systems (GB200 NVL72, GB300 NVL72, and some HGX B200/B300 configurations), cooling integration is a multi-step process involving CDU connection, manifold routing, pressure testing, filling, bleeding, and flow verification. This process is detailed in our separate guide on liquid cooling for GPU deployments.
Phase 5: Testing and Commissioning
Physical Inspection
Before any electronic testing, perform a final physical inspection of the completed rack. Verify all screws are torqued, all cables are properly routed and secured, all labels are present and legible, all blanking panels are installed, and all cable management arms move freely without catching on adjacent cables.
Power-On and POST
Power on all servers and verify successful POST completion. Monitor BMC consoles for any hardware errors, thermal warnings, or component failures during POST. All GPUs must be detected by the operating system and report correct specifications (memory size, compute capability, NVLink connectivity).
Cable Certification
Test every fiber and copper connection using calibrated test equipment. Fiber connections require OTDR testing and insertion loss measurement. Copper connections require full cable certification to TIA-568 standards. Document all test results in the cable management database.
GPU Stress Testing
Run GPU diagnostic tools (NVIDIA DCGM, gpu-burn, or NCCL benchmarks) across all GPUs in the rack for a sustained period. Minimum burn-in duration is 24 hours under full load. Monitor GPU temperatures, clock speeds, memory errors, and NVLink bandwidth throughout the test. Any GPU that shows errors, thermal throttling, or performance degradation must be replaced and the test repeated.
Documentation Package
Assemble the complete documentation package: as-built rack layout, cable schedule with test results, GPU validation reports, power consumption data, thermal performance data, and photographs of the completed rack from all four sides. This package is the deliverable that accompanies every rack Leviathan assembles.
Scaling: Multi-Rack and Cluster Deployments
Single rack assembly takes 2-5 days depending on platform complexity and cooling type. Multi-rack deployments require parallel assembly teams and careful coordination with facility infrastructure (power, cooling, network) that is being built out simultaneously.
For cluster-scale deployments (10+ racks), Leviathan deploys dedicated project management to coordinate the assembly schedule, manage component staging, track testing progress, and ensure that the inter-rack network fabric is built in parallel with individual rack assembly. The goal is to complete cluster-level integration testing (all-reduce benchmarks across the full GPU count) within days of the final rack being commissioned.
GPU Rack Assembly Services
Leviathan Systems provides GPU rack assembly for all current NVIDIA platforms, from single racks to hyperscale deployments. Our teams are currently deploying H100, GB200, and GB300 NVL72 infrastructure across the United States, with active projects in Texas and California.
Contact us to discuss your deployment. We respond within 48 hours with a scope assessment and timeline estimate.