Precision Lab

Exploring precision-performance tradeoffs in numerical computing

How does reduced floating-point precision affect algorithm convergence?

Modern GPU accelerators support FP8, FP16, FP32, and FP64 arithmetic with vastly different throughput characteristics. This project explores precision-performance tradeoffs through eigenvalue computation using the power method algorithm.

Key Findings

📊

Precision Floors Matter

Each precision level has a natural convergence floor determined by its machine epsilon. FP8 stagnates around 10⁻², FP16 around 10⁻³, FP32 around 10⁻⁷, and FP64 beyond 10⁻¹⁵.

Cascading Wins

Starting at FP8 and escalating to higher precisions (FP8→FP16→FP32→FP64) achieves 2-3× speedup over FP64-only while reaching the same final accuracy.

🎯

Early Progress is Cheap

Lower precisions make rapid progress in early iterations. FP8 and FP16 quickly reduce error from 25% to 1%, setting up efficient refinement in higher precisions.

🔬

Residual Norm is Key

Using residual norm ||Av - λv|| as the convergence metric reveals the true precision limits of each format and enables accurate precision-switching decisions.

Methodology

Power Method Algorithm

The power method is an iterative algorithm for computing the dominant eigenvalue of a matrix. Starting with a random vector, it repeatedly applies the matrix and normalizes, converging to the eigenvector corresponding to the largest eigenvalue.

This algorithm is ideal for studying precision effects because:

  • Convergence depends on condition number κ = λ₁/λ₂
  • Each iteration compounds floating-point roundoff errors
  • Precision limits manifest as convergence stagnation
  • It's widely used in practice (PageRank, PCA, spectral clustering)

Experiment Parameters

Matrix Size 1024×1024
Condition Number κ = 100
True Eigenvalue λ ≈ 100
Convergence Metric Residual Norm

Precision Formats

FP8 ε ≈ 0.125
FP16 ε ≈ 9.77×10⁻⁴
FP32 ε ≈ 1.19×10⁻⁷
FP64 ε ≈ 2.22×10⁻¹⁶

About This Project

Precision Lab is an educational research project demonstrating precision-performance tradeoffs in numerical computing. The power method serves as a concrete example to explore how reduced floating-point precision affects algorithm convergence.

This work is motivated by modern GPU architectures that offer dramatically different throughput for different precisions. Understanding when and how to use lower precisions is critical for performance optimization in scientific computing and machine learning.