New Study Tests Emerging Number Formats in Eigenvalue Computations

Computing eigenvalues and eigenvectors of large, sparse matrices is central to modern applications ranging from web search and network analysis to structural modeling and quantum simulations. A study to be presented at The International Conference for High Performance
Computing, Networking, Storage, and Analysis
(SC25), explores how emerging number formats perform in this critical task, focusing on the implicitly restarted Arnoldi method (IRAM), the core eigensolver behind ARPACK and widely used in scientific software such as MATLAB.

The research compared four alternatives to IEEE 754 floating-point arithmetic—OFP8, bfloat16, posit, and takum—using two major real-world datasets: the SuiteSparse Matrix Collection, which includes diverse matrices with varying conditioning, and the Network Repository, a large archive of graphs from practical applications.

The results show a clear divergence in performance among the formats. OFP8 (E4M3, E5M2), though attractive for efficiency, proved unsuitable for general-purpose computations due to limited precision and instability during Arnoldi iterations. Bfloat16 consistently outperformed float16 by offering a larger dynamic range, though it still showed weaknesses for clustered or near-zero eigenvalues. Posit arithmetic, with its tapered-precision design, delivered notable improvements in stability and accuracy but suffered from edge-case weaknesses. Most significantly, takum arithmetic—a refinement of posit—achieved the strongest overall performance, maintaining stability on challenging matrices where other formats failed.

As high-performance computing shifts toward mixed-precision and energy-aware strategies, the study underscores the importance of arithmetic choices. The findings suggest that while bfloat16 remains a pragmatic option in hardware-supported environments, tapered-precision formats like posit and especially takum may offer the most promise for future scientific computing.

Why it matters

The push toward exascale systems and beyond necessitates a balance among speed, energy efficiency, and accuracy. Low-precision formats are attractive because they can reduce memory bandwidth and power consumption, provided that they maintain numerical stability in demanding algorithms. By testing emerging formats on a foundational eigensolver with real-world datasets, this study provides critical evidence that new number systems could reshape the landscape of high-performance computing.

James Quinlan, Chair of the Department of Computer Science at USM, co-authored the study, demonstrating the department’s active role in advancing research on numerical methods and next-generation computing technologies.

Read more about the presentation here: SC25 Conference Program – Numerical Performance of the Implicitly Restarted Arnoldi Method in OFP8, Bfloat16, Posit, and Takum Arithmetics.