Jack Dongarra on High-Performance Computing and Responsibly Reckless Algorithms
202502061354
Status: #idea
Tags: SCI
Jack Dongarra on High-Performance Computing and Responsibly Reckless Algorithms
- Jack’s LINPACK benchmark list evolved into the Top500
- Capture the asymptotic rate of throughput, and put that on the benchmark
- Attack of the killer micros
- Microprocessors scaled better than large vector processors
- Super-computers used
- Dennard scaling ended ~2007
- Cloud vendors
- building their own chips
- AWS Graviton
- Google TPU
- building their own interconnects, accelerators
- building their own chips
- Environment for HPC in scientific computing
- Communication is vv expensive compared to floating point ops
- Floating point goes from 64-4bits
- Nvidia TF32, Google BF16, etc.
- Nvidia FP8 (2 versinos)
- Forward prop requires more precision on the fraction
- Back prop requires more range
Performance & Benchmarking Evaluation Tools
- LINPACK is not a very relevant benchmark
- FLOP is not very hard
- Real world applications no longer solve a lot of dense matrix problems
‘Responsibly Reckless’ Algorithms
- Try a fast algorithm (unstable) that might fail (rarely)
- Check for instability
- If needed, recompute with a stable algo
Questions
- AMD MI300A has both CPU cores and GPU cores, which is separate from the Epyc CPUs on the compute node.
- What’s the diff b/w the CPU cores on the MI300 and the Epyc?
- How does the MI300A compare to the GH200?