Tachyum Submits Bid to Build 20 Exaflops Supercomputer
Collected Image
Tachyum on Tuesday said that it had submitted a bid to the Department of Energy to build a 20 exaflops supercomputer in 2025. The machine would be based on the company's next-generation Prodigy processors featuring a proprietary microarchitecture that can be used for different types of workloads.
The U.S. DoE wants a 20 exaflops supercomputer with a 20MW–60MW power consumption to be delivered by 2025. The system is set to be installed at Oak Ridge National Laboratory (ORNL) and will complement the lab's Frontier system that went online earlier this year. Tachyum does not disclose which hardware it proposed to the DoE, but only says that it has its 128-core Prodigy processor today as well as a higher-performing Prodigy 2 processor in its roadmap, so it is safe to say that by 2025 it will have the latter on hand and it could be able to address the upcoming system.
Tachyum's Prodigy is a universal homogeneous processor packing up to 128 proprietary 64-bit VLIW cores that feature two 1024-bit vector units per core and one 4096-bit matrix unit per core. Tachyum expected its flagship Prodigy T16128-AIX processor(opens in new tab) to offer up to 90 FP64 teraflops for HPC as well as up to 12 'AI petaflops' for AI inference and training (presumably when running INT8 or FP8 workloads). Prodigy consumes up to 950W and uses liquid cooling.
That was all before Tachyum sued Cadence, its intellectual property provider, for lower-than-expected performance of its Prodigy processor. We have no idea what the current performance expectations are for the chip.
In theory, Tachyum could power an exaflops system using over 11,000 of its Prodigy processors, though power consumption of such a machine would be gargantuan. Presumably, Prodigy 2 has a better chance to meet the needs of a next-generation exascale system than the original Prodigy.
There is currently one exaflops-class supercomputer in the U.S., the 1.1 exaflops Frontier system at Oak Ridge National Laboratory (ORNL) that is based on AMD's 64-core EPYC CPUs as well as Instinct MI250X compute GPUs. There are two more exascale systems being built in the USA, the 2 exaflops Aurora machine powered by Intel's 4thGeneration Xeon Scalable processors and Xe-HPC compute GPUs (aka, Ponte Vecchio) as well as the ">2 exaflops" El Capitan supercomputer based on AMD's Zen 4 architecture EPYC CPUs and Instinct MI300 GPUs.
One of the interesting things about the DoE's supercomputing plans is that from now on it wants to upgrade its high-performance compute capabilities every 12–24 months, not every 4–5 years. As a result, the DoE will be more eager to adopt exotic architectures like Tachyum's Prodigy than it is today.
"We also wish to explore the development of an approach that moves away from monolithic acquisitions toward a model for enabling more rapid upgrade cycles of deployed systems, to enable faster innovation on hardware and software," a DoE document reads. "One possible strategy would include increased reuse of existing infrastructure so that the upgrades are modular. A goal would be to reimagine systems architecture and an efficient acquisition process that allows continuous injection of technological advances to a facility (e.g., every 12–24 months rather than every 4–5 years). Understanding the tradeoffs of these approaches is one goal of this RFI, and we invite responses to include perceived benefits and/or disadvantages of this modular upgrade approach."
One of the advantages that Tachyum's Prodigy has over traditional CPUs and GPUs for AI and HPC workloads is that it is tailored for both types of workloads, which is why Prodigy can be used for AI workloads when its HPC capabilities are not used and vice versa. The DoE may or may not adopt Tachyum for any of its upcoming supercomputers, but the company hopes to be awarded with an appropriate contract.
The U.S. DoE wants a 20 exaflops supercomputer with a 20MW–60MW power consumption to be delivered by 2025. The system is set to be installed at Oak Ridge National Laboratory (ORNL) and will complement the lab's Frontier system that went online earlier this year. Tachyum does not disclose which hardware it proposed to the DoE, but only says that it has its 128-core Prodigy processor today as well as a higher-performing Prodigy 2 processor in its roadmap, so it is safe to say that by 2025 it will have the latter on hand and it could be able to address the upcoming system.
Tachyum's Prodigy is a universal homogeneous processor packing up to 128 proprietary 64-bit VLIW cores that feature two 1024-bit vector units per core and one 4096-bit matrix unit per core. Tachyum expected its flagship Prodigy T16128-AIX processor(opens in new tab) to offer up to 90 FP64 teraflops for HPC as well as up to 12 'AI petaflops' for AI inference and training (presumably when running INT8 or FP8 workloads). Prodigy consumes up to 950W and uses liquid cooling.
That was all before Tachyum sued Cadence, its intellectual property provider, for lower-than-expected performance of its Prodigy processor. We have no idea what the current performance expectations are for the chip.
In theory, Tachyum could power an exaflops system using over 11,000 of its Prodigy processors, though power consumption of such a machine would be gargantuan. Presumably, Prodigy 2 has a better chance to meet the needs of a next-generation exascale system than the original Prodigy.
There is currently one exaflops-class supercomputer in the U.S., the 1.1 exaflops Frontier system at Oak Ridge National Laboratory (ORNL) that is based on AMD's 64-core EPYC CPUs as well as Instinct MI250X compute GPUs. There are two more exascale systems being built in the USA, the 2 exaflops Aurora machine powered by Intel's 4thGeneration Xeon Scalable processors and Xe-HPC compute GPUs (aka, Ponte Vecchio) as well as the ">2 exaflops" El Capitan supercomputer based on AMD's Zen 4 architecture EPYC CPUs and Instinct MI300 GPUs.
One of the interesting things about the DoE's supercomputing plans is that from now on it wants to upgrade its high-performance compute capabilities every 12–24 months, not every 4–5 years. As a result, the DoE will be more eager to adopt exotic architectures like Tachyum's Prodigy than it is today.
"We also wish to explore the development of an approach that moves away from monolithic acquisitions toward a model for enabling more rapid upgrade cycles of deployed systems, to enable faster innovation on hardware and software," a DoE document reads. "One possible strategy would include increased reuse of existing infrastructure so that the upgrades are modular. A goal would be to reimagine systems architecture and an efficient acquisition process that allows continuous injection of technological advances to a facility (e.g., every 12–24 months rather than every 4–5 years). Understanding the tradeoffs of these approaches is one goal of this RFI, and we invite responses to include perceived benefits and/or disadvantages of this modular upgrade approach."
One of the advantages that Tachyum's Prodigy has over traditional CPUs and GPUs for AI and HPC workloads is that it is tailored for both types of workloads, which is why Prodigy can be used for AI workloads when its HPC capabilities are not used and vice versa. The DoE may or may not adopt Tachyum for any of its upcoming supercomputers, but the company hopes to be awarded with an appropriate contract.
Source: https://www.tomshardware.com
Tags :
Previous Story
- My $863 Endgame Keyboard Is Perfect, for Now
- MSI MPG Z690 Edge WIFI DDR4 Motherboard Review:...
- SK Hynix's New SSD Boasts 1.4 Million IOPS
- Budget-Friendly Sonos Ray Soundbar Coming in June
- How to choose the right antivirus for your...
- AMD Confirms Its GPU Drivers Are Overclocking CPUs...
- Foot Operated computer Mouse Wheel Launches in Japan...
- Asus recalls product after users 'smell smoke'