Old Web
English
Sign In
Acemap
>
authorDetail
>
Vijay Korthikanti
Vijay Korthikanti
Throughput (business)
Memory footprint
CUDA
Schedule
Parallel computing
3
Papers
22
Citations
0
KQI
Citation Trend
Filter By
Interval:
1900~2024
1900
2024
Author
Papers (3)
Sort By
Default
Most Recent
Most Early
Most Citation
No data
Journal
Conference
Others
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model.
2022
CoRR
Shaden Smith
Mostofa Patwary
Brandon Norick
Patrick LeGresley
Samyam Rajbhandari
Jared Casper
Zhun Liu
Shrimai Prabhumoye
George Zerveas
Vijay Korthikanti
Elton Zheng
Rewon Child
Reza Yazdani Aminabadi
Julie Bernauer
Xia Song
Mohammad Shoeybi
Yuxiong He
Michael Houston
Saurabh Tiwary
Bryan Catanzaro
Show All
Source
Cite
Save
Citations (0)
Reducing Activation Recomputation in Large Transformer Models.
2022
CoRR
Vijay Korthikanti
Jared Casper
Sangkug Lym
Lawrence McAfee
Michael Andersch
Mohammad Shoeybi
Bryan Catanzaro
Show All
Source
Cite
Save
Citations (0)
Efficient Large-Scale Language Model Training on GPU Clusters
2021
arXiv: Computation and Language
Deepak Narayanan
Mohammad Shoeybi
Jared Casper
Patrick LeGresley
Md. Mostofa Ali Patwary
Vijay Korthikanti
Dmitri Vainbrand
Prethvi Kashinkunti
Julie Bernauer
Bryan Catanzaro
Amar Phanishayee
Matei Zaharia
Show All
Source
Cite
Save
Citations (22)
1
map