Optimizing Communication for Mixture-of-Experts Training with Hybrid Expert Parallel | NVIDIA Technical Blog
…Routing and information-processing support is also added to enable complete EP communication. The design goals and core optimization directions of Hybrid-EP include leveraging the latest communication technologies on the NVIDIA…
