tag: sequence-parallel.md

Tag: sequence-parallel

1 posts
GPU 并行策略详解:TP / PP / EP / SP

训练和推理大模型时,单卡显存和算力远远不够。本文系统梳理四种主流并行策略 —— Tensor Parallelism、Pipeline Parallelism、Expert Parallelism、Sequence Parallelism,关键机制配有可交互动画。

...