Skip to content

fix : (GLM-5-2-FP8)do not use buffer out of cudagraph when unnecessary#1391

Open
JiaoliangYu wants to merge 3 commits into
ROCm:mainfrom
JiaoliangYu:fix/sparse-mla-convert-tok-bounds
Open

fix : (GLM-5-2-FP8)do not use buffer out of cudagraph when unnecessary#1391
JiaoliangYu wants to merge 3 commits into
ROCm:mainfrom
JiaoliangYu:fix/sparse-mla-convert-tok-bounds