How many FLOPs does a transformer model with 10 layers, dmodel 1024, GQA with 16 heads, and a MLP expansion factor of 4

视频信息