Pipeline friction refers to any obstacle that slows or disrupts the journey of a model from training to production inference. Unlike bugs that produce clear error messages, friction often manifests as subtle inefficiencies: a model that consumes twice the expected GPU memory, for example, or an inference server that drops requests under load, or a deployment that works on one GPU architecture but fails on another. The most frequent sources of pipeline friction can be grouped into four categories: Model export issues: These arise when converting from training frameworks like PyTorch or TensorFl