Paper page - Position: LLM Inference Should Be Evaluated as Energy-to-Token Production
…Xiang Liu , , , , , , , Abstract LLM inference should be evaluated as energy-to-token production under constraints of compute, power, cooling, and operational efficiency, requiring new metrics beyond traditional accuracy and latency measures. AI…
