Paper page - IndustryBench: Probing the Industrial Knowledge Boundaries of LLMs
…Across 17 models in Chinese and an 8-model intersection over four languages, we find: (i) the best system reaches only 2.083 on the 0--3 rubric, leaving substantial headroom; (ii…