Paper page - LaRA: Layer-wise Representation Analysis for Detecting Data Contamination in RL Post-Training
…Generated by Qwen/Qwen2.5-Coder-32B-Instruct Reinforcement learning (RL) post-training has shown to improve reasoning in large language models (LLMs). However, there has been little exploration on the problem…
