Search

Showing top 3 results for "GPT-5.5 rollout"

… Thanks again for the clarification; the narrative hypothesis testing → rollout correction → attention sink as root cause is much clearer, and the work is clearly valuable for agentic RL on GPT-OSS. …

Jan 27, 2026

Paper page - REPOT: Recoverable Program-of-Thought via Checkpoint Repair

… No fine-tuning, no rollout-time search. Results on PuzzleZoo-775 Average about +3 to +11 pp over vanilla Program-of-Thought across four closed-model configurations gpt-5.4-mini ± reasoning, gemini-3.5-flash, claude-sonnet-4.6 , peaking at 96.9% vs 86.3% on gpt-5.4-mini-medium . …

May 29, 2026

Paper page - Breaking the Bubble: Asynchronous Pipeline Parallel Training with Bounded Weight Inconsistency

… In GPT-style language-model pretraining, PACI matches the stability and final perplexity of synchronous 1F1B-flush, retains the same peak memory footprint, achieves fully utilized pipeline throughput, and improves training time-to-accuracy by up to 1.69x over the fastest flush baseline. …

Jun 11, 2026

Followed topics

Search

GPT-5

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

Paper page - REPOT: Recoverable Program-of-Thought via Checkpoint Repair

Paper page - Breaking the Bubble: Asynchronous Pipeline Parallel Training with Bounded Weight Inconsistency