RT DF A1 Peifer, Dylan James. T1 Reinforcement Learning in Buchberger's Algorithm