How to solve reinforcement learning problems