Recent advances in the development of new biomarker tests, which physicians use for the early detection of cancer, have the potential to improve patient survival by catching cancer at an early stage. We describe a Q-learning method to compute near optimal prostate cancer screening strategies that trade off the number of screening biopsies versus metastatic cases per 1,000 men. We present results based on Monte Carlo simulation to compare the policies developed using Q-learning methods with those recommended in the medical literature.