Recycling the Web: A Method to Enhance Pre-training Data Quality and Quantity for Language Models