sklearn.datasets

Datasets that I have worked/used so far are from

Dataset NameLoadersDescriptionExample/ Usage
20 newsgroups text datasetfetch_20newsgroups - Returns raw textComprises around 18000 newsgroups posts on 20 topics (such as ‘alt.atheism’,
‘comp.graphics’, …) split in two subsets: ~60% for training (or development) and the other ~40% for testing (or for performance evaluation). The split between the train and test set is based upon a messages posted before and after a specific date.
Google-5-Day-Gen-AI-Intensive-Course