sklearn.datasets
Datasets that I have worked/used so far are from
Dataset Name | Loaders | Description | Example/ Usage |
---|---|---|---|
20 newsgroups text dataset | fetch_20newsgroups - Returns raw text | Comprises around 18000 newsgroups posts on 20 topics (such as ‘alt.atheism’, ‘comp.graphics’, …) split in two subsets: ~60% for training (or development) and the other ~40% for testing (or for performance evaluation). The split between the train and test set is based upon a messages posted before and after a specific date. | Google-5-Day-Gen-AI-Intensive-Course |