Reddit — Comments
120M+ records
from $0.0012/record
Bulk export of public Reddit comments with full text, author, parent post reference, and upvote scores. One of the richest datasets for LLM fine-tuning and discourse analysis.
Field dictionary
| Field | Type | Description |
|---|---|---|
| id | string | Unique comment ID |
| body | string | Comment body text |
| parentPostId | string | ID of the post the comment belongs to |
| parentId | string | ID of the parent comment or post |
| authorId | string | ID of the commenting user |
| authorUsername | string | Username of the commenter |
| postSubredditName | string | Subreddit the comment is in |
| postSubredditId | string | ID of the subreddit |
| score | integer | Net score (upvotes minus downvotes) |
| upvotes | integer | Number of upvotes |
| downvotes | integer | Number of downvotes |
| controversiality | integer | Controversiality flag/score |
| depth | integer | Nesting depth in the comment tree |
| isSubmitter | boolean | Whether the commenter is the post author (OP) |
| stickied | boolean | Whether the comment is pinned |
| collapsed | boolean | Whether the comment is collapsed by default |
| edited | boolean | Whether the comment was edited |
| distinguished | string | Mod/admin distinction label |
| createdAt | datetime | Comment creation timestamp |
| createdAtDate | datetime | Comment creation date |
Sample data preview
[
{
"id": "t1_xyz789",
"body": "This is great work. How are you handling rate limits on the keyword search endpoint?",
"parentPostId": "t3_1abcd23",
"parentId": "t3_1abcd23",
"authorId": "t2_d4e5f6",
"authorUsername": "curious_dev",
"postSubredditName": "dataisbeautiful",
"postSubredditId": "t5_2qh6e",
"score": 184,
"upvotes": 191,
"downvotes": 7,
"controversiality": 0,
"depth": 1,
"isSubmitter": false,
"stickied": false,
"collapsed": false,
"edited": false,
"distinguished": null,
"createdAt": "2026-06-18T15:42:09.000Z",
"createdAtDate": "2026-06-18"
}
]Volume & pricing
$250 minimum order. Free 100-record sample included with every dataset.
| Volume | Records | Per record | Total (one-time) |
|---|---|---|---|
| 1M | 1,000,000 | $0.0012 | $1,200 |
| 10M | 10,000,000 | $0.0010 | $10,000 |
| 100M | 100,000,000 | $0.0005 | $50,000 |
Records are a representative sample of the full dataset, selected across the available range. Need specific records? Build a custom dataset →
Formats & delivery
File formats
CSVJSON
Delivery options
DownloadAmazon S3Google Cloud Storage
Related datasets
Other datasets that pair well with this one
Ready to get started?
Download a free 100-record sample or customize your exact dataset requirements.
