Skip to main content

Reddit — Comments

120M+ records

from $0.0012/record

Bulk export of public Reddit comments with full text, author, parent post reference, and upvote scores. One of the richest datasets for LLM fine-tuning and discourse analysis.

Customize this dataset

Field dictionary

FieldTypeDescription
idstringUnique comment ID
bodystringComment body text
parentPostIdstringID of the post the comment belongs to
parentIdstringID of the parent comment or post
authorIdstringID of the commenting user
authorUsernamestringUsername of the commenter
postSubredditNamestringSubreddit the comment is in
postSubredditIdstringID of the subreddit
scoreintegerNet score (upvotes minus downvotes)
upvotesintegerNumber of upvotes
downvotesintegerNumber of downvotes
controversialityintegerControversiality flag/score
depthintegerNesting depth in the comment tree
isSubmitterbooleanWhether the commenter is the post author (OP)
stickiedbooleanWhether the comment is pinned
collapsedbooleanWhether the comment is collapsed by default
editedbooleanWhether the comment was edited
distinguishedstringMod/admin distinction label
createdAtdatetimeComment creation timestamp
createdAtDatedatetimeComment creation date

Sample data preview

[
  {
    "id": "t1_xyz789",
    "body": "This is great work. How are you handling rate limits on the keyword search endpoint?",
    "parentPostId": "t3_1abcd23",
    "parentId": "t3_1abcd23",
    "authorId": "t2_d4e5f6",
    "authorUsername": "curious_dev",
    "postSubredditName": "dataisbeautiful",
    "postSubredditId": "t5_2qh6e",
    "score": 184,
    "upvotes": 191,
    "downvotes": 7,
    "controversiality": 0,
    "depth": 1,
    "isSubmitter": false,
    "stickied": false,
    "collapsed": false,
    "edited": false,
    "distinguished": null,
    "createdAt": "2026-06-18T15:42:09.000Z",
    "createdAtDate": "2026-06-18"
  }
]

Volume & pricing

$250 minimum order. Free 100-record sample included with every dataset.

VolumeRecordsPer recordTotal (one-time)
1M1,000,000$0.0012$1,200
10M10,000,000$0.0010$10,000
100M100,000,000$0.0005$50,000

Records are a representative sample of the full dataset, selected across the available range. Need specific records? Build a custom dataset →

Formats & delivery

File formats

CSVJSON

Delivery options

DownloadAmazon S3Google Cloud Storage

Ready to get started?

Download a free 100-record sample or customize your exact dataset requirements.

Customize this dataset