r-wallstreetbets-comments

Reddit comments scraped in real-time from /r/wallstreetbets. Some comments may be missing.


Setup

If you've already installed the SDK, you can skip these steps. First, install the library:

pip install --upgrade beneath

Then authenticate your environment by running:

beneath auth

Read the entire table into memory

This snippet loads the entire table into a Pandas DataFrame, which is useful for analysis in notebooks or scripts:

import beneath
df = await beneath.load_full("examples/reddit/r-wallstreetbets-comments")

The function accepts several optional arguments. The most common are to_dataframe=False to get records as a regular Python list, filter="..." to filter by key fields, and max_bytes=... to increase the cap on how many records to load (used to prevent runaway costs). For more details, see the API reference.

Replay the table's history and subscribe to changes

This snippet replays the table's historical records one-by-one and stays subscribed to new records (with at-least-once delivery), which is useful for alerting and data enrichment:

import beneath
async def callback(record):
print(record)
await beneath.consume("examples/reddit/r-wallstreetbets-comments", callback)

The function accepts several optional arguments. The most common are replay_only=True to stop the script once the replay has completed, changes_only=True to only subscribe to changes, and subscription_path="ORGANIZATION/PROJECT/subscription:NAME" to persist the consumer's progress.

Lookup records by key

Use the snippet below to lookup records by key.

import beneath
client = beneath.Client()
table = await client.find_table("examples/reddit/r-wallstreetbets-comments")
cursor = await table.query_index(filter={"created_on": ..., "id": ...})
record = await cursor.read_one()
# records = await cursor.read_next() # for range or prefix filters that return multiple records
You can also pass filters that match multiple records based on a key range or key prefix. See the filter docs for syntax guidelines.

Analyze with SQL

This snippet runs a warehouse (OLAP) query on the table and returns the result, which is useful for ad-hoc joins, aggregations, and visualizations:

import beneath
df = await beneath.query_warehouse("SELECT count(*) FROM `examples/reddit/r-wallstreetbets-comments`")

See the warehouse queries documentation for a guideline to the SQL query syntax.

Reference

Consult the Beneath Python client API reference for details on all classes, methods and arguments.

© Beneath Systems