It’s not always easy to distinguish between existentialism and a bad mood.

  • 3 Posts
  • 188 Comments
Joined 3 years ago
cake
Cake day: July 2nd, 2023

help-circle


  • That was a good read.

    Corey doc wrote:

    It’s not ā€œunethicalā€ to scrape the web in order to create and analyze data-sets. That’s just ā€œa search engineā€

    Equivocating what LLMs do and what goes into LLM web scraping with ā€œa search engineā€ is messed up. His article that he links about scraping is mostly about how badly copyright works and how analysing trade-secret-walled data can be beneficial both to consumers and science but occasionally bad for citizen privacy, which you’ll recognize as mostly irrelevant to the concerns people tend to have against LLM training data providers ddosing the fuck out of everything, and all the rest of the stuff tante does a good job of explaining.

    Corey also provides this anecdote:

    As a group of human-rights defending forensic statisticians, HRDAG has always relied on cutting edge mathematics in its analysis. With its Colombia project, HRDAG used a large language model to assign probabilities for responsibility for each killing documented in the databases it analyzed.

    That is, HRDAG was able to rigorously and legibly say, ā€œThis killing has an X% probability of having been carried out by a right-wing militia, a Y% probability of having been carried out by the FARC, and a Z% probability of being unrelated to the civil war.ā€

    The use of large language modelsā€Šā€”ā€Šproduced from vast corpuses of scraped dataā€Šā€”ā€Što produce accurate, thorough and comprehensible accounts of the hidden crimes that accompany war and conflict is still in its infancy. But already, these techniques are changing the way we hold criminals to account and bring justice to their victims.

    Scraping to make large language models is good, actually.

    what the actual shit

    edit: I mean, he tried transformer powered voice-to-text and liked it, and now he’s all in on the LLMs are a rigorous and accurate tool actually bandwagon?

    Also the web scraping article is from 2023 but CD linked it in the recent pluralistic post so I assume his views haven’t changed.













  • The common clay of the new west:

    transcription

    Twitter post from @BenjaminDEKR

    ā€œOpenClaw is interesting, but will also drain your wallet if you aren’t careful. Last night around midnight I loaded my Anthropic API account with $20, then went to bed. When I woke up, my Anthropic balance was $O. Opus was checking ā€œis it daytime yet?ā€ every 30 minutes, paying $0.75 each time to conclude ā€œno, it’s still night.ā€ Doing literally nothing, OpenClaw spent the entire balance. How? The ā€œHeartbeatā€ cron job, even though literally the only thing I had going was one silly reminder, (ā€œremind me tomorrow to get milkā€)ā€

    Continuation of twitter post

    ā€œ1. Sent ~120,000 tokens of context to Opus 4.5 2. Opus read HEARTBEAT md, thought about reminders 3. Replied ā€œHEARTBEAT_OKā€ 4. Cost: ~$0.75 per heartbeat (cache writes) The damage:

    • Overnight = ~25+ heartbeats
    • 25 Ɨ $0.75 = ~$18.75 just from heartbeats alone
    • Plus regular conversation = ~$20 total The absurdity: Opus was essentially checking ā€œis it daytime yet?ā€ every 30 minutes, paying $0.75 each time to conclude ā€œno, it’s still night.ā€ The problem is:
    1. Heartbeat uses Opus (most expensive model) for a trivial check
    2. Sends the entire conversation context (~120k tokens) each time
    3. Runs every 30 minutes regardless of whether anything needs checking That’s $750 a month if this runs, to occasionally remind me stuff? Yeah, no. Not great.ā€



  • I’m planning on using this data to catalog ā€œin the wildā€ instances of agents resisting shutdown, attempting to acquire resources, and avoiding oversight.

    He’ll probably do this by running an agent that uses a chatbot with the playwright mcp to occasionally scrape the site, then feed that to a second agent who’ll filter the posts for suspect behavior, then to another agent to summarize and create a report, then another agent which decides if the report is worth it for him to read and message him through his socials. Maybe another agent with db access to log the flagged posts at some point.

    All this will be worth it to no one except the bot vendors.