How to stop spending money on US llms.

Kkk2237pl@szmer.info · 5 days ago

How to stop spending money on US llms.

Akh@lemmy.world · 5 days ago

I just think for myself

Im_old@lemmy.world · 5 days ago

Mistral is based in France

dudesss@lemmy.ca · 5 days ago

What unique advantage and service does Mistral offer?

Warl0k3@lemmy.world · 5 days ago

France isn’t in the US.

dudesss@lemmy.ca · 5 days ago

Which is great. I support France and Mistral. Is there anything else?

Warl0k3@lemmy.world · 5 days ago

Does there need to be…?

lurch (he/him)@sh.itjust.works · 5 days ago

being in france

dudesss@lemmy.ca · 4 days ago

Nice, I use Mistral

k0e3@lemmy.ca · 5 days ago

I had been using it for almost a year but it’s really dumb compared to the big three US llms. I had to unsubscribe since “it’s not US” alone didn’t justify the fees.

lime!@feddit.nu · 5 days ago

mistral?

[object Object]@lemmy.ca · 5 days ago

So self hosting is still not great.

The big problem is you can get large memory but slow prompt processing, which reduces your context window, or you can get semi-fast GPU with low memory, where you’re capped on models.

Sometimes I run pi agent in a container with Gemma 4 or Qwen 3.6, but even on strix halo after 60k tokens the quadratic slowdown is brutal.

We aren’t there yet for complex agaentic workflows locally, and it’s primarily a hardware issue.

Though innovations in performance are being shipped regularly, they’re incremental.

Warl0k3@lemmy.world · 5 days ago

There are a few of the chinese open models that are okay for coding, but in terms of functionality they’re extremely basic. You can make them work, but if people are used to the big corpo models it’s going to be hard to get them to switch to what is basically a chatbot, and the open-source tools to give them much needed QoL functionality are pretty rough right now.

For sure worth looking into self hosting but it’s going to take quite a bit of convincing to get people to shift over, I fear.

gravitas_deficiency@sh.itjust.works · 5 days ago

There’s a ton of content out there about locally hosting LLMs and ML models in general, and a number of newer novel techniques and approaches to successfully running models that are rather a lot bigger than your VRAM. I’d start by searching around for that stuff.

MystValkyrie@lemmy.blahaj.zone · 4 days ago

Is 128 GB of ram per unit enough for your organization’s use case? You could convince them to buy a Framework Desktop and then install an offline llm to it (ollama with Mistral, perhaps). Then you don’t have to rely on American companies or the environmental impact of data centers, and then after the startup cost, it’s free from then on.

Best of all, they can just be normal work computers when the bubble bursts.

I wish I could just say, “Convince your company not to use AI,” but I’m sure your higher-ups aren’t taking no for an answer.

lukecyca@lemmy.ca · 5 days ago

Pi.dev with Qwen3.6 running on a modest 6GB GPU is actually working pretty well for me. For smallish well-scoped agentic code tasks.

Sims@lemmy.ml · 5 days ago

Something like AIhorde as the foundation ?

Noxy@pawb.social · 4 days ago

self hosting an orphan crusher doesn’t sound like a meaningful improvement