Hello! Basically, I need to process a very large (4000 lines) file and free ai chatbots like chatgpt aren’t able to handle it. I would like to split it into smaller parts and process each part separately. I’m having however a very hard time finding a chatbot with free API. the only one I found is huggingchat, but after a few requests waiting 1 seconds before sending the next one it starts giving rate limit errors.
any suggestion? thanks in advance!
EDIT: I also tried to run gpt4all on my laptop (with integrated graphics) and it took like 2-5 minutes to asnwer a simple “hello” prompt, so it’s not really feasable :(
As with other responses, I recommend a local model, for a vast number of reasons, including privacy and cost.
Ollama is a front end that lets you run several kinds of models on Windows and Linux. Most will run without a GPU, but the performance will be bad. If your only compute device is a laptop without a GPU, you’re out of luck running things locally with any speed… that said, if you need to process a large file and have time to just let the laptop cook, you can probably still get what you need overnight or over a weekend…
If you really need something faster soon, you can probably buy any cheap($5-800) off-the-shelf gaming pc from your local electronics store like best buy, microcenter, walmart, and get more ‘bang for your buck’ over a longer term running a model locally, assuming this isn’t a one-off need. Aim for >=16GB RAM on the PC itself and >=10GB on the GPU for real-time responses. I have a 10GB RTX 3080 and have success running 8B models on my computer. I’m able to run a 70B model, but it’s a slideshow. The ‘B’ metric here is parameters and context(history). Depending on what your 4k-lines really means (book pages/printed text?, code?) a 7-10B model is probably able to keep it all ‘loaded in memory’ and be able to respond to questions about the file without forgetting parts of it.
From a privacy perspective, I also HIGHLY recommend not using the various online front ends. There’s no guarantee that any info you upload to them stays private and generally their privacy policies have a line like ‘we collect information about your interactions with us including but not limited to user generated content, such as text input and images…’ effectively meaning anything your send them is theirs to keep. If your 4k line file is in any way business related, you shouldn’t send it to a service you don’t operate.
Additionally, as much as I enjoy playing with these tools, I’m an AI skeptic. Ensure you review the response and can sanity check it – AI/LLMs are not actually intelligent and will make shit up.
Ollama, it’s free and doesn’t require internet to use it (only during instalation)
Gemini flash free api tier is pretty reasonable if you can chunk the file
What model size did you run on your laptop? I have an Intel Nuc with an i7 and I run various models on CPU (it doesn’t have a dedicated GPU) and while I can’t run stuff larger than ~14b or so, models up to around ~7b aren’t too slow. If I try to run a 32b then I get a similar experience to you. I tend not to go below 4b because that’s when it starts being dumb and not following instructions well, so just depends on how complex your task is.
OpenRouter has free models, but you are severely rate limited until you purchase $10 credits.