• Yuri@lemmygrad.ml
    link
    fedilink
    English
    arrow-up
    6
    ·
    edit-2
    2 days ago

    Technically, the capability is really simple, it could take them less than a day to implement. But, I think the reason he gave the “1 year” timeline is because that would greatly increase compute requirements, tools take more input tokens, and you have more requests because of them as well because it’s basically prompt -> tool (start timer) -> response -> prompt -> tool (end timer) -> response whereas it’s only two requests for it to hallucinate something. They’re already struggling with keeping up with compute without adding tools into the mix

    • CriticalResist8@lemmygrad.ml
      link
      fedilink
      arrow-up
      8
      ·
      edit-2
      2 days ago

      You’re right, I was imagining GPT generating a timer with python (since they pioneered the AI generating and running code on the web interface) but making tools the llm can call is probably much simpler in terms of compute.

      Which brings me to the next step, OAI seems to be doing so bad recently that I imagine they have much more pressing matters to attend to before adding a timer for the probably <5% of the users who need one.

      They also have to reckon with the fact that people are using GPT in these types of way. Like yeah you could just open your own timer app on your phone, but what people want is live chatting while on their run. we can speculate as to why and if it’s good or not but regardless this is a strain on OAI’s servers that doesn’t make them any money. Frankly I’ve said this for a while now but I don’t see OAI surviving for much longer, especially when Claude is the best of the best for coding and agentic which is where the providers are headed, and GPT has been lagging behind that lazily.

      I myself have deleted my chatGPT account some time ago, it’s all deepseek for me from here on out lol

      edit: instead of buying 40% of the world’s wafers and driving up memory prices they should invest some of that taxpayer money into rebuilding their model from the ground up with new innovations but what do I know I’m not a sociopathic tech CEO

      • Yuri@lemmygrad.ml
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        2 days ago

        but what people want is live chatting while on their run

        What’s interesting is that you can run a small model like Qwen3.5-9B (basically runs on any GPU with >=8GB VRAM) which is trained on agentic tool use and that would get this right 99% of the time without the massive compute costs. Even modern phones can run LLMs that could do this

        The future for this kind of thing is local, not with a 1T parameter model running in the cloud and polluting the environment for no reason

      • Yuri@lemmygrad.ml
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        2 days ago

        But there is no root cause. LLMs have no sense of time. You have to inject that yourself, give it access to code execution or something, or it’ll never do anything about it. Another solution would be to inject the current time into the system prompt and train it to put the current time into context when a user wants a timer so that when the user asks again, it can find the diff between the first time it put into context and the time it gets from the system prompt, but that’s a little bit more complicated and Sam himself said that they want to use tools for this in that interview