Hexadecimal

Track_Shovel@slrpnk.net · 18 hours ago

Hexadecimal

GissaMittJobb@lemmy.ml · 16 hours ago

Is this real? On account of how LLMs tokenize their input, this can actually be a pretty tricky task for them to accomplish. This is also the reason why it’s hard for them to count the amount of 'R’s in the word ‘Strawberry’.

kautau@lemmy.world · 10 hours ago

It’s probably deepseek r1, which is a “reasoning” model so basically it has sub-models doing things like running computation while the “supervisor” part of the model “talks to them” and relays back the approach. Trying to imitate the way humans think. That being said, models are getting “agentic” meaning they have the ability to run software tools against what you send them, and while it’s obviously being super hyped up by all the tech bro accellerationists, it is likely where LLMs and the like are headed, for better or for worse.

GissaMittJobb@lemmy.ml · 9 hours ago

Still, this does not quite address the issue of tokenization making it difficult for most models to accurately distinguish between the hexadecimals here.

Having the model write code to solve an issue and then ask it to execute it is an established technique to circumvent this issue, but all of the model interfaces I know of with this capability are very explicit about when they are making use of this tool.

morrowind@lemmy.ml · 8 hours ago

Not really a concern. It’s basically translation, which language models excel at. It just needs a mapping of the hex to byte

GissaMittJobb@lemmy.ml · 6 hours ago

It is a concern.

Check out https://tiktokenizer.vercel.app/?model=deepseek-ai%2FDeepSeek-R1 and try entering some freeform hexadecimal data - you’ll notice that it does not cleanly segment the hexadecimal numbers into individual tokens.

morrowind@lemmy.ml · 6 hours ago

I’m well aware, but you don’t need to necessarily see each character to translate to bytes

GissaMittJobb@lemmy.ml · 6 hours ago

It’s not out of the question that we get emergent behaviour where the model can connect non-optimally mapped tokens and still translate them correctly, yeah.