I've just created c/Ollama!

catty@lemmy.world · edit-2 2 days ago

I've just created c/Ollama!

WhirlpoolBrewer@lemmings.world · 1 day ago

brucethemoose@lemmy.world · edit-2 1 day ago

8GB?

You might be able to run Qwen3 4B: https://huggingface.co/mlx-community/Qwen3-4B-4bit-DWQ/tree/main

But honestly you don’t have enough RAM to spare, and even a small model might bog things down. I’d run Open Web UI or LM Studio with a free LLM API, like Gemini Flash, or pay a few bucks for something off openrouter. Or maybe Cerebras API.

…Unfortunely, LLMs are very RAM intensive, and >4GB (more realistically like 2GB) is not going to be a good experience :(

WhirlpoolBrewer@lemmings.world · 1 day ago

Good to know. I’d hate to buy a new machine strictly for running an LLM. Could be an excuse to pickup something like a Framework 16, but realistically, I don’t see myself doing that. I think you might be right about using something like Open Web UI or LM Studio.

brucethemoose@lemmy.world · edit-2 1 day ago

Yeah, just paying for LLM APIs is dirt cheap, and they (supposedly) don’t scrape data. Again I’d recommend Openrouter and Cerebras! And you get your pick of models to try from them.

Even a framework 16 is not good for LLMs TBH. The Framework desktop is (as it uses a special AMD chip), but it’s very expensive. Honestly the whole hardware market is so screwed up, hence most ‘local LLM enthusiasts’ buy a used RTX 3090 and stick them in desktops or servers, as no one wants to produce something affordable apparently :/

~> psudojo@witsEnd <~@ioc.exchange · 1 day ago

@brucethemoose @WhirlpoolBrewer

*1650 and it works like a charm 🤌🏾

brucethemoose@lemmy.world · edit-2 1 day ago

1650

You mean GPU? Yeah, it’s good, I was strictly talking about purchasing a laptop for LLM usage, as most are less than ideal for the money. Laptop vram pools are relatively small and SO-DIMMS are usually very slow.

Things will get much better once the “Max” AMD SKUs proliferate.