I’ve just re-discovered ollama and it’s come on a long way and has reduced the very difficult task of locally hosting your own LLM (and getting it running on a GPU) to simply installing a deb! It also works for Windows and Mac, so can help everyone.

I’d like to see Lemmy become useful for specific technical sub branches instead of trying to find the best existing community which can be subjective making information difficult to find, so I created [email protected] for everyone to discuss, ask questions, and help each other out with ollama!

So, please, join, subscribe and feel free to post, ask questions, post tips / projects, and help out where you can!

Thanks!

    • brucethemoose@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      1 day ago

      8GB?

      You might be able to run Qwen3 4B: https://huggingface.co/mlx-community/Qwen3-4B-4bit-DWQ/tree/main

      But honestly you don’t have enough RAM to spare, and even a small model might bog things down. I’d run Open Web UI or LM Studio with a free LLM API, like Gemini Flash, or pay a few bucks for something off openrouter. Or maybe Cerebras API.

      …Unfortunely, LLMs are very RAM intensive, and >4GB (more realistically like 2GB) is not going to be a good experience :(

      • WhirlpoolBrewer@lemmings.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 day ago

        Good to know. I’d hate to buy a new machine strictly for running an LLM. Could be an excuse to pickup something like a Framework 16, but realistically, I don’t see myself doing that. I think you might be right about using something like Open Web UI or LM Studio.

        • brucethemoose@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          1 day ago

          Yeah, just paying for LLM APIs is dirt cheap, and they (supposedly) don’t scrape data. Again I’d recommend Openrouter and Cerebras! And you get your pick of models to try from them.

          Even a framework 16 is not good for LLMs TBH. The Framework desktop is (as it uses a special AMD chip), but it’s very expensive. Honestly the whole hardware market is so screwed up, hence most ‘local LLM enthusiasts’ buy a used RTX 3090 and stick them in desktops or servers, as no one wants to produce something affordable apparently :/