Google released Gemma 4 12B today. I'm a huge fan of the Gemma model family, they have improved with each iteration and consistently perform on par with larger models. It didn't run at first because it needs more VRAM that my laptop has, but there's a workaround. Here's a short instruction for how to run … Continue reading Running large LLMs on small hardware: Gemma 4 12B on a VRAM-constrained Radeon laptop
Tag: ollama
Docker, Ollama, Ubuntu & Radeon GPU
Just a quickie: this is the command I'm using on my Acer Nitro latop to run Ollama in Docker with GPU acceleration: group_id_video=$(getent group video | cut -d: -f3) group_id_render=$(getent group render | cut -d: -f3) docker run -d \ --privileged \ --device /dev/kfd \ --device /dev/dri \ --volume ollama:/root/.ollama \ --volume "/some/path/ollama:/images" \ --group-add … Continue reading Docker, Ollama, Ubuntu & Radeon GPU