I set out to deploy LLM-assisted coding while minimizing the use of closed-source software. I ended up accomplishing this with the following stack:
It was surprisingly simple, yet apparently undocumented - NVIDIA’s documentation would have had me install an entire array of unnecessary software (my GPU is an NVIDIA GeForce RTX 3050) if I’d tried the conventional method of using the Ollama Docker image (installing it directly wasn’t the best option for me since I’m running Fedora Silverblue). Here’s how I got it working instead:
- Create a new Distrobox running a well-known distro (I used Ubuntu 24). Ensure that NVIDIA support is turned on prior to creating the box. DistroShelf is a great frontend for this purpose.
- Install Ollama in the Distrobox. Run
ollama serve
and keep in the background. - On the base system, install VSCodium (via Flatpak ideally) and then install the Continue extension.
- Configure Continue to run entirely local, ensuring that it reports successful connectivity to Ollama, and then replicate the provided commands in the Distrobox (running them on the base system, which does not have Ollama, will obviously result in an error).
That was that. No need for Cursor, a subscription to the big LLM providers, or similar expense. The heaviest model I’m using weighs in at 4 gigabytes. The only true limitation is having a GPU sufficient for the job (I doubt integrated graphics will cut it).