Ollama
Ollama is a tool that allows users to run large language models (LLMs) directly on their own computers, making powerful AI technology accessible without relying on cloud services. It provides a user-friendly way to manage, deploy, and integrate LLMs, offering greater control, privacy, and customization compared to traditional cloud-based solutions.
Ollama was funded by Jared Friedman out of Y Combinator (YC). Founders Jeffrey Morgan and Michael Chiang wanted an easier way to run LLMs than having to do it in the cloud. In fact, they were previously founders of a startup project named Kitematic which was the early UI for Docker. Acquired by Docker, it was the precursor to Docker Desktop.
Using it
Although you can download or install it from the repo on GitHub https://github.com/ollama/ollama, you can also run it as a docker image ollama/ollama[1]
Because I have a GeForce RTX 4060 NVidia GPU, I had to install the NVidia Container Toolkit, and configure Docker to use the NVidia driver
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \ | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \ | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \ | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list sudo apt-get update
Problems
The docs advise to
(sudo) docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
which clearly runs as root, however my Docker (Desktop) is running as non-root user, so although I previously fetched the image through Docker Desktop, the CLI command couldn't find it and downloaded another copy. And spit out additional errors:
Unable to find image 'ollama/ollama:latest' locally latest: Pulling from ollama/ollama 13b7e930469f: Pull complete 97ca0261c313: Pull complete 2ace2f9dde9e: Pull complete 41ea4d361810: Pull complete Digest: sha256:50ab2378567a62b811a2967759dd91f254864c3495cbe50576bd8a85bc6edd56 Status: Downloaded newer image for ollama/ollama:latest 40be284dab1709b74fa68d513f75c10239d7234a21d65aac1e80cbd743515498 docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running prestart hook #0: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: initialization error: nvml error: driver/library version mismatch: unknown
The important part seems to be 'Auto-detected mode as legacy'
Running the image from Docker Desktop, with setting optons for ports and volumes, and copying the 'run' command spits out:
docker run --hostname=3f50cd4183bd --mac-address=02:42:ac:11:00:02 --env=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin --env=LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64 --env=NVIDIA_DRIVER_CAPABILITIES=compute,utility --env=NVIDIA_VISIBLE_DEVICES=all --env=OLLAMA_HOST=0.0.0.0:11434 --volume=ollama:/root/.ollama --network=bridge -p 11434:11434 --restart=no --label='org.opencontainers.image.ref.name=ubuntu' --label='org.opencontainers.image.version=20.04' --runtime=runc -d ollama/ollama:latest
http://localhost:11434/ just reveals 'Ollama is running' in 'hello world' style.