Ollama

Ollama is a tool that allows users to run large language models (LLMs) directly on their own computers, making powerful AI technology accessible without relying on cloud services. It provides a user-friendly way to manage, deploy, and integrate LLMs, offering greater control, privacy, and customization compared to traditional cloud-based solutions.

Ollama was funded by Jared Friedman out of Y Combinator (YC). Founders Jeffrey Morgan and Michael Chiang wanted an easier way to run LLMs than having to do it in the cloud. In fact, they were previously founders of a startup project named Kitematic which was the early UI for Docker. Acquired by Docker, it was the precursor to Docker Desktop.

Using it

Although you can download or install it from the repo on GitHub https://github.com/ollama/ollama, you can also run it as a docker image ollama/ollama^[1]

Because I have a GeForce RTX 4060 NVidia GPU, I had to install the NVidia Container Toolkit, and configure Docker to use the NVidia driver

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
    | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
    | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
    | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update

Problems

The docs advise to

(sudo) docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

which clearly runs as root, however my Docker (Desktop) is running as non-root user, so although I previously fetched the image through Docker Desktop, the CLI command couldn't find it and downloaded another copy. And spit out additional errors:

Unable to find image 'ollama/ollama:latest' locally
latest: Pulling from ollama/ollama
13b7e930469f: Pull complete 
97ca0261c313: Pull complete 
2ace2f9dde9e: Pull complete 
41ea4d361810: Pull complete 
Digest: sha256:50ab2378567a62b811a2967759dd91f254864c3495cbe50576bd8a85bc6edd56
Status: Downloaded newer image for ollama/ollama:latest
40be284dab1709b74fa68d513f75c10239d7234a21d65aac1e80cbd743515498
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running prestart hook #0: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: nvml error: driver/library version mismatch: unknown

The important part seems to be 'Auto-detected mode as legacy'

Running the image from Docker Desktop, with setting optons for ports and volumes, and copying the 'run' command spits out:

docker run --hostname=3f50cd4183bd --mac-address=02:42:ac:11:00:02 --env=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin --env=LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64 --env=NVIDIA_DRIVER_CAPABILITIES=compute,utility --env=NVIDIA_VISIBLE_DEVICES=all --env=OLLAMA_HOST=0.0.0.0:11434 --volume=ollama:/root/.ollama --network=bridge -p 11434:11434 --restart=no --label='org.opencontainers.image.ref.name=ubuntu' --label='org.opencontainers.image.version=20.04' --runtime=runc -d ollama/ollama:latest

http://localhost:11434/ just reveals 'Ollama is running' in 'hello world' style.

References

↑ https://hub.docker.com/r/ollama/ollama

[1] ttps://hub.docker.com/r/ollama/ollama

[1]