Nvidia on Ubuntu: Difference between revisions
add notes |
Add NVIDIA instructions |
||
Line 1: | Line 1: | ||
Because I wanted to run a local [[Artificial Intelligence]] platform called [[Ollama]], I wanted to ensure that my GPU was fully utilized in the system since GPUs are the particular type of hardware best suited for these [[Vector database|Vector]] calculations. And, I have a 'decent' GPU - [[PC Build 2024#Video Card (GPU)|Nvidia GeForce RTX 4060]] (the best you could get in 2024). In trying to install the latest Nvidia driver, I set off on a week-long journey of learning, frustration and perseverance discovering the inner workings of Ubuntu 24.04, Xorg, the Linux kernel and kernel modules, DRM, Secure Boot, initramfs and more. | Because I wanted to run a local [[Artificial Intelligence]] platform called [[Ollama]], I wanted to ensure that my GPU was fully utilized in the system since GPUs are the particular type of hardware best suited for these [[Vector database|Vector]] calculations<ref>https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#the-benefits-of-using-gpus</ref>. And, I have a 'decent' GPU - [[PC Build 2024#Video Card (GPU)|Nvidia GeForce RTX 4060]] (the best you could get in 2024). In trying to install the latest Nvidia driver, I set off on a week-long journey of learning, frustration and perseverance discovering the inner workings of Ubuntu 24.04, Xorg, the Linux kernel and kernel modules, DRM, Secure Boot, initramfs and more. | ||
I still do not have the Nvidia driver loaded - even after 40+ reboots and attempts. Instead I'm using the Nouveau driver but at least I have a working system and I believe now that I've finally figured out what needs to be done to disable Nouveau and install Nvidia - a project that I am approaching with greater scrutiny now. I'm documenting the things that I encounter in this journey. | I still do not have the Nvidia driver loaded - even after 40+ reboots and attempts. Instead I'm using the Nouveau driver but at least I have a working system and I believe now that I've finally figured out what needs to be done to disable Nouveau and install Nvidia - a project that I am approaching with greater scrutiny now. I'm documenting the things that I encounter in this journey. | ||
Line 18: | Line 18: | ||
'''OpenGL version''' string: 4.3 (Compatibility Profile) Mesa 24.2.8-1ubuntu1~24.04.1 | '''OpenGL version''' string: 4.3 (Compatibility Profile) Mesa 24.2.8-1ubuntu1~24.04.1 | ||
<code>lspci | grep VGA</code> | |||
01:00.0 '''VGA''' compatible controller: NVIDIA Corporation AD107 [GeForce RTX 4060] (rev a1) | |||
== GUI is stuck == | == GUI is stuck == | ||
Line 99: | Line 103: | ||
== NVidia == | == NVidia == | ||
The installation guide (46 chapters) is at https://download.nvidia.com/XFree86/Linux-x86_64/570.153.02/README/ | Documentation for installing NVidia drivers is at https://docs.nvidia.com/datacenter/tesla/driver-installation-guide/ | ||
The installation guide for the v570 of the driver (46 chapters) is at https://download.nvidia.com/XFree86/Linux-x86_64/570.153.02/README/ | |||
I've read the whole thing. | I've read the whole thing. | ||
Line 112: | Line 118: | ||
We explore these in more detail below. | We explore these in more detail below. | ||
Over at StackExchange, a user asked [https://unix.stackexchange.com/questions/352828/how-to-switch-nvidia-driver-from-nouveau-to-nvidia-proprietary how to switch | Over at StackExchange, a user asked [https://unix.stackexchange.com/questions/352828/how-to-switch-nvidia-driver-from-nouveau-to-nvidia-proprietary how to switch graphics driver from nouveau to nvidia] and succeeded in part by '''modifying the boot parameters in grub''' to deny nouveau. Note that the boot parameters were used only during the process to stop using one driver and install the other driver. It is not a configuration that would allow you to have two different boot menu entries in GRUB in order to use two graphics modes. | ||
Over in the Manjaro Linux forums, a user asked a similar question: [https://forum.manjaro.org/t/how-do-i-switch-between-nvidia-and-nouveau-drivers-on-boot/92044 How do I switch between Nvidia and Nouveau drivers on boot?] They tried using | |||
<code>modprobe.blacklist=nvidia systemd.setenv=GPUMOD=nouveau rd.driver.blacklist=nvidia nouveau.modeset=1 nvidia.modeset=0</code> | |||
But ultimately had to install the OS twice on different disk partitions in order to choose to boot one system or the other depending on what graphics driver they needed to use. | |||
=== Denylist === | === Denylist === | ||
Line 164: | Line 176: | ||
=== Module Signing === | === Module Signing === | ||
On systems with Secure Boot enabled (mine), you most likely need to sign the module. See [https://download.nvidia.com/XFree86/Linux-x86_64/570.153.02/README/installdriver.html#modulesigning Signing NVIDIA Kernel Module]. However, I didn't get an explicit message that signing was a problem; and I did see that the installation process signs the module with a generated key. I assume that the MOK process hooks into the trust system somehow. | On systems with Secure Boot enabled (mine), you most likely need to sign the module. See [https://download.nvidia.com/XFree86/Linux-x86_64/570.153.02/README/installdriver.html#modulesigning Signing NVIDIA Kernel Module]. However, I didn't get an explicit message that signing was a problem; and I did see that the installation process signs the module with a generated key. I assume that the MOK process hooks into the trust system somehow. | ||
== Tools and Troubleshooting == | |||
Ubuntu wants you to use the '[[Nvidia on Ubuntu/ubuntu-drivers|ubuntu-drivers]]' tool<ref>https://documentation.ubuntu.com/server/how-to/graphics/install-nvidia-drivers/</ref>. | |||
NVIDIA seems to just settled on a new mechanism<ref>https://docs.nvidia.com/datacenter/tesla/driver-installation-guide/</ref> rather than downloading the (former?) .run installers: <code>wget <nowiki>https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb</nowiki> dpkg -i cuda-keyring_1.1-1_all.deb apt update</code> <code>apt install nvidia-open</code> | |||
NVIDIA distributes a script called <code>nvidia-bug-report.sh</code> that you can and should run<ref>https://forums.developer.nvidia.com/t/if-you-have-a-problem-please-read-this-first/27131 | |||
</ref> to collect detailed information about any problems. | |||
== Interesting Notes == | == Interesting Notes == |
Revision as of 08:47, 29 June 2025
Because I wanted to run a local Artificial Intelligence platform called Ollama, I wanted to ensure that my GPU was fully utilized in the system since GPUs are the particular type of hardware best suited for these Vector calculations[1]. And, I have a 'decent' GPU - Nvidia GeForce RTX 4060 (the best you could get in 2024). In trying to install the latest Nvidia driver, I set off on a week-long journey of learning, frustration and perseverance discovering the inner workings of Ubuntu 24.04, Xorg, the Linux kernel and kernel modules, DRM, Secure Boot, initramfs and more.
I still do not have the Nvidia driver loaded - even after 40+ reboots and attempts. Instead I'm using the Nouveau driver but at least I have a working system and I believe now that I've finally figured out what needs to be done to disable Nouveau and install Nvidia - a project that I am approaching with greater scrutiny now. I'm documenting the things that I encounter in this journey.
Why not just continue to use Nouveau, a project of the freedesktop community? I mean "if it ain't broke, don't fix it" - right? In principle, I'd very much like to use nouveau. I'm not even sure that any alternative is "better" in any way - especially since I am not a gamer[2]. My use case is to get the best performance from local LLMs. As I become familiar with the methods to switch video drivers reliably, I intend to run benchmarks and explore the benefits of one configuration vs another.
Opposite[edit]
To go the opposite route, purging all proprietary drivers and installing the open source Nouveau driver, you pretty much just sudo apt install xserver-xorg-video-nouveau
You might find seemingly good content/tutorials by linuxconfig, but I'm intentionally not linking to their site or YouTube videos because their content is plausible, but sketchy - IOW, it is Artificial Intelligence slop. [3]
About this System[edit]
In your desktop environment, you can access 'System Settings' -> 'About this System' (KInfoCenter) to display basic info about your Software and Hardware environment including the 'graphics processor'. Mine says NV197 - which is the codename given to the card by the Nouveau project[4]. You can click on 'Show More Information' which reveals a multi-tab dialog for OpenCL, OpenGL, Vulkan, Window Manager and X-Server with extensive Graphics info.
Or, you can also get details from a variety of CLI commands like glxinfo, lspci etc.
glxinfo | grep -E "OpenGL version|OpenGL renderer"
OpenGL renderer string: NV197
OpenGL version string: 4.3 (Compatibility Profile) Mesa 24.2.8-1ubuntu1~24.04.1
lspci | grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation AD107 [GeForce RTX 4060] (rev a1)
GUI is stuck[edit]

In theory, the GUI (see image) allows you to effortlessly choose and switch between multiple drivers. There are even YouTube videos that show how it's supposed to work[5]. In practice, at least on my system, installing a second driver is not possible from a Graphical desktop environment.
If you go into the Graphical User Interface (GUI) for "System Settings" and then "Hardware -> Driver Manager" it would appear to allow you to simply choose to use a proprietary driver from Nvidia. But all the choices are disabled (grayed out). Every attempt I made to "install an Nvidia driver" - whether through a graphical package manager like Synaptic, or the CLI using apt or dpkg, would just result in a "non-working" graphics system where I didn't even have dual monitor support.
X.org[edit]
Here is the Xorg.log file from my first boot into a clean system having no Nvidia drivers.
Nouveau[edit]
The XServer is loading the nouveau driver package xserver-xorg-video-nouveau.
From the package description:
This driver for the X.Org X server (see xserver-xorg for a further description) provides support for NVIDIA Riva, TNT, GeForce, and Quadro cards.
This package provides 2D support including EXA acceleration, Xv and RandR. 3D functionality is provided by the libgl1-mesa-dri package.
This package is built from the FreeDesktop.org xf86-video-nouveau driver.
Inspection[edit]
dpkg can show us what packages are installed with 'nouveau' in the name.
dpkg -l | grep -i nouveau
ii libdrm-nouveau2:amd64 2.4.122-1~ubuntu0.24.04.1 amd64 Userspace interface to nouveau-specific kernel DRM services -- runtime
ii xserver-xorg-video-nouveau 1:1.0.17-2build1 amd64 X.Org X server -- Nouveau display driver
And, lsmod can show us what kernel modules are loaded with 'nouveau' in the name.
lsmod | grep nouveau
nouveau 3096576 68
drm_gpuvm 45056 2 xe,nouveau
drm_exec 12288 3 drm_gpuvm,xe,nouveau
gpu_sched 61440 2 xe,nouveau
drm_ttm_helper 12288 2 xe,nouveau
ttm 110592 4 drm_ttm_helper,xe,i915,nouveau
drm_display_helper 237568 3 xe,i915,nouveau
mxm_wmi 12288 1 nouveau
i2c_algo_bit 16384 3 xe,i915,nouveau
video 77824 3 xe,i915,nouveau
wmi 28672 4 video,wmi_bmof,mxm_wmi,nouveau
modinfo tells us details about the kernel module, including the dependencies.
What files does the nouveau driver install?
dpkg -L xserver-xorg-video-nouveau
/. /usr /usr/lib /usr/lib/xorg /usr/lib/xorg/modules /usr/lib/xorg/modules/drivers /usr/lib/xorg/modules/drivers/nouveau_drv.so /usr/share /usr/share/bug /usr/share/bug/xserver-xorg-video-nouveau /usr/share/doc /usr/share/doc/xserver-xorg-video-nouveau /usr/share/doc/xserver-xorg-video-nouveau/README.Debian /usr/share/doc/xserver-xorg-video-nouveau/changelog.Debian.gz /usr/share/doc/xserver-xorg-video-nouveau/copyright /usr/share/man /usr/share/man/man4 /usr/share/man/man4/nouveau.4.gz /usr/share/bug/xserver-xorg-video-nouveau/script
NVidia[edit]
Documentation for installing NVidia drivers is at https://docs.nvidia.com/datacenter/tesla/driver-installation-guide/
The installation guide for the v570 of the driver (46 chapters) is at https://download.nvidia.com/XFree86/Linux-x86_64/570.153.02/README/
I've read the whole thing.
Replacing Nouveau[edit]
See NVIDIA driver 570.153.02 README common problems #nouveau where it says basically
- denylist it
- modify your initramfs
- modify Xorg to not load nouveau
We explore these in more detail below.
Over at StackExchange, a user asked how to switch graphics driver from nouveau to nvidia and succeeded in part by modifying the boot parameters in grub to deny nouveau. Note that the boot parameters were used only during the process to stop using one driver and install the other driver. It is not a configuration that would allow you to have two different boot menu entries in GRUB in order to use two graphics modes.
Over in the Manjaro Linux forums, a user asked a similar question: How do I switch between Nvidia and Nouveau drivers on boot? They tried using
modprobe.blacklist=nvidia systemd.setenv=GPUMOD=nouveau rd.driver.blacklist=nvidia nouveau.modeset=1 nvidia.modeset=0
But ultimately had to install the OS twice on different disk partitions in order to choose to boot one system or the other depending on what graphics driver they needed to use.
Denylist[edit]
I tried denylisting the nouveau driver and preventing it from doing modesetting by creating disable-nouveau.conf however I was unsuccessful in installing Nvidia drivers even with that in place, and performing operations from a recovery console.
I've looked at the initramfs and don't see where it is loading nouveau. Although I do see where the temporary disable-nouveau.conf file I created is read in.
Modify Initial Ram Disk[edit]
The initial ram disk is a gzipped CPIO archive
file /boot/initrd.img-6.8.0-62-generic
/boot/initrd.img-6.8.0-62-generic: ASCII cpio archive (SVR4 with no CRC)
But how do you examine it? with lsiniramfs
There are quite a lot of files that get named in it. So, pipe it through a pager like less
X.org.conf[edit]
I've looked at Xorg but I'm not sure how / if it is responsible for requiring nouveau - but I can clearly see that the package is installed.
dpkg -l | grep -E "xorg|xserver"
ii python3-xkit 0.5.0ubuntu6 all library for the manipulation of xorg.conf files (Python 3)
ii x11-xserver-utils 7.7+10build2 amd64 X server utilities
ii xorg 1:7.7+23ubuntu3 amd64 X.Org X Window System
ii xorg-docs-core 1:1.7.1-1.2 all Core documentation for the X.org X Window System
ii xorg-sgml-doctools 1:1.11-1.1 all Common tools for building X.Org SGML documentation
ii xserver-common 2:21.1.12-1ubuntu1.4 all common files used by various X servers
ii xserver-xephyr 2:21.1.12-1ubuntu1.4 amd64 nested X server
ii xserver-xorg 1:7.7+23ubuntu3 amd64 X.Org X server
ii xserver-xorg-core 2:21.1.12-1ubuntu1.4 amd64 Xorg X server - core server
ii xserver-xorg-input-all 1:7.7+23ubuntu3 amd64 X.Org X server -- input driver metapackage
ii xserver-xorg-input-libinput 1.4.0-1ubuntu24.04.1 amd64 X.Org X server -- libinput input driver
ii xserver-xorg-input-wacom 1:1.2.0-1ubuntu2 amd64 X.Org X server -- Wacom input driver
ii xserver-xorg-legacy 2:21.1.12-1ubuntu1.4 amd64 setuid root Xorg server wrapper
ii xserver-xorg-video-all 1:7.7+23ubuntu3 amd64 X.Org X server -- output driver metapackage
ii xserver-xorg-video-amdgpu 23.0.0-1build1 amd64 X.Org X server -- AMDGPU display driver
ii xserver-xorg-video-ati 1:22.0.0-1build1 amd64 X.Org X server -- AMD/ATI display driver wrapper
ii xserver-xorg-video-fbdev 1:0.5.0-2build2 amd64 X.Org X server -- fbdev display driver
ii xserver-xorg-video-intel 2:2.99.917+git20210115-1build1 amd64 X.Org X server -- Intel i8xx, i9xx display driver
ii xserver-xorg-video-nouveau 1:1.0.17-2build1 amd64 X.Org X server -- Nouveau display driver
ii xserver-xorg-video-qxl 0.1.6-1build1 amd64 X.Org X server -- QXL display driver
ii xserver-xorg-video-radeon 1:22.0.0-1build1 amd64 X.Org X server -- AMD/ATI Radeon display driver
ii xserver-xorg-video-vesa 1:2.6.0-1 amd64 X.Org X server -- VESA display driver
ii xserver-xorg-video-vmware 1:13.4.0-1build1 amd64 X.Org X server -- VMware display driver
Module Signing[edit]
On systems with Secure Boot enabled (mine), you most likely need to sign the module. See Signing NVIDIA Kernel Module. However, I didn't get an explicit message that signing was a problem; and I did see that the installation process signs the module with a generated key. I assume that the MOK process hooks into the trust system somehow.
Tools and Troubleshooting[edit]
Ubuntu wants you to use the 'ubuntu-drivers' tool[6].
NVIDIA seems to just settled on a new mechanism[7] rather than downloading the (former?) .run installers: wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb dpkg -i cuda-keyring_1.1-1_all.deb apt update
apt install nvidia-open
NVIDIA distributes a script called nvidia-bug-report.sh
that you can and should run[8] to collect detailed information about any problems.
Interesting Notes[edit]
Usually, when you have 'sudo' or root privileges you can do more. One exception is the X-Server. Root access to the server may be restricted. In that case,
glxinfo
will give
Error: unable to open display
A regular user will have no problem running glxinfo
.
Different Desktop Environments[edit]
As a regular user, my DE is KDE Plasma (using Kubuntu) rather than the GNOME default of Ubuntu
See Also[edit]
References[edit]
- ↑ https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#the-benefits-of-using-gpus
- ↑ I'm not opposed in any way, I just don't have the time to add another hobby. This is a clarifying statement for my use-case, and therefore, requirements.
- ↑ linuxconfig.org/how-to-install-nvidia-drivers-on-ubuntu-24-04 describes how to get Nvidia drivers working on Ubuntu 24, but curiously refers to deprecated commands like 'ubuntu drivers autoinstall'. They have a whole section where they talk about downloading 'drivers' from Nvidia - but the '.run' files are not drivers at all, they are installers. And they instruct you to use 'telinit' to switch runlevels when the concept of SysV runlevels is obsolete (and thus so is the command). Granted the telinit command will be transparently translated into systemd unit activation requests, but old, deprecated "howto" is "how NOT to".
- ↑ https://nouveau.freedesktop.org/CodeNames.html
- ↑ I do not want to link to AI-generated content that creates a worse Internet, but this is what I'm refering to www.youtube.com/watch?v=pmGfi1ldBqc The bot doesn't respond to questions, so is my experience different because I have Secure Boot enabled? Is my experience different because something changed in the OS between Ubuntu 20 and Ubuntu 24?
- ↑ https://documentation.ubuntu.com/server/how-to/graphics/install-nvidia-drivers/
- ↑ https://docs.nvidia.com/datacenter/tesla/driver-installation-guide/
- ↑ https://forums.developer.nvidia.com/t/if-you-have-a-problem-please-read-this-first/27131