Published on

Upgrade to Fedora 38 including Nvidia, etc

Authors

Upgrading to Fedora 38 from Fedora 36

(while being careful about cuda versions)

Standard Fedora 36 → 38 upgrade steps

dnf install dnf-plugin-system-upgrade --best

# Large download (to latest Fedora 36) :
dnf upgrade --refresh
# Takes several minutes, depending on whether you update regularly
shutdown -r now

dnf system-upgrade download --releasever=38
# Takes 30mins?
dnf system-upgrade reboot
# Takes 30mins?


# Collect useful stats:
uname -r;
rpm -q --queryformat '%{name}\t: %{version}\n' xorg-x11-drv-nvidia;
rpm -q --queryformat '%{name}\t\t\t: %{version}\n' cuda;
rpm -q --queryformat 'cudnn\t\t\t: %{version}\n' libcudnn8;
rpm -q --queryformat '%{name}\t: %{version}\n' google-chrome-stable;

nvidia-smi
# If this gives nice output : Then we are done
# NB: It might say '12.2' as the cuda version at the top
#     But the installed rpm is (likely 11.8) : Which *seems* to match reality

If there's no cuda library installed...

dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/fedora35/x86_64/cuda-fedora35.repo
dnf install cuda # We get 11.8

# The cuda should show up with the stats lines:
rpm -q --queryformat '%{name}\t: %{version}\n' xorg-x11-drv-nvidia;
rpm -q --queryformat '%{name}\t\t\t: %{version}\n' cuda;
rpm -q --queryformat 'cudnn\t\t\t: %{version}\n' libcudnn8;

nvidia-smi
# If this gives nice output : Then we are done
# NB: It might say '12.2' as the cuda version at the top :-?
# If not :
lsmod | grep nv
# Probably empty (except for i2c_nvidia_gpu)

journalctl -b | grep nvidia
# Assuming the journal kernel line mentions stuff like : modprobe.blacklist=nouveau  ...

Apparently, this may be a something that doesn't work quite right in what akmods produces:

depmod -a
shutdown -r now
nvidia-smi
# If this gives nice output : Then we are done

Rebuild the kernel module...

Apparently, normal updating does give enough time for akmods to complete, so let's do it manually here.

NV_KMOD=`rpm -qa | grep kmod | grep $(uname -r)`
echo $NV_KMOD

dnf remove $NV_KMOD
akmods --force --kernels $(uname -r)
# Takes a couple of minutes
shutdown -r now
nvidia-smi
# If this gives nice output : Then we are done

If the journal still mentions starting 'nvidia-fallback.service' ...

Perhaps the service is falling back to nouveau in before the nvidia module loads properly:

systemctl disable nvidia-fallback.service
systemctl mask nvidia-fallback.service
shutdown -r now

nvidia-smi
# If this gives nice output : Then we are done

All done!

Squid configuration (notes for faster upgrade on LAN)

ps fax | grep squid
   4309 pts/0    S+     0:00              \_ grep --color=auto squid
   1648 ?        Ss     0:00 /usr/sbin/squid --foreground -f /etc/squid/squid.conf
   1655 ?        S      0:00  \_ (squid-1) --kid squid-1 --foreground -f /etc/squid/squid.conf
   3934 ?        S      0:00      \_ (logfile-daemon) /var/log/squid/access.log
   3941 ?        S      0:00      \_ /usr/bin/python /etc/squid/store_id_program.py
/etc/squid/store_id_program.py
#!/usr/bin/python

import re
from six.moves import input
import sys

def main():
    rpm_re = re.compile('[^/]+\.rpm')
    distros = [
        'centos',
        'redhat',
        'fedora',
        'opensuse',
        'suse',
    ]

    while True:
        line = input('')
        parts = line.split(' ')
        url = parts[0]
        distro = next(
            (
                d for d in distros
                if d in url.lower()
            ),
            'unknown'
        )
        search_res = rpm_re.search(url)
        print(
            "OK store-id=%s" % (
                'distro:%s:%s' % (distro, search_res.group())
                if search_res else url
            )
        )
        sys.stdout.flush()

if __name__ == '__main__':
    main()