Upgrading to Debian Trixie

I had been running Bookworm for quite a while. It has now been stable for more than a year. In other words, Trixie has now been testing for more than a year, and will most likely become the new stable in less than a year.

In the past, I encountered a few surprises with unstable releases and recent testing releases. So, I avoid upgrading too soon. But it felt pretty safe at this point. Besides, I wanted to run a more recent version of the kernel to have fun with modern perf events.

So I basically did the usual process:

# sed s/bookworm/trixie/g -i /etc/apt/sources.list
# apt update
# apt upgrade

Everything went smoothly… until it didn’t.

Broken apt

The first issue is that apt had broken itself, by removing many packages before their replacements. In particular, it had removed libgnutls, which made apt unable to fetch new packages, including libgnutls.

Edit: apt-get downloads all packages before installing them, so this is not supposed to happen. But, looking at the logs, it looks like the installation of libgio-2.0-dev failed, stopping the installation process. When I tried resuming, the libgnutls packages were gone, so it did not manage to check the index. I might have been able to complete the installation from the apt-get cache.

I resorted to using another computer to manually download the .deb file from the Debian packages website… as well as the dependencies… and the transitive dependencies. I then transferred them to the broken system with a USB flash drive and dpkg -i *.deb. This enabled me to resume the upgrade.

Broken X

Once I rebooted (actually before fixing the previous issue), lightdm became unable to start. It turns out that apt had also removed the nvidia driver.

Thinking I would be clever, I installed nouveau from the official repositories. I hoped that that would get me a working graphical session and I would be able to download the Nvidia driver from their website. Maybe I could have used Lynx, but I use it rarely enough that I wanted to avoid that if I could.

Of course, it made things worse. Not only did I not have a graphical session, but my TTY jumped back to TTY 7 (where the graphical session usually starts) every time I tried to switch to another one. Effectively, it meant that I would not do anything from the system itself.

I flashed Debian on my USB flash drive (dd if=debian-12.7.0-amd64-netinst.iso of=/dev/sdb status=progress) and used the rescue mode to remove nouveau. I tried to also install the Nvidia driver from there, but it did not like being run on a different kernel (6.1 on the rescue drive) than the target (6.10 on my NVMe/SSD).

I rebooted, installed the Nvidia driver, and rebooted again. Finally, I had a graphical session again!

Broken Bear

I used Bear, a small SSH server, to allow me to unlock my LUKS partition remotely. Combined with wake-on-LAN, this enables me to start my computer fully remotely while keeping the disk encrypted1.

And, now, it was very broken.

First, it was complaining about missing cat. I think reinstalling the dropbear-initramfs package fixed this. But I could still not connect to it at all (not even at the TCP level).

After investigating the logs (sudo less -R /var/log/boot.log in particular), I noticed many errors such as enp8s0: SIOCGIFINDEX: No such device and no devices to configure. Although searching for the first did not bring me any useful results, the second lead me to someone with the same issue. For some reason, the network drive was not included in my initramfs. In short:

# lspci -v | grep -A8 Ethernet
08:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
	Subsystem: Gigabyte Technology Co., Ltd Device e000
	Flags: bus master, fast devsel, latency 0, IRQ 38, IOMMU group 18
	I/O ports at c000 [size=256]
	Memory at f6800000 (64-bit, non-prefetchable) [size=64K]
	Memory at f6810000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: r8169
	Kernel modules: r8169
# echo r8169 >>/etc/initramfs-tools/modules
# update-initramfs -u
# reboot

Conclusion

I have no idea how this happened, but it did.

On the one hand, this should not have happened. And this reinforces the idea that Linux is still not ready for most people. On the other hand, it was nice being able to fix these problems myself, instead of having to reinstall from scratch or rely on the well good will of some company.

  1. It increases the attack surface in the fact that someone who gains physical access to my computer while I am away could access it when I start it remotely. For this to work, this would effectively mean that my adversary is a state actor. In that case, I’m going to die and there’s nothing that I can do about it. ↩︎

Leave a Reply

Your email address will not be published. Required fields are marked *