Proxmox on Intel NUC

This is written from my perspective with two Intel Nucs Gen 8 but will most likely be the cause/solution to all NUCs with similar configuration (Integrated Graphics card)

Models:

  • BOXNUC8i5BEH2

  • BOXNUC8i7BEH2

Both equipped with NVME disks (Intel and Kingston) and regular 2.5" SSD (Samsung and Intel) and 32GB Ram.

The issue

Using these two NUCs together with Proxmox VE version 8 (didnt matter which) I had problem with them suddenly crashing with symptoms like

  • Stopped responding to ping

  • Powering off (after a while, maybe som sort of timeout so may not actually have been an issue)

  • Plugging in a monitor gave "no signal"

This could happen anywhere from 15 minutes to 6 hours and the logs were pretty much useless, neither dmesg or journalctl revelead any "error" before the crash

Troubleshooting

Standard troubleshooting

  • Checking logs

  • Reinstalling Proxmox

  • Going back to older versions of Proxmox (7.4)

  • Memtest to rule out memory issues

  • Disable Bluetooth, wifi nic and other peripherals that I dont need.

  • Having a monitor plugged in seemed to keep it more stable (found threads where using a HDMI dummy might solve the problem)

One step of the troubleshoting was to install Windows again on it and see if the same issue persisted but it did not look like it. The hardware should be all good and given that everything had been running esxi without problems for years it felt unlikely to be a sudden hardware issue (still a possibility)

Googling around, looking at pretty much every thread regarding Intel Nuc and Proxmox or Debian 12 (since Proxmox runs Debian 12) I eventually found https://forum.proxmox.com/threads/proxmox-random-reboots-on-hp-elitedesk-800g4-fixed-with-proxmox-install-on-top-of-debian-12-now-issues-with-hardware-transcoding-in-plex.132187/

On page two mcdy wrote

Pointing to a file relating to the Intel HD Graphics card power management.

Archlinux forum states

"i915.enable_dc=0" disables GPU power management. 
This does solve random hangs on certain Intel systems, notably Goldmount and Kaby Lake Refresh chips. 
Using this parameter does result in higher power use and shorter battery life on laptops/notebooks.

Found here: https://wiki.archlinux.org/title/intel_graphics

Workaround/Solution!?

Move the .bin file

mv /lib/firmware/i915/kbl_dmc_ver1_04.bin /lib/firmware/i915/kbl_dmc_ver1_04.bak

From what I read, the file comes back with the next Proxmox Update so another way would be to update the Grub (which I assume is persistent?)

  1. Edit /etc/default/grub

  2. Find this line (probably default) GRUB_CMDLINE_LINUX_DEFAULT="quiet"

  3. Change it to GRUB_CMDLINE_LINUX_DEFAULT="quiet i915.enable_dc=0"

  4. Update grub update-grub

  5. Reboot and verify with cat /proc/cmdline and check that the settings is applied

Last updated