All is well until I fire up Firefox and start to experience huge lag spikes where the whole UI seizes up for 20-30 seconds at a time. I can CTRL-ALT-Fn to another console and text mode is fine but X/KDE is locked solid.
Looking at top shows that VM's, Firefox, kwin and khugepaged have pegged their respective CPU cores (or 4 cores in the case of the VM's) with little or no disk, swap or RAM activity. Killing firefox drops everything back to normal so I start to look online for reports of weird interactions between firefox (or flash/java within firefox) with OpenSuse and VMWare. Nothing.
Typing khugepaged into Google, however, was a bit of a revelation. Lots of reports of CPU stalls, 100% utilisation etc. with high core/RAM counts. I wouldn't have called 8-cores/16GB high in this day and age - at work I use 48-core/256GB VM hosts and they're getting to the end of their support lifetime already. However, my previous 4-core/12GB box did not have this problem.
To cut a long story short, it appears to be a problem with khugepaged attempting to defrag RAM to make space for the huge pages. For now I have just disabled defragging with:
echo 0 > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag echo never > /sys/kernel/mm/transparent_hugepage/defrag
...why do they take completely different parameters when they have the same name? Logic, please.
...it seems that later kernel vesions have this fixed, so I may have been bitten by my "lag an OS version" principle above. Ho hum.