pve-kernel-qoup

Author	SHA1	Message	Date
Friedrich Weber	29cb6fcbb7	cherry-pick scheduler fix to avoid temporary VM freezes on NUMA hosts Users have been reporting [1] that VMs occasionally become unresponsive with high CPU usage for some time (varying between ~1 and more than 60 seconds). After that time, the guests come back and continue running. Windows VMs seem most affected (not responding to pings during the hang, RDP sessions time out), but we also got reports about Linux VMs (reporting soft lockups). The issue was not present on host kernel 5.15 and was first reported with kernel 6.2. Users reported that the issue becomes easier to trigger the more memory is assigned to the guests. Setting mitigations=off was reported to alleviate (but not eliminate) the issue. For most users the issue seems to disappear after (also) disabling KSM [2], but some users reported freezes even with KSM disabled [3]. It turned out the reports concerned NUMA hosts only, and that the freezes correlated with runs of the NUMA balancer [4]. Users reported that disabling the NUMA balancer resolves the issue (even with KSM enabled). We put together a Linux VM reproducer, ran a git-bisect on the kernel to find the commit introducing the issue and asked upstream for help [5]. As it turned out, an upstream bugreport was recently opened [6] and a preliminary fix to the KVM TDP MMU was proposed [7]. With that patch [7] on top of kernel 6.7, the reproducer does not trigger freezes anymore. As of now, the patch (or its v2 [8]) is not yet merged in the mainline kernel, and backporting it may be difficult due to dependencies on other KVM changes [9]. However, the bugreport [6] also prompted an upstream developer to propose a patch to the kernel scheduler logic that decides whether a contended spinlock/rwlock should be dropped [10]. Without the patch, PREEMPT_DYNAMIC kernels (such as ours) would always drop contended locks. With the patch, the kernel only drops contended locks if the kernel is currently set to preempt=full. As noted in the commit message [10], this can (counter-intuitively) improve KVM performance. Our kernel defaults to preempt=voluntary (according to /sys/kernel/debug/sched/preempt), so with the patch it does not drop contended locks anymore, and the reproducer does not trigger freezes anymore. Hence, backport [10] to our kernel. [1] https://forum.proxmox.com/threads/130727/ [2] https://forum.proxmox.com/threads/130727/page-4#post-575886 [3] https://forum.proxmox.com/threads/130727/page-8#post-617587 [4] https://www.kernel.org/doc/html/latest/admin-guide/sysctl/kernel.html#numa-balancing [5] https://lore.kernel.org/kvm/832697b9-3652-422d-a019-8c0574a188ac@proxmox.com/ [6] https://bugzilla.kernel.org/show_bug.cgi?id=218259 [7] https://lore.kernel.org/all/20230825020733.2849862-1-seanjc@google.com/ [8] https://lore.kernel.org/all/20240110012045.505046-1-seanjc@google.com/ [9] https://lore.kernel.org/kvm/Zaa654hwFKba_7pf@google.com/ [10] https://lore.kernel.org/all/20240110214723.695930-1-seanjc@google.com/ Signed-off-by: Friedrich Weber <f.weber@proxmox.com>	2024-02-14 11:10:25 +01:00
Thomas Lamprecht	5dde66b4fe	update kernel and patches for Ubuntu-6.5.0-20.20 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2024-02-14 11:08:30 +01:00
Fabian Grünbichler	0ec9138fc0	fix #5158 : cherry-pick ext4 fix for high-CPU flush Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2024-01-30 13:26:35 +01:00
Fabian Grünbichler	53226238d9	fix #5077 : cherry-pick revert for aacraid resets reported both in our bug tracker and upstream to fix the affected hardware. Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2024-01-30 13:24:16 +01:00
Fiona Ebner	cc99d7fd2f	cherry-pick fix for RCU stall issue after VM live migration caused by a lapic timer interrupt getting lost. Already queued for 6.5.13: https://lore.kernel.org/stable/20231124172031.920738810@linuxfoundation.org/ Reported in the community forum: https://forum.proxmox.com/threads/136992/ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>	2023-11-27 18:58:23 +01:00
Fiona Ebner	dd086d18e3	backport UBSAN fixes for amdgpu to silence array-index-out-of-bounds warnings for dynamically-sized arrays. All commits applied cleanly and just replace array[1] with array[]. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>	2023-11-14 16:15:22 +01:00
Thomas Lamprecht	4a4ddffc89	cherry-pick 6.5.11 stable release Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-11-12 16:45:41 +01:00
Thomas Lamprecht	b0ac1e9734	Revert "UBUNTU: SAUCE: ceph: make sure all the files successfully put before unmounting" Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-11-07 09:37:12 +01:00
Thomas Lamprecht	8f06837c7c	revert "memfd: improve userspace warnings for missing exec-related flags" This is generating far too much noise in the logs, so keep it at once per boot until we (and other user space tools) adapted to the kernel wanting user space to chose memfd execution behavior very explicitly. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-11-06 10:21:03 +01:00
Thomas Lamprecht	fbb25a860c	update submodule to Ubuntu-6.5.0-9.9 from ubuntu mantic sources Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-10-28 14:28:11 +02:00
Thomas Lamprecht	6d825fcff3	backport constraining guest-supported xfeatures only at KVM_GET_XSAVE{2} This improves compatibility for guests w.r.t. live-migration, or live snapshot rollback, to hosts with less (FPU) xfeatures supported, as long as the set of features that was actually exposed to the guest is still supported. This improves on the ad856280ddea ("x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0") bug fix. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-10-21 15:16:56 +02:00
Thomas Lamprecht	9a2449d7c2	normalize patches Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-10-21 15:15:35 +02:00
Stefan Sterz	3202de9857	backport exposing FLUSHBYASID when running nested VMs on AMD CPUs this exposes the FLUSHBYASID CPU flag to nested VMs when running on an AMD CPU. also reverts a made up check that would advertise FLUSHBYASID as not supported. this enable certain modern hypervisors such as VMWare ESXi 7 and Workstation 17 to run nested VMs properly again. Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>	2023-10-20 09:42:01 +02:00
Thomas Lamprecht	04f267a5c7	backport fix for AMD erratum #1485 on Zen4-based CPUs Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-10-11 17:03:45 +02:00
Thomas Lamprecht	2db681b5f1	rebase patches on top of Ubuntu-6.2.0-36.36 (generated with debian/scripts/import-upstream-tag) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-10-03 07:05:13 +02:00
Stoiko Ivanov	4696b978f7	cherry-pick fix for new amd64 ucode The latest amd64-microcode package in sid [0] (which probably will eventually make it to bookworm-security) has a change that requires the added patch to work properly. The changelog-entry refers to stable k.o branches only - but a quick look through the linux-firmware.git log identifies: `f2eb058afc57348cde66852272d6bf11da1eef8f` as relevant commit, which refers (as NOTE in the patch) to: a32b0f0db3f3 ("x86/microcode/AMD: Load late on both threads too") which applies cleanly (although I cherry-picked the patch from the 6.1.y stable branch to have the original commit in the commit message). quickly tested compiling and booting the result in a VM (however w/o a fitting CPU (Epyc Genoa or Bergamo) it should cause a change) reported in our Enterprise Support as potential culprit for one thread from 128 being reported as offline in `lscpu` [0] https://metadata.ftp-master.debian.org/changelogs//non-free-firmware/a/amd64-microcode/amd64-microcode_3.20230808.1.1_changelog Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>	2023-09-26 11:37:58 +02:00
Thomas Lamprecht	d772676031	fix thunderbolt ring-interrupt not being masked on suspend Originally for v6.4-rc7 and now it also got already into some stable trees, but not yet into a (released) ubuntu tag – so backport it already. Link: https://forum.proxmox.com/threads/133104/post-590457 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-09-20 06:36:01 +02:00
Thomas Lamprecht	9ba0dde971	cherry-pick fix for setting X86_FEATURE_OSXSAVE feature Avoids regressions where some code falsely think they cannot use some CPU features like AVX1, e.g., ZFS. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-09-19 09:27:13 +02:00
Thomas Lamprecht	8ff596f2d3	rebase patches on top of Ubuntu-6.2.0-34.34 (generated with debian/scripts/import-upstream-tag) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-09-18 15:19:28 +02:00
Thomas Lamprecht	ddba52024f	backport thunderbolt-net fixes A user of ours reported an issue with p2p thunderbolt-net w.r.t. IPv6 and failure to reestablish the connection after a reboot of a peer node, in the forum [0] and the relayed it upstream, so lets cherry-pick those two patches to our 6.2. Especially the IPv6 one seems straight forward, and the other one makes it actually spec conform and should only improve things. [0]: https://forum.proxmox.com/threads/133104/ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-09-18 10:40:31 +02:00
Fabian Grünbichler	1acfcad2f3	fix #4707 : add override parameter for RMRR relaxation Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2023-09-06 08:53:13 +02:00
Fiona Ebner	6810c247a1	cherry-pick fix for KVM vCPU page fault loop The mailing list thread [0] (found by Friedrich, many thanks!) leading up to this patch sounds very familiar to issues users reported in the community forum [1] and enterprise support channel, where a VM would be stuck for no discernable reason with all vCPU threads spinning. [0]: https://lore.kernel.org/all/f023d927-52aa-7e08-2ee5-59a2fbc65953@gameservers.com/T/#u [1]: https://forum.proxmox.com/threads/127459/ Suggested-by: Friedrich Weber <f.weber@proxmox.com> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>	2023-09-04 15:15:42 +02:00
Thomas Lamprecht	77b18ac62e	rebase patches on top of Ubuntu-6.2.0-32.32 (generated with debian/scripts/import-upstream-tag) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-08-31 11:04:14 +02:00
Fiona Ebner	762b8cebe9	cherry-pick fix to surpress faulty segfault logging While there is no actual issue, users are still nervous about the faulty logging [0]. It might take a while until the fix comes in via upstream, so just pick it up manually. [0]: https://forum.proxmox.com/threads/130628/post-583864 Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>	2023-08-25 15:31:30 +02:00
Fiona Ebner	8b9dc02180	add patch for igc tx timeout issue There were several reports about issues related to igc and tx timeout and while the issue couldn't be reproduced locally, the hope is that this fix Friedrich found will resolve the issue for the users. The kernel versions in the reports would match with when 9b275176270e ("igc: Add ndo_tx_timeout support"), i.e. the one fixed by this commit, landed. [0]: https://forum.proxmox.com/threads/130935/ [1]: https://forum.proxmox.com/threads/130415/#post-580064 [2]: https://forum.proxmox.com/threads/132138/ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>	2023-08-16 10:01:05 +02:00
Stoiko Ivanov	9dd7462461	add fixes for downfall by cherry-picking the relevant commits from launchpad/lunar [0]. (relevant commits are based on k.o/stable commits for this) minimally tested by booting my (ryzen) machine with this kernel and skimming through dmesg after boot. [0] git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/lunar Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>	2023-08-16 09:56:23 +02:00
Thomas Lamprecht	08e179ff5c	backport Zenbleed stop-gap fix CVE-2023-20593 the actual fix is the microcode update, but this is a stop-gap (with a performance penalty) setting a chicken bit on affected CPUs that do not have the new enough microcode loaded, disabling some features. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-07-25 16:56:06 +02:00
Thomas Lamprecht	069e83e462	fix 4770: backport "nvme: don't reject probe due to duplicate IDs" we got quite some reports for this (e.g., Bugzilla or [0]), well in non-enterprise setups as those cheap NVMe's just don't bother holding up basic principles... [0]: https://forum.proxmox.com/threads/128738/#post-567249 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-07-15 18:45:20 +02:00
Thomas Lamprecht	c22aa75368	fix #4833 : backport fix for recovering potential NX huge pages Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-07-15 18:41:35 +02:00
Thomas Lamprecht	1559d22f35	kvm: xsave set: mask-out PKRU bit in xfeatures if vCPU has no support Fixes live-migrations & snapshot-rollback of VMs with a restricted CPU type (e.g., qemu64) from our 5.15 based kernel (default Proxmox VE 7.4) to the 6.2 (and future newer) of Proxmox VE 8.0. Previous to (upstream kernel) commit ad856280ddea ("x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0") the PKRU bit of the host could leak into the state from the guest, which caused trouble when migrating between hosts with different CPUs, i.e., where the source supported it but the target did not, causing a general protection fault when the guest tried to use a pkru related instruction after the migration. But the fix, while welcome, caused a temporary out-of-sync state when migrating such a VM from a kernel without the fix to a kernel with the fix, as it threw of KVM when the CPUID of the guest and most of the state doesn't report XSAVE and thus any xfeatures, but PKRU and the related state is set as enabled, causing the vCPU to spin at 100% without any progress forever. The fix could be at two sites, either in QEMU or in the kernel, I choose the kernel as we have all the info there for a targeted heuristic so that we don't have to adapt QEMU and qemu-server, the latter even on both sides. Still, a short summary of the possible fixes and short drawbacks: * on QEMU-side either - clear the PKRU state in the migration saved state would be rather complicated to implement as the vCPU is initialised way before we have the saved xfeature state available to check what we'd need to do, plus the user-space only gets a memory blob from ioctl KVM_GET_XSAVE2 that it passes to KVM_SET_XSAVE ioctl, there are no ABI guarantees, and while the struct seem stable for 5.15 to 6.5-rc1, that doesn't has to be for future kernels, so off the table. - enforce that the CPUID reports PKU support even if it normally wouldn't. While this works (tested by hard-coding it as POC) it is a) not really nice and b) needs some interaction from qemu-server to enable this flag as otherwise we have no good info to decide when it's OK to do this, which means we need to adapt both PVE 7 and 8's qemu-server and also pve-qemu, workable but not optimal * on Kernel/KVM-side we can hook into the set XSAVE ioctl specific to the KVM subsystem, which already reduces chance of regression for all other places. There we have access to the union/struct definitions of the saved state and thus can savely cast to that. We also got access to the vCPU's CPUID capabilities, meaning we can check if the XCR0 (first XSAVE Control Register) reports that it support the PKRU feature, and if it does NOT but the saved xfeatures register from XSAVE DOES report it, we can safely assume that this combination is due to an migration from an older, leaky kernel – and clear the bit in the xfeature register before restoring it to the guest vCPU KVM state, avoiding the confusing situation that made the vCPU spin at 100%. This should be safe to do, as the guest vCPU CPUID never reported support for the PKRU feature, and it's also a relatively niche and newish feature. If it gains us something we can drop this patch a bit in the future Proxmox VE 9 major release, but we should ensure that VMs that where started before PVE 8 cannot be directly live-migrated to the release that includes that change; so we should rather only drop it if the maintenance burden is high. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-07-14 19:47:11 +02:00
Thomas Lamprecht	289e2dddd9	update to Proxmox-6.2.16-2 based on Ubuntu-6.2.0-25.25 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-06-17 07:56:37 +02:00
Thomas Lamprecht	85f85b6fba	backport "net/sched: flower: fix possible OOB write in fl_set_geneve_opt()" Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-06-13 16:14:24 +02:00
Thomas Lamprecht	7e4bc8ae81	backport re-adding mdev_set_iommu_device() kABI Should fix compat with SRIOV based Nvidia vGPU until they switch over to using the vfio-pci-core framework instead of MDEV. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-06-13 16:14:24 +02:00
Thomas Lamprecht	2de39b1616	update submodule to Proxmox-6.2.16-1 and refresh patches Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-05-20 19:25:13 +02:00
Thomas Lamprecht	435ecf6664	update patches for Ubuntu-6.2.0-23.23 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-05-20 19:25:13 +02:00
Thomas Lamprecht	91266dcbe2	backport "netfilter: nf_tables: deactivate anonymous set from preparation phase" Link: https://ubuntu.com/security/CVE-2023-32233 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-05-10 11:13:20 +02:00
Thomas Lamprecht	40592ac627	update to Proxmox-6.2.9-1 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-04-03 11:53:01 +02:00
Thomas Lamprecht	2c4688ec2e	replace rever of RDMA link-speed reporting patch with fix Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-03-17 14:58:46 +01:00
Thomas Lamprecht	af0b394907	update to Ubuntu-6.2.0-17.17 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-03-14 18:07:40 +01:00
Thomas Lamprecht	24d804a086	update and drop applied patches for 6.2 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-03-08 12:29:55 +01:00
Fiona Ebner	3d016e115f	add patch to fix issue with large IO requests Several people reported IO-related issues since kernel 6.1.6 [0]. Things got better with 6.1.10, but apparently the issues are not fully resolved (e.g. [1]). I ran into an issue with PBS backup of a VM with passed-through disks (error with 6.1.6, hang with 6.1.10+) and found that the issue did not occur anymore with v6.3-rc1. Bisecting what fixed the issue led to the commit in this patch. The hope is that it fixes some other issues too. The commit has a CC-stable tag for 5.15+, but telling from the absence of user reports, it was much less likely to trigger before 6.1.x (it's not clear what x is, because of the other issue in 6.1.6). The commit says it depends on 613b14884b85 ("block: handle bio_split_to_limits() NULL return") which is already present as a3f1c82e0413 ("block: handle bio_split_to_limits() NULL return") in the Ubuntu tree. [0]: https://forum.proxmox.com/threads/119483/post-530365 [1]: https://forum.proxmox.com/threads/119483/post-537991 Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>	2023-03-07 19:38:11 +01:00
Thomas Lamprecht	fc2b61b134	update submodule and patches to 6.1.14 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-02-27 18:09:00 +01:00
Thomas Lamprecht	9fde3ef1c6	wireless: Add Debian wireless-regdb certificates so that plain Debian crda + wireless-regdb can work, alternatively we could disable CRDA and bake in the regdb directly in the kernel, using the CFG80211_INTERNAL_REGDB KConfig. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-02-10 12:48:20 +01:00
Thomas Lamprecht	7c0483e8cd	update to Proxmox-6.1.10-1 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-02-07 14:09:31 +01:00
Fabian Grünbichler	826eb0ff89	build: re-enable BTF but allow discarding BTF information when loading modules, so that upgrades which are otherwise ABI compatible still work. this allows using BTF information when matching and available, while degrading gracefully if the currently running kernel is not identical to the one that module was built for. in case of a mismatch, the kernel will log a warning when loading the module, for example: Jan 30 13:57:58 test kernel: BPF: type_id=184 bits_offset=4096 Jan 30 13:57:58 test kernel: BPF: Jan 30 13:57:58 test kernel: BPF: Invalid name Jan 30 13:57:58 test kernel: BPF: Jan 30 13:57:58 test kernel: failed to validate module [bonding] BTF: -22 Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2023-01-31 17:44:18 +01:00
Thomas Lamprecht	2162f4c4e7	backport fix for CPU stalls with hugepage in use Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-01-31 10:21:37 +01:00
Thomas Lamprecht	5ddf42542e	rebase patches on top of Ubuntu-6.1.0-14.14 (generated with debian/scripts/import-upstream-tag) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-01-28 15:01:34 +01:00
Thomas Lamprecht	3ba39b6c0a	rever fortify patch that breaks our gcc 10.2 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-01-10 08:53:57 +01:00
Thomas Lamprecht	4d1db3083c	backport some fixes-fixes from v6.1.4 found with git log --decorate v5.16^..v6.1.4 -- Makefile kernel/ secuirty drivers/ fs \ block mm net virt/ ipc init arch/x86/ \| ~/gitdm/stablefixes \ --fixed-after v6.1.2 --regressed-before v6.1.2 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-01-07 14:52:24 +01:00
Thomas Lamprecht	a0a93ff7fe	revert two stable patches that have reports about regressions we never released them yet (only introduced after 6.1.0), but there are upstream reports about regressions for them at: https://lore.kernel.org/netdev/CAK8fFZ5pzMaw3U1KXgC_OK4shKGsN=HDcR62cfPOuL0umXE1Ww@mail.gmail.com/ https://lore.kernel.org/netdev/CAK8fFZ6A_Gphw_3-QMGKEFQk=sfCw1Qmq0TVZK3rtAi7vb621A@mail.gmail.com/ So do a preventive revert. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2023-01-07 13:52:36 +01:00

1 2 3 4 5

235 Commits