The original fix disabled the xsaves feature for zen1/2. The issue has
since been fixed in the cpus microcode and this patch keeps the feature enabled
if the microcode version is recent enough to contain the fix.
The patch had to be altered slightly to apply cleanly on 6.5, but no
changes content-wise.
Signed-off-by: Folke Gleumes <f.gleumes@proxmox.com>
copying files within a cifs-share currently result in the following
trace:
```
[ 495.388739] BUG: unable to handle page fault for address: fffffffffffffffe
[ 495.388744] #PF: supervisor read access in kernel mode
[ 495.388746] #PF: error_code(0x0000) - not-present page
[ 495.388747] PGD 172c3f067 P4D 172c3f067 PUD 172c41067 PMD 0
[ 495.388752] Oops: 0000 [#2] PREEMPT SMP NOPTI
[ 495.388754] CPU: 1 PID: 3894 Comm: cp Tainted: G D 6.5.0-32-generic #32-Ubuntu [ 495.388756] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 4.2023.08-4 02/15/2024
[ 495.388758] RIP: 0010:cifs_flush_folio+0x41/0xf0 [cifs]
...
```
a quick check identified proxmox-kernel-6.5.13-2 as the first affected
version, and `2dc07a11e269bfbe5589e99b60cdbae0118be979` as likely
source of the issue. The commit adapts the changes from
`7b2404a886f8b91250c31855d287e632123e1746` to work with the code in
kernel 6.1.
This is not needed as the relevant changes were made in 6.4 and
are already part of the 6.5 tree -
`66dabbb65d673aef40dd17bf62c042be8f6d4a4b`
reverting the commit fixes copying files within a samba share.
Tested/reproduced with:
* a VM with the kernel as cifs-client
* one very crude samba-share allowing guest-write access on a Debian
bookworm host
* as well as a share using cifscreds + multiuser (`mount.cifs(8)`)
* mounting the share, copying any file from one directory to another
on the same share (with `cp` and Thunar and Nautilus).
Reported to Ubuntu upstream at [1].
[0] https://lore.kernel.org/linux-mm/ZZhrpNJ3zxMR8wcU@eldamar.lan/
[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2055002
Reported-by: Daniela Häsler <daniela@proxmox.com>
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
This reverts commit 29cb6fcbb7, user
feedback was showing any positive impact of this patch, and upstream
still hasn't a fix for older stable releases (but for 6.8), so for now
rather revert this and wait for either a better (well, actual) fix or
updating to 6.8 or newer.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Users have been reporting [1] that VMs occasionally become
unresponsive with high CPU usage for some time (varying between ~1 and
more than 60 seconds). After that time, the guests come back and
continue running. Windows VMs seem most affected (not responding to
pings during the hang, RDP sessions time out), but we also got reports
about Linux VMs (reporting soft lockups). The issue was not present on
host kernel 5.15 and was first reported with kernel 6.2. Users
reported that the issue becomes easier to trigger the more memory is
assigned to the guests. Setting mitigations=off was reported to
alleviate (but not eliminate) the issue. For most users the issue
seems to disappear after (also) disabling KSM [2], but some users
reported freezes even with KSM disabled [3].
It turned out the reports concerned NUMA hosts only, and that the
freezes correlated with runs of the NUMA balancer [4]. Users reported
that disabling the NUMA balancer resolves the issue (even with KSM
enabled).
We put together a Linux VM reproducer, ran a git-bisect on the kernel
to find the commit introducing the issue and asked upstream for help
[5]. As it turned out, an upstream bugreport was recently opened [6]
and a preliminary fix to the KVM TDP MMU was proposed [7]. With that
patch [7] on top of kernel 6.7, the reproducer does not trigger
freezes anymore. As of now, the patch (or its v2 [8]) is not yet
merged in the mainline kernel, and backporting it may be difficult due
to dependencies on other KVM changes [9].
However, the bugreport [6] also prompted an upstream developer to
propose a patch to the kernel scheduler logic that decides whether a
contended spinlock/rwlock should be dropped [10]. Without the patch,
PREEMPT_DYNAMIC kernels (such as ours) would always drop contended
locks. With the patch, the kernel only drops contended locks if the
kernel is currently set to preempt=full. As noted in the commit
message [10], this can (counter-intuitively) improve KVM performance.
Our kernel defaults to preempt=voluntary (according to
/sys/kernel/debug/sched/preempt), so with the patch it does not drop
contended locks anymore, and the reproducer does not trigger freezes
anymore. Hence, backport [10] to our kernel.
[1] https://forum.proxmox.com/threads/130727/
[2] https://forum.proxmox.com/threads/130727/page-4#post-575886
[3] https://forum.proxmox.com/threads/130727/page-8#post-617587
[4] https://www.kernel.org/doc/html/latest/admin-guide/sysctl/kernel.html#numa-balancing
[5] https://lore.kernel.org/kvm/832697b9-3652-422d-a019-8c0574a188ac@proxmox.com/
[6] https://bugzilla.kernel.org/show_bug.cgi?id=218259
[7] https://lore.kernel.org/all/20230825020733.2849862-1-seanjc@google.com/
[8] https://lore.kernel.org/all/20240110012045.505046-1-seanjc@google.com/
[9] https://lore.kernel.org/kvm/Zaa654hwFKba_7pf@google.com/
[10] https://lore.kernel.org/all/20240110214723.695930-1-seanjc@google.com/
Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
There shouldn't be any changes for us w.r.t. data integrity and the
recent uncovered dnode dirtiness, as we backported those patches
already.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
since e2d55709398e ("vfio: Fold vfio_virqfd.ko into vfio.ko") this
config isn't a tristate anymore but a bool, so adapt to that.
Luckily the kconfig script did the right thing and set (or at least
kept) this to yes anyway
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
the signed template together with the binary package(s) containing the unsigned
files form the input to our secure boot signing service.
the signed template consists of
- files.json (specifying which files are signed how and by which key)
- packaging template used to build the signed package(s)
the signing service
- extracts and checks the signed-template binary package
- extracts the unsigned package(s)
- signs the needed files
- packs up the signatures + the template contained in the signed-template
package into the signed source package
the signed source package can then be built in the regular fashion (in case of
the kernel packages, it will copy the kernel image, modules and some helper
files from the unsigned package, attach the signature created by the signing
service, and re-pack the result as signed-kernel package).
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>