It's in master-next of current ubuntu noble kernel git tree and a null
check cannot really hurt.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The reporter has an Adaptec 5805 controller (using the aacraid
driver), which reports a byteswapped page length for VPD page 0. It
reports "02 00" as page length instead of "00 02".
This stopped working with kernel 6.8.4 due to commit b5fc07a5fb56
("scsi: core: Consult supported VPD page list prior to fetching page")
To address that issue limit the page search scope to the size of our
VPD buffer to guard against devices returning a larger page count than
requested.
Reported-by: Peter Schneider <pschneider1968@googlemail.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The patch from commit e5731f4 ("backport fix for managing block flush
queue list") caused some fallout when used with LVM on root, as that
uses some rather odd (but previously working fine) PREFLUSH
| POSTFLUSH format that was now causing the list to be used without
being initialized, resulting in freezes.
Link: https://lore.kernel.org/all/20240608143115.972486-1-chengming.zhou@linux.dev/
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Reported in the community forum [0] and easy to reproduce by doing
e.g.
> while true; do mount -t nfs 192.168.20.148:/rpool/data /mnt/test; done
from another node for a share that does not exist or for which the
client has no permissions.
[0]: https://forum.proxmox.com/threads/146649
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
The original fix disabled the xsaves feature for zen1/2. The issue has
since been fixed in the cpus microcode and this patch keeps the feature enabled
if the microcode version is recent enough to contain the fix.
Signed-off-by: Folke Gleumes <f.gleumes@proxmox.com>
With apparmor 4, when recvmsg() calls are checked by the apparmor LSM
they will always return EINVAL.
This causes very weird issues when apparmor profiles are in use, and a
lot of networking issues in containers (which are always using
apparmor).
When coming from sys_recvmsg, msg->msg_namelen is explicitly set to
zero early on. (see ____sys_recvmsg in net/socket.c)
We still end up in 'map_addr' where the assumption is that addr !=
NULL means addrlen has a valid size.
This is likely not a final fix, it was suggested by jjohansen on irc
to get things going until this is resolved properly.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
This reverts commit 29cb6fcbb7, user
feedback was showing any positive impact of this patch, and upstream
still hasn't a fix for older stable releases (but for 6.8), so for now
rather revert this and wait for either a better (well, actual) fix or
updating to 6.8 or newer.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Users have been reporting [1] that VMs occasionally become
unresponsive with high CPU usage for some time (varying between ~1 and
more than 60 seconds). After that time, the guests come back and
continue running. Windows VMs seem most affected (not responding to
pings during the hang, RDP sessions time out), but we also got reports
about Linux VMs (reporting soft lockups). The issue was not present on
host kernel 5.15 and was first reported with kernel 6.2. Users
reported that the issue becomes easier to trigger the more memory is
assigned to the guests. Setting mitigations=off was reported to
alleviate (but not eliminate) the issue. For most users the issue
seems to disappear after (also) disabling KSM [2], but some users
reported freezes even with KSM disabled [3].
It turned out the reports concerned NUMA hosts only, and that the
freezes correlated with runs of the NUMA balancer [4]. Users reported
that disabling the NUMA balancer resolves the issue (even with KSM
enabled).
We put together a Linux VM reproducer, ran a git-bisect on the kernel
to find the commit introducing the issue and asked upstream for help
[5]. As it turned out, an upstream bugreport was recently opened [6]
and a preliminary fix to the KVM TDP MMU was proposed [7]. With that
patch [7] on top of kernel 6.7, the reproducer does not trigger
freezes anymore. As of now, the patch (or its v2 [8]) is not yet
merged in the mainline kernel, and backporting it may be difficult due
to dependencies on other KVM changes [9].
However, the bugreport [6] also prompted an upstream developer to
propose a patch to the kernel scheduler logic that decides whether a
contended spinlock/rwlock should be dropped [10]. Without the patch,
PREEMPT_DYNAMIC kernels (such as ours) would always drop contended
locks. With the patch, the kernel only drops contended locks if the
kernel is currently set to preempt=full. As noted in the commit
message [10], this can (counter-intuitively) improve KVM performance.
Our kernel defaults to preempt=voluntary (according to
/sys/kernel/debug/sched/preempt), so with the patch it does not drop
contended locks anymore, and the reproducer does not trigger freezes
anymore. Hence, backport [10] to our kernel.
[1] https://forum.proxmox.com/threads/130727/
[2] https://forum.proxmox.com/threads/130727/page-4#post-575886
[3] https://forum.proxmox.com/threads/130727/page-8#post-617587
[4] https://www.kernel.org/doc/html/latest/admin-guide/sysctl/kernel.html#numa-balancing
[5] https://lore.kernel.org/kvm/832697b9-3652-422d-a019-8c0574a188ac@proxmox.com/
[6] https://bugzilla.kernel.org/show_bug.cgi?id=218259
[7] https://lore.kernel.org/all/20230825020733.2849862-1-seanjc@google.com/
[8] https://lore.kernel.org/all/20240110012045.505046-1-seanjc@google.com/
[9] https://lore.kernel.org/kvm/Zaa654hwFKba_7pf@google.com/
[10] https://lore.kernel.org/all/20240110214723.695930-1-seanjc@google.com/
Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
There shouldn't be any changes for us w.r.t. data integrity and the
recent uncovered dnode dirtiness, as we backported those patches
already.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
since e2d55709398e ("vfio: Fold vfio_virqfd.ko into vfio.ko") this
config isn't a tristate anymore but a bool, so adapt to that.
Luckily the kconfig script did the right thing and set (or at least
kept) this to yes anyway
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
the signed template together with the binary package(s) containing the unsigned
files form the input to our secure boot signing service.
the signed template consists of
- files.json (specifying which files are signed how and by which key)
- packaging template used to build the signed package(s)
the signing service
- extracts and checks the signed-template binary package
- extracts the unsigned package(s)
- signs the needed files
- packs up the signatures + the template contained in the signed-template
package into the signed source package
the signed source package can then be built in the regular fashion (in case of
the kernel packages, it will copy the kernel image, modules and some helper
files from the unsigned package, attach the signature created by the signing
service, and re-pack the result as signed-kernel package).
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
it's really not just ZFS and AMDGPU modules, but way more and
generating scary looking messages for these "issues" is just noise
that drown real issues. Disable this for now, maybe in another few
years.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
to silence array-index-out-of-bounds warnings for dynamically-sized
arrays. All commits applied cleanly and just replace array[1] with
array[].
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
makes it easier to cherry-pick newer stable release tags, that
sometimes contain new config values one must pick from.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
it's mostly noise for users, and quiet some interpret this as real
problem and report it to us.
Ideally we'd either educate them, or take time ourself, to report this
upstream and see if the situation can be improved overall, but
currently that's not feasible. We should check this out a few releases
down, if the lower hanging fruits got fixed and noise got lower we
could enable it again to catch the more rare cases.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
We have a slightly better fix where only a few targeted ZFS module
parts are added to the UBSAN ignore-list, so the rest of the kernel
still gets exposure.
Link: https://github.com/openzfs/zfs/pull/15510
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This is generating far too much noise in the logs, so keep it at once
per boot until we (and other user space tools) adapted to the kernel
wanting user space to chose memfd execution behavior very explicitly.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Following ZFS commit ad9e76765 ("linux: module: weld all but spl.ko
into zfs.ko") we only have two modules to care about.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This improves compatibility for guests w.r.t. live-migration, or live
snapshot rollback, to hosts with less (FPU) xfeatures supported, as
long as the set of features that was actually exposed to the guest is
still supported.
This improves on the ad856280ddea ("x86/kvm/fpu: Limit guest
user_xfeatures to supported bits of XCR0") bug fix.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
this exposes the FLUSHBYASID CPU flag to nested VMs when running on an
AMD CPU. also reverts a made up check that would advertise
FLUSHBYASID as not supported. this enable certain modern hypervisors
such as VMWare ESXi 7 and Workstation 17 to run nested VMs properly
again.
Signed-off-by: Stefan Sterz <s.sterz@proxmox.com>
merge both versions, I saw the fix for AMD slightly to late and
previous build wasn't made public already anyway
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
From the upstream commit [0] that this update pulls in:
> Intel SPR erratum SPR4 says that if you trip into a vmexit while
> doing FPU save/restore, your AMX register state might misbehave...
> and by misbehave, I mean save all zeroes incorrectly, leading to
> explosions if you restore it.
>
> Since we're not using AMX for anything, the simple way to avoid
> this is to just not save/restore those when we do anything, since
> we're killing preemption of any sort across our save/restores.
>
> If we ever decide to use AMX, it's not clear that we have any
> way to mitigate this, on Linux...but I am not an expert.
[0]: c65aaa8387
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The latest amd64-microcode package in sid [0] (which probably will
eventually make it to bookworm-security) has a change that requires
the added patch to work properly.
The changelog-entry refers to stable k.o branches only - but a quick
look through the linux-firmware.git log identifies:
`f2eb058afc57348cde66852272d6bf11da1eef8f` as relevant commit, which
refers (as NOTE in the patch) to:
a32b0f0db3f3 ("x86/microcode/AMD: Load late on both threads too")
which applies cleanly (although I cherry-picked the patch from the
6.1.y stable branch to have the original commit in the commit
message).
quickly tested compiling and booting the result in a VM (however w/o
a fitting CPU (Epyc Genoa or Bergamo) it should cause a change)
reported in our Enterprise Support as potential culprit for one
thread from 128 being reported as offline in `lscpu`
[0] https://metadata.ftp-master.debian.org/changelogs//non-free-firmware/a/amd64-microcode/amd64-microcode_3.20230808.1.1_changelog
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Originally for v6.4-rc7 and now it also got already into some stable
trees, but not yet into a (released) ubuntu tag – so backport it
already.
Link: https://forum.proxmox.com/threads/133104/post-590457
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Avoids regressions where some code falsely think they cannot use some
CPU features like AVX1, e.g., ZFS.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
A user of ours reported an issue with p2p thunderbolt-net w.r.t. IPv6
and failure to reestablish the connection after a reboot of a peer
node, in the forum [0] and the relayed it upstream, so lets
cherry-pick those two patches to our 6.2. Especially the IPv6 one
seems straight forward, and the other one makes it actually spec
conform and should only improve things.
[0]: https://forum.proxmox.com/threads/133104/
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The mailing list thread [0] (found by Friedrich, many thanks!) leading
up to this patch sounds very familiar to issues users reported in the
community forum [1] and enterprise support channel, where a VM would
be stuck for no discernable reason with all vCPU threads spinning.
[0]: https://lore.kernel.org/all/f023d927-52aa-7e08-2ee5-59a2fbc65953@gameservers.com/T/#u
[1]: https://forum.proxmox.com/threads/127459/
Suggested-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
While there is no actual issue, users are still nervous about the
faulty logging [0]. It might take a while until the fix comes in via
upstream, so just pick it up manually.
[0]: https://forum.proxmox.com/threads/130628/post-583864
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
when not having installed an intel-microcode version containing the
mitigation, this options disables AVX instructions, which breaks quite
a lot of software (e.g. firefox, electron apps)
Reported-by: Stefan Hanreich <s.hanreich@proxmox.com>
Tested-by: Stefan Hanreich <s.hanreich@proxmox.com>
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
There were several reports about issues related to igc and tx timeout
and while the issue couldn't be reproduced locally, the hope is that
this fix Friedrich found will resolve the issue for the users. The
kernel versions in the reports would match with when 9b275176270e
("igc: Add ndo_tx_timeout support"), i.e. the one fixed by this
commit, landed.
[0]: https://forum.proxmox.com/threads/130935/
[1]: https://forum.proxmox.com/threads/130415/#post-580064
[2]: https://forum.proxmox.com/threads/132138/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
by cherry-picking the relevant commits from launchpad/lunar [0].
(relevant commits are based on k.o/stable commits for this)
minimally tested by booting my (ryzen) machine with this kernel and
skimming through dmesg after boot.
[0] git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/lunar
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
and drop PKGREL variable from Makefile, since every package release is a kernel ABI bump now.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
this is required for secure boot support.
at build time, an ephemeral key pair will be generated and all built modules
will be signed with it. the private key is discarded, and the public key
embedded in the kernel image for signature validation at module load time.
this change means that every kernel release must be considered an ABI change
from now on, else the signatures of on-disk modules and the signing key
embedded in the running kernel image might not match.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
long overdue, and avoids the issue of the meta packages version going down
after being folded in from the pve-kernel-meta repository.
the ABI needs to be bumped for every published kernel package now that modules
are signed, else the booted kernel image containing the public part of the
ephemeral signing key, and the on-disk (potentially upgraded in-place) signed
module files can disagree, and module loading would fail.
not changed (yet): git repository name, pve-firmware
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
the actual fix is the microcode update, but this is a stop-gap (with
a performance penalty) setting a chicken bit on affected CPUs that do
not have the new enough microcode loaded, disabling some features.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
we got quite some reports for this (e.g., Bugzilla or [0]), well in
non-enterprise setups as those cheap NVMe's just don't bother holding
up basic principles...
[0]: https://forum.proxmox.com/threads/128738/#post-567249
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Fixes live-migrations & snapshot-rollback of VMs with a restricted
CPU type (e.g., qemu64) from our 5.15 based kernel (default Proxmox
VE 7.4) to the 6.2 (and future newer) of Proxmox VE 8.0.
Previous to (upstream kernel) commit ad856280ddea ("x86/kvm/fpu: Limit
guest user_xfeatures to supported bits of XCR0") the PKRU bit of the
host could leak into the state from the guest, which caused trouble
when migrating between hosts with different CPUs, i.e., where the
source supported it but the target did not, causing a general
protection fault when the guest tried to use a pkru related
instruction after the migration.
But the fix, while welcome, caused a temporary out-of-sync state when
migrating such a VM from a kernel without the fix to a kernel with
the fix, as it threw of KVM when the CPUID of the guest and most of
the state doesn't report XSAVE and thus any xfeatures, but PKRU and
the related state is set as enabled, causing the vCPU to spin at 100%
without any progress forever.
The fix could be at two sites, either in QEMU or in the kernel, I
choose the kernel as we have all the info there for a targeted
heuristic so that we don't have to adapt QEMU and qemu-server, the
latter even on both sides.
Still, a short summary of the possible fixes and short drawbacks:
* on QEMU-side either
- clear the PKRU state in the migration saved state would be rather
complicated to implement as the vCPU is initialised way before we
have the saved xfeature state available to check what we'd need
to do, plus the user-space only gets a memory blob from ioctl
KVM_GET_XSAVE2 that it passes to KVM_SET_XSAVE ioctl, there are
no ABI guarantees, and while the struct seem stable for 5.15 to
6.5-rc1, that doesn't has to be for future kernels, so off the
table.
- enforce that the CPUID reports PKU support even if it normally
wouldn't. While this works (tested by hard-coding it as POC) it
is a) not really nice and b) needs some interaction from
qemu-server to enable this flag as otherwise we have no good info
to decide when it's OK to do this, which means we need to adapt
both PVE 7 and 8's qemu-server and also pve-qemu, workable but
not optimal
* on Kernel/KVM-side we can hook into the set XSAVE ioctl specific to
the KVM subsystem, which already reduces chance of regression for
all other places. There we have access to the union/struct
definitions of the saved state and thus can savely cast to that.
We also got access to the vCPU's CPUID capabilities, meaning we can
check if the XCR0 (first XSAVE Control Register) reports
that it support the PKRU feature, and if it does *NOT* but the
saved xfeatures register from XSAVE *DOES* report it, we can safely
assume that this combination is due to an migration from an older,
leaky kernel – and clear the bit in the xfeature register before
restoring it to the guest vCPU KVM state, avoiding the confusing
situation that made the vCPU spin at 100%.
This should be safe to do, as the guest vCPU CPUID never reported
support for the PKRU feature, and it's also a relatively niche and
newish feature.
If it gains us something we can drop this patch a bit in the future
Proxmox VE 9 major release, but we should ensure that VMs that where
started before PVE 8 cannot be directly live-migrated to the release
that includes that change; so we should rather only drop it if the
maintenance burden is high.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Should fix compat with SRIOV based Nvidia vGPU until they switch over
to using the vfio-pci-core framework instead of MDEV.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
no fun to build the kernel with just a single job at the same time,
which happens e.g., in an sbuild environment.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
lintian rightfully errors out on this one, makes no sense to depend
on an implementation detail of the perl packaging ecosystem.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Update the ZFS submodule so that it includes a commit with compat fix
[0] for kernel 6.2.8, which otherwise regressed build through the
484c2be84b49 ("block: count 'ios' and 'sectors' when io is done for
bio-based device") commit, which was backported to stable-6.2 from
the v6.3-rc3 "release".
[0]: 59f1875639
Link: https://github.com/openzfs/zfs/issues/14658
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Several people reported IO-related issues since kernel 6.1.6 [0].
Things got better with 6.1.10, but apparently the issues are not fully
resolved (e.g. [1]).
I ran into an issue with PBS backup of a VM with passed-through disks
(error with 6.1.6, hang with 6.1.10+) and found that the issue did not
occur anymore with v6.3-rc1. Bisecting what fixed the issue led to the
commit in this patch. The hope is that it fixes some other issues too.
The commit has a CC-stable tag for 5.15+, but telling from the absence
of user reports, it was much less likely to trigger before 6.1.x (it's
not clear what x is, because of the other issue in 6.1.6). The commit
says it depends on 613b14884b85 ("block: handle bio_split_to_limits()
NULL return") which is already present as a3f1c82e0413 ("block:
handle bio_split_to_limits() NULL return") in the Ubuntu tree.
[0]: https://forum.proxmox.com/threads/119483/post-530365
[1]: https://forum.proxmox.com/threads/119483/post-537991
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
so that plain Debian crda + wireless-regdb can work, alternatively we
could disable CRDA and bake in the regdb directly in the kernel,
using the CFG80211_INTERNAL_REGDB KConfig.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
but allow discarding BTF information when loading modules, so that upgrades
which are otherwise ABI compatible still work. this allows using BTF
information when matching and available, while degrading gracefully if the
currently running kernel is not identical to the one that module was built for.
in case of a mismatch, the kernel will log a warning when loading the module,
for example:
Jan 30 13:57:58 test kernel: BPF: type_id=184 bits_offset=4096
Jan 30 13:57:58 test kernel: BPF:
Jan 30 13:57:58 test kernel: BPF: Invalid name
Jan 30 13:57:58 test kernel: BPF:
Jan 30 13:57:58 test kernel: failed to validate module [bonding] BTF: -22
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
This is provdied by both initramfs-tools and dracut.
Required to be able to use dracut in place of
initramfs-tools.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
same info but shorter, avoiding cut-off on `uname -a` output due to
the relatively newly changed and reported "SMP PREEMPT_DYNAMIC" mode.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
this was actually intended for the stable 5.15 branch, already
included in 5.19.
This reverts commit 198fde3a16.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The following issue reported on the community forum [0] is likely
fixed by this.
In my case, loading a VM snapshot that originally was taken on an
Intel CPU on my AMD-based host often caused problems in other VMs. In
particular, it often led to CPU stalls, and sometimes clock jumps far
into the future. With this backport applied, everything seems to run
smoothly even after loading the "bad" snapshot 10 times.
The backport from upstream commit 11d39e8cc43e ("KVM: SVM: fix tsc
scaling cache logic consisted of dropping the parts for nested TSC
scaling, which is not yet present in our kernel, renaming the constant
for the default ratio, and some context changes.
[0] https://forum.proxmox.com/threads/112756/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
hio driver got removed by ubuntu already in jammy, but then they
forgot to remove this instance too, failing the clean build target,
my patch got accepted but was forgotten when doing the same in
kinetic, so here we go again
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Useless since 5.5 and will fail build with 5.16+, see upstream linux
commit 7ecaf069da52 and 4fbce819337a for some details
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
we got lots of reports with QNAP NFS being broken, and the commit
this cherry picked one fixes got backported to 5.15 by canonical, so
its def. worth a try.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
we cannot make this a built-in easily due to kconfig dependency
resolution.
We'll handle the availability in initrd with a initramfs modules.d
snippet shipped by the meta package,
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
pve-headers-$(uname -r) is equivalent to
linux-headers-$(uname -r)-amd64
pve-kernel-$(uname -r) is equivalent to
linux-image-$(uname -r)-amd64
By adding a provides this should help users running
`apt install linux-headers-$(uname -r)-amd64` which is commonly
suggested in install instructions for third-party kernel-drivers on
plain debian.
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
used for compressing the kernel image, build fails if not installed.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Acked-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Reviewed-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
There were quite a few reports in the community forum about Windows
VMs with SATA disks not working after upgrading to kernel 5.13.
Issue was reproducible during the installation of Win2019 (suggested
by Thomas), and it's already fixed in 5.15. Bisecting led to
io-wq: split bounded and unbounded work into separate lists
as the commit fixing the issue.
Indeed, the commit states
Fixes: ecc53c48c13d ("io-wq: check max_worker limits if a worker transitions bound state")
which is present as a backport in ubuntu-impish:
f9eb79f840052285408ae9082dc4419dc1397954
The first backport
io-wq: fix queue stalling race
also sounds nice to have and additionally served as a preparation for
the second one to apply more cleanly.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
The patch is not needed anymore, because the fix is already in
ubuntu-impish (commit d0b69849e40b2c3582f1cd6573f8e0d3a033d078).
Unfortunately, the patch still applied (in the wrong place), making it
hard to notice.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Debian did so since 5.10~rc7-1~exp1 and ubuntu only disabled it due
some concerns about "high" memory usage on many-core systems[0], high
is to be seen relative here as its 26 MiB on 208 cores[1] and only
matters for ubuntu as due to their snaps they may have a lot of
active squashfs mounts.
Proxmox projects do not use snaps, or other things that uses squashfs
instances a tall besides the installer. While some users may use a
few it is unlikely to cause much problems (a few 100 MiB should not
be a big problem on a server with hundreds of online cores.
Any how, to speed up decompression in our installer and use a similar
setting as Debian, the distro we're most similar too, enable this
Kconfig knob.
[0]: https://bugs.launchpad.net/snappy/+bug/1636847
[1]: https://bugs.launchpad.net/snappy/+bug/1636847/comments/21
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
breaking some NVME setups. these should be picked up by one of the next
Ubuntu kernel releases, since both the breaking change and the fix are
authored by Canonical devs.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
from ubuntu-hirsute upstream/master-next
cherry-pick only the 2 patches, because master-next is 970 commits
ahead of our current master.
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
which could be triggered in some corner cases with (but most likely
not limited to) LVM-backed QEMU guests using io_uring.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
sometimes the build would fail with
cp: cannot stat 'ubuntu-hirsute/.tmp_1987275': No such file or directory
make[1]: *** [debian/rules:181: .headers_prepare_mark] Error 1
make[1]: Leaving directory '/home/fgruenbichler/pve-kernel/build'
dpkg-buildpackage: error: fakeroot debian/rules binary subprocess returned exit status 2
make: *** [Makefile:58: pve-kernel-5.11.21-1-pve_5.11.21-1_amd64.deb] Error 2
if copying was slow enough.
so let's do the copying first, then do the rest in parallel without
needing to worry about side-effects.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
needed for it to be a proper replacement for linux-libc-dev when
resolving dependencies, such as for liburing-dev
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
and put them into a new -dbgsym package for usage with
crash/kdump-tools/...
fixes#3465, and now allows to do the following (after installing
and configuring kdump-tools to collect kernel crash dumps) when the
system crashes:
$ apt install pve-kernel-5.11.21-1-dbgsym
$ crash /usr/lib/debug/boot/vmlinux-5.11.21-1-pve /var/crash/202106151236/dump.202106151236
crash 7.2.9
Copyright (C) 2002-2020 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
WARNING: kernel relocated [812MB]: patching 136336 gdb minimal_symbol values
KERNEL: /usr/lib/debug/boot/vmlinux-5.11.21-1-pve
DUMPFILE: /var/crash/202106151236/dump.202106151236 [PARTIAL DUMP]
CPUS: 4
DATE: Tue Jun 15 12:36:38 CEST 2021
UPTIME: 00:06:21
LOAD AVERAGE: 0.04, 0.11, 0.08
TASKS: 272
NODENAME: test
RELEASE: 5.11.21-1-pve
VERSION: #1 SMP PVE 5.11.21-1 (Tue, 01 Jun 2021 16:38:57 +0200)
MACHINE: x86_64 (3696 Mhz)
MEMORY: 8 GB
PANIC: "Kernel panic - not syncing: sysrq triggered crash"
PID: 3167
COMMAND: "bash"
TASK: ffff9220c8f5be00 [THREAD_INFO: ffff9220c8f5be00]
CPU: 3
STATE: TASK_RUNNING (PANIC)
crash> bt
PID: 3167 TASK: ffff9220c8f5be00 CPU: 3 COMMAND: "bash"
#0 [ffffa24ec0bfbc80] machine_kexec at ffffffffb3c751f3
#1 [ffffa24ec0bfbce0] __crash_kexec at ffffffffb3d61092
#2 [ffffa24ec0bfbdb0] panic at ffffffffb47b769d
#3 [ffffa24ec0bfbe30] sysrq_handle_crash at ffffffffb434da4a
#4 [ffffa24ec0bfbe40] __handle_sysrq.cold at ffffffffb47e2cdc
#5 [ffffa24ec0bfbe78] write_sysrq_trigger at ffffffffb434e3f8
#6 [ffffa24ec0bfbe90] proc_reg_write at ffffffffb3fc09ea
#7 [ffffa24ec0bfbeb0] vfs_write at ffffffffb3f143b6
#8 [ffffa24ec0bfbee8] ksys_write at ffffffffb3f16b97
#9 [ffffa24ec0bfbf28] __x64_sys_write at ffffffffb3f16c2a
#10 [ffffa24ec0bfbf38] do_syscall_64 at ffffffffb480e868
#11 [ffffa24ec0bfbf50] entry_SYSCALL_64_after_hwframe at ffffffffb4a0008c
RIP: 00007f367f7baf33 RSP: 00007ffe6175dc98 RFLAGS: 00000246
RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f367f7baf33
RDX: 0000000000000002 RSI: 0000560510e640b0 RDI: 0000000000000001
RBP: 0000560510e640b0 R8: 000000000000000a R9: 0000000000000001
R10: 0000560510e5f800 R11: 0000000000000246 R12: 0000000000000002
R13: 00007f367f88b6a0 R14: 0000000000000002 R15: 00007f367f88b8a0
ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b
as well as lots of other fun things (see 'help' after opening a crash dump).
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
The prepare/compile/install targets feel a bit mixed, so it's not
100% clear where this should happen.
But as the `.headers_compile_mark` already triggers various kernel
build targets with a correct kconfig setup, it is a good fit to add
the modules_prepare step (which is recommended to use when preparing
a out-of-three (OOT) module build environment like dkms expects)
there. As we can only copy (= install) the `scripts` directory
afterwards it follows that it needs to be moved afterwards. Moving
installing the `include` directory there is not really necessary but
it feels like a better place than the _prepare_ target and safes a
extra line, so move that over too.
In terms of actual changes to the built header package we get
additionally the, now generated, module.lds file.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Those files are generated by the `if_changed` macro from
scripts/Kbuild.include and are not really useful or interesting for
being shipped in the header packages and other distros (checked
Debian and Ubuntu) do not seem to ship those at all..
So, lets prune them to reduce shipped files dramatically, without
losing, well, anything.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
we do not use module signing currently.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
(cherry picked from commit 77470417db)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
(generated with debian/scripts/import-upstream-tag)
+ manually dropped the now hopefully superfluous.
0006-Revert-scsi-lpfc-Fix-broken-Credit-Recovery-after-dr.patch
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
since it prevents boot with our current way of building ZFS modules in
case a system is booted with secureboot enabled.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
This was long overdue, allows to access the full feature set of our
kernel for some tools using the Linux API directly.
Packaging mostly taken from Debian[0]
[0]: https://salsa.debian.org/kernel-team/linux/-/blob/debian/4.19.118-2/debian/rules.real#L367
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Reviewed-By: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Since Ubuntu Eoan the kernel compression was changed from GZIP to
LZ4, due to slightly faster load times vs. a 25% size increase
trade-off (e.g. 5.0 had ~ 8, this one has ~ 12 MB; *but* the initrd
stays roughly the same size, and that one is 5 times bigger anyway)
If we want to keep that is in the stars, but for now correctly
document the build-dependency to LZ4.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The PC speaker (beeper) can only be managed by one module, and there
are two which could do so. The very basic INPUT_PCSPKR, and the more
advanced SND_PCSP which allows it to be used as primitive ALSA
soundcard, which for Proxmox Server projects, and all modern
workstations is not much of use.
As they both were aliased to the "pcspkr" module name, and used the
same internal driver name (being a replacment of the other), one
would get the following error message when both are loaded:
"Error: Driver 'pcspkr' is already registered, aborting..."
in the kernel log. This happens as by default both are tried to get
loaded. We do not want the more complex ALSA one, so disable that.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Allows to mount VFAT devices even if the currently running kernel was
removed before any VFAT, or other FS using the default Native
Language Support module was mounted during the current uptime.
This then could break updating the ESP partitions, which are mounted
with VFAT in a postrm triggered step - so at a time where the current
/lib/modules/... was already removed, and so the NLS could not get
loaded.
While there are a lot of different NLS, our kernel config has:
> CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
So compile that module as built-in.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
debian-control-has-dbgsym-package (in section for proxmox-kernel-*-pve-dbgsym) Package [debian/control:*]
license-problem-gfdl-invariants invariant part is: with the :ref:`invariant sections <fdl-invariant>` being list their titles, with the :ref:`front-cover texts <fdl-cover-texts>` being list, and with the :ref:`back-cover texts <fdl-cover-texts>` being list [ubuntu-kernel/Documentation/userspace-api/media/fdl-appendix.rst]
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.