Revert "add patch to work around stuck guest IO with iothread and VirtIO block/SCSI"
This reverts commit 6b7c1815e1
.
The attempted fix has been reported to cause high CPU usage after
backup [0]. Not difficult to reproduce and it's iothreads getting
stuck in a loop. Downgrading to pve-qemu-kvm=8.1.2-4 helps which was
also verified by Christian, thanks! The issue this was supposed to fix
is much rarer, so revert for now, while upstream is still working on a
proper fix.
[0]: https://forum.proxmox.com/threads/138140/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
This commit is contained in:
parent
c6eb05a799
commit
2a49e667ba
@ -1,66 +0,0 @@
|
|||||||
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
|
|
||||||
From: Fiona Ebner <f.ebner@proxmox.com>
|
|
||||||
Date: Tue, 5 Dec 2023 14:05:49 +0100
|
|
||||||
Subject: [PATCH] virtio blk/scsi: work around iothread polling getting stuck
|
|
||||||
with drain
|
|
||||||
|
|
||||||
When using iothread, after commits
|
|
||||||
1665d9326f ("virtio-blk: implement BlockDevOps->drained_begin()")
|
|
||||||
766aa2de0f ("virtio-scsi: implement BlockDevOps->drained_begin()")
|
|
||||||
it can happen that polling gets stuck when draining. This would cause
|
|
||||||
IO in the guest to get completely stuck.
|
|
||||||
|
|
||||||
A workaround for users is stopping and resuming the vCPUs because that
|
|
||||||
would also stop and resume the dataplanes which would kick the host
|
|
||||||
notifiers.
|
|
||||||
|
|
||||||
This can happen with block jobs like backup and drive mirror as well
|
|
||||||
as with hotplug [2].
|
|
||||||
|
|
||||||
Reports in the community forum that might be about this issue[0][1]
|
|
||||||
and there is also one in the enterprise support channel.
|
|
||||||
|
|
||||||
As a workaround in the code, just re-enable notifications and kick the
|
|
||||||
virt queue after draining. Draining is already costly and rare, so no
|
|
||||||
need to worry about a performance penalty here. This was taken from
|
|
||||||
the following comment of a QEMU developer [3] (in my debugging,
|
|
||||||
I had already found re-enabling notification to work around the issue,
|
|
||||||
but also kicking the queue is more complete).
|
|
||||||
|
|
||||||
[0]: https://forum.proxmox.com/threads/137286/
|
|
||||||
[1]: https://forum.proxmox.com/threads/137536/
|
|
||||||
[2]: https://issues.redhat.com/browse/RHEL-3934
|
|
||||||
[3]: https://issues.redhat.com/browse/RHEL-3934?focusedId=23562096&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-23562096
|
|
||||||
|
|
||||||
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
|
|
||||||
---
|
|
||||||
hw/block/virtio-blk.c | 2 ++
|
|
||||||
hw/scsi/virtio-scsi.c | 2 ++
|
|
||||||
2 files changed, 4 insertions(+)
|
|
||||||
|
|
||||||
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
|
|
||||||
index 39e7f23fab..22502047d5 100644
|
|
||||||
--- a/hw/block/virtio-blk.c
|
|
||||||
+++ b/hw/block/virtio-blk.c
|
|
||||||
@@ -1537,6 +1537,8 @@ static void virtio_blk_drained_end(void *opaque)
|
|
||||||
for (uint16_t i = 0; i < s->conf.num_queues; i++) {
|
|
||||||
VirtQueue *vq = virtio_get_queue(vdev, i);
|
|
||||||
virtio_queue_aio_attach_host_notifier(vq, ctx);
|
|
||||||
+ virtio_queue_set_notification(vq, 1);
|
|
||||||
+ virtio_queue_notify(vdev, i);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
|
|
||||||
index 45b95ea070..a7bddbf899 100644
|
|
||||||
--- a/hw/scsi/virtio-scsi.c
|
|
||||||
+++ b/hw/scsi/virtio-scsi.c
|
|
||||||
@@ -1166,6 +1166,8 @@ static void virtio_scsi_drained_end(SCSIBus *bus)
|
|
||||||
for (uint32_t i = 0; i < total_queues; i++) {
|
|
||||||
VirtQueue *vq = virtio_get_queue(vdev, i);
|
|
||||||
virtio_queue_aio_attach_host_notifier(vq, s->ctx);
|
|
||||||
+ virtio_queue_set_notification(vq, 1);
|
|
||||||
+ virtio_queue_notify(vdev, i);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
1
debian/patches/series
vendored
1
debian/patches/series
vendored
@ -60,4 +60,3 @@ pve/0042-Revert-block-rbd-implement-bdrv_co_block_status.patch
|
|||||||
pve/0043-alloc-track-fix-deadlock-during-drop.patch
|
pve/0043-alloc-track-fix-deadlock-during-drop.patch
|
||||||
pve/0044-migration-for-snapshots-hold-the-BQL-during-setup-ca.patch
|
pve/0044-migration-for-snapshots-hold-the-BQL-during-setup-ca.patch
|
||||||
pve/0045-savevm-async-don-t-hold-BQL-during-setup.patch
|
pve/0045-savevm-async-don-t-hold-BQL-during-setup.patch
|
||||||
pve/0046-virtio-blk-scsi-work-around-iothread-polling-getting.patch
|
|
||||||
|
Loading…
Reference in New Issue
Block a user