Revert "add patch to work around stuck guest IO with iothread and VirtIO block/SCSI"

This reverts commit 6b7c1815e1.

The attempted fix has been reported to cause high CPU usage after
backup [0]. Not difficult to reproduce and it's iothreads getting
stuck in a loop. Downgrading to pve-qemu-kvm=8.1.2-4 helps which was
also verified by Christian, thanks! The issue this was supposed to fix
is much rarer, so revert for now, while upstream is still working on a
proper fix.

[0]: https://forum.proxmox.com/threads/138140/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
This commit is contained in:
Fiona Ebner 2023-12-15 14:10:35 +01:00
parent c6eb05a799
commit 2a49e667ba
2 changed files with 0 additions and 67 deletions

View File

@ -1,66 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Tue, 5 Dec 2023 14:05:49 +0100
Subject: [PATCH] virtio blk/scsi: work around iothread polling getting stuck
with drain
When using iothread, after commits
1665d9326f ("virtio-blk: implement BlockDevOps->drained_begin()")
766aa2de0f ("virtio-scsi: implement BlockDevOps->drained_begin()")
it can happen that polling gets stuck when draining. This would cause
IO in the guest to get completely stuck.
A workaround for users is stopping and resuming the vCPUs because that
would also stop and resume the dataplanes which would kick the host
notifiers.
This can happen with block jobs like backup and drive mirror as well
as with hotplug [2].
Reports in the community forum that might be about this issue[0][1]
and there is also one in the enterprise support channel.
As a workaround in the code, just re-enable notifications and kick the
virt queue after draining. Draining is already costly and rare, so no
need to worry about a performance penalty here. This was taken from
the following comment of a QEMU developer [3] (in my debugging,
I had already found re-enabling notification to work around the issue,
but also kicking the queue is more complete).
[0]: https://forum.proxmox.com/threads/137286/
[1]: https://forum.proxmox.com/threads/137536/
[2]: https://issues.redhat.com/browse/RHEL-3934
[3]: https://issues.redhat.com/browse/RHEL-3934?focusedId=23562096&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-23562096
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/block/virtio-blk.c | 2 ++
hw/scsi/virtio-scsi.c | 2 ++
2 files changed, 4 insertions(+)
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 39e7f23fab..22502047d5 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1537,6 +1537,8 @@ static void virtio_blk_drained_end(void *opaque)
for (uint16_t i = 0; i < s->conf.num_queues; i++) {
VirtQueue *vq = virtio_get_queue(vdev, i);
virtio_queue_aio_attach_host_notifier(vq, ctx);
+ virtio_queue_set_notification(vq, 1);
+ virtio_queue_notify(vdev, i);
}
}
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 45b95ea070..a7bddbf899 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -1166,6 +1166,8 @@ static void virtio_scsi_drained_end(SCSIBus *bus)
for (uint32_t i = 0; i < total_queues; i++) {
VirtQueue *vq = virtio_get_queue(vdev, i);
virtio_queue_aio_attach_host_notifier(vq, s->ctx);
+ virtio_queue_set_notification(vq, 1);
+ virtio_queue_notify(vdev, i);
}
}

View File

@ -60,4 +60,3 @@ pve/0042-Revert-block-rbd-implement-bdrv_co_block_status.patch
pve/0043-alloc-track-fix-deadlock-during-drop.patch
pve/0044-migration-for-snapshots-hold-the-BQL-during-setup-ca.patch
pve/0045-savevm-async-don-t-hold-BQL-during-setup.patch
pve/0046-virtio-blk-scsi-work-around-iothread-polling-getting.patch