pve-qemu-qoup/debian/patches/extra/0001-monitor-qmp-fix-race-on-CHR_EVENT_CLOSED-without-OOB.patch
Stefan Reiter 8dca018b68 udpate and rebase to QEMU v6.0.0
Mostly minor changes, bigger ones summarized:
* QEMU's internal backup code now uses a new async system, which allows
  parallel requests - the default max_workers settings is 64, I chose
  less, since 64 put enough stress on QEMU that the guest became
  practically unusable during the backup, and 16 still shows quite a
  nice measureable performance improvement. Little code changes for us
  though.
* 'malformed' QAPI parameters/functions are now a build error (i.e.
  using '_' vs '-'), I chose to just whitelist our calls in the name of
  backwards compatibility.
* monitor OOB race fix now uses the upstream variant, cherry-picked from
  origin/master since it's not in 6.0 by default
* last patch fixes a bug with snapshot rollback related to the new yank
  system

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-05-28 11:29:44 +02:00

85 lines
3.4 KiB
Diff

From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Mon, 22 Mar 2021 16:40:24 +0100
Subject: [PATCH] monitor/qmp: fix race on CHR_EVENT_CLOSED without OOB
The QMP dispatcher coroutine holds the qmp_queue_lock over a yield
point, where it expects to be rescheduled from the main context. If a
CHR_EVENT_CLOSED event is received just then, it can race and block the
main thread on the mutex in monitor_qmp_cleanup_queue_and_resume.
monitor_resume does not need to be called from main context, so we can
call it immediately after popping a request from the queue, which allows
us to drop the qmp_queue_lock mutex before yielding.
Suggested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Message-Id: <20210322154024.15011-1-s.reiter@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
monitor/qmp.c | 40 ++++++++++++++++++++++------------------
1 file changed, 22 insertions(+), 18 deletions(-)
diff --git a/monitor/qmp.c b/monitor/qmp.c
index 2b0308f933..092c527b6f 100644
--- a/monitor/qmp.c
+++ b/monitor/qmp.c
@@ -257,24 +257,6 @@ void coroutine_fn monitor_qmp_dispatcher_co(void *data)
trace_monitor_qmp_in_band_dequeue(req_obj,
req_obj->mon->qmp_requests->length);
- if (qatomic_xchg(&qmp_dispatcher_co_busy, true) == true) {
- /*
- * Someone rescheduled us (probably because a new requests
- * came in), but we didn't actually yield. Do that now,
- * only to be immediately reentered and removed from the
- * list of scheduled coroutines.
- */
- qemu_coroutine_yield();
- }
-
- /*
- * Move the coroutine from iohandler_ctx to qemu_aio_context for
- * executing the command handler so that it can make progress if it
- * involves an AIO_WAIT_WHILE().
- */
- aio_co_schedule(qemu_get_aio_context(), qmp_dispatcher_co);
- qemu_coroutine_yield();
-
/*
* @req_obj has a request, we hold req_obj->mon->qmp_queue_lock
*/
@@ -298,8 +280,30 @@ void coroutine_fn monitor_qmp_dispatcher_co(void *data)
monitor_resume(&mon->common);
}
+ /*
+ * Drop the queue mutex now, before yielding, otherwise we might
+ * deadlock if the main thread tries to lock it.
+ */
qemu_mutex_unlock(&mon->qmp_queue_lock);
+ if (qatomic_xchg(&qmp_dispatcher_co_busy, true) == true) {
+ /*
+ * Someone rescheduled us (probably because a new requests
+ * came in), but we didn't actually yield. Do that now,
+ * only to be immediately reentered and removed from the
+ * list of scheduled coroutines.
+ */
+ qemu_coroutine_yield();
+ }
+
+ /*
+ * Move the coroutine from iohandler_ctx to qemu_aio_context for
+ * executing the command handler so that it can make progress if it
+ * involves an AIO_WAIT_WHILE().
+ */
+ aio_co_schedule(qemu_get_aio_context(), qmp_dispatcher_co);
+ qemu_coroutine_yield();
+
/* Process request */
if (req_obj->req) {
if (trace_event_get_state(TRACE_MONITOR_QMP_CMD_IN_BAND)) {