cherry-pick fix for data corruption
cherry-picked from 2.2.0-staging, fixing https://github.com/openzfs/zfs/issues/15526 Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
This commit is contained in:
parent
e295f30e6a
commit
3db00caad9
97
debian/patches/0018-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch
vendored
Normal file
97
debian/patches/0018-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch
vendored
Normal file
@ -0,0 +1,97 @@
|
|||||||
|
From 9b9b09f452a469458451c221debfbab944e7f081 Mon Sep 17 00:00:00 2001
|
||||||
|
From: Rob N <robn@despairlabs.com>
|
||||||
|
Date: Wed, 29 Nov 2023 04:15:48 +1100
|
||||||
|
Subject: [PATCH] dnode_is_dirty: check dnode and its data for dirtiness
|
||||||
|
MIME-Version: 1.0
|
||||||
|
Content-Type: text/plain; charset=UTF-8
|
||||||
|
Content-Transfer-Encoding: 8bit
|
||||||
|
|
||||||
|
Over its history this the dirty dnode test has been changed between
|
||||||
|
checking for a dnodes being on `os_dirty_dnodes` (`dn_dirty_link`) and
|
||||||
|
`dn_dirty_record`.
|
||||||
|
|
||||||
|
de198f2d9 Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency
|
||||||
|
2531ce372 Revert "Report holes when there are only metadata changes"
|
||||||
|
ec4f9b8f3 Report holes when there are only metadata changes
|
||||||
|
454365bba Fix dirty check in dmu_offset_next()
|
||||||
|
66aca2473 SEEK_HOLE should not block on txg_wait_synced()
|
||||||
|
|
||||||
|
Also illumos/illumos-gate@c543ec060d illumos/illumos-gate@2bcf0248e9
|
||||||
|
|
||||||
|
It turns out both are actually required.
|
||||||
|
|
||||||
|
In the case of appending data to a newly created file, the dnode proper
|
||||||
|
is dirtied (at least to change the blocksize) and dirty records are
|
||||||
|
added. Thus, a single logical operation is represented by separate
|
||||||
|
dirty indicators, and must not be separated.
|
||||||
|
|
||||||
|
The incorrect dirty check becomes a problem when the first block of a
|
||||||
|
file is being appended to while another process is calling lseek to skip
|
||||||
|
holes. There is a small window where the dnode part is undirtied while
|
||||||
|
there are still dirty records. In this case, `lseek(fd, 0, SEEK_DATA)`
|
||||||
|
would not know that the file is dirty, and would go to
|
||||||
|
`dnode_next_offset()`. Since the object has no data blocks yet, it
|
||||||
|
returns `ESRCH`, indicating no data found, which results in `ENXIO`
|
||||||
|
being returned to `lseek()`'s caller.
|
||||||
|
|
||||||
|
Since coreutils 9.2, `cp` performs sparse copies by default, that is, it
|
||||||
|
uses `SEEK_DATA` and `SEEK_HOLE` against the source file and attempts to
|
||||||
|
replicate the holes in the target. When it hits the bug, its initial
|
||||||
|
search for data fails, and it goes on to call `fallocate()` to create a
|
||||||
|
hole over the entire destination file.
|
||||||
|
|
||||||
|
This has come up more recently as users upgrade their systems, getting
|
||||||
|
OpenZFS 2.2 as well as a newer coreutils. However, this problem has been
|
||||||
|
reproduced against 2.1, as well as on FreeBSD 13 and 14.
|
||||||
|
|
||||||
|
This change simply updates the dirty check to check both types of dirty.
|
||||||
|
If there's anything dirty at all, we immediately go to the "wait for
|
||||||
|
sync" stage, It doesn't really matter after that; both changes are on
|
||||||
|
disk, so the dirty fields should be correct.
|
||||||
|
|
||||||
|
Sponsored-by: Klara, Inc.
|
||||||
|
Sponsored-by: Wasabi Technology, Inc.
|
||||||
|
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
|
||||||
|
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
|
||||||
|
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
|
||||||
|
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
|
||||||
|
Closes #15571
|
||||||
|
Closes #15526
|
||||||
|
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
|
||||||
|
---
|
||||||
|
module/zfs/dnode.c | 12 ++++++++++--
|
||||||
|
1 file changed, 10 insertions(+), 2 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/module/zfs/dnode.c b/module/zfs/dnode.c
|
||||||
|
index 7cf03264d..ad9988366 100644
|
||||||
|
--- a/module/zfs/dnode.c
|
||||||
|
+++ b/module/zfs/dnode.c
|
||||||
|
@@ -1764,7 +1764,14 @@ dnode_try_claim(objset_t *os, uint64_t object, int slots)
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
- * Checks if the dnode contains any uncommitted dirty records.
|
||||||
|
+ * Checks if the dnode itself is dirty, or is carrying any uncommitted records.
|
||||||
|
+ * It is important to check both conditions, as some operations (eg appending
|
||||||
|
+ * to a file) can dirty both as a single logical unit, but they are not synced
|
||||||
|
+ * out atomically, so checking one and not the other can result in an object
|
||||||
|
+ * appearing to be clean mid-way through a commit.
|
||||||
|
+ *
|
||||||
|
+ * Do not change this lightly! If you get it wrong, dmu_offset_next() can
|
||||||
|
+ * detect a hole where there is really data, leading to silent corruption.
|
||||||
|
*/
|
||||||
|
boolean_t
|
||||||
|
dnode_is_dirty(dnode_t *dn)
|
||||||
|
@@ -1772,7 +1779,8 @@ dnode_is_dirty(dnode_t *dn)
|
||||||
|
mutex_enter(&dn->dn_mtx);
|
||||||
|
|
||||||
|
for (int i = 0; i < TXG_SIZE; i++) {
|
||||||
|
- if (multilist_link_active(&dn->dn_dirty_link[i])) {
|
||||||
|
+ if (multilist_link_active(&dn->dn_dirty_link[i]) ||
|
||||||
|
+ !list_is_empty(&dn->dn_dirty_records[i])) {
|
||||||
|
mutex_exit(&dn->dn_mtx);
|
||||||
|
return (B_TRUE);
|
||||||
|
}
|
||||||
|
--
|
||||||
|
2.39.2
|
||||||
|
|
1
debian/patches/series
vendored
1
debian/patches/series
vendored
@ -15,3 +15,4 @@
|
|||||||
0015-Fix-block-cloning-between-unencrypted-and-encrypted-.patch
|
0015-Fix-block-cloning-between-unencrypted-and-encrypted-.patch
|
||||||
0016-Add-a-tunable-to-disable-BRT-support.patch
|
0016-Add-a-tunable-to-disable-BRT-support.patch
|
||||||
0017-zfs-2.2.1-Disable-block-cloning-by-default.patch
|
0017-zfs-2.2.1-Disable-block-cloning-by-default.patch
|
||||||
|
0018-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch
|
||||||
|
Loading…
Reference in New Issue
Block a user