mirror of
https://git.proxmox.com/git/mirror_zfs.git
synced 2026-05-22 02:27:36 +03:00
Illumos 4390 - I/O errors can corrupt space map when deleting fs/vol
4390 i/o errors when deleting filesystem/zvol can lead to space map corruption Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Christopher Siden <christopher.siden@delphix.com> Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Dan McDonald <danmcd@omniti.com> Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com> Approved by: Dan McDonald <danmcd@omniti.com> References: https://www.illumos.org/issues/4390 https://github.com/illumos/illumos-gate/commit/7fd05ac Porting notes: Previous stack-reduction efforts in traverse_visitb() caused a fair number of un-mergable pieces of code. This patch should reduce its stack footprint a bit more. The new local bptree_entry_phys_t in bptree_add() is dynamically-allocated using kmem_zalloc() for the purpose of stack reduction. The new global zfs_free_leak_on_eio has been defined as an integer rather than a boolean_t as was the case with the related zfs_recover global. Also, zfs_free_leak_on_eio's definition has been inserted into zfs_debug.c for consistency with the existing definition of zfs_recover. Illumos placed it in spa_misc.c. Ported by: Tim Chase <tim@chase2k.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2545
This commit is contained in:
committed by
Brian Behlendorf
parent
9b67f60560
commit
fbeddd60b7
@@ -696,6 +696,43 @@ Set additional debugging flags
|
||||
Default value: \fB1\fR.
|
||||
.RE
|
||||
|
||||
.sp
|
||||
.ne 2
|
||||
.na
|
||||
\fBzfs_free_leak_on_eio\fR (int)
|
||||
.ad
|
||||
.RS 12n
|
||||
If destroy encounters an EIO while reading metadata (e.g. indirect
|
||||
blocks), space referenced by the missing metadata can not be freed.
|
||||
Normally this causes the background destroy to become "stalled", as
|
||||
it is unable to make forward progress. While in this stalled state,
|
||||
all remaining space to free from the error-encountering filesystem is
|
||||
"temporarily leaked". Set this flag to cause it to ignore the EIO,
|
||||
permanently leak the space from indirect blocks that can not be read,
|
||||
and continue to free everything else that it can.
|
||||
|
||||
The default, "stalling" behavior is useful if the storage partially
|
||||
fails (i.e. some but not all i/os fail), and then later recovers. In
|
||||
this case, we will be able to continue pool operations while it is
|
||||
partially failed, and when it recovers, we can continue to free the
|
||||
space, with no leaks. However, note that this case is actually
|
||||
fairly rare.
|
||||
|
||||
Typically pools either (a) fail completely (but perhaps temporarily,
|
||||
e.g. a top-level vdev going offline), or (b) have localized,
|
||||
permanent errors (e.g. disk returns the wrong data due to bit flip or
|
||||
firmware bug). In case (a), this setting does not matter because the
|
||||
pool will be suspended and the sync thread will not be able to make
|
||||
forward progress regardless. In case (b), because the error is
|
||||
permanent, the best we can do is leak the minimum amount of space,
|
||||
which is what setting this flag will do. Therefore, it is reasonable
|
||||
for this flag to normally be set, but we chose the more conservative
|
||||
approach of not setting it, so that there is no possibility of
|
||||
leaking space in the "partial temporary" failure case.
|
||||
.sp
|
||||
Default value: \fB0\fR.
|
||||
.RE
|
||||
|
||||
.sp
|
||||
.ne 2
|
||||
.na
|
||||
|
||||
Reference in New Issue
Block a user