pool may become suspended during device expansion

When expanding a device zfs needs to rescan the partition table to
get the correct size. This can only happen when we're in the kernel
and requires the device to be closed. As part of the rescan, udev is
notified and the device links are removed and recreated. This leave a
window where the vdev code may try to reopen the device before udev
has recreated the link. If that happens, then the pool may end up in
a suspended state.

To correct this, we leverage the BLKPG_RESIZE_PARTITION ioctl which
allows the partition information to be modified even while it's in use.
This ioctl also does not remove the device link associated with the zfs
data partition so it eliminates the race condition that can occur in
the kernel.

Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Wilson <gwilson@delphix.com>
Closes #10897
This commit is contained in:
George Wilson
2020-09-17 22:03:10 -05:00
committed by Brian Behlendorf
parent 66ccc9b75f
commit dacb4f6a61
3 changed files with 150 additions and 36 deletions
-3
View File
@@ -72,9 +72,6 @@ zpool_relabel_disk(libzfs_handle_t *hdl, const char *path, const char *msg)
* It's possible that we might encounter an error if the device
* does not have any unallocated space left. If so, we simply
* ignore that error and continue on.
*
* Also, we don't call efi_rescan() - that would just return EBUSY.
* The module will do it for us in vdev_disk_open().
*/
error = efi_use_whole_disk(fd);