Commit Graph

35 Commits

Author SHA1 Message Date
George Wilson
dacb4f6a61 pool may become suspended during device expansion
When expanding a device zfs needs to rescan the partition table to
get the correct size. This can only happen when we're in the kernel
and requires the device to be closed. As part of the rescan, udev is
notified and the device links are removed and recreated. This leave a
window where the vdev code may try to reopen the device before udev
has recreated the link. If that happens, then the pool may end up in
a suspended state.

To correct this, we leverage the BLKPG_RESIZE_PARTITION ioctl which
allows the partition information to be modified even while it's in use.
This ioctl also does not remove the device link associated with the zfs
data partition so it eliminates the race condition that can occur in
the kernel.

Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Wilson <gwilson@delphix.com>
Closes #10897
2020-09-18 12:38:30 -07:00
Brian Behlendorf
fe20400db5 cppcheck: (error) Memory leak: vtoc
Resolve the reported memory leak by using a dedicated local vptr
variable to store the pointer reported by calloc().  Only assign the
passed **vtoc function argument on success, in all other cases vptr
is freed.

[lib/libefi/rdwr_efi.c:403]: (error) Memory leak: vtoc
[lib/libefi/rdwr_efi.c:422]: (error) Memory leak: vtoc
[lib/libefi/rdwr_efi.c:440]: (error) Memory leak: vtoc
[lib/libefi/rdwr_efi.c:454]: (error) Memory leak: vtoc
[lib/libefi/rdwr_efi.c:470]: (error) Memory leak: vtoc

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9732
2019-12-18 17:25:04 -08:00
Romain Dolbeau
4254e40729 Preliminary support for RV64G
This adds basic support for RISC-V, specifically RV64G.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Romain Dolbeau <romain.dolbeau@european-processor-initiative.eu>
Closes #9540
2019-11-06 10:56:09 -08:00
Andrea Gelmini
7859537768 Fix typos in lib/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9237
2019-09-02 17:53:27 -07:00
Prakash Surya
475ebd763a Fix device expansion when VM is powered off
When running on an ESXi based VM, I've found that "zpool online -e" will
not expand the zpool, if the disk was expanded in ESXi while the VM was
powered off.

For example, take the following scenario:

 1. VM running on top of VMware ESXi
 2. ZFS pool created with a given device "sda" of size 8GB
 3. VM powered off
 4. Device "sda" size expanded to 16GB
 5. VM powered on
 6. "zpool online -e" used on device "sda"

In this situation, after (2) the zpool will be roughly 8GB in size.
After (6), the expectation is the zpool's size will expand to roughly
16GB in size; i.e. expand to the new size of the "sda" device.
Unfortunately, I've seen that after (6), the zpool size does not change.

What's happening is after (5), the EFI label of the "sda" device will be
such that fields "efi_last_u_lba", "efi_last_lba", and "efi_altern_lba"
all reflect the new size of the disk; i.e. "33554398", "33554431", and
"33554431" respectively.

Thus, the check that we perform in "efi_use_whole_disk":

    if ((efi_label->efi_altern_lba == 1) || (efi_label->efi_altern_lba
        >= efi_label->efi_last_lba)) {

This will return true, and then we return from the function without
having expanded the size of the zpool/device.

In contrast, if we remove steps (3) and (5) in the sequence above, i.e.
the device is expanded while the VM is powered on, things change. In
that case, the fields "efi_last_u_lba" and "efi_altern_lba" do not
change (i.e. they still reflect the old 8GB device size), but the
"efi_last_lba" field does change (i.e. it now reflects the new 16GB
device size). Thus, when we evaluate the same conditional in
"efi_use_whole_disk", it'll return false, so the zpool is expanded.

Taking all of this into account, this PR updates "efi_use_whole_disk" to
properly expand the zpool when the underlying disk is expanded while the
VM is powered off.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Prakash Surya <prakash.surya@delphix.com>
Closes #9111
2019-08-13 21:18:53 -06:00
Brian Behlendorf
7e0594a3da
Remove libefi __linux__ wrappers
The ZoL version of libefi has been modified for Linux in several
places outside the existing __linux__ wrappers.  Remove them to
make the code easier to read and so as not to mislead anyone that
these are the sole modifications for Linux.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7625
2018-06-14 09:43:32 -07:00
Brian Behlendorf
232dd8b956
Fix efi_get_info() zvol detection
Partition detection for zvol devices was not working correctly
resulting inconsistent partitioning behavior when layering pools
on top of zvols.  This isn't a supported configuration but we'd
still like it to work properly.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7624
2018-06-13 10:20:58 -07:00
Sara Hartse
74d42600d8 zpool reopen should detect expanded devices
Update bdev_capacity to have wholedisk vdevs query the
size of the underlying block device (correcting for the size
of the efi parition and partition alignment) and therefore detect
expanded space.

Correct vdev_get_stats_ex so that the expandsize is aligned
to metaslab size and new space is only reported if it is large
enough for a new metaslab.

Reviewed by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Wren Kennedy <jwk404@gmail.com>
Signed-off-by: sara hartse <sara.hartse@delphix.com>
External-issue: LX-165
Closes #7546 
Issue #7582
2018-05-31 10:36:37 -07:00
Brian Behlendorf
93ce2b4ca5 Update build system and packaging
Minimal changes required to integrate the SPL sources in to the
ZFS repository build infrastructure and packaging.

Build system and packaging:
  * Renamed SPL_* autoconf m4 macros to ZFS_*.
  * Removed redundant SPL_* autoconf m4 macros.
  * Updated the RPM spec files to remove SPL package dependency.
  * The zfs package obsoletes the spl package, and the zfs-kmod
    package obsoletes the spl-kmod package.
  * The zfs-kmod-devel* packages were updated to add compatibility
    symlinks under /usr/src/spl-x.y.z until all dependent packages
    can be updated.  They will be removed in a future release.
  * Updated copy-builtin script for in-kernel builds.
  * Updated DKMS package to include the spl.ko.
  * Updated stale AUTHORS file to include all contributors.
  * Updated stale COPYRIGHT and included the SPL as an exception.
  * Renamed README.markdown to README.md
  * Renamed OPENSOLARIS.LICENSE to LICENSE.
  * Renamed DISCLAIMER to NOTICE.

Required code changes:
  * Removed redundant HAVE_SPL macro.
  * Removed _BOOT from nvpairs since it doesn't apply for Linux.
  * Initial header cleanup (removal of empty headers, refactoring).
  * Remove SPL repository clone/build from zimport.sh.
  * Use of DEFINE_RATELIMIT_STATE and DEFINE_SPINLOCK removed due
    to build issues when forcing C99 compilation.
  * Replaced legacy ACCESS_ONCE with READ_ONCE.
  * Include needed headers for `current` and `EXPORT_SYMBOL`.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
TEST_ZIMPORT_SKIP="yes"
Closes #7556
2018-05-29 16:00:33 -07:00
Tomohiro Kusumi
8111eb4abc Fix calloc(3) arguments order
calloc(3) takes `nelem` (or `nmemb` in glibc) first, and then size of
elements.  No difference expected for having these in reverse order,
however should follow the standard.

http://pubs.opengroup.org/onlinepubs/009695399/functions/calloc.html

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #7405
2018-04-12 10:50:39 -07:00
timor
c66e54e9dc Add support for nvme disk detection
This treats /dev/nvme.. devices the same way as /dev/sd... devices.  The
motivation behind this is that whole disk detection did not work on nvme
SSDs without that, because it DKC_UNKNOWN was returned for such devices.

Perhaps there should be a separate DKC_ type for this, but I don't know
enough about the code to know the implications of that.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: timor <timor.dd@googlemail.com>
Closes #7304
2018-03-21 08:35:20 -07:00
Gvozden Neskovic
a64f903b06 Fixes for issues found with cppcheck tool
The patch fixes small number of errors/false positives reported by `cppcheck`,
static analysis tool for C/C++.

cppcheck 1.72

$ cppcheck . --force --quiet
[cmd/zfs/zfs_main.c:4444]: (error) Possible null pointer dereference: who_perm
[cmd/zfs/zfs_main.c:4445]: (error) Possible null pointer dereference: who_perm
[cmd/zfs/zfs_main.c:4446]: (error) Possible null pointer dereference: who_perm
[cmd/zpool/zpool_iter.c:317]: (error) Uninitialized variable: nvroot
[cmd/zpool/zpool_vdev.c:1526]: (error) Memory leak: child
[lib/libefi/rdwr_efi.c:1118]: (error) Memory leak: efi_label
[lib/libuutil/uu_misc.c:207]: (error) va_list 'args' was opened but not closed by va_end().
[lib/libzfs/libzfs_import.c:1554]: (error) Dangerous usage of 'diskname' (strncpy doesn't always null-terminate it).
[lib/libzfs/libzfs_sendrecv.c:3279]: (error) Dereferencing 'cp' after it is deallocated / released
[tests/zfs-tests/cmd/file_write/file_write.c:154]: (error) Possible null pointer dereference: operation
[tests/zfs-tests/cmd/randfree_file/randfree_file.c:90]: (error) Memory leak: buf
[cmd/zinject/zinject.c:1068]: (error) Uninitialized variable: dataset
[module/icp/io/sha2_mod.c:698]: (error) Uninitialized variable: blocks_per_int64

Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #1392
2016-07-27 13:31:22 -07:00
YunQiang Su
2493dca54e Add isa_defs for MIPS
GCC for MIPS only defines _LP64 when 64bit,
while no _ILP32 defined when 32bit.

Signed-off-by: YunQiang Su <syq@debian.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4712
2016-05-31 09:05:40 -07:00
Brian Behlendorf
b8faf0cba8 Disable efi_debug in --enable-debug builds
Disable the additional EFI debugging in all builds.  Some users
run debug builds in production and the extra log messages can
cause confusion.  Beyond that the log messages are rarely useful.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #4523
2016-04-25 11:13:26 -07:00
Dimitri John Ledkov
a1bc34c0a7 Add support for s390[x].
Signed-off-by: Dimitri John Ledkov <xnox@ubuntu.com>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4425
2016-03-17 09:58:01 -07:00
ilovezfs
25df831b81 Fix cstyle issue from 7a02327
Continuations should be indented four spaces.

Signed-off-by: ilovezfs <ilovezfs@icloud.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4062
2015-12-04 14:37:41 -08:00
ilovezfs
917b8c5cec Ext4's typical GPT partition type not recognized
Adding additional entries to the efi conversion array will help prevent
the overwriting of the GPTs of disks with in-use file systems in more
cases. Most notably, this adds partition type 8300 "Linux filesystem"
(0FC63DAF-8483-4772-8E79-3D69D8477DE4), which is often used for ext4 and
btrfs, among others.

This commit itself does nothing to address the underlying problematic
behavior that check_slice() isn't called on partitions of an
unrecognized type, even when they contain a currently mounted file
system.

The additional entries were derived from these two resources:
https://en.wikipedia.org/wiki/GUID_Partition_Table
http://sourceforge.net/p/gptfdisk/code/ci/master/tree/parttypes.cc

Signed-off-by: ilovezfs <ilovezfs@icloud.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #4016
2015-12-04 09:27:00 -08:00
Yuri Pankov
fc80384923 Illumos 934 - FreeBSD's GPT not recognized
Reviewed by: Alexander Eremin <alexander.r.eremin@gmail.com>
Reviewed by: Garrett D'Amore <garrett@damore.org>
Reviewed by: Andrew Stormont <Andrew.Stormont@nexenta.com>
Reviewed by: Richard Elling <richard.elling@richardelling.com>
Approved by: Gordon Ross <gwr@nexenta.com>

References:
  https://www.illumos.org/issues/934
  https://github.com/illumos/illumos-gate/commit/e21ea67

Ported-by: ilovezfs <ilovezfs@icloud.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #4016
2015-12-04 09:19:40 -08:00
Richard Yao
9f5ba90f9f Fix zvol detection
The zpool create subcomand should not return an error on debug builds of
the userland tools when given zvols.

Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3595
2015-08-19 13:32:57 -07:00
Richard Yao
541da9935d Fix Xen Virtual Block Device detection
We fail to make partitions on xvd (Xen Virtual Block) devices. This also
causes debug builds of zpool create to return an error when given xen
virtual block devices. These devices should be given the same treatment
as vd (KVM Virtual Block) devices, so we adjust the relevant code paths.

Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3576
2015-07-10 12:21:14 -07:00
Richard Yao
f9f431cd28 Use (void) memcpy(), not (void *) memcpy()
This was caught by Clang.  Clearly the intent of this code was
to explicitly ignore the return value.

Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3054
2015-01-30 09:43:04 -08:00
Andrew Hamilton
d09a99f96b 2493 change efi_rescan() to wait longer
Change efi_rescan() to loop 10 times instead of 5 on EBUSY and
to sleep at the end of each loop. This helps with some instances
where the kernel does not reload the partition table fast enough
for ZFS to detect.

Signed-off-by: Andrew Hamilton <ahamilto@tjhsst.edu>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #2493
2014-08-26 16:12:37 -07:00
Richard Yao
787c455ed7 Improve partition detection on lesser used devices
The format strings in efi_get_info() are intended to extract both the
main device and partition number. However, this is only done correctly
for hd, sd and vd devices. The format strings for ram, dm-, md and loop
devices misparse the input. This causes the partition device to be
incorrectly labelled as the main device with the partition being
labelled 0.

Reported-by: ilovezfs <ilovezfs@icloud.com>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2175
2014-04-08 14:45:12 -07:00
Brian Behlendorf
d7ec8d4fd9 Define the needed ISA types for Sparc
Add the minimum required ISA types to support the Sparc
architecture.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: marku89 <mar42@kola.li>
Issue #1700
2014-01-09 15:49:59 -08:00
Michael Kjorling
d1d7e2689d cstyle: Resolve C style issues
The vast majority of these changes are in Linux specific code.
They are the result of not having an automated style checker to
validate the code when it was originally written.  Others were
caused when the common code was slightly adjusted for Linux.

This patch contains no functional changes.  It only refreshes
the code to conform to style guide.

Everyone submitting patches for inclusion upstream should now
run 'make checkstyle' and resolve any warning prior to opening
a pull request.  The automated builders have been updated to
fail a build if when 'make checkstyle' detects an issue.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #1821
2013-12-18 16:46:35 -08:00
Etienne Dechamps
b5a28807cd Move partition scanning from userspace to module.
Currently, zpool online -e (dynamic vdev expansion) doesn't work on
whole disks because we're invoking ioctl(BLKRRPART) from userspace
while ZFS still has a partition open on the disk, which results in
EBUSY.

This patch moves the BLKRRPART invocation from the zpool utility to the
module. Specifically, this is done just before opening the device in
vdev_disk_open() which is called inside vdev_reopen(). This requires
jumping through some hoops to get to the disk device from the partition
device, and to make sure we can still open the partition after the
BLKRRPART call.

Note that this new code path is triggered on dynamic vdev expansion
only; other actions, like creating a new pool, are unchanged and still
call BLKRRPART from userspace.

This change also depends on API changes which are available in 2.6.37
and latter kernels.  The build system has been updated to detect this,
but there is no compatibility mode for older kernels.  This means that
online expansion will NOT be available in older kernels.  However, it
will still be possible to expand the vdev offline.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #808
2012-07-17 09:17:31 -07:00
Brian Behlendorf
7535251dcf Add PowerPC to supported VTOCs
This code was was inherited from Solaris which was careful to define
the expected VTOC for various supported architectures.  While this
check may have made sense there it's something we should be able to
safely drop under Linux.

However, I'm not quite ready to do that yet.  So for the moment I'm
just doing the very safe thing of adding PowerPC as a supported type.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-07-12 11:52:34 -07:00
Etienne Dechamps
cee43a7477 Fix efi_use_whole_disk() when efi_nparts == 128.
Commit e5dc681a changed EFI_NUMPAR from 9 to 128. This means that the
on-disk EFI label has efi_nparts = 128 instead of 9. The index of the
reserved partition, however, is still 8. This breaks
efi_use_whole_disk(), which uses efi_nparts-1 as the index of the
reserved partition.

This commit fixes efi_use_whole_disk() when the index of the reserved
partition is not efi_nparts-1. It rewrites the algorithm and makes it
more robust by using the order of the partitions instead of their
numbering. It assumes that the last non-empty partition is the reserved
partition, and that the non-empty partition before that is the data
partition.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #808
2012-07-12 08:59:22 -07:00
Jorgen Lundman
c421831192 Define the needed ISA types for ARM
Add the minimum required ISA types to support the ARM architecture.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-05-03 11:18:54 -07:00
Richard Laager
2932b6a800 Treat /dev/vd* as whole disks
Correctly detect /dev/vd devices as whole disks and attempt to
create an EFI partition table.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-01-11 16:44:54 -08:00
Zachary Bedell
7a0232735d Make libefi-created GPT compatible with gptfdisk
GPT's created by libefi set the HeaderSize attribute in the GPT
header to 512 -- size of the GPT header INCLUDING the 420 padding
bytes at the end.  Most other tools set the size to 92 -- size of
the actual header itself excluding the padding.  Most tools check
the recorded HeaderSize when verifying CRC, but gptfdisk hardcodes
92 and thus reports CRC verification problems for full-disk vdevs
created IE with `zpool create pool sdc`.

This commit changes libefi's behavior for GPT creation and also
fixes several edge cases where libefi's behavior was similar
(though in an incompatible manner) to gptfdisk.  Libefi assumed
HeaderSize was always 512 even if the GPT recorded a different
value.  Sanity checks of the GPT headersize read from disk were
added before applying checksum calculation -- this will prevent
segfault in cases of bogus on-disk values.

Zpools created with the resuling libefi are verified as correct
both by parted and gptfdisk.  Also pool have been tested to
import correctly on ZFS on Linux as well as Solaris Express 11
livecd.

Signed-off-by: Zachary Bedell <zac@thebedells.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #344
2011-09-26 09:44:43 -07:00
Brian Behlendorf
b3b4f547f9 Remove useless libefi warnings
These two warnings in libefi serve no real purpose.  When running
without DEBUG they are already supressed, and even when DEBUG is
enabled all they indicate is the device doesn't already have an
EFI label.  For a Linux machine this is probably the common case.
2011-02-10 09:27:22 -08:00
Brian Behlendorf
d603ed6c27 Add linux user disk support
This topic branch contains all the changes needed to integrate the user
side zfs tools with Linux style devices.  Primarily this includes fixing
up the Solaris libefi library to be Linux friendly, and integrating with
the libblkid library which is provided by e2fsprogs.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2010-08-31 13:42:00 -07:00
Brian Behlendorf
572e285762 Update to onnv_147
This is the last official OpenSolaris tag before the public
development tree was closed.
2010-08-26 14:24:34 -07:00
Brian Behlendorf
5c36312909 Script update-zfs.sh updated to include libefi library 2009-10-09 15:37:29 -07:00