18 Commits

Author SHA1 Message Date
Thomas Lamprecht f5c9e3a9a8 bump version to 2.2.4-pve1
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-06-04 11:12:08 +02:00
Stoiko Ivanov 4847272363 update arc_summary arcstat patch with new introduced values
ZFS 2.2.4 added new kstats for speculative prefetch in:
026fe796465e3da7b27d06ef5338634ee6dd30d8

Adapt our patch introduced with ZFS 2.1 (for the then added MFU/MRU
stats), to also deal with the now introduced values not being present
(because an old kernel-module does not offer them).

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Reviewed-by: Max Carrara <m.carrara@proxmox.com>
Tested-by: Max Carrara <m.carrara@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-05-21 16:03:10 +02:00
Stoiko Ivanov 76119aa32b update zfs submodule to 2.2.4 and refresh patches
mostly - drop all patches we had queued up to get kernel 6.8
supported.

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Tested-by: Max Carrara <m.carrara@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-05-21 16:03:00 +02:00
Thomas Lamprecht 3968b96ed4 bump version to 2.2.3-pve2
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-04-08 17:44:01 +02:00
Fabian Grünbichler 32ce077088 fix #4835: order zfs-import@ before -cache/-scan
this should fix failures of the template instances because either of
the two other import services picked up the pool in question first.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Tested-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2024-04-08 17:38:01 +02:00
Thomas Lamprecht 68be554e71 backport 2.2.4 staging for better 6.8 support
Use the current ZFS 2.2.4 staging tree [0] with commit deb7a8423 ("Fix
corruption caused by mmap flushing problems") on top.

Additionally, include an open, but ack'd, pull request [1] that avoids
a potential general protection fault due to touching a vbio after it
was handed off to the kernel.

[0]: https://github.com/openzfs/zfs/commits/zfs-2.2.4-staging/
[1]: https://github.com/openzfs/zfs/pull/16049

Both should mostly touch the module code.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-04-03 09:56:31 +02:00
Thomas Lamprecht 6c9ff9b992 bump version to 2.2.3-pve1
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-03-11 13:46:05 +01:00
Stoiko Ivanov b48cfd2b15 fix #5288: cherry-pick fix for udev-partition links > 16
If a zvol has more than 15 partitions, the minor device number
exhausts the slot count reserved for partitions next to the zvol
itself. As a result, the minor number cannot be used to determine the
partition number for the higher partition, and doing so results in
wrong named symlinks being generated by udev.

Since the partition number is encoded in the block device name anyway,
let's just extract it from there instead.

For upstream issue and PR discussion see:
https://github.com/openzfs/zfs/pull/15970
https://github.com/openzfs/zfs/issues/15904

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-03-11 13:44:37 +01:00
Stoiko Ivanov a5e0251015 update zfs submodule to 2.2.3 and refresh patches
mostly support for newer kernel-versions, and fixes for the BRT bugs
discovered with 2.2.0 (BRT remains disabled by default).

The update contains a fix for CVE-2020-24370 in lua (which is present
in ZFS for channel-programs, which we do not use) - see:
https://github.com/openzfs/zfs/pull/15847 for more details.

One patch from Stefan Lendl was backported and is now in the ZFS 2.2
branch.

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2024-03-11 13:41:25 +01:00
Thomas Lamprecht 838cd1d173 bump version to 2.2.2-pve2
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-02-19 17:00:17 +01:00
Stefan Lendl 5f4f0445f4 Fix #5101: exports with sharenfs remain after zfs mount -a
When running `zfs mount -a`, prevent the exported datasets (with sharenfs)
to be truncated (unexported).
Adds tests to verify shares persist after mount -a

Signed-off-by: Stefan Lendl <s.lendl@proxmox.com>
2024-02-02 19:17:28 +01:00
Thomas Lamprecht 81d11761c3 bump version to 2.2.2-pve1
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-12-04 16:50:30 +01:00
Stoiko Ivanov 3bda92bd20 d/zfsutils-linux.install: add zfs_prepare_disk and manpage
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2023-12-04 16:48:29 +01:00
Stoiko Ivanov f67eb9538f update zfs submodule to 2.2.2 and refresh patches
the removed patches were cherry-picks, which are included in 2.2.2

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2023-12-04 16:48:29 +01:00
Fabian Grünbichler 00036e5a6e bump version to 2.2.0-pve4
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2023-11-29 09:22:05 +01:00
Fabian Grünbichler 3db00caad9 cherry-pick fix for data corruption
cherry-picked from 2.2.0-staging, fixing
https://github.com/openzfs/zfs/issues/15526

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2023-11-29 09:18:39 +01:00
Thomas Lamprecht e295f30e6a bump version to 2.2.0-pve3
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-11-17 17:33:02 +01:00
Stoiko Ivanov 96c807af63 pick bug-fixes staged for 2.2.1
ZFS 2.2.1 is currently being prepared, but the 3 patches added here
seem quite relevant, as the might cause dataloss/panics on setups
which run `zpool upgrade`.
See upstreams discussion for 2.2.1:
https://github.com/openzfs/zfs/pull/15498/
and the most critical issue:
https://github.com/openzfs/zfs/pull/15529
finally:
https://github.com/openzfs/zfs/commit/459c99ff2339a4a514abcf2255f9b3e5324ef09e
should not hurt either

the change to the UBSAN patch (0013) is unrelate, cosmetic only and
happened by running export-patchqueue.

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2023-11-17 17:30:26 +01:00
14 changed files with 511 additions and 425 deletions
+54
View File
@@ -1,3 +1,57 @@
zfs-linux (2.2.4-pve1) bookworm; urgency=medium
* update to new ZFS upstream 2.2.4 release
-- Proxmox Support Team <support@proxmox.com> Tue, 04 Jun 2024 11:11:48 +0200
zfs-linux (2.2.3-pve2) bookworm; urgency=medium
* fix #4835: order zfs-import@ before -cache/-scan
* backport (module) patches from the 2.2.4 staging tree for better Linux 6.8
support
-- Proxmox Support Team <support@proxmox.com> Mon, 08 Apr 2024 17:43:35 +0200
zfs-linux (2.2.3-pve1) bookworm; urgency=medium
* update to new ZFS upstream 2.2.3 release
* fix #5288: correctly handle zvols with more than 15 partitions in udev
-- Proxmox Support Team <support@proxmox.com> Mon, 11 Mar 2024 13:42:50 +0100
zfs-linux (2.2.2-pve2) bookworm; urgency=medium
* fix #5101: ensure datasets that have sharenfs enabled are not unexported
after a `zfs mount -a` call.
-- Proxmox Support Team <support@proxmox.com> Mon, 19 Feb 2024 16:56:37 +0100
zfs-linux (2.2.2-pve1) bookworm; urgency=medium
* update to new ZFS upstream 2.2.2 release, as we have all important fixes
for recent discovered data integrity issues backported to previous
versions, there should be no visible change in that regard.
-- Proxmox Support Team <support@proxmox.com> Mon, 04 Dec 2023 16:50:25 +0100
zfs-linux (2.2.0-pve4) bookworm; urgency=medium
* pick bug-fix staged for 2.2.2:
- fix (rare) corruption caused by dirty dnode being treated as clean
-- Proxmox Support Team <support@proxmox.com> Wed, 29 Nov 2023 09:21:26 +0100
zfs-linux (2.2.0-pve3) bookworm; urgency=medium
* pick bug-fixes staged for 2.2.1:
- add a tunable to disable BRT support and disable it by default
- fix block cloning between unencrypted and encrypted datasets
- disable block cloning by default
-- Proxmox Support Team <support@proxmox.com> Fri, 17 Nov 2023 17:32:58 +0100
zfs-linux (2.2.0-pve2) bookworm; urgency=medium
* avoid error from zfs-mount when /etc/exports.d does not exist (yet)
+1 -1
View File
@@ -13,7 +13,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/cmd/zed/zed.d/zed.rc b/cmd/zed/zed.d/zed.rc
index 78dc1afc7..41d5539ea 100644
index bc269b155..e6d4b1703 100644
--- a/cmd/zed/zed.d/zed.rc
+++ b/cmd/zed/zed.d/zed.rc
@@ -41,7 +41,7 @@ ZED_EMAIL_ADDR="root"
@@ -10,13 +10,16 @@ by scanning /dev/disk/by-id, irrespective of the existence and content of
the instance name is used unescaped (see systemd.unit(5)), since zpool names
can contain characters which will be escaped by systemd.
Its instances are ordered before the other two "big" import services to avoid
races and spurious (cosmetic!) service failures.
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
etc/Makefile.am | 1 +
etc/systemd/system/50-zfs.preset | 1 +
etc/systemd/system/zfs-import@.service.in | 16 ++++++++++++++++
3 files changed, 18 insertions(+)
etc/systemd/system/zfs-import@.service.in | 18 ++++++++++++++++++
3 files changed, 20 insertions(+)
create mode 100644 etc/systemd/system/zfs-import@.service.in
diff --git a/etc/Makefile.am b/etc/Makefile.am
@@ -45,10 +48,10 @@ index e4056a92c..030611419 100644
enable zfs-share.service
diff --git a/etc/systemd/system/zfs-import@.service.in b/etc/systemd/system/zfs-import@.service.in
new file mode 100644
index 000000000..9b4ee9371
index 000000000..5bd19fb79
--- /dev/null
+++ b/etc/systemd/system/zfs-import@.service.in
@@ -0,0 +1,16 @@
@@ -0,0 +1,18 @@
+[Unit]
+Description=Import ZFS pool %i
+Documentation=man:zpool(8)
@@ -57,6 +60,8 @@ index 000000000..9b4ee9371
+After=cryptsetup.target
+After=multipathd.target
+Before=zfs-import.target
+Before=zfs-import-scan.service
+Before=zfs-import-cache.service
+
+[Service]
+Type=oneshot
@@ -15,7 +15,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
rename man/{man1/arcstat.1 => man8/arcstat.8} (99%)
diff --git a/man/Makefile.am b/man/Makefile.am
index 36c1aede1..94fd96e58 100644
index 43bb014dd..a9293468a 100644
--- a/man/Makefile.am
+++ b/man/Makefile.am
@@ -2,7 +2,6 @@ dist_noinst_man_MANS = \
@@ -0,0 +1,438 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Thomas Lamprecht <t.lamprecht@proxmox.com>
Date: Wed, 10 Nov 2021 09:29:47 +0100
Subject: [PATCH] arc stat/summary: guard access to freshly introduced stats
l2arc MFU/MRU and zfetch past future and stride stats were introduced
in 2.1 and 2.2.4 respectively:
commit 085321621e79a75bea41c2b6511da6ebfbf2ba0a added printing MFU
and MRU stats for 2.1 user space tools, but those keys are not
available in the 2.0 module. That means it may break the arcstat and
arc_summary tools after upgrade to 2.1 (user space), before a reboot
to the new 2.1 ZFS kernel-module happened, due to python raising a
KeyError on the dict access then.
Move those two keys to a .get accessor with `0` as fallback, as it
should be better to show some possible wrong data for new stat-keys
than throwing an exception.
also move l2_mfu_asize l2_mru_asize l2_prefetch_asize
l2_bufc_data_asize l2_bufc_metadata_asize to .get accessor
(these are only present with a cache device in the pool)
guard access to iohits and uncached state introduced in
792a6ee462efc15a7614f27e13f0f8aaa9414a08
guard access to zfetch past future stride stats introduced in
026fe796465e3da7b27d06ef5338634ee6dd30d8
These are present in the current kernel, but lead to an exception, if
running the new user-space with an old kernel module.
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
cmd/arc_summary | 132 ++++++++++++++++++++++++------------------------
cmd/arcstat.in | 48 +++++++++---------
2 files changed, 90 insertions(+), 90 deletions(-)
diff --git a/cmd/arc_summary b/cmd/arc_summary
index 100fb1987..30f5d23e9 100755
--- a/cmd/arc_summary
+++ b/cmd/arc_summary
@@ -551,21 +551,21 @@ def section_arc(kstats_dict):
arc_target_size = arc_stats['c']
arc_max = arc_stats['c_max']
arc_min = arc_stats['c_min']
- meta = arc_stats['meta']
- pd = arc_stats['pd']
- pm = arc_stats['pm']
- anon_data = arc_stats['anon_data']
- anon_metadata = arc_stats['anon_metadata']
- mfu_data = arc_stats['mfu_data']
- mfu_metadata = arc_stats['mfu_metadata']
- mru_data = arc_stats['mru_data']
- mru_metadata = arc_stats['mru_metadata']
- mfug_data = arc_stats['mfu_ghost_data']
- mfug_metadata = arc_stats['mfu_ghost_metadata']
- mrug_data = arc_stats['mru_ghost_data']
- mrug_metadata = arc_stats['mru_ghost_metadata']
- unc_data = arc_stats['uncached_data']
- unc_metadata = arc_stats['uncached_metadata']
+ meta = arc_stats.get('meta', 0)
+ pd = arc_stats.get('pd', 0)
+ pm = arc_stats.get('pm', 0)
+ anon_data = arc_stats.get('anon_data', 0)
+ anon_metadata = arc_stats.get('anon_metadata', 0)
+ mfu_data = arc_stats.get('mfu_data', 0)
+ mfu_metadata = arc_stats.get('mfu_metadata', 0)
+ mru_data = arc_stats.get('mru_data', 0)
+ mru_metadata = arc_stats.get('mru_metadata', 0)
+ mfug_data = arc_stats.get('mfu_ghost_data', 0)
+ mfug_metadata = arc_stats.get('mfu_ghost_metadata', 0)
+ mrug_data = arc_stats.get('mru_ghost_data', 0)
+ mrug_metadata = arc_stats.get('mru_ghost_metadata', 0)
+ unc_data = arc_stats.get('uncached_data', 0)
+ unc_metadata = arc_stats.get('uncached_metadata', 0)
bonus_size = arc_stats['bonus_size']
dnode_limit = arc_stats['arc_dnode_limit']
dnode_size = arc_stats['dnode_size']
@@ -655,13 +655,13 @@ def section_arc(kstats_dict):
prt_i1('L2 cached evictions:', f_bytes(arc_stats['evict_l2_cached']))
prt_i1('L2 eligible evictions:', f_bytes(arc_stats['evict_l2_eligible']))
prt_i2('L2 eligible MFU evictions:',
- f_perc(arc_stats['evict_l2_eligible_mfu'],
+ f_perc(arc_stats.get('evict_l2_eligible_mfu', 0), # 2.0 module compat
arc_stats['evict_l2_eligible']),
- f_bytes(arc_stats['evict_l2_eligible_mfu']))
+ f_bytes(arc_stats.get('evict_l2_eligible_mfu', 0)))
prt_i2('L2 eligible MRU evictions:',
- f_perc(arc_stats['evict_l2_eligible_mru'],
+ f_perc(arc_stats.get('evict_l2_eligible_mru', 0), # 2.0 module compat
arc_stats['evict_l2_eligible']),
- f_bytes(arc_stats['evict_l2_eligible_mru']))
+ f_bytes(arc_stats.get('evict_l2_eligible_mru', 0)))
prt_i1('L2 ineligible evictions:',
f_bytes(arc_stats['evict_l2_ineligible']))
print()
@@ -672,106 +672,106 @@ def section_archits(kstats_dict):
"""
arc_stats = isolate_section('arcstats', kstats_dict)
- all_accesses = int(arc_stats['hits'])+int(arc_stats['iohits'])+\
+ all_accesses = int(arc_stats['hits'])+int(arc_stats.get('iohits', 0))+\
int(arc_stats['misses'])
prt_1('ARC total accesses:', f_hits(all_accesses))
ta_todo = (('Total hits:', arc_stats['hits']),
- ('Total I/O hits:', arc_stats['iohits']),
+ ('Total I/O hits:', arc_stats.get('iohits', 0)),
('Total misses:', arc_stats['misses']))
for title, value in ta_todo:
prt_i2(title, f_perc(value, all_accesses), f_hits(value))
print()
dd_total = int(arc_stats['demand_data_hits']) +\
- int(arc_stats['demand_data_iohits']) +\
+ int(arc_stats.get('demand_data_iohits', 0)) +\
int(arc_stats['demand_data_misses'])
prt_2('ARC demand data accesses:', f_perc(dd_total, all_accesses),
f_hits(dd_total))
dd_todo = (('Demand data hits:', arc_stats['demand_data_hits']),
- ('Demand data I/O hits:', arc_stats['demand_data_iohits']),
+ ('Demand data I/O hits:', arc_stats.get('demand_data_iohits', 0)),
('Demand data misses:', arc_stats['demand_data_misses']))
for title, value in dd_todo:
prt_i2(title, f_perc(value, dd_total), f_hits(value))
print()
dm_total = int(arc_stats['demand_metadata_hits']) +\
- int(arc_stats['demand_metadata_iohits']) +\
+ int(arc_stats.get('demand_metadata_iohits', 0)) +\
int(arc_stats['demand_metadata_misses'])
prt_2('ARC demand metadata accesses:', f_perc(dm_total, all_accesses),
f_hits(dm_total))
dm_todo = (('Demand metadata hits:', arc_stats['demand_metadata_hits']),
('Demand metadata I/O hits:',
- arc_stats['demand_metadata_iohits']),
+ arc_stats.get('demand_metadata_iohits', 0)),
('Demand metadata misses:', arc_stats['demand_metadata_misses']))
for title, value in dm_todo:
prt_i2(title, f_perc(value, dm_total), f_hits(value))
print()
pd_total = int(arc_stats['prefetch_data_hits']) +\
- int(arc_stats['prefetch_data_iohits']) +\
+ int(arc_stats.get('prefetch_data_iohits', 0)) +\
int(arc_stats['prefetch_data_misses'])
prt_2('ARC prefetch data accesses:', f_perc(pd_total, all_accesses),
f_hits(pd_total))
pd_todo = (('Prefetch data hits:', arc_stats['prefetch_data_hits']),
- ('Prefetch data I/O hits:', arc_stats['prefetch_data_iohits']),
+ ('Prefetch data I/O hits:', arc_stats.get('prefetch_data_iohits', 0)),
('Prefetch data misses:', arc_stats['prefetch_data_misses']))
for title, value in pd_todo:
prt_i2(title, f_perc(value, pd_total), f_hits(value))
print()
pm_total = int(arc_stats['prefetch_metadata_hits']) +\
- int(arc_stats['prefetch_metadata_iohits']) +\
+ int(arc_stats.get('prefetch_metadata_iohits', 0)) +\
int(arc_stats['prefetch_metadata_misses'])
prt_2('ARC prefetch metadata accesses:', f_perc(pm_total, all_accesses),
f_hits(pm_total))
pm_todo = (('Prefetch metadata hits:',
arc_stats['prefetch_metadata_hits']),
('Prefetch metadata I/O hits:',
- arc_stats['prefetch_metadata_iohits']),
+ arc_stats.get('prefetch_metadata_iohits', 0)),
('Prefetch metadata misses:',
arc_stats['prefetch_metadata_misses']))
for title, value in pm_todo:
prt_i2(title, f_perc(value, pm_total), f_hits(value))
print()
- all_prefetches = int(arc_stats['predictive_prefetch'])+\
- int(arc_stats['prescient_prefetch'])
+ all_prefetches = int(arc_stats.get('predictive_prefetch', 0))+\
+ int(arc_stats.get('prescient_prefetch', 0))
prt_2('ARC predictive prefetches:',
- f_perc(arc_stats['predictive_prefetch'], all_prefetches),
- f_hits(arc_stats['predictive_prefetch']))
+ f_perc(arc_stats.get('predictive_prefetch', 0), all_prefetches),
+ f_hits(arc_stats.get('predictive_prefetch', 0)))
prt_i2('Demand hits after predictive:',
f_perc(arc_stats['demand_hit_predictive_prefetch'],
- arc_stats['predictive_prefetch']),
+ arc_stats.get('predictive_prefetch', 0)),
f_hits(arc_stats['demand_hit_predictive_prefetch']))
prt_i2('Demand I/O hits after predictive:',
- f_perc(arc_stats['demand_iohit_predictive_prefetch'],
- arc_stats['predictive_prefetch']),
- f_hits(arc_stats['demand_iohit_predictive_prefetch']))
- never = int(arc_stats['predictive_prefetch']) -\
+ f_perc(arc_stats.get('demand_iohit_predictive_prefetch', 0),
+ arc_stats.get('predictive_prefetch', 0)),
+ f_hits(arc_stats.get('demand_iohit_predictive_prefetch', 0)))
+ never = int(arc_stats.get('predictive_prefetch', 0)) -\
int(arc_stats['demand_hit_predictive_prefetch']) -\
- int(arc_stats['demand_iohit_predictive_prefetch'])
+ int(arc_stats.get('demand_iohit_predictive_prefetch', 0))
prt_i2('Never demanded after predictive:',
- f_perc(never, arc_stats['predictive_prefetch']),
+ f_perc(never, arc_stats.get('predictive_prefetch', 0)),
f_hits(never))
print()
prt_2('ARC prescient prefetches:',
- f_perc(arc_stats['prescient_prefetch'], all_prefetches),
- f_hits(arc_stats['prescient_prefetch']))
+ f_perc(arc_stats.get('prescient_prefetch', 0), all_prefetches),
+ f_hits(arc_stats.get('prescient_prefetch', 0)))
prt_i2('Demand hits after prescient:',
f_perc(arc_stats['demand_hit_prescient_prefetch'],
- arc_stats['prescient_prefetch']),
+ arc_stats.get('prescient_prefetch', 0)),
f_hits(arc_stats['demand_hit_prescient_prefetch']))
prt_i2('Demand I/O hits after prescient:',
- f_perc(arc_stats['demand_iohit_prescient_prefetch'],
- arc_stats['prescient_prefetch']),
- f_hits(arc_stats['demand_iohit_prescient_prefetch']))
- never = int(arc_stats['prescient_prefetch'])-\
+ f_perc(arc_stats.get('demand_iohit_prescient_prefetch', 0),
+ arc_stats.get('prescient_prefetch', 0)),
+ f_hits(arc_stats.get('demand_iohit_prescient_prefetch', 0)))
+ never = int(arc_stats.get('prescient_prefetch', 0))-\
int(arc_stats['demand_hit_prescient_prefetch'])-\
- int(arc_stats['demand_iohit_prescient_prefetch'])
+ int(arc_stats.get('demand_iohit_prescient_prefetch', 0))
prt_i2('Never demanded after prescient:',
- f_perc(never, arc_stats['prescient_prefetch']),
+ f_perc(never, arc_stats.get('prescient_prefetch', 0)),
f_hits(never))
print()
@@ -782,7 +782,7 @@ def section_archits(kstats_dict):
arc_stats['mfu_ghost_hits']),
('Most recently used (MRU) ghost:',
arc_stats['mru_ghost_hits']),
- ('Uncached:', arc_stats['uncached_hits']))
+ ('Uncached:', arc_stats.get('uncached_hits', 0)))
for title, value in cl_todo:
prt_i2(title, f_perc(value, all_accesses), f_hits(value))
print()
@@ -794,26 +794,26 @@ def section_dmu(kstats_dict):
zfetch_stats = isolate_section('zfetchstats', kstats_dict)
zfetch_access_total = int(zfetch_stats['hits']) +\
- int(zfetch_stats['future']) + int(zfetch_stats['stride']) +\
- int(zfetch_stats['past']) + int(zfetch_stats['misses'])
+ int(zfetch_stats.get('future', 0)) + int(zfetch_stats.get('stride', 0)) +\
+ int(zfetch_stats.get('past', 0)) + int(zfetch_stats['misses'])
prt_1('DMU predictive prefetcher calls:', f_hits(zfetch_access_total))
prt_i2('Stream hits:',
f_perc(zfetch_stats['hits'], zfetch_access_total),
f_hits(zfetch_stats['hits']))
- future = int(zfetch_stats['future']) + int(zfetch_stats['stride'])
+ future = int(zfetch_stats.get('future', 0)) + int(zfetch_stats.get('stride', 0))
prt_i2('Hits ahead of stream:', f_perc(future, zfetch_access_total),
f_hits(future))
prt_i2('Hits behind stream:',
- f_perc(zfetch_stats['past'], zfetch_access_total),
- f_hits(zfetch_stats['past']))
+ f_perc(zfetch_stats.get('past', 0), zfetch_access_total),
+ f_hits(zfetch_stats.get('past', 0)))
prt_i2('Stream misses:',
f_perc(zfetch_stats['misses'], zfetch_access_total),
f_hits(zfetch_stats['misses']))
prt_i2('Streams limit reached:',
f_perc(zfetch_stats['max_streams'], zfetch_stats['misses']),
f_hits(zfetch_stats['max_streams']))
- prt_i1('Stream strides:', f_hits(zfetch_stats['stride']))
+ prt_i1('Stream strides:', f_hits(zfetch_stats.get('stride', 0)))
prt_i1('Prefetches issued', f_hits(zfetch_stats['io_issued']))
print()
@@ -860,20 +860,20 @@ def section_l2arc(kstats_dict):
f_perc(arc_stats['l2_hdr_size'], arc_stats['l2_size']),
f_bytes(arc_stats['l2_hdr_size']))
prt_i2('MFU allocated size:',
- f_perc(arc_stats['l2_mfu_asize'], arc_stats['l2_asize']),
- f_bytes(arc_stats['l2_mfu_asize']))
+ f_perc(arc_stats.get('l2_mfu_asize', 0), arc_stats['l2_asize']),
+ f_bytes(arc_stats.get('l2_mfu_asize', 0))) # 2.0 module compat
prt_i2('MRU allocated size:',
- f_perc(arc_stats['l2_mru_asize'], arc_stats['l2_asize']),
- f_bytes(arc_stats['l2_mru_asize']))
+ f_perc(arc_stats.get('l2_mru_asize', 0), arc_stats['l2_asize']),
+ f_bytes(arc_stats.get('l2_mru_asize', 0))) # 2.0 module compat
prt_i2('Prefetch allocated size:',
- f_perc(arc_stats['l2_prefetch_asize'], arc_stats['l2_asize']),
- f_bytes(arc_stats['l2_prefetch_asize']))
+ f_perc(arc_stats.get('l2_prefetch_asize', 0), arc_stats['l2_asize']),
+ f_bytes(arc_stats.get('l2_prefetch_asize',0))) # 2.0 module compat
prt_i2('Data (buffer content) allocated size:',
- f_perc(arc_stats['l2_bufc_data_asize'], arc_stats['l2_asize']),
- f_bytes(arc_stats['l2_bufc_data_asize']))
+ f_perc(arc_stats.get('l2_bufc_data_asize', 0), arc_stats['l2_asize']),
+ f_bytes(arc_stats.get('l2_bufc_data_asize', 0))) # 2.0 module compat
prt_i2('Metadata (buffer content) allocated size:',
- f_perc(arc_stats['l2_bufc_metadata_asize'], arc_stats['l2_asize']),
- f_bytes(arc_stats['l2_bufc_metadata_asize']))
+ f_perc(arc_stats.get('l2_bufc_metadata_asize', 0), arc_stats['l2_asize']),
+ f_bytes(arc_stats.get('l2_bufc_metadata_asize', 0))) # 2.0 module compat
print()
prt_1('L2ARC breakdown:', f_hits(l2_access_total))
diff --git a/cmd/arcstat.in b/cmd/arcstat.in
index c4f10a1d6..bf47ec90e 100755
--- a/cmd/arcstat.in
+++ b/cmd/arcstat.in
@@ -510,7 +510,7 @@ def calculate():
v = dict()
v["time"] = time.strftime("%H:%M:%S", time.localtime())
v["hits"] = d["hits"] // sint
- v["iohs"] = d["iohits"] // sint
+ v["iohs"] = d.get("iohits", 0) // sint
v["miss"] = d["misses"] // sint
v["read"] = v["hits"] + v["iohs"] + v["miss"]
v["hit%"] = 100 * v["hits"] // v["read"] if v["read"] > 0 else 0
@@ -518,7 +518,7 @@ def calculate():
v["miss%"] = 100 - v["hit%"] - v["ioh%"] if v["read"] > 0 else 0
v["dhit"] = (d["demand_data_hits"] + d["demand_metadata_hits"]) // sint
- v["dioh"] = (d["demand_data_iohits"] + d["demand_metadata_iohits"]) // sint
+ v["dioh"] = (d.get("demand_data_iohits", 0) + d.get("demand_metadata_iohits", 0)) // sint
v["dmis"] = (d["demand_data_misses"] + d["demand_metadata_misses"]) // sint
v["dread"] = v["dhit"] + v["dioh"] + v["dmis"]
@@ -527,7 +527,7 @@ def calculate():
v["dm%"] = 100 - v["dh%"] - v["di%"] if v["dread"] > 0 else 0
v["ddhit"] = d["demand_data_hits"] // sint
- v["ddioh"] = d["demand_data_iohits"] // sint
+ v["ddioh"] = d.get("demand_data_iohits", 0) // sint
v["ddmis"] = d["demand_data_misses"] // sint
v["ddread"] = v["ddhit"] + v["ddioh"] + v["ddmis"]
@@ -536,7 +536,7 @@ def calculate():
v["ddm%"] = 100 - v["ddh%"] - v["ddi%"] if v["ddread"] > 0 else 0
v["dmhit"] = d["demand_metadata_hits"] // sint
- v["dmioh"] = d["demand_metadata_iohits"] // sint
+ v["dmioh"] = d.get("demand_metadata_iohits", 0) // sint
v["dmmis"] = d["demand_metadata_misses"] // sint
v["dmread"] = v["dmhit"] + v["dmioh"] + v["dmmis"]
@@ -545,8 +545,8 @@ def calculate():
v["dmm%"] = 100 - v["dmh%"] - v["dmi%"] if v["dmread"] > 0 else 0
v["phit"] = (d["prefetch_data_hits"] + d["prefetch_metadata_hits"]) // sint
- v["pioh"] = (d["prefetch_data_iohits"] +
- d["prefetch_metadata_iohits"]) // sint
+ v["pioh"] = (d.get("prefetch_data_iohits", 0) +
+ d.get("prefetch_metadata_iohits", 0)) // sint
v["pmis"] = (d["prefetch_data_misses"] +
d["prefetch_metadata_misses"]) // sint
@@ -556,7 +556,7 @@ def calculate():
v["pm%"] = 100 - v["ph%"] - v["pi%"] if v["pread"] > 0 else 0
v["pdhit"] = d["prefetch_data_hits"] // sint
- v["pdioh"] = d["prefetch_data_iohits"] // sint
+ v["pdioh"] = d.get("prefetch_data_iohits", 0) // sint
v["pdmis"] = d["prefetch_data_misses"] // sint
v["pdread"] = v["pdhit"] + v["pdioh"] + v["pdmis"]
@@ -565,7 +565,7 @@ def calculate():
v["pdm%"] = 100 - v["pdh%"] - v["pdi%"] if v["pdread"] > 0 else 0
v["pmhit"] = d["prefetch_metadata_hits"] // sint
- v["pmioh"] = d["prefetch_metadata_iohits"] // sint
+ v["pmioh"] = d.get("prefetch_metadata_iohits", 0) // sint
v["pmmis"] = d["prefetch_metadata_misses"] // sint
v["pmread"] = v["pmhit"] + v["pmioh"] + v["pmmis"]
@@ -575,8 +575,8 @@ def calculate():
v["mhit"] = (d["prefetch_metadata_hits"] +
d["demand_metadata_hits"]) // sint
- v["mioh"] = (d["prefetch_metadata_iohits"] +
- d["demand_metadata_iohits"]) // sint
+ v["mioh"] = (d.get("prefetch_metadata_iohits", 0) +
+ d.get("demand_metadata_iohits", 0)) // sint
v["mmis"] = (d["prefetch_metadata_misses"] +
d["demand_metadata_misses"]) // sint
@@ -592,24 +592,24 @@ def calculate():
v["mru"] = d["mru_hits"] // sint
v["mrug"] = d["mru_ghost_hits"] // sint
v["mfug"] = d["mfu_ghost_hits"] // sint
- v["unc"] = d["uncached_hits"] // sint
+ v["unc"] = d.get("uncached_hits", 0) // sint
v["eskip"] = d["evict_skip"] // sint
v["el2skip"] = d["evict_l2_skip"] // sint
v["el2cach"] = d["evict_l2_cached"] // sint
v["el2el"] = d["evict_l2_eligible"] // sint
- v["el2mfu"] = d["evict_l2_eligible_mfu"] // sint
- v["el2mru"] = d["evict_l2_eligible_mru"] // sint
+ v["el2mfu"] = d.get("evict_l2_eligible_mfu", 0) // sint
+ v["el2mru"] = d.get("evict_l2_eligible_mru", 0) // sint
v["el2inel"] = d["evict_l2_ineligible"] // sint
v["mtxmis"] = d["mutex_miss"] // sint
- v["ztotal"] = (d["zfetch_hits"] + d["zfetch_future"] + d["zfetch_stride"] +
- d["zfetch_past"] + d["zfetch_misses"]) // sint
+ v["ztotal"] = (d["zfetch_hits"] + d.get("zfetch_future", 0) + d.get("zfetch_stride", 0) +
+ d.get("zfetch_past", 0) + d["zfetch_misses"]) // sint
v["zhits"] = d["zfetch_hits"] // sint
- v["zahead"] = (d["zfetch_future"] + d["zfetch_stride"]) // sint
- v["zpast"] = d["zfetch_past"] // sint
+ v["zahead"] = (d.get("zfetch_future", 0) + d.get("zfetch_stride", 0)) // sint
+ v["zpast"] = d.get("zfetch_past", 0) // sint
v["zmisses"] = d["zfetch_misses"] // sint
v["zmax"] = d["zfetch_max_streams"] // sint
- v["zfuture"] = d["zfetch_future"] // sint
- v["zstride"] = d["zfetch_stride"] // sint
+ v["zfuture"] = d.get("zfetch_future", 0) // sint
+ v["zstride"] = d.get("zfetch_stride", 0) // sint
v["zissued"] = d["zfetch_io_issued"] // sint
v["zactive"] = d["zfetch_io_active"] // sint
@@ -624,11 +624,11 @@ def calculate():
v["l2size"] = cur["l2_size"]
v["l2bytes"] = d["l2_read_bytes"] // sint
- v["l2pref"] = cur["l2_prefetch_asize"]
- v["l2mfu"] = cur["l2_mfu_asize"]
- v["l2mru"] = cur["l2_mru_asize"]
- v["l2data"] = cur["l2_bufc_data_asize"]
- v["l2meta"] = cur["l2_bufc_metadata_asize"]
+ v["l2pref"] = cur.get("l2_prefetch_asize", 0)
+ v["l2mfu"] = cur.get("l2_mfu_asize", 0)
+ v["l2mru"] = cur.get("l2_mru_asize", 0)
+ v["l2data"] = cur.get("l2_bufc_data_asize", 0)
+ v["l2meta"] = cur.get("l2_bufc_metadata_asize", 0)
v["l2pref%"] = 100 * v["l2pref"] // v["l2asize"]
v["l2mfu%"] = 100 * v["l2mfu"] // v["l2asize"]
v["l2mru%"] = 100 * v["l2mru"] // v["l2asize"]
@@ -1,113 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Thomas Lamprecht <t.lamprecht@proxmox.com>
Date: Wed, 10 Nov 2021 09:29:47 +0100
Subject: [PATCH] arc stat/summary: guard access to l2arc MFU/MRU stats
commit 085321621e79a75bea41c2b6511da6ebfbf2ba0a added printing MFU
and MRU stats for 2.1 user space tools, but those keys are not
available in the 2.0 module. That means it may break the arcstat and
arc_summary tools after upgrade to 2.1 (user space), before a reboot
to the new 2.1 ZFS kernel-module happened, due to python raising a
KeyError on the dict access then.
Move those two keys to a .get accessor with `0` as fallback, as it
should be better to show some possible wrong data for new stat-keys
than throwing an exception.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
also move l2_mfu_asize l2_mru_asize l2_prefetch_asize
l2_bufc_data_asize l2_bufc_metadata_asize to .get accessor
(these are only present with a cache device in the pool)
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
cmd/arc_summary | 28 ++++++++++++++--------------
cmd/arcstat.in | 14 +++++++-------
2 files changed, 21 insertions(+), 21 deletions(-)
diff --git a/cmd/arc_summary b/cmd/arc_summary
index 426e02070..9de198150 100755
--- a/cmd/arc_summary
+++ b/cmd/arc_summary
@@ -655,13 +655,13 @@ def section_arc(kstats_dict):
prt_i1('L2 cached evictions:', f_bytes(arc_stats['evict_l2_cached']))
prt_i1('L2 eligible evictions:', f_bytes(arc_stats['evict_l2_eligible']))
prt_i2('L2 eligible MFU evictions:',
- f_perc(arc_stats['evict_l2_eligible_mfu'],
+ f_perc(arc_stats.get('evict_l2_eligible_mfu', 0), # 2.0 module compat
arc_stats['evict_l2_eligible']),
- f_bytes(arc_stats['evict_l2_eligible_mfu']))
+ f_bytes(arc_stats.get('evict_l2_eligible_mfu', 0)))
prt_i2('L2 eligible MRU evictions:',
- f_perc(arc_stats['evict_l2_eligible_mru'],
+ f_perc(arc_stats.get('evict_l2_eligible_mru', 0), # 2.0 module compat
arc_stats['evict_l2_eligible']),
- f_bytes(arc_stats['evict_l2_eligible_mru']))
+ f_bytes(arc_stats.get('evict_l2_eligible_mru', 0)))
prt_i1('L2 ineligible evictions:',
f_bytes(arc_stats['evict_l2_ineligible']))
print()
@@ -851,20 +851,20 @@ def section_l2arc(kstats_dict):
f_perc(arc_stats['l2_hdr_size'], arc_stats['l2_size']),
f_bytes(arc_stats['l2_hdr_size']))
prt_i2('MFU allocated size:',
- f_perc(arc_stats['l2_mfu_asize'], arc_stats['l2_asize']),
- f_bytes(arc_stats['l2_mfu_asize']))
+ f_perc(arc_stats.get('l2_mfu_asize', 0), arc_stats['l2_asize']),
+ f_bytes(arc_stats.get('l2_mfu_asize', 0))) # 2.0 module compat
prt_i2('MRU allocated size:',
- f_perc(arc_stats['l2_mru_asize'], arc_stats['l2_asize']),
- f_bytes(arc_stats['l2_mru_asize']))
+ f_perc(arc_stats.get('l2_mru_asize', 0), arc_stats['l2_asize']),
+ f_bytes(arc_stats.get('l2_mru_asize', 0))) # 2.0 module compat
prt_i2('Prefetch allocated size:',
- f_perc(arc_stats['l2_prefetch_asize'], arc_stats['l2_asize']),
- f_bytes(arc_stats['l2_prefetch_asize']))
+ f_perc(arc_stats.get('l2_prefetch_asize', 0), arc_stats['l2_asize']),
+ f_bytes(arc_stats.get('l2_prefetch_asize',0))) # 2.0 module compat
prt_i2('Data (buffer content) allocated size:',
- f_perc(arc_stats['l2_bufc_data_asize'], arc_stats['l2_asize']),
- f_bytes(arc_stats['l2_bufc_data_asize']))
+ f_perc(arc_stats.get('l2_bufc_data_asize', 0), arc_stats['l2_asize']),
+ f_bytes(arc_stats.get('l2_bufc_data_asize', 0))) # 2.0 module compat
prt_i2('Metadata (buffer content) allocated size:',
- f_perc(arc_stats['l2_bufc_metadata_asize'], arc_stats['l2_asize']),
- f_bytes(arc_stats['l2_bufc_metadata_asize']))
+ f_perc(arc_stats.get('l2_bufc_metadata_asize', 0), arc_stats['l2_asize']),
+ f_bytes(arc_stats.get('l2_bufc_metadata_asize', 0))) # 2.0 module compat
print()
prt_1('L2ARC breakdown:', f_hits(l2_access_total))
diff --git a/cmd/arcstat.in b/cmd/arcstat.in
index 8df1c62f7..833348d0e 100755
--- a/cmd/arcstat.in
+++ b/cmd/arcstat.in
@@ -565,8 +565,8 @@ def calculate():
v["el2skip"] = d["evict_l2_skip"] // sint
v["el2cach"] = d["evict_l2_cached"] // sint
v["el2el"] = d["evict_l2_eligible"] // sint
- v["el2mfu"] = d["evict_l2_eligible_mfu"] // sint
- v["el2mru"] = d["evict_l2_eligible_mru"] // sint
+ v["el2mfu"] = d.get("evict_l2_eligible_mfu", 0) // sint
+ v["el2mru"] = d.get("evict_l2_eligible_mru", 0) // sint
v["el2inel"] = d["evict_l2_ineligible"] // sint
v["mtxmis"] = d["mutex_miss"] // sint
@@ -581,11 +581,11 @@ def calculate():
v["l2size"] = cur["l2_size"]
v["l2bytes"] = d["l2_read_bytes"] // sint
- v["l2pref"] = cur["l2_prefetch_asize"]
- v["l2mfu"] = cur["l2_mfu_asize"]
- v["l2mru"] = cur["l2_mru_asize"]
- v["l2data"] = cur["l2_bufc_data_asize"]
- v["l2meta"] = cur["l2_bufc_metadata_asize"]
+ v["l2pref"] = cur.get("l2_prefetch_asize", 0)
+ v["l2mfu"] = cur.get("l2_mfu_asize", 0)
+ v["l2mru"] = cur.get("l2_mru_asize", 0)
+ v["l2data"] = cur.get("l2_bufc_data_asize", 0)
+ v["l2meta"] = cur.get("l2_bufc_metadata_asize", 0)
v["l2pref%"] = 100 * v["l2pref"] // v["l2asize"]
v["l2mfu%"] = 100 * v["l2mfu"] // v["l2asize"]
v["l2mru%"] = 100 * v["l2mru"] // v["l2asize"]
@@ -1,99 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tony Hutter <hutter2@llnl.gov>
Date: Mon, 23 Oct 2023 14:45:06 -0700
Subject: [PATCH] zvol: Remove broken blk-mq optimization
This fix removes a dubious optimization in zfs_uiomove_bvec_rq()
that saved the iterator contents of a rq_for_each_segment(). This
optimization allowed restoring the "saved state" from a previous
rq_for_each_segment() call on the same uio so that you wouldn't
need to iterate though each bvec on every zfs_uiomove_bvec_rq() call.
However, if the kernel is manipulating the requests/bios/bvecs under
the covers between zfs_uiomove_bvec_rq() calls, then it could result
in corruption from using the "saved state". This optimization
results in an unbootable system after installing an OS on a zvol
with blk-mq enabled.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #15351
(cherry picked from commit 7c9b6fed16ed5034fd1cdfdaedfad93dc97b1557)
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
include/os/linux/spl/sys/uio.h | 8 --------
module/os/linux/zfs/zfs_uio.c | 29 -----------------------------
2 files changed, 37 deletions(-)
diff --git a/include/os/linux/spl/sys/uio.h b/include/os/linux/spl/sys/uio.h
index cce097e16..a4b600004 100644
--- a/include/os/linux/spl/sys/uio.h
+++ b/include/os/linux/spl/sys/uio.h
@@ -73,13 +73,6 @@ typedef struct zfs_uio {
size_t uio_skip;
struct request *rq;
-
- /*
- * Used for saving rq_for_each_segment() state between calls
- * to zfs_uiomove_bvec_rq().
- */
- struct req_iterator iter;
- struct bio_vec bv;
} zfs_uio_t;
@@ -138,7 +131,6 @@ zfs_uio_bvec_init(zfs_uio_t *uio, struct bio *bio, struct request *rq)
} else {
uio->uio_bvec = NULL;
uio->uio_iovcnt = 0;
- memset(&uio->iter, 0, sizeof (uio->iter));
}
uio->uio_loffset = io_offset(bio, rq);
diff --git a/module/os/linux/zfs/zfs_uio.c b/module/os/linux/zfs/zfs_uio.c
index 3efd4ab15..c2ed67c43 100644
--- a/module/os/linux/zfs/zfs_uio.c
+++ b/module/os/linux/zfs/zfs_uio.c
@@ -204,22 +204,6 @@ zfs_uiomove_bvec_rq(void *p, size_t n, zfs_uio_rw_t rw, zfs_uio_t *uio)
this_seg_start = orig_loffset;
rq_for_each_segment(bv, rq, iter) {
- if (uio->iter.bio) {
- /*
- * If uio->iter.bio is present, then we know we've saved
- * uio->iter from a previous call to this function, and
- * we can skip ahead in this rq_for_each_segment() loop
- * to where we last left off. That way, we don't need
- * to iterate over tons of segments we've already
- * processed - we can just restore the "saved state".
- */
- iter = uio->iter;
- bv = uio->bv;
- this_seg_start = uio->uio_loffset;
- memset(&uio->iter, 0, sizeof (uio->iter));
- continue;
- }
-
/*
* Lookup what the logical offset of the last byte of this
* segment is.
@@ -260,19 +244,6 @@ zfs_uiomove_bvec_rq(void *p, size_t n, zfs_uio_rw_t rw, zfs_uio_t *uio)
copied = 1; /* We copied some data */
}
- if (n == 0) {
- /*
- * All done copying. Save our 'iter' value to the uio.
- * This allows us to "save our state" and skip ahead in
- * the rq_for_each_segment() loop the next time we call
- * call zfs_uiomove_bvec_rq() on this uio (which we
- * will be doing for any remaining data in the uio).
- */
- uio->iter = iter; /* make a copy of the struct data */
- uio->bv = bv;
- return (0);
- }
-
this_seg_start = this_seg_end + 1;
}
@@ -1,123 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tony Hutter <hutter2@llnl.gov>
Date: Mon, 23 Oct 2023 14:39:59 -0700
Subject: [PATCH] Revert "zvol: Temporally disable blk-mq"
This reverts commit aefb6a2bd6c24597cde655e9ce69edd0a4c34357.
aefb6a2bd temporally disabled blk-mq until we could fix a fix for
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #15439
(cherry picked from commit 05c4710e8958832afc2868102c9535a4f18115be)
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
man/man4/zfs.4 | 57 ++++++++++++++++++++++++++++
module/os/linux/zfs/zvol_os.c | 12 ++++++
tests/zfs-tests/include/tunables.cfg | 2 +-
3 files changed, 70 insertions(+), 1 deletion(-)
diff --git a/man/man4/zfs.4 b/man/man4/zfs.4
index 71a3e67ee..cfadd79d8 100644
--- a/man/man4/zfs.4
+++ b/man/man4/zfs.4
@@ -2317,6 +2317,63 @@ If
.Sy zvol_threads
to the number of CPUs present or 32 (whichever is greater).
.
+.It Sy zvol_blk_mq_threads Ns = Ns Sy 0 Pq uint
+The number of threads per zvol to use for queuing IO requests.
+This parameter will only appear if your kernel supports
+.Li blk-mq
+and is only read and assigned to a zvol at zvol load time.
+If
+.Sy 0
+(the default) then internally set
+.Sy zvol_blk_mq_threads
+to the number of CPUs present.
+.
+.It Sy zvol_use_blk_mq Ns = Ns Sy 0 Ns | Ns 1 Pq uint
+Set to
+.Sy 1
+to use the
+.Li blk-mq
+API for zvols.
+Set to
+.Sy 0
+(the default) to use the legacy zvol APIs.
+This setting can give better or worse zvol performance depending on
+the workload.
+This parameter will only appear if your kernel supports
+.Li blk-mq
+and is only read and assigned to a zvol at zvol load time.
+.
+.It Sy zvol_blk_mq_blocks_per_thread Ns = Ns Sy 8 Pq uint
+If
+.Sy zvol_use_blk_mq
+is enabled, then process this number of
+.Sy volblocksize Ns -sized blocks per zvol thread.
+This tunable can be use to favor better performance for zvol reads (lower
+values) or writes (higher values).
+If set to
+.Sy 0 ,
+then the zvol layer will process the maximum number of blocks
+per thread that it can.
+This parameter will only appear if your kernel supports
+.Li blk-mq
+and is only applied at each zvol's load time.
+.
+.It Sy zvol_blk_mq_queue_depth Ns = Ns Sy 0 Pq uint
+The queue_depth value for the zvol
+.Li blk-mq
+interface.
+This parameter will only appear if your kernel supports
+.Li blk-mq
+and is only applied at each zvol's load time.
+If
+.Sy 0
+(the default) then use the kernel's default queue depth.
+Values are clamped to the kernel's
+.Dv BLKDEV_MIN_RQ
+and
+.Dv BLKDEV_MAX_RQ Ns / Ns Dv BLKDEV_DEFAULT_RQ
+limits.
+.
.It Sy zvol_volmode Ns = Ns Sy 1 Pq uint
Defines zvol block devices behaviour when
.Sy volmode Ns = Ns Sy default :
diff --git a/module/os/linux/zfs/zvol_os.c b/module/os/linux/zfs/zvol_os.c
index 76521c959..7a95b54bd 100644
--- a/module/os/linux/zfs/zvol_os.c
+++ b/module/os/linux/zfs/zvol_os.c
@@ -1620,6 +1620,18 @@ MODULE_PARM_DESC(zvol_prefetch_bytes, "Prefetch N bytes at zvol start+end");
module_param(zvol_volmode, uint, 0644);
MODULE_PARM_DESC(zvol_volmode, "Default volmode property value");
+#ifdef HAVE_BLK_MQ
+module_param(zvol_blk_mq_queue_depth, uint, 0644);
+MODULE_PARM_DESC(zvol_blk_mq_queue_depth, "Default blk-mq queue depth");
+
+module_param(zvol_use_blk_mq, uint, 0644);
+MODULE_PARM_DESC(zvol_use_blk_mq, "Use the blk-mq API for zvols");
+
+module_param(zvol_blk_mq_blocks_per_thread, uint, 0644);
+MODULE_PARM_DESC(zvol_blk_mq_blocks_per_thread,
+ "Process volblocksize blocks per thread");
+#endif
+
#ifndef HAVE_BLKDEV_GET_ERESTARTSYS
module_param(zvol_open_timeout_ms, uint, 0644);
MODULE_PARM_DESC(zvol_open_timeout_ms, "Timeout for ZVOL open retries");
diff --git a/tests/zfs-tests/include/tunables.cfg b/tests/zfs-tests/include/tunables.cfg
index 8010a9451..80e7bcb3b 100644
--- a/tests/zfs-tests/include/tunables.cfg
+++ b/tests/zfs-tests/include/tunables.cfg
@@ -89,7 +89,7 @@ VDEV_VALIDATE_SKIP vdev.validate_skip vdev_validate_skip
VOL_INHIBIT_DEV UNSUPPORTED zvol_inhibit_dev
VOL_MODE vol.mode zvol_volmode
VOL_RECURSIVE vol.recursive UNSUPPORTED
-VOL_USE_BLK_MQ UNSUPPORTED UNSUPPORTED
+VOL_USE_BLK_MQ UNSUPPORTED zvol_use_blk_mq
XATTR_COMPAT xattr_compat zfs_xattr_compat
ZEVENT_LEN_MAX zevent.len_max zfs_zevent_len_max
ZEVENT_RETAIN_MAX zevent.retain_max zfs_zevent_retain_max
@@ -51,10 +51,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/cmd/zpool/zpool_main.c b/cmd/zpool/zpool_main.c
index 5507f9d3f..98970abfe 100644
index ed0b8d7a1..f3acc49d0 100644
--- a/cmd/zpool/zpool_main.c
+++ b/cmd/zpool/zpool_main.c
@@ -2478,7 +2478,8 @@ print_status_config(zpool_handle_t *zhp, status_cbdata_t *cb, const char *name,
@@ -2663,7 +2663,8 @@ print_status_config(zpool_handle_t *zhp, status_cbdata_t *cb, const char *name,
if (vs->vs_scan_removing != 0) {
(void) printf(gettext(" (removing)"));
@@ -1,75 +0,0 @@
From 28be24aefc13b11e4c96e172cf2685994e03150d Mon Sep 17 00:00:00 2001
From: Tony Hutter <hutter2@llnl.gov>
Date: Thu, 9 Nov 2023 16:43:35 -0800
Subject: [PATCH] Workaround UBSAN errors for variable arrays
This gets around UBSAN errors when using arrays at the end of
structs. It converts some zero-length arrays to variable length
arrays and disables UBSAN checking on certain modules.
It is based off of the patch from #15460.
Addresses: #15145
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Co-authored-by: Tony Hutter <hutter2@llnl.gov>
Co-authored-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
include/os/linux/spl/sys/kmem_cache.h | 2 +-
include/sys/vdev_raidz_impl.h | 4 ++--
module/Kbuild.in | 4 ++++
3 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/include/os/linux/spl/sys/kmem_cache.h b/include/os/linux/spl/sys/kmem_cache.h
index 20eeadc46..82d50b603 100644
--- a/include/os/linux/spl/sys/kmem_cache.h
+++ b/include/os/linux/spl/sys/kmem_cache.h
@@ -108,7 +108,7 @@ typedef struct spl_kmem_magazine {
uint32_t skm_refill; /* Batch refill size */
struct spl_kmem_cache *skm_cache; /* Owned by cache */
unsigned int skm_cpu; /* Owned by cpu */
- void *skm_objs[0]; /* Object pointers */
+ void *skm_objs[]; /* Object pointers */
} spl_kmem_magazine_t;
typedef struct spl_kmem_obj {
diff --git a/include/sys/vdev_raidz_impl.h b/include/sys/vdev_raidz_impl.h
index c1037fa12..73c26dff1 100644
--- a/include/sys/vdev_raidz_impl.h
+++ b/include/sys/vdev_raidz_impl.h
@@ -130,7 +130,7 @@ typedef struct raidz_row {
uint64_t rr_offset; /* Logical offset for *_io_verify() */
uint64_t rr_size; /* Physical size for *_io_verify() */
#endif
- raidz_col_t rr_col[0]; /* Flexible array of I/O columns */
+ raidz_col_t rr_col[]; /* Flexible array of I/O columns */
} raidz_row_t;
typedef struct raidz_map {
@@ -139,7 +139,7 @@ typedef struct raidz_map {
int rm_nskip; /* RAIDZ sectors skipped for padding */
int rm_skipstart; /* Column index of padding start */
const raidz_impl_ops_t *rm_ops; /* RAIDZ math operations */
- raidz_row_t *rm_row[0]; /* flexible array of rows */
+ raidz_row_t *rm_row[]; /* flexible array of rows */
} raidz_map_t;
diff --git a/module/Kbuild.in b/module/Kbuild.in
index c13217159..b9c284a24 100644
--- a/module/Kbuild.in
+++ b/module/Kbuild.in
@@ -488,6 +488,10 @@ zfs-$(CONFIG_ARM64) += $(addprefix zfs/,$(ZFS_OBJS_ARM64))
zfs-$(CONFIG_PPC) += $(addprefix zfs/,$(ZFS_OBJS_PPC_PPC64))
zfs-$(CONFIG_PPC64) += $(addprefix zfs/,$(ZFS_OBJS_PPC_PPC64))
+UBSAN_SANITIZE_zap_leaf.o := n
+UBSAN_SANITIZE_zap_micro.o := n
+UBSAN_SANITIZE_sa.o := n
+
# Suppress incorrect warnings from versions of objtool which are not
# aware of x86 EVEX prefix instructions used for AVX512.
OBJECT_FILES_NON_STANDARD_vdev_raidz_math_avx512bw.o := y
--
2.39.2
+3 -6
View File
@@ -6,9 +6,6 @@
0006-dont-symlink-zed-scripts.patch
0007-Add-systemd-unit-for-importing-specific-pools.patch
0008-Patch-move-manpage-arcstat-1-to-arcstat-8.patch
0009-arc-stat-summary-guard-access-to-l2arc-MFU-MRU-stats.patch
0010-zvol-Remove-broken-blk-mq-optimization.patch
0011-Revert-zvol-Temporally-disable-blk-mq.patch
0012-Fix-nfs_truncate_shares-without-etc-exports.d.patch
0013-Workaround-UBSAN-errors-for-variable-arrays.patch
0014-zpool-status-tighten-bounds-for-noalloc-stat-availab.patch
0009-arc-stat-summary-guard-access-to-freshly-introduced-.patch
0010-Fix-nfs_truncate_shares-without-etc-exports.d.patch
0011-zpool-status-tighten-bounds-for-noalloc-stat-availab.patch
+2
View File
@@ -35,6 +35,7 @@ sbin/zstreamdump
usr/bin/zvol_wait
usr/bin/zilstat
usr/lib/modules-load.d/ lib/
usr/lib/zfs-linux/zfs_prepare_disk
usr/lib/zfs-linux/zpool.d/
usr/lib/zfs-linux/zpool_influxdb
usr/sbin/arc_summary
@@ -72,6 +73,7 @@ usr/share/man/man8/zfs-list.8
usr/share/man/man8/zfs-load-key.8
usr/share/man/man8/zfs-mount-generator.8
usr/share/man/man8/zfs-mount.8
usr/share/man/man8/zfs_prepare_disk.8
usr/share/man/man8/zfs-program.8
usr/share/man/man8/zfs-project.8
usr/share/man/man8/zfs-projectspace.8