Compare commits

..

153 Commits

Author SHA1 Message Date
Thomas Lamprecht c0bcb10a6b bump version to 4.13.16-51
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2018-07-05 12:21:37 +02:00
Thomas Lamprecht e5f68877f7 update ABI file for 4.13.16-3
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2018-07-05 12:21:17 +02:00
Thomas Lamprecht c32a5136c9 rebase patches on top of Ubuntu-4.13.0-45.50
(generated with debian/scripts/import-upstream-tag)

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2018-07-05 09:00:14 +02:00
Thomas Lamprecht 25d1229b93 update sources to Ubuntu-4.13.0-45.50
(generated with debian/scripts/import-upstream-tag)

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2018-07-05 09:00:13 +02:00
Thomas Lamprecht 9920decb89 bsys: export-patchqueue: do not print signature
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
(cherry picked from commit a6ee60dcae)
2018-07-05 08:59:54 +02:00
Thomas Lamprecht 94ee5be79f bump version to 4.13.16-50 2018-06-28 10:49:34 +02:00
Thomas Lamprecht 241d0d30b7 add KVM L1 guest escape - CVE-2018-12904 patch
see: http://www.openwall.com/lists/oss-security/2018/06/27/7
2018-06-27 18:01:12 +02:00
Thomas Lamprecht 758134b5b8 d/control: automatically replace linux tools maj.min version
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2018-05-30 08:21:24 +02:00
Stoiko Ivanov ea0c28fbd3 d/rules: don't remove perf.1 manpage
the one in linux-base refers to the versioned one

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2018-05-28 15:50:12 +02:00
Stoiko Ivanov d19532a45f d/rules: add version to perf man pages
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2018-05-28 15:50:12 +02:00
Stoiko Ivanov 836592184c refactor variable names and remove hardcoded major.minor version
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2018-05-28 12:30:01 +02:00
Thomas Lamprecht eae1bbd4fd abi-generate: abi parameter only needed if we parse deb
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2018-05-25 15:09:52 +02:00
Thomas Lamprecht a4ed09777b buildsys: abi-generate: add usage output
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2018-05-25 13:20:12 +02:00
Thomas Lamprecht aaf2ebd5eb follow up: update ABI tracking file
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2018-05-25 13:06:25 +02:00
Thomas Lamprecht 96c98e6390 bump version to 4.13-49 2018-05-24 13:36:25 +02:00
Thomas Lamprecht 1f1e183368 rebase patches on top of Ubuntu-4.13.0-43.48 2018-05-24 13:36:25 +02:00
Thomas Lamprecht cee23dde01 update sources to Ubuntu-4.13.0-43.48 2018-05-24 13:36:25 +02:00
Thomas Lamprecht 2a24ef2cb2 update ZFS to 0.7.9-pve1 2018-05-24 13:36:25 +02:00
Thomas Lamprecht cd95e3e71c buildsys: also cleanup *.{deb,changes,buildinfo} files 2018-05-23 11:42:10 +02:00
Thomas Lamprecht 7d510c8811 bump version to 4.13-48 2018-05-04 13:29:41 +02:00
Thomas Lamprecht 1411e00339 update ZFS to 0.7.8-pve1 2018-05-04 13:29:41 +02:00
Thomas Lamprecht 4cae8e323f rebase patches on top of Ubuntu-4.13.0-40.45
(generated with debian/scripts/import-upstream-tag)

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2018-05-04 10:56:24 +02:00
Thomas Lamprecht cf0ec7e756 update sources to Ubuntu-4.13.0-40.45
(generated with debian/scripts/import-upstream-tag)

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2018-05-04 10:56:24 +02:00
Wolfgang Bumiller c9e0edd481 fix #1737: merge: net: fix deadlock while clearing neighbor proxy table
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199289
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2018-04-25 14:28:04 +02:00
Fabian Grünbichler 62948cb62e d/rules: check for accidental perf linkage
with libraries that are not GPL-2-only compatible, fix previously typoed
variable, and add build-dep on libiberty-dev for CPLUS demangling.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-04-25 14:05:31 +02:00
Fabian Grünbichler 3e1dbb9dbf d/rules: install perf man pages
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-04-25 14:05:21 +02:00
Fabian Grünbichler 6e677e700e d/rules: don't strip headers package
we don't want to debug the contained helper binaries ;)

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-04-25 14:02:55 +02:00
Fabian Grünbichler cfb5b5480b d/rules: reformat header collection
for better readability and to reduce future churn

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-04-25 14:02:51 +02:00
Thomas Lamprecht 8ef2bff31c d/control: add some missing build dependencies
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

and wrap-and-sort them

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-04-24 12:00:58 +02:00
Fabian Grünbichler a5d0221028 debian/scripts: add import-upstream-tag
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-04-24 08:57:53 +02:00
Fabian Grünbichler eec53b4f3f debian/scripts: add patchqueue scripts
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-04-24 08:57:53 +02:00
Fabian Grünbichler 0e37410676 update ABI file for 4.13.16-2-pve
(generated with debian/scripts/abi-generate)

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-04-09 11:36:52 +02:00
Fabian Grünbichler 7996fc9e77 bump version to 4.13-47 2018-04-09 11:36:51 +02:00
Fabian Grünbichler 58e2f3e62f update ZFS to 0.7.7-pve2 2018-04-09 11:36:51 +02:00
Fabian Grünbichler 556283304a unbreak build after Ubuntu retpoline extract changes 2018-04-09 09:59:10 +02:00
Fabian Grünbichler d2da08752a rebase patches on top of Ubuntu-4.13.0-39.44
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-04-09 09:59:08 +02:00
Fabian Grünbichler 733df4e15e update sources to Ubuntu-4.13.0-39.44
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-04-09 09:59:04 +02:00
Fabian Grünbichler e91714fb70 bump version to 4.13-46
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-04-04 14:49:39 +02:00
Fabian Grünbichler 4abb1088a3 update SPL/ZFS to 0.7.7
and manually set the executable build on this new helper script

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-04-04 14:49:39 +02:00
Fabian Grünbichler 1f6c4a8c5c bump version to 4.13.16-45 2018-03-28 15:47:17 +02:00
Fabian Grünbichler e866cbac69 fix #1633: potential deadlock with shmem
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-03-28 15:25:35 +02:00
Fabian Grünbichler fd0ede4990 bump version to 4.13.16-44 2018-03-28 10:37:10 +02:00
Fabian Grünbichler 5a7ad156fa fix #1633: potential deadlock with THPs
see https://marc.info/?l=linux-mm&m=151683828707588

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-03-26 14:47:23 +02:00
Fabian Grünbichler 8d06c0d3d4 update ABI file for 4.13.16-1-pve
(generated with debian/scripts/abi-generate)

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-03-22 14:44:55 +01:00
Fabian Grünbichler 8f8697361c build: add abiupdate target
to automatically extract and commit the ABI data from a built
pve-headers binary package.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-03-21 14:45:36 +01:00
Fabian Grünbichler 66530989ce bump version to 4.13-43, bump ABI to 4.13.16-1-pve
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-03-21 14:45:36 +01:00
Fabian Grünbichler b1d53c94b4 scripts/abi-check: don't fail after ABI bump
this allows automatically running abi-check in non-fatal mode if an ABI
bump has just been done.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-03-21 14:45:36 +01:00
Fabian Grünbichler 2132f0716b d/scripts/abi-generate: add new helper script
and use it in d/rules to generate the checked ABI file.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-03-21 14:45:36 +01:00
Fabian Grünbichler 613597611d build: rename ABI file
to track previous ABI to automatically skip ABI checks on ABI bumps

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-03-21 14:45:35 +01:00
Fabian Grünbichler eac000ce29 rebase patches
and drop those applied upstream

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-03-21 14:45:35 +01:00
Fabian Grünbichler 8cef3abbbf update sources to Ubuntu-4.13.0-38.43
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2018-03-21 14:45:35 +01:00
Fabian Grünbichler 320c823e91 bump version to 4.13.13-42 2018-03-09 11:57:49 +01:00
Fabian Grünbichler 3323a8b78c add cherry-picks for OCFS2 bug
see https://forum.proxmox.com/threads/ocfs2-kernel-bug.39163/
2018-03-09 11:57:49 +01:00
Fabian Grünbichler 863ccb9670 add cherry-pick for NFS in network namespaces 2018-03-09 11:57:49 +01:00
Fabian Grünbichler 23df286b67 update source to Ubuntu-4.13.0-36.40 2018-03-09 11:57:49 +01:00
Fabian Grünbichler 44403fcc69 update README 2018-03-09 11:57:24 +01:00
Fabian Grünbichler 12aaf1a2f7 build: cleanup directory handling 2018-03-09 11:56:22 +01:00
Fabian Grünbichler 66aed5b89f build: remove exported variables
in favor of generated rules.d snippet. this allows calling
dpkg-buildpackage in the build directory manually without setting up the
environment to match.
2018-03-09 11:56:22 +01:00
Fabian Grünbichler f3acafc70e build: add pmg to upload target 2018-03-09 09:19:58 +01:00
Fabian Grünbichler e96d2ab3a1 build: move build and packaging to debian/
the top-level Makefile now only prepares the build directory by copying
and patching sources and generating the real files from debian/*.in

the actual build and packaging happens in debian/rules
2018-03-09 09:19:58 +01:00
Fabian Grünbichler 812cbfc144 debian/compat: set to 10 2018-03-09 09:19:58 +01:00
Fabian Grünbichler b9b3ee7810 debian/copyright: whitespace cleanup 2018-03-09 09:19:58 +01:00
Fabian Grünbichler f3baf3769b d/control: add source section, cleanup
remove variables that are set by dpkg-buildpackage automatically, and
wrap-and-sort the whole thing
2018-03-09 09:19:58 +01:00
Fabian Grünbichler 2d62d8a400 build: move/merge files
the control files were merged as appropriate, the rest are plain
renames.
2018-03-09 09:19:58 +01:00
Fabian Grünbichler 5de7886d45 build: remove leftover patch
has been applied upstream since 4.7 and thus dropped from our queue for
quite some time.
2018-03-09 09:19:58 +01:00
Fabian Grünbichler c7ac633ff7 build: remove proxmox-ve files
moved to separate repositories (proxmox-ve and pve-kernel-meta)
2018-03-09 09:19:43 +01:00
Fabian Grünbichler 89102957f9 bump version to 4.13-41 2018-02-21 10:08:20 +01:00
Fabian Grünbichler fecdaa4ec9 update ZFS/SPL to 0.7.6 2018-02-21 10:06:09 +01:00
Fabian Grünbichler 38c79e8118 fix refcnt leaks with net namespaces
see https://github.com/lxc/lxc/issues/2141 and
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407/
2018-02-21 09:18:49 +01:00
Fabian Grünbichler 8a8c16e218 bump version to 4.13-40 2018-02-16 09:58:12 +01:00
Fabian Grünbichler a4b1a797a0 warn when non-RETPOLINED module gets loaded 2018-02-16 09:58:12 +01:00
Fabian Grünbichler eb7f659548 bump version to 4.13-39, bump ABI to 4.13.13-6-pve 2018-02-16 09:58:12 +01:00
Fabian Grünbichler 7f4d14b06f buildsys: check for indirect/RETPOLINE gcc support
copied from arch/x86/Makefile
2018-02-16 09:58:12 +01:00
Fabian Grünbichler ef812b062d cherry-pick sched-wait bug fix
(included in 4.15 and queued for 4.14)
2018-02-14 12:14:12 +01:00
Fabian Grünbichler d320e5b2c3 cherry-pick scsi lpfc HBA bug fix
see https://forum.proxmox.com/threads/proxmox-5-1-lpfc-hba-emulex-lpe12000-error.39179/
2018-02-13 12:41:35 +01:00
Fabian Grünbichler 3adc532101 rebase patches 2018-02-13 12:41:35 +01:00
Fabian Grünbichler 9e25396c90 update sources to Ubuntu-4.13.0-35.39 2018-02-13 12:41:35 +01:00
Fabian Grünbichler 1da60899e3 add EDAC cherry-picks 2018-01-29 15:00:40 +01:00
Fabian Grünbichler a70918fbbc restructure patches
rebase on Ubuntu-4.13.0-32.35

the effective kernel tree which gets compiled after patches have been
applied is functionally identical (modulo parts for architectures which
we don't care about and Ubuntu build files)
2018-01-29 14:22:56 +01:00
Fabian Grünbichler 8d1dbe7c68 update sources to Ubuntu-4.13.0-32.35
note: all relevant changes were previously already cherry-picked
2018-01-29 14:19:13 +01:00
Fabian Grünbichler 57ff4c945b bump version to 4.13-38 2018-01-26 10:48:16 +01:00
Fabian Grünbichler 81f370d513 fix syscall retpoline 2018-01-26 10:46:25 +01:00
Fabian Grünbichler c1178a874f bump version to 4.13-37 2018-01-19 12:45:45 +01:00
Fabian Grünbichler a0f7ab8a6a fix #1622: i40e memory leak
cherry-pick from upstream 4.14
2018-01-19 12:43:16 +01:00
Fabian Grünbichler 7f0445469f update ZFS to 0.7.4 + ARC hit rate cherry-pick
from 0.7.6 queue
2018-01-19 12:28:37 +01:00
Fabian Grünbichler f90505f3a2 add tc fixes 2018-01-19 12:27:49 +01:00
Fabian Grünbichler d7db7042bc update ABI file 2018-01-15 14:05:27 +01:00
Fabian Grünbichler 5c85c0455e bump version to 4.13-36, bump ABI to 4.13.13-5-pve 2018-01-15 14:05:27 +01:00
Fabian Grünbichler 035dbe6708 KPTI/Spectre: add more fixes
* initial IBRS/IBPB/SPEC_CTRL support
* regression fixes for KPTI
* additional hardening against Spectre

based on Ubuntu-4.13.0-29.32 and mainline 4.14
2018-01-15 12:34:50 +01:00
Fabian Grünbichler 59d5af6732 build: reformat existing patches
drop numbers and commit hashes from patch metadata to reduce future
patch churn
2018-01-15 12:26:15 +01:00
Fabian Grünbichler 9c34463e8c bump version to 4.13-35, bump ABI to 4.13.13-4-pve 2018-01-08 11:51:24 +01:00
Fabian Grünbichler 633c5ed17f revert buggy SCSI error handler commit
this causes kernel OOPS and upstream is unresponsive about it.

see https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1726519
2018-01-08 11:51:24 +01:00
Fabian Grünbichler 76ec7e5931 update Spectre KVM PoC fix for AMD 2018-01-08 10:58:23 +01:00
Fabian Grünbichler 04f3b8beca KPTI: disable on AMD
and allow loading of microcode on recent AMD systems in preparation of
further Spectre fixes
2018-01-08 10:25:31 +01:00
Fabian Grünbichler e4cdf2a53e KPTI: add follow-up fixes 2018-01-08 10:25:09 +01:00
Fabian Grünbichler f9557b2c8d update ABI file 2018-01-07 13:21:02 +01:00
Fabian Grünbichler 597fd67073 bump version to 4.13-34, bump ABI to 4.13.13-3-pve 2018-01-07 13:21:02 +01:00
Fabian Grünbichler 6ecf746bac enable KPTI 2018-01-07 13:18:22 +01:00
Fabian Grünbichler e414beae5f default to FRAME_POINTER unwinder again
the new default was changed in 4.14 and was cherry-picked together with
KPTI, but the ORC_UNWINDER seems to break ZFS
2018-01-07 13:18:22 +01:00
Fabian Grünbichler b378f209dd add objtool build fix 2018-01-07 13:18:22 +01:00
Fabian Grünbichler 7c7389df50 add Spectre PoC fix
picked from https://patchwork.kernel.org/patch/10147679/
2018-01-06 15:15:39 +01:00
Fabian Grünbichler 321d628a98 add KPTI and related patches
picked from Ubuntu-4.13.0-23.26
2018-01-06 15:15:39 +01:00
Fabian Grünbichler 19894df472 reorder patches
numbering got messed up in the previous upload
2018-01-06 15:15:39 +01:00
Fabian Grünbichler 4536c7a7ed bump version to 4.13-33 2018-01-02 10:04:21 +01:00
Fabian Grünbichler 9e94988ca1 fix #1537: cherry-pick AMD NPT / IOMMU fix 2018-01-02 10:01:56 +01:00
Fabian Grünbichler aac9d58a8e update to Ubuntu-4.13.0-22.25 2018-01-02 09:55:05 +01:00
Fabian Grünbichler f783f68d2c bump version to 4.13-32, bump ABI to 4.13.13-2-pve 2017-12-21 10:22:24 +01:00
Fabian Grünbichler f51d44850e update to Ubuntu-4.13.0-21.24 2017-12-21 09:01:40 +01:00
Fabian Grünbichler bfd0cd3fe0 bump version to 4.13-31, bump ABI to 4.13.13-1-pve 2017-12-11 11:24:58 +01:00
Fabian Grünbichler 8d5d9374b9 update sources to Ubuntu-4.13.0-19.22 2017-12-11 09:54:00 +01:00
Fabian Grünbichler cba3f72b57 bump version to 4.13-30 2017-12-05 13:07:11 +01:00
Fabian Grünbichler 6eb123031d revert igb to 5.3.5.10
because 5.3.5.12 broke JUMBO_FRAMES (again)
2017-12-05 13:05:16 +01:00
Fabian Grünbichler 6749ef5ad2 bump version to 4.13-29, bump ABI to 4.13.8-3-pve 2017-12-04 09:36:58 +01:00
Fabian Grünbichler b42b4a1b96 cherry-pick KVM fix for old CPUs 2017-12-04 09:36:58 +01:00
Fabian Grünbichler 905722fbce cherry-pick / backport IB fixes
see https://forum.proxmox.com/threads/pve-5-1-and-infiniband-issues.37575/
2017-12-04 09:36:19 +01:00
Fabian Grünbichler ddad99c986 cherry-pick vhost perf regression and mem-leak fix 2017-12-04 09:27:58 +01:00
Fabian Grünbichler 9a9f6e04a7 cherry-pick final KVM BSOD fix 2017-12-04 09:27:58 +01:00
Fabian Grünbichler 8345558924 bump version to 4.13-28, bump ABI to 4.13.8-2-pve 2017-11-29 10:23:18 +01:00
Fabian Grünbichler 777ee9fe67 revert mmu changes causing bluescreens 2017-11-29 09:48:40 +01:00
Fabian Grünbichler 350f641023 bump version to 4.13-27, bump ABI to 4.13.8-1-pve 2017-11-22 09:47:25 +01:00
Fabian Grünbichler b409d5b86d add ABI data for 4.13.8-1-pve 2017-11-22 09:47:25 +01:00
Fabian Grünbichler 25c35b26a1 update intel drivers to latest upstream releases 2017-11-22 09:47:25 +01:00
Fabian Grünbichler d060c84f4d drop patches applied upstream 2017-11-17 11:59:22 +01:00
Fabian Grünbichler 0bd7eac29d update sources to Ubuntu-4.13.0-17.20 2017-11-17 11:39:27 +01:00
Fabian Grünbichler 2a26cde588 bump version to 4.13-26 2017-11-06 11:24:17 +01:00
Fabian Grünbichler 5caef38b1c update ZFS/SPL to 0.7.3 2017-11-06 11:23:31 +01:00
Fabian Grünbichler 3572537ff8 bump version to 5.1-25 2017-10-23 09:39:36 +02:00
Fabian Grünbichler da64a9b95a bump version to 4.13-25, bump ABI to 4.13.4-1-pve 2017-10-13 11:33:03 +02:00
Fabian Grünbichler 93216bf67a update abi-previous for ABI bump 2017-10-13 11:33:03 +02:00
Fabian Grünbichler 0e3176e76f fix CVE-2017-12188: nested KVM stack overflow 2017-10-13 11:33:03 +02:00
Fabian Grünbichler 2e38f6f987 update ZFS/SPL to 0.7.2
and switch submodule to simplify patch handling
2017-10-13 11:33:03 +02:00
Fabian Grünbichler dc8dc362e5 update sources to Ubuntu-4.13.0-16.19 2017-10-13 08:53:24 +02:00
Fabian Grünbichler a6dd515e43 build: rename submodules target to submodule 2017-10-13 08:41:42 +02:00
Fabian Grünbichler 2a1d389df6 bump version to 4.13-2, bump ABI to 4.13.3-1 2017-09-27 14:32:08 +02:00
Fabian Grünbichler 13cc2f9603 update abi-previous for ABI bump 2017-09-27 14:32:03 +02:00
Fabian Grünbichler 9c55c348b6 update kernel source to Ubuntu-4.13.0-12.13 2017-09-27 14:00:03 +02:00
Fabian Grünbichler 262ff4236b bump version to 4.13.1-1
kernel and header only, no meta packages
2017-09-27 10:08:57 +02:00
Fabian Grünbichler d84d9cdc47 ZFS/SPL: add 4.13 compat patches 2017-09-27 10:06:33 +02:00
Fabian Grünbichler e03fa66fce add cpuset v2 in v1 cherry-picks 2017-09-27 10:06:33 +02:00
Fabian Grünbichler 2853601c5c update fwlist and abi for 4.13.1-1-pve 2017-09-27 10:06:33 +02:00
Fabian Grünbichler a8ee21761c ixgbe: add 4.13 compat patch 2017-09-26 10:46:35 +02:00
Fabian Grünbichler 628004c405 igb: add 4.12 compat patch 2017-09-26 10:46:35 +02:00
Fabian Grünbichler 8021de509c intel: drop patches which are no longer needed 2017-09-26 10:46:35 +02:00
Fabian Grünbichler 85507ee2c5 update igb to 5.3.5.10 2017-09-26 10:46:35 +02:00
Fabian Grünbichler f3bad6d2b0 update ixgbe to 5.2.3 2017-09-26 10:46:35 +02:00
Fabian Grünbichler b46edee600 update e1000e to 3.3.5.10 2017-09-26 10:46:35 +02:00
Fabian Grünbichler 2f7beffd96 build: move intel NIC patches 2017-09-26 10:46:35 +02:00
Fabian Grünbichler b9e76370ab build: rebase and refactor kernel patches 2017-09-26 10:46:35 +02:00
Fabian Grünbichler 754ba827c1 update ACS override patch for 4.12+
using 330e834488d035e490fdda0b3242118411727bed from
https://aur.archlinux.org/linux-vfio.git
2017-09-26 10:46:35 +02:00
Fabian Grünbichler 6c7fba28d9 drop cpuset patch
to be replaced with backport of cgroup v2 functionality
2017-09-26 10:46:35 +02:00
Fabian Grünbichler a350540ee9 drop patches applied upstream 2017-09-26 10:46:35 +02:00
Fabian Grünbichler 0194915336 build: update for 4.13/artful 2017-09-26 10:38:27 +02:00
Fabian Grünbichler 133c60a505 update submodule for ubuntu artful's 4.13 kernel 2017-09-26 10:38:24 +02:00
85 changed files with 25867 additions and 23508 deletions
+6 -9
View File
@@ -1,9 +1,6 @@
[submodule "submodules/ubuntu-zesty"]
path = submodules/ubuntu-zesty
url = ../mirror_ubuntu-zesty-kernel
[submodule "submodules/zfs-module"]
path = submodules/zfs-module
url = ../mirror_zfs-debian
[submodule "submodules/spl-module"]
path = submodules/spl-module
url = ../mirror_spl-debian
[submodule "submodules/ubuntu-artful"]
path = submodules/ubuntu-artful
url = ../mirror_ubuntu-artful-kernel
[submodule "submodules/zfsonlinux"]
path = submodules/zfsonlinux
url = ../zfsonlinux
@@ -1,51 +0,0 @@
From b776ff7db868804129b9f364825fd4e949a493ee Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fabian=20Gr=C3=BCnbichler?= <f.gruenbichler@proxmox.com>
Date: Tue, 19 Sep 2017 09:36:43 +0200
Subject: [PATCH] Revert "net: reduce skb_warn_bad_offload() noise"
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This reverts commit b2504a5dbef3305ef41988ad270b0e8ec289331c.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
net/core/dev.c | 12 +++---------
1 file changed, 3 insertions(+), 9 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index 73d5644fa834..7c8959936169 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2702,12 +2702,11 @@ static inline bool skb_needs_check(struct sk_buff *skb, bool tx_path)
struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
netdev_features_t features, bool tx_path)
{
- struct sk_buff *segs;
-
if (unlikely(skb_needs_check(skb, tx_path))) {
int err;
- /* We're going to init ->check field in TCP or UDP header */
+ skb_warn_bad_offload(skb);
+
err = skb_cow_head(skb, 0);
if (err < 0)
return ERR_PTR(err);
@@ -2735,12 +2734,7 @@ struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
skb_reset_mac_header(skb);
skb_reset_mac_len(skb);
- segs = skb_mac_gso_segment(skb, features);
-
- if (unlikely(skb_needs_check(skb, tx_path)))
- skb_warn_bad_offload(skb);
-
- return segs;
+ return skb_mac_gso_segment(skb, features);
}
EXPORT_SYMBOL(__skb_gso_segment);
--
2.11.0
@@ -1,54 +0,0 @@
From 6fa9fc0ce1032710ce017c444b0c66eaf9e77782 Mon Sep 17 00:00:00 2001
From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Mon, 22 May 2017 00:17:30 +0200
Subject: [PATCH linux] netfilter: nft_set_rbtree: handle re-addition element
after deletion
The existing code selects no next branch to be inspected when
re-inserting an inactive element into the rb-tree, looping endlessly.
This patch restricts the check for active elements to the EEXIST case
only.
Fixes: e701001e7cbe ("netfilter: nft_rbtree: allow adjacent intervals with dynamic updates")
Reported-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nft_set_rbtree.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
index f06f55e..51ff879 100644
--- a/net/netfilter/nft_set_rbtree.c
+++ b/net/netfilter/nft_set_rbtree.c
@@ -118,17 +118,17 @@ static int __nft_rbtree_insert(const struct net *net, const struct nft_set *set,
else if (d > 0)
p = &parent->rb_right;
else {
- if (nft_set_elem_active(&rbe->ext, genmask)) {
- if (nft_rbtree_interval_end(rbe) &&
- !nft_rbtree_interval_end(new))
- p = &parent->rb_left;
- else if (!nft_rbtree_interval_end(rbe) &&
- nft_rbtree_interval_end(new))
- p = &parent->rb_right;
- else {
- *ext = &rbe->ext;
- return -EEXIST;
- }
+ if (nft_rbtree_interval_end(rbe) &&
+ !nft_rbtree_interval_end(new)) {
+ p = &parent->rb_left;
+ } else if (!nft_rbtree_interval_end(rbe) &&
+ nft_rbtree_interval_end(new)) {
+ p = &parent->rb_right;
+ } else if (nft_set_elem_active(&rbe->ext, genmask)) {
+ *ext = &rbe->ext;
+ return -EEXIST;
+ } else {
+ p = &parent->rb_left;
}
}
}
--
2.1.4
@@ -1,43 +0,0 @@
From d747a7a51b00984127a88113cdbbc26f91e9d815 Mon Sep 17 00:00:00 2001
From: WANG Cong <xiyou.wangcong@gmail.com>
Date: Sat, 24 Jun 2017 23:50:30 -0700
Subject: [PATCH] tcp: reset sk_rx_dst in tcp_disconnect()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
We have to reset the sk->sk_rx_dst when we disconnect a TCP
connection, because otherwise when we re-connect it this
dst reference is simply overridden in tcp_finish_connect().
This fixes a dst leak which leads to a loopback dev refcnt
leak. It is a long-standing bug, Kevin reported a very similar
(if not same) bug before. Thanks to Andrei for providing such
a reliable reproducer which greatly narrows down the problem.
Fixes: 41063e9dd119 ("ipv4: Early TCP socket demux.")
Reported-by: Andrei Vagin <avagin@gmail.com>
Reported-by: Kevin Xu <kaiwen.xu@hulu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
net/ipv4/tcp.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index b5ea036ca781..40aca7803cf2 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2330,6 +2330,8 @@ int tcp_disconnect(struct sock *sk, int flags)
tcp_init_send_head(sk);
memset(&tp->rx_opt, 0, sizeof(tp->rx_opt));
__sk_dst_reset(sk);
+ dst_release(sk->sk_rx_dst);
+ sk->sk_rx_dst = NULL;
tcp_saved_syn_free(tp);
/* Clean up fastopen related fields */
--
2.11.0
+130 -295
View File
@@ -1,14 +1,15 @@
RELEASE=5.0
RELEASE=5.1
# also update proxmox-ve/changelog if you change KERNEL_VER or KREL
KERNEL_VER=4.10.17
PKGREL=25
# also include firmware of previous version into
# the fw package: fwlist-2.6.32-PREV-pve
KREL=5
# also update pve-kernel-meta.git if either of these change
KERNEL_MAJ=4
KERNEL_MIN=13
KERNEL_PATCHLEVEL=16
KREL=4
KERNEL_SRC=ubuntu-zesty
KERNEL_SRC_SUBMODULE=submodules/ubuntu-zesty
PKGREL=51
KERNEL_MAJMIN=$(KERNEL_MAJ).$(KERNEL_MIN)
KERNEL_VER=$(KERNEL_MAJMIN).$(KERNEL_PATCHLEVEL)
EXTRAVERSION=-${KREL}-pve
KVNAME=${KERNEL_VER}${EXTRAVERSION}
@@ -16,7 +17,6 @@ PACKAGE=pve-kernel-${KVNAME}
HDRPACKAGE=pve-headers-${KVNAME}
ARCH=$(shell dpkg-architecture -qDEB_BUILD_ARCH)
NPROCS=$(shell nproc)
# amd64/x86_64/x86 share the arch subdirectory in the kernel, 'x86' so we need
# a mapping
@@ -26,343 +26,178 @@ KERNEL_ARCH=${ARCH}
endif
GITVERSION:=$(shell git rev-parse HEAD)
CHANGELOG_DATE:=$(shell dpkg-parsechangelog -SDate -lchangelog.Debian)
export SOURCE_DATE_EPOCH ?= $(shell dpkg-parsechangelog -STimestamp -lchangelog.Debian)
SKIPABI=0
TOP=$(shell pwd)
ifeq ($(CC), cc)
GCC=gcc
else
GCC=$(CC)
endif
BUILD_DIR=build
KERNEL_SRC=ubuntu-artful
KERNEL_SRC_SUBMODULE=submodules/$(KERNEL_SRC)
KERNEL_CFG_ORG=config-${KERNEL_VER}.org
E1000EDIR=e1000e-3.3.5.3
E1000EDIR=e1000e-3.3.6
E1000ESRC=${E1000EDIR}.tar.gz
IGBDIR=igb-5.3.5.4
IGBDIR=igb-5.3.5.10
IGBSRC=${IGBDIR}.tar.gz
IXGBEDIR=ixgbe-5.0.4
IXGBEDIR=ixgbe-5.3.3
IXGBESRC=${IXGBEDIR}.tar.gz
ZFSONLINUX_SUBMODULE=submodules/zfsonlinux
SPLDIR=pkg-spl
SPLSRC=submodules/spl-module
SPLSRC=${ZFSONLINUX_SUBMODULE}/spl-debian
ZFSDIR=pkg-zfs
ZFSSRC=submodules/zfs-module
ZFS_KO=zfs.ko
ZFS_KO_REST=zavl.ko znvpair.ko zunicode.ko zcommon.ko zpios.ko icp.ko
ZFS_MODULES=$(ZFS_KO) $(ZFS_KO_REST)
SPL_KO=spl.ko
SPL_KO_REST=splat.ko
SPL_MODULES=$(SPL_KO) $(SPL_KO_REST)
ZFSSRC=${ZFSONLINUX_SUBMODULE}/zfs-debian
MODULES=modules
MODULE_DIRS=${E1000EDIR} ${IGBDIR} ${IXGBEDIR} ${SPLDIR} ${ZFSDIR}
# exported to debian/rules via debian/rules.d/dirs.mk
DIRS=KERNEL_SRC E1000EDIR IGBDIR IXGBEDIR SPLDIR ZFSDIR MODULES
DST_DEB=${PACKAGE}_${KERNEL_VER}-${PKGREL}_${ARCH}.deb
HDR_DEB=${HDRPACKAGE}_${KERNEL_VER}-${PKGREL}_${ARCH}.deb
PVEPKG=proxmox-ve
PVE_DEB=${PVEPKG}_${RELEASE}-${PKGREL}_all.deb
VIRTUALHDRPACKAGE=pve-headers
VIRTUAL_HDR_DEB=${VIRTUALHDRPACKAGE}_${RELEASE}-${PKGREL}_all.deb
LINUX_TOOLS_DEB=linux-tools-$(KERNEL_MAJMIN)_${KERNEL_VER}-${PKGREL}_${ARCH}.deb
LINUX_TOOLS_PKG=linux-tools-4.10
LINUX_TOOLS_DEB=${LINUX_TOOLS_PKG}_${KERNEL_VER}-${PKGREL}_${ARCH}.deb
DEBS=${DST_DEB} ${HDR_DEB} ${PVE_DEB} ${VIRTUAL_HDR_DEB} ${LINUX_TOOLS_DEB}
DEBS=${DST_DEB} ${HDR_DEB} ${LINUX_TOOLS_DEB}
all: check_gcc deb
deb: ${DEBS}
pve: $(PVE_DEB)
${PVE_DEB}: proxmox-ve/control proxmox-ve/postinst ${PVE_RELEASE_KEYS}
rm -rf proxmox-ve/data
mkdir -p proxmox-ve/data/DEBIAN
mkdir -p proxmox-ve/data/usr/share/doc/${PVEPKG}/
mkdir -p proxmox-ve/data/etc/apt/trusted.gpg.d
install -m 0644 proxmox-ve/proxmox-release-5.x.pubkey proxmox-ve/data/etc/apt/trusted.gpg.d/proxmox-ve-release-5.x.gpg
sed -e 's/@KVNAME@/${KVNAME}/' -e 's/@KERNEL_VER@/${KERNEL_VER}/' -e 's/@RELEASE@/${RELEASE}/' -e 's/@PKGREL@/${PKGREL}/' <proxmox-ve/control >proxmox-ve/data/DEBIAN/control
sed -e 's/@KVNAME@/${KVNAME}/' <proxmox-ve/postinst >proxmox-ve/data/DEBIAN/postinst
chmod 0755 proxmox-ve/data/DEBIAN/postinst
install -m 0755 proxmox-ve/postrm proxmox-ve/data/DEBIAN/postrm
echo "git clone git://git.proxmox.com/git/pve-kernel.git\\ngit checkout ${GITVERSION}" > proxmox-ve/data/usr/share/doc/${PVEPKG}/SOURCE
install -m 0644 proxmox-ve/copyright proxmox-ve/data/usr/share/doc/${PVEPKG}
install -m 0644 proxmox-ve/changelog.Debian proxmox-ve/data/usr/share/doc/${PVEPKG}
gzip -n --best proxmox-ve/data/usr/share/doc/${PVEPKG}/changelog.Debian
dpkg-deb --build proxmox-ve/data ${PVE_DEB}
check_gcc:
$(GCC) --version|grep "6\.3" || false
@$(GCC) -Werror -mindirect-branch=thunk-extern -mindirect-branch-register -c -x c /dev/null -o check_gcc.o \
|| ( rm -f check_gcc.o; \
echo "Please install gcc-6 packages with indirect thunk / RETPOLINE support"; \
false)
@rm -f check_gcc.o
pve-headers: $(VIRTUAL_HDR_DEB)
${VIRTUAL_HDR_DEB}: proxmox-ve/pve-headers.control
rm -rf pve-headers/data
mkdir -p pve-headers/data/DEBIAN
mkdir -p pve-headers/data/usr/share/doc/${VIRTUALHDRPACKAGE}/
sed -e 's/@KVNAME@/${KVNAME}/' -e 's/@KERNEL_VER@/${KERNEL_VER}/' -e 's/@RELEASE@/${RELEASE}/' -e 's/@PKGREL@/${PKGREL}/' <proxmox-ve/pve-headers.control >pve-headers/data/DEBIAN/control
echo "git clone git://git.proxmox.com/git/pve-kernel.git\\ngit checkout ${GITVERSION}" > pve-headers/data/usr/share/doc/${VIRTUALHDRPACKAGE}/SOURCE
install -m 0644 proxmox-ve/copyright pve-headers/data/usr/share/doc/${VIRTUALHDRPACKAGE}
install -m 0644 proxmox-ve/changelog.Debian pve-headers/data/usr/share/doc/${VIRTUALHDRPACKAGE}
gzip -n --best pve-headers/data/usr/share/doc/${VIRTUALHDRPACKAGE}/changelog.Debian
dpkg-deb --build pve-headers/data ${VIRTUAL_HDR_DEB}
check_gcc:
ifeq ($(CC), cc)
gcc --version|grep "6\.3" || false
else
$(CC) --version|grep "6\.3" || false
endif
${DST_DEB}: data control.in prerm.in postinst.in postrm.in copyright changelog.Debian | fwcheck abicheck
mkdir -p data/DEBIAN
sed -e 's/@KERNEL_VER@/${KERNEL_VER}/' -e 's/@KVNAME@/${KVNAME}/' -e 's/@PKGREL@/${PKGREL}/' -e 's/@ARCH@/${ARCH}/' <control.in >data/DEBIAN/control
sed -e 's/@@KVNAME@@/${KVNAME}/g' <prerm.in >data/DEBIAN/prerm
chmod 0755 data/DEBIAN/prerm
sed -e 's/@@KVNAME@@/${KVNAME}/g' <postinst.in >data/DEBIAN/postinst
chmod 0755 data/DEBIAN/postinst
sed -e 's/@@KVNAME@@/${KVNAME}/g' <postrm.in >data/DEBIAN/postrm
chmod 0755 data/DEBIAN/postrm
install -D -m 644 copyright data/usr/share/doc/${PACKAGE}/copyright
install -D -m 644 changelog.Debian data/usr/share/doc/${PACKAGE}/changelog.Debian
echo "git clone git://git.proxmox.com/git/pve-kernel.git\\ngit checkout ${GITVERSION}" > data/usr/share/doc/${PACKAGE}/SOURCE
gzip -n -f --best data/usr/share/doc/${PACKAGE}/changelog.Debian
rm -f data/lib/modules/${KVNAME}/source
rm -f data/lib/modules/${KVNAME}/build
dpkg-deb --build data ${DST_DEB}
${LINUX_TOOLS_DEB} ${HDR_DEB}: ${DST_DEB}
${DST_DEB}: ${BUILD_DIR}.prepared
cd ${BUILD_DIR}; dpkg-buildpackage --jobs=auto -b -uc -us
lintian ${DST_DEB}
LINUX_TOOLS_DH_LIST=strip installchangelogs installdocs compress shlibdeps gencontrol md5sums builddeb
${LINUX_TOOLS_DEB}: .compile_mark control.tools changelog.Debian copyright
rm -rf linux-tools ${LINUX_TOOLS_DEB}
mkdir -p linux-tools/debian
cp control.tools linux-tools/debian/control
echo 9 > linux-tools/debian/compat
cp changelog.Debian linux-tools/debian/changelog
cp copyright linux-tools/debian
mkdir -p linux-tools/debian/linux-tools-4.10/usr/bin
install -m 0755 ${KERNEL_SRC}/tools/perf/perf linux-tools/debian/linux-tools-4.10/usr/bin/perf_4.10
cd linux-tools; for i in ${LINUX_TOOLS_DH_LIST}; do dh_$$i; done
#lintian ${HDR_DEB}
lintian ${LINUX_TOOLS_DEB}
fwlist-${KVNAME}: data
./find-firmware.pl data/lib/modules/${KVNAME} >fwlist.tmp
mv fwlist.tmp $@
.PHONY: fwcheck
fwcheck: fwlist-${KVNAME} fwlist-previous
@echo "checking fwlist for changes since last built firmware package.."
@echo "if this check fails, add fwlist-${KVNAME} to the pve-firmware repository and upload a new firmware package together with the ${KVNAME} kernel"
sort fwlist-previous | uniq > fwlist-previous.sorted
sort fwlist-${KVNAME} | uniq > fwlist-${KVNAME}.sorted
diff -up -N fwlist-previous.sorted fwlist-${KVNAME}.sorted > fwlist.diff
rm fwlist.diff fwlist-previous.sorted fwlist-${KVNAME}.sorted
@echo "done, no need to rebuild pve-firmware"
abi-${KVNAME}: .compile_mark
sed -e 's/^\(.\+\)[[:space:]]\+\(.\+\)[[:space:]]\(.\+\)$$/\3 \2 \1/' ${KERNEL_SRC}/Module.symvers | sort > abi-${KVNAME}
.PHONY: abicheck
abicheck: abi-${KVNAME} abi-previous abi-blacklist
./abi-check abi-${KVNAME} abi-previous ${SKIPABI}
data: .compile_mark igb.ko ixgbe.ko e1000e.ko ${SPL_MODULES} ${ZFS_MODULES}
rm -rf data tmp; mkdir -p tmp/lib/modules/${KVNAME}
mkdir tmp/boot
install -m 644 ${KERNEL_SRC}/.config tmp/boot/config-${KVNAME}
install -m 644 ${KERNEL_SRC}/System.map tmp/boot/System.map-${KVNAME}
install -m 644 ${KERNEL_SRC}/arch/${KERNEL_ARCH}/boot/bzImage tmp/boot/vmlinuz-${KVNAME}
cd ${KERNEL_SRC}; make INSTALL_MOD_PATH=../tmp/ modules_install
## install latest ibg driver
install -m 644 igb.ko tmp/lib/modules/${KVNAME}/kernel/drivers/net/ethernet/intel/igb/
# install latest ixgbe driver
install -m 644 ixgbe.ko tmp/lib/modules/${KVNAME}/kernel/drivers/net/ethernet/intel/ixgbe/
# install latest e1000e driver
install -m 644 e1000e.ko tmp/lib/modules/${KVNAME}/kernel/drivers/net/ethernet/intel/e1000e/
# install zfs drivers
install -d -m 0755 tmp/lib/modules/${KVNAME}/zfs
install -m 644 ${SPL_MODULES} ${ZFS_MODULES} tmp/lib/modules/${KVNAME}/zfs
# remove firmware
rm -rf tmp/lib/firmware
# strip debug info
find tmp/lib/modules -name \*.ko -print | while read f ; do strip --strip-debug "$$f"; done
# finalize
/sbin/depmod -b tmp/ ${KVNAME}
# Autogenerate blacklist for watchdog devices (see README)
install -m 0755 -d tmp/lib/modprobe.d
ls tmp/lib/modules/${KVNAME}/kernel/drivers/watchdog/ > watchdog-blacklist.tmp
echo ipmi_watchdog.ko >> watchdog-blacklist.tmp
cat watchdog-blacklist.tmp|sed -e 's/^/blacklist /' -e 's/.ko$$//'|sort -u > tmp/lib/modprobe.d/blacklist_${PACKAGE}.conf
mv tmp data
PVE_CONFIG_OPTS= \
-m INTEL_MEI_WDT \
-d CONFIG_SND_PCM_OSS \
-e CONFIG_TRANSPARENT_HUGEPAGE_MADVISE \
-d CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS \
-m CONFIG_CEPH_FS \
-m CONFIG_BLK_DEV_NBD \
-m CONFIG_BLK_DEV_RBD \
-m CONFIG_BCACHE \
-m CONFIG_JFS_FS \
-m CONFIG_HFS_FS \
-m CONFIG_HFSPLUS_FS \
-e CONFIG_BRIDGE \
-e CONFIG_BRIDGE_NETFILTER \
-e CONFIG_BLK_DEV_SD \
-e CONFIG_BLK_DEV_SR \
-e CONFIG_BLK_DEV_DM \
-e CONFIG_BLK_DEV_NVME \
-d CONFIG_INPUT_EVBUG \
-d CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND \
-e CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE \
-d CONFIG_MODULE_SIG \
-d CONFIG_MEMCG_DISABLED \
-e CONFIG_MEMCG_SWAP_ENABLED \
-e CONFIG_MEMCG_KMEM \
-d CONFIG_DEFAULT_CFQ \
-e CONFIG_DEFAULT_DEADLINE \
-e CONFIG_MODVERSIONS \
-d CONFIG_DEFAULT_SECURITY_DAC \
-e CONFIG_DEFAULT_SECURITY_APPARMOR \
--set-str CONFIG_DEFAULT_SECURITY apparmor
.compile_mark: ${KERNEL_SRC}/README ${KERNEL_CFG_ORG}
[ ! -e /lib/modules/${KVNAME}/build ] || (echo "please remove /lib/modules/${KVNAME}/build" && false)
cp ${KERNEL_CFG_ORG} ${KERNEL_SRC}/.config
cd ${KERNEL_SRC}; ./scripts/config ${PVE_CONFIG_OPTS}
cd ${KERNEL_SRC}; make oldconfig
cd ${KERNEL_SRC}; make KBUILD_BUILD_VERSION_TIMESTAMP="PVE ${KERNEL_VER}-${PKGREL} (${CHANGELOG_DATE})" -j ${NPROCS}
make -C ${KERNEL_SRC}/tools/perf prefix=/usr HAVE_CPLUS_DEMANGLE=1 NO_LIBPYTHON=1 NO_LIBPERL=1 NO_LIBCRYPTO=1 PYTHON=python2.7
make -C ${KERNEL_SRC}/tools/perf man
${BUILD_DIR}.prepared: $(addsuffix .prepared,${KERNEL_SRC} ${MODULES} debian)
cp -a fwlist-previous ${BUILD_DIR}/
cp -a abi-prev-* ${BUILD_DIR}/
cp -a abi-blacklist ${BUILD_DIR}/
touch $@
${KERNEL_CFG_ORG}: ${KERNEL_SRC}/README
${KERNEL_SRC}/README: ${KERNEL_SRC_SUBMODULE} | submodule
rm -rf ${KERNEL_SRC}
cp -a ${KERNEL_SRC_SUBMODULE} ${KERNEL_SRC}
cat ${KERNEL_SRC}/debian.master/config/config.common.ubuntu ${KERNEL_SRC}/debian.master/config/${ARCH}/config.common.${ARCH} ${KERNEL_SRC}/debian.master/config/${ARCH}/config.flavour.generic > ${KERNEL_CFG_ORG}
cd ${KERNEL_SRC}; patch -p1 < ../uname-version-timestamp.patch
cd ${KERNEL_SRC}; patch -p1 <../bridge-patch.diff
#cd ${KERNEL_SRC}; patch -p1 <../bridge-forward-ipv6-neighbor-solicitation.patch
#cd ${KERNEL_SRC}; patch -p1 <../add-empty-ndo_poll_controller-to-veth.patch
cd ${KERNEL_SRC}; patch -p1 <../override_for_missing_acs_capabilities.patch
#cd ${KERNEL_SRC}; patch -p1 <../vhost-net-extend-device-allocation-to-vmalloc.patch
cd ${KERNEL_SRC}; patch -p1 < ../kvm-dynamic-halt-polling-disable-default.patch
cd ${KERNEL_SRC}; patch -p1 < ../cgroup-cpuset-add-cpuset.remap_cpus.patch
cd ${KERNEL_SRC}; patch -p1 < ../0001-netfilter-nft_set_rbtree-handle-re-addition-element-.patch # DoS from within (unpriv) containers
cd ${KERNEL_SRC}; patch -p1 < ../0001-tcp-reset-sk_rx_dst-in-tcp_disconnect.patch
cd ${KERNEL_SRC}; patch -p1 < ../0001-Revert-net-reduce-skb_warn_bad_offload-noise.patch
sed -i ${KERNEL_SRC}/Makefile -e 's/^EXTRAVERSION.*$$/EXTRAVERSION=${EXTRAVERSION}/'
debian.prepared: debian
rm -rf ${BUILD_DIR}/debian
mkdir -p ${BUILD_DIR}
cp -a debian ${BUILD_DIR}/debian
echo "git clone git://git.proxmox.com/git/pve-kernel.git\\ngit checkout ${GITVERSION}" > ${BUILD_DIR}/debian/SOURCE
@$(foreach dir, ${DIRS},echo "${dir}=${${dir}}" >> ${BUILD_DIR}/debian/rules.d/env.mk;)
echo "KVNAME=${KVNAME}" >> ${BUILD_DIR}/debian/rules.d/env.mk
echo "KERNEL_MAJMIN=${KERNEL_MAJMIN}" >> ${BUILD_DIR}/debian/rules.d/env.mk
cd ${BUILD_DIR}; debian/rules debian/control
touch $@
e1000e.ko e1000e: .compile_mark ${E1000ESRC}
rm -rf ${E1000EDIR}
tar xf ${E1000ESRC}
[ ! -e /lib/modules/${KVNAME}/build ] || (echo "please remove /lib/modules/${KVNAME}/build" && false)
cd ${E1000EDIR}; patch -p1 < ../intel-module-gcc6-compat.patch
cd ${E1000EDIR}; patch -p1 < ../e1000e_4.10_compat.patch
cd ${E1000EDIR}; patch -p1 < ../e1000e_4.10_max-mtu.patch
cd ${E1000EDIR}/src; make BUILD_KERNEL=${KVNAME} KSRC=${TOP}/${KERNEL_SRC}
cp ${E1000EDIR}/src/e1000e.ko e1000e.ko
${KERNEL_SRC}.prepared: ${KERNEL_SRC_SUBMODULE} | submodule
rm -rf ${BUILD_DIR}/${KERNEL_SRC} $@
mkdir -p ${BUILD_DIR}
cp -a ${KERNEL_SRC_SUBMODULE} ${BUILD_DIR}/${KERNEL_SRC}
# TODO: split for archs, track and diff in our repository?
cat ${BUILD_DIR}/${KERNEL_SRC}/debian.master/config/config.common.ubuntu ${BUILD_DIR}/${KERNEL_SRC}/debian.master/config/${ARCH}/config.common.${ARCH} ${BUILD_DIR}/${KERNEL_SRC}/debian.master/config/${ARCH}/config.flavour.generic > ${KERNEL_CFG_ORG}
cp ${KERNEL_CFG_ORG} ${BUILD_DIR}/${KERNEL_SRC}/.config
sed -i ${BUILD_DIR}/${KERNEL_SRC}/Makefile -e 's/^EXTRAVERSION.*$$/EXTRAVERSION=${EXTRAVERSION}/'
rm -rf ${BUILD_DIR}/${KERNEL_SRC}/debian ${BUILD_DIR}/${KERNEL_SRC}/debian.master
cd ${BUILD_DIR}/${KERNEL_SRC}; for patch in ../../patches/kernel/*.patch; do patch -p1 < $${patch}; done
touch $@
igb.ko igb: .compile_mark ${IGBSRC}
rm -rf ${IGBDIR}
tar xf ${IGBSRC}
[ ! -e /lib/modules/${KVNAME}/build ] || (echo "please remove /lib/modules/${KVNAME}/build" && false)
cd ${IGBDIR}; patch -p1 < ../intel-module-gcc6-compat.patch
cd ${IGBDIR}; patch -p1 < ../igb_4.9_compat.patch
cd ${IGBDIR}; patch -p1 < ../igb_4.10_compat.patch
cd ${IGBDIR}; patch -p1 < ../igb_4.10_max-mtu.patch
cd ${IGBDIR}/src; make BUILD_KERNEL=${KVNAME} KSRC=${TOP}/${KERNEL_SRC}
cp ${IGBDIR}/src/igb.ko igb.ko
${MODULES}.prepared: $(addsuffix .prepared,${MODULE_DIRS})
touch $@
ixgbe.ko ixgbe: .compile_mark ${IXGBESRC}
rm -rf ${IXGBEDIR}
tar xf ${IXGBESRC}
[ ! -e /lib/modules/${KVNAME}/build ] || (echo "please remove /lib/modules/${KVNAME}/build" && false)
cd ${IXGBEDIR}; patch -p1 < ../ixgbe_4.10_compat.patch
cd ${IXGBEDIR}; patch -p1 < ../ixgbe_4.10_max-mtu.patch
cd ${IXGBEDIR}/src; make CFLAGS_EXTRA="-DIXGBE_NO_LRO" BUILD_KERNEL=${KVNAME} KSRC=${TOP}/${KERNEL_SRC}
cp ${IXGBEDIR}/src/ixgbe.ko ixgbe.ko
${E1000EDIR}.prepared: ${E1000ESRC}
rm -rf ${BUILD_DIR}/${MODULES}/${E1000EDIR} $@
mkdir -p ${BUILD_DIR}/${MODULES}/${E1000EDIR}
tar --strip-components=1 -C ${BUILD_DIR}/${MODULES}/${E1000EDIR} -xf ${E1000ESRC}
cd ${BUILD_DIR}/${MODULES}/${E1000EDIR}; patch -p1 < ../../../patches/intel/intel-module-gcc6-compat.patch
cd ${BUILD_DIR}/${MODULES}/${E1000EDIR}; patch -p1 < ../../../patches/intel/e1000e/e1000e_4.10_max-mtu.patch
touch $@
$(SPL_KO_REST): $(SPL_KO)
$(SPL_KO): .compile_mark ${SPLSRC}
rm -rf ${SPLDIR}
rsync -ra ${SPLSRC}/ ${SPLDIR}
[ ! -e /lib/modules/${KVNAME}/build ] || (echo "please remove /lib/modules/${KVNAME}/build" && false)
cd ${SPLDIR}; ./autogen.sh
cd ${SPLDIR}; ./configure --with-config=kernel --with-linux=${TOP}/${KERNEL_SRC} --with-linux-obj=${TOP}/${KERNEL_SRC}
cd ${SPLDIR}; make
cp ${SPLDIR}/module/spl/spl.ko spl.ko
cp ${SPLDIR}/module/splat/splat.ko splat.ko
${IGBDIR}.prepared: ${IGBSRC}
rm -rf ${BUILD_DIR}/${MODULES}/${IGBDIR} $@
mkdir -p ${BUILD_DIR}/${MODULES}/${IGBDIR}
tar --strip-components=1 -C ${BUILD_DIR}/${MODULES}/${IGBDIR} -xf ${IGBSRC}
cd ${BUILD_DIR}/${MODULES}/${IGBDIR}; patch -p1 < ../../../patches/intel/igb/igb_4.10_max-mtu.patch
cd ${BUILD_DIR}/${MODULES}/${IGBDIR}; patch -p1 < ../../../patches/intel/igb/igb_4.12_compat.patch
touch $@
$(ZFS_KO_REST): $(ZFS_KO)
$(ZFS_KO): .compile_mark ${ZFSSRC}
rm -rf ${ZFSDIR}
rsync -ra ${ZFSSRC}/ ${ZFSDIR}
[ ! -e /lib/modules/${KVNAME}/build ] || (echo "please remove /lib/modules/${KVNAME}/build" && false)
cd ${ZFSDIR}; ./autogen.sh
cd ${ZFSDIR}; ./configure --with-spl=${TOP}/${SPLDIR} --with-spl-obj=${TOP}/${SPLDIR} --with-config=kernel --with-linux=${TOP}/${KERNEL_SRC} --with-linux-obj=${TOP}/${KERNEL_SRC}
cd ${ZFSDIR}; make
cp ${ZFSDIR}/module/zfs/zfs.ko zfs.ko
cp ${ZFSDIR}/module/avl/zavl.ko zavl.ko
cp ${ZFSDIR}/module/nvpair/znvpair.ko znvpair.ko
cp ${ZFSDIR}/module/unicode/zunicode.ko zunicode.ko
cp ${ZFSDIR}/module/zcommon/zcommon.ko zcommon.ko
cp ${ZFSDIR}/module/zpios/zpios.ko zpios.ko
cp ${ZFSDIR}/module/icp/icp.ko icp.ko
${IXGBEDIR}.prepared: ${IXGBESRC}
rm -rf ${BUILD_DIR}/${MODULES}/${IXGBEDIR} $@
mkdir -p ${BUILD_DIR}/${MODULES}/${IXGBEDIR}
tar --strip-components=1 -C ${BUILD_DIR}/${MODULES}/${IXGBEDIR} -xf ${IXGBESRC}
touch $@
headers_tmp := $(CURDIR)/tmp-headers
headers_dir := $(headers_tmp)/usr/src/linux-headers-${KVNAME}
$(SPLDIR).prepared: ${SPLSRC}
rm -rf ${BUILD_DIR}/${MODULES}/${SPLDIR} $@
mkdir -p ${BUILD_DIR}/${MODULES}/${SPLDIR}
cp -a ${SPLSRC}/* ${BUILD_DIR}/${MODULES}/${SPLDIR}
cd ${BUILD_DIR}/${MODULES}/${SPLDIR}; for patch in ../../../${SPLSRC}/../spl-patches/*.patch; do patch -p1 < $${patch}; done
touch $@
hdr: $(HDR_DEB)
${HDR_DEB}: .compile_mark headers-control.in headers-postinst.in
rm -rf $(headers_tmp)
install -d $(headers_tmp)/DEBIAN $(headers_dir)/include/
sed -e 's/@KERNEL_VER@/${KERNEL_VER}/' -e 's/@KVNAME@/${KVNAME}/' -e 's/@PKGREL@/${PKGREL}/' -e 's/@ARCH@/${ARCH}/' <headers-control.in >$(headers_tmp)/DEBIAN/control
sed -e 's/@@KVNAME@@/${KVNAME}/g' <headers-postinst.in >$(headers_tmp)/DEBIAN/postinst
chmod 0755 $(headers_tmp)/DEBIAN/postinst
install -D -m 644 copyright $(headers_tmp)/usr/share/doc/${HDRPACKAGE}/copyright
install -D -m 644 changelog.Debian $(headers_tmp)/usr/share/doc/${HDRPACKAGE}/changelog.Debian
echo "git clone git://git.proxmox.com/git/pve-kernel.git\\ngit checkout ${GITVERSION}" > $(headers_tmp)/usr/share/doc/${HDRPACKAGE}/SOURCE
gzip -n -f --best $(headers_tmp)/usr/share/doc/${HDRPACKAGE}/changelog.Debian
install -m 0644 ${KERNEL_SRC}/.config $(headers_dir)
install -m 0644 ${KERNEL_SRC}/Module.symvers $(headers_dir)
cd ${KERNEL_SRC}; find . -path './debian/*' -prune -o -path './include/*' -prune -o -path './Documentation' -prune \
-o -path './scripts' -prune -o -type f \
\( -name 'Makefile*' -o -name 'Kconfig*' -o -name 'Kbuild*' -o \
-name '*.sh' -o -name '*.pl' \) \
-print | cpio -pd --preserve-modification-time $(headers_dir)
cd ${KERNEL_SRC}; cp -a include scripts $(headers_dir)
cd ${KERNEL_SRC}; (find arch/${KERNEL_ARCH} -name include -type d -print | \
xargs -n1 -i: find : -type f) | \
cpio -pd --preserve-modification-time $(headers_dir)
mkdir -p ${headers_tmp}/lib/modules/${KVNAME}
ln -sf /usr/src/linux-headers-${KVNAME} ${headers_tmp}/lib/modules/${KVNAME}/build
dpkg-deb --build $(headers_tmp) ${HDR_DEB}
#lintian ${HDR_DEB}
$(ZFSDIR).prepared: ${ZFSSRC}
rm -rf ${BUILD_DIR}/${MODULES}/${ZFSDIR} $@
mkdir -p ${BUILD_DIR}/${MODULES}/${ZFSDIR}
cp -a ${ZFSSRC}/* ${BUILD_DIR}/${MODULES}/${ZFSDIR}
cd ${BUILD_DIR}/${MODULES}/${ZFSDIR}; for patch in ../../../${ZFSSRC}/../zfs-patches/*.patch; do patch -p1 < $${patch}; done
# temporarily since patch does not know about permissions, remove after 0.7.7 was merged properly
chmod +x ${BUILD_DIR}/${MODULES}/${ZFSDIR}/scripts/enum-extract.pl
touch $@
.PHONY: upload
upload: ${DEBS}
tar cf - ${DEBS}|ssh repoman@repo.proxmox.com -- upload --product pve --dist stretch --arch ${ARCH}
tar cf - ${DEBS}|ssh repoman@repo.proxmox.com -- upload --product pve,pmg --dist stretch --arch ${ARCH}
.PHONY: distclean
distclean: clean
rm -rf linux-firmware.git dvb-firmware.git ${KERNEL_SRC}.org
git submodule deinit --all
# upgrade to current master
.PHONY: update_modules
update_modules: submodule
git submodule foreach 'git pull --ff-only origin master'
cd ${ZFSSRC}; git pull --ff-only origin master
cd ${SPLSRC}; git pull --ff-only origin master
# make sure submodules were initialized
.PHONY: submodule
submodule:
test -f "${KERNEL_SRC_SUBMODULE}/README" || git submodule update --init
test -f "${ZFSSRC}/debian/changelog" || git submodule update --init
test -f "${SPLSRC}/debian/changelog" || git submodule update --init
test -f "${KERNEL_SRC_SUBMODULE}/README" || git submodule update --init ${KERNEL_SRC_SUBMODULE}
test -f "${ZFSONLINUX_SUBMODULE}/Makefile" || git submodule update --init ${ZFSONLINUX_SUBMODULE}
(test -f "${ZFSSRC}/debian/changelog" && test -f "${SPLZRC}/debian/changelog") || (cd ${ZFSONLINUX_SUBMODULE}; git submodule update --init)
# call after ABI bump with header deb in working directory
.PHONY: abiupdate
abiupdate: abi-prev-${KVNAME}
abi-prev-${KVNAME}: abi-tmp-${KVNAME}
ifneq ($(strip $(shell git status --untracked-files=no --porcelain -z)),)
@echo "working directory unclean, aborting!"
@false
else
git rm "abi-prev-*"
mv $< $@
git add $@
git commit -s -m "update ABI file for ${KVNAME}" -m "(generated with debian/scripts/abi-generate)"
@echo "update abi-prev-${KVNAME} committed!"
endif
abi-tmp-${KVNAME}:
@ test -e ${HDR_DEB} || (echo "need ${HDR_DEB} to extract ABI data!" && false)
debian/scripts/abi-generate ${HDR_DEB} $@ ${KVNAME} 1
.PHONY: clean
clean:
rm -rf *~ .compile_mark watchdog-blacklist.tmp ${KERNEL_CFG_ORG} ${KERNEL_SRC} ${KERNEL_SRC}.tmp ${KERNEL_CFG_ORG} ${KERNEL_SRC}.org orig tmp data proxmox-ve/data *.deb ${headers_tmp} fwdata fwlist.tmp *.ko abi-${KVNAME} fwlist-${KVNAME} ${ZFSDIR} ${SPLDIR} ${SPL_MODULES} ${ZFS_MODULES} hpsa.ko ${HPSADIR} ${DRBDDIR} drbd-9.0 ${IGBDIR} igb.ko ${IXGBEDIR} ixgbe.ko ${E1000EDIR} e1000e.ko linux-tools ${LINUX_TOOLS_DEB}
rm -rf *~ build *.prepared ${KERNEL_CFG_ORG}
rm -f *.deb *.changes *.buildinfo
+45 -50
View File
@@ -3,7 +3,7 @@ KERNEL SOURCE:
We currently use the Ubuntu kernel sources, available from:
http://kernel.ubuntu.com/git/ubuntu/ubuntu-xenial.git/
http://kernel.ubuntu.com/git/ubuntu/ubuntu-artful.git/
Ubuntu will maintain those kernels till:
@@ -17,12 +17,7 @@ Additional/Updated Modules:
- include latest ixgbe driver from intel/sourceforge
- include latest igb driver from intel/sourceforge
# Note: hpsa does not compile with kernel 3.19.8
#- include latest HPSA driver (HP Smart Array)
#
# * http://sourceforge.net/projects/cciss/
- include latest igb driver from intel/sourceforge
- include native OpenZFS filesystem kernel modules for Linux
@@ -30,43 +25,35 @@ Additional/Updated Modules:
For licensing questions, see: http://open-zfs.org/wiki/Talk:FAQ
- include latest DRBD 9 driver, see http://drbd.linbit.com/home/what-is-drbd/
RELATED PACKAGES:
=================
proxmox-ve
----------
top level meta package, depends on current default kernel series meta package.
git clone git://git.proxmox.com/git/proxmox-ve.git
pve-kernel-meta
---------------
depends on latest kernel and header package within a certain kernel series,
e.g., pve-kernel-4.13 / pve-headers-4.13
git clone git://git.proxmox.com/git/pve-kernel-meta.git
pve-firmware
------------
contains the firmware for all released PVE kernels.
git clone git://git.proxmox.com/git/pve-firmware.git
FIRMWARE:
=========
We create our own firmware package, which includes the firmware for
all proxmox-ve kernels. So far this include
pve-kernel-2.6.18
pve-kernel-2.6.24
pve-kernel-2.6.32
pve-kernel-3.10.0
pve-kernel-3.19.0
We use 'find-firmware.pl' to extract lists of required firmeware
files. The script 'assemble-firmware.pl' is used to read those lists
and copy the files from various source directory into a target
directory.
We do not include firmeware for some wireless HW when there is a
separate debian package for that, for example:
zd1211-firmware
atmel-firmware
bluez-firmware
PATCHES:
--------
bridge-patch.diff: Avoid bridge problems with changing MAC
see also: http://forum.openvz.org/index.php?t=msg&th=5291
Behaviour after 2.6.27 has changed slighly - after setting mac address
of bridge device, then address won't change. So we could omit
that patch, requiring to set hwaddress in /etc/network/interfaces.
NOTES:
======
Watchdog blacklist
------------------
@@ -80,9 +67,15 @@ Additional information
----------------------
We use the default configuration provided by Ubuntu, and apply
the following modification:
the following modifications:
see Makefile (PVE_CONFIG_OPTS)
see debian/rules (PVE_CONFIG_OPTS)
- enable INTEL_MEI_WDT=m (to allow disabling via patch)
- disable CONFIG_SND_PCM_OSS (enabled by default in Ubuntu, not needed)
- switch CONFIG_TRANSPARENT_HUGEPAGE to MADVISE from ALWAYS
- enable CONFIG_CEPH_FS=m (request from user)
@@ -106,8 +99,8 @@ see Makefile (PVE_CONFIG_OPTS)
CONFIG_BLK_DEV_LOOP_MIN_COUNT=8
- disable module signatures (CONFIG_MODULE_SIG)
- enable IBM JFS file system
- enable IBM JFS file system
This is disabled in RHEL kernel for no real reason, so we enable
it as requested by users (bug #64)
@@ -127,7 +120,7 @@ see Makefile (PVE_CONFIG_OPTS)
- enable CONFIG_DEFAULT_SECURITY_APPARMOR
We need this for lxc
- set CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
because if not set, it can give some dynamic memory or cpu frequencies
@@ -145,8 +138,10 @@ see Makefile (PVE_CONFIG_OPTS)
Module evbug is not blacklisted on debian, so we simply disable it
to avoid key-event logs (which is a big security problem)
Testing final kernel with kvm
-----------------------------
- enable CONFIG_MODVERSIONS (needed for ABI tracking)
kvm -kernel data/boot/vmlinuz-3.19.8-1-pve -initrd initrd.img-3.19.8-1-pve -append "vga=791 video=vesafb:ywrap,mtrr" /dev/zero
- switch default UNWINDER to FRAME_POINTER
the recently introduced ORC_UNWINDER is not 100% stable yet, especially in combination with ZFS
- enable CONFIG_PAGE_TABLE_ISOLATION (Meltdown mitigation)
+22208
View File
File diff suppressed because it is too large Load Diff
-21375
View File
File diff suppressed because it is too large Load Diff
-14
View File
@@ -1,14 +0,0 @@
--- linux-2.6-3.10.0/net/bridge/br_stp_if.c.orig 2013-11-26 22:20:20.000000000 +0100
+++ linux-2.6-3.10.0/net/bridge/br_stp_if.c 2013-12-17 08:42:10.004428223 +0100
@@ -228,10 +228,7 @@
return false;
list_for_each_entry(p, &br->port_list, list) {
- if (addr == br_mac_zero ||
- memcmp(p->dev->dev_addr, addr, ETH_ALEN) < 0)
- addr = p->dev->dev_addr;
-
+ addr = p->dev->dev_addr;
}
if (ether_addr_equal(br->bridge_id.addr, addr))
-137
View File
@@ -1,137 +0,0 @@
commit 8974189222159154c55f24ddad33e3613960521a
Author: Peter Zijlstra <peterz@infradead.org>
Date: Thu Jun 16 10:50:40 2016 +0200
sched/fair: Fix cfs_rq avg tracking underflow
As per commit:
b7fa30c9cc48 ("sched/fair: Fix post_init_entity_util_avg() serialization")
> the code generated from update_cfs_rq_load_avg():
>
> if (atomic_long_read(&cfs_rq->removed_load_avg)) {
> s64 r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0);
> sa->load_avg = max_t(long, sa->load_avg - r, 0);
> sa->load_sum = max_t(s64, sa->load_sum - r * LOAD_AVG_MAX, 0);
> removed_load = 1;
> }
>
> turns into:
>
> ffffffff81087064: 49 8b 85 98 00 00 00 mov 0x98(%r13),%rax
> ffffffff8108706b: 48 85 c0 test %rax,%rax
> ffffffff8108706e: 74 40 je ffffffff810870b0 <update_blocked_averages+0xc0>
> ffffffff81087070: 4c 89 f8 mov %r15,%rax
> ffffffff81087073: 49 87 85 98 00 00 00 xchg %rax,0x98(%r13)
> ffffffff8108707a: 49 29 45 70 sub %rax,0x70(%r13)
> ffffffff8108707e: 4c 89 f9 mov %r15,%rcx
> ffffffff81087081: bb 01 00 00 00 mov $0x1,%ebx
> ffffffff81087086: 49 83 7d 70 00 cmpq $0x0,0x70(%r13)
> ffffffff8108708b: 49 0f 49 4d 70 cmovns 0x70(%r13),%rcx
>
> Which you'll note ends up with sa->load_avg -= r in memory at
> ffffffff8108707a.
So I _should_ have looked at other unserialized users of ->load_avg,
but alas. Luckily nikbor reported a similar /0 from task_h_load() which
instantly triggered recollection of this here problem.
Aside from the intermediate value hitting memory and causing problems,
there's another problem: the underflow detection relies on the signed
bit. This reduces the effective width of the variables, IOW its
effectively the same as having these variables be of signed type.
This patch changes to a different means of unsigned underflow
detection to not rely on the signed bit. This allows the variables to
use the 'full' unsigned range. And it does so with explicit LOAD -
STORE to ensure any intermediate value will never be visible in
memory, allowing these unserialized loads.
Note: GCC generates crap code for this, might warrant a look later.
Note2: I say 'full' above, if we end up at U*_MAX we'll still explode;
maybe we should do clamping on add too.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yuyang Du <yuyang.du@intel.com>
Cc: bsegall@google.com
Cc: kernel@kyup.com
Cc: morten.rasmussen@arm.com
Cc: pjt@google.com
Cc: steve.muckle@linaro.org
Fixes: 9d89c257dfb9 ("sched/fair: Rewrite runnable load and utilization average tracking")
Link: http://lkml.kernel.org/r/20160617091948.GJ30927@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
kernel/sched/fair.c | 33 +++++++++++++++++++++++++--------
1 file changed, 25 insertions(+), 8 deletions(-)
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2682,6 +2682,23 @@ static inline void update_tg_load_avg(st
static inline u64 cfs_rq_clock_task(struct cfs_rq *cfs_rq);
+/*
+ * Unsigned subtract and clamp on underflow.
+ *
+ * Explicitly do a load-store to ensure the intermediate value never hits
+ * memory. This allows lockless observations without ever seeing the negative
+ * values.
+ */
+#define sub_positive(_ptr, _val) do { \
+ typeof(_ptr) ptr = (_ptr); \
+ typeof(*ptr) val = (_val); \
+ typeof(*ptr) res, var = READ_ONCE(*ptr); \
+ res = var - val; \
+ if (res > var) \
+ res = 0; \
+ WRITE_ONCE(*ptr, res); \
+} while (0)
+
/* Group cfs_rq's load_avg is used for task_h_load and update_cfs_share */
static inline int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
{
@@ -2690,15 +2707,15 @@ static inline int update_cfs_rq_load_avg
if (atomic_long_read(&cfs_rq->removed_load_avg)) {
s64 r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0);
- sa->load_avg = max_t(long, sa->load_avg - r, 0);
- sa->load_sum = max_t(s64, sa->load_sum - r * LOAD_AVG_MAX, 0);
+ sub_positive(&sa->load_avg, r);
+ sub_positive(&sa->load_sum, r * LOAD_AVG_MAX);
removed = 1;
}
if (atomic_long_read(&cfs_rq->removed_util_avg)) {
long r = atomic_long_xchg(&cfs_rq->removed_util_avg, 0);
- sa->util_avg = max_t(long, sa->util_avg - r, 0);
- sa->util_sum = max_t(s32, sa->util_sum - r * LOAD_AVG_MAX, 0);
+ sub_positive(&sa->util_avg, r);
+ sub_positive(&sa->util_sum, r * LOAD_AVG_MAX);
}
decayed = __update_load_avg(now, cpu_of(rq_of(cfs_rq)), sa,
@@ -2764,10 +2781,10 @@ static void detach_entity_load_avg(struc
&se->avg, se->on_rq * scale_load_down(se->load.weight),
cfs_rq->curr == se, NULL);
- cfs_rq->avg.load_avg = max_t(long, cfs_rq->avg.load_avg - se->avg.load_avg, 0);
- cfs_rq->avg.load_sum = max_t(s64, cfs_rq->avg.load_sum - se->avg.load_sum, 0);
- cfs_rq->avg.util_avg = max_t(long, cfs_rq->avg.util_avg - se->avg.util_avg, 0);
- cfs_rq->avg.util_sum = max_t(s32, cfs_rq->avg.util_sum - se->avg.util_sum, 0);
+ sub_positive(&cfs_rq->avg.load_avg, se->avg.load_avg);
+ sub_positive(&cfs_rq->avg.load_sum, se->avg.load_sum);
+ sub_positive(&cfs_rq->avg.util_avg, se->avg.util_avg);
+ sub_positive(&cfs_rq->avg.util_sum, se->avg.util_sum);
}
/* Add the load generated by se into cfs_rq's load average */
-178
View File
@@ -1,178 +0,0 @@
From 40d641241a5399afc93d4eb75d8794f72fe3c0fb Mon Sep 17 00:00:00 2001
From: Wolfgang Bumiller <w.bumiller@proxmox.com>
Date: Wed, 21 Dec 2016 15:37:20 +0100
Subject: [PATCH] cgroup, cpuset: add cpuset.remap_cpus
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Changes a cpuset, recursively remapping all its descendants
to the new range.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
include/linux/cpumask.h | 17 ++++++++++++++++
kernel/cpuset.c | 54 +++++++++++++++++++++++++++++++++++++++----------
2 files changed, 60 insertions(+), 11 deletions(-)
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 59915ea..f5487c8 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -514,6 +514,23 @@ static inline void cpumask_copy(struct cpumask *dstp,
}
/**
+ * cpumask_remap - *dstp = map(old, new)(*srcp)
+ * @dstp: the result
+ * @srcp: the input cpumask
+ * @oldp: the old mask
+ * @newp: the new mask
+ */
+static inline void cpumask_remap(struct cpumask *dstp,
+ const struct cpumask *srcp,
+ const struct cpumask *oldp,
+ const struct cpumask *newp)
+{
+ bitmap_remap(cpumask_bits(dstp), cpumask_bits(srcp),
+ cpumask_bits(oldp), cpumask_bits(newp),
+ nr_cpumask_bits);
+}
+
+/**
* cpumask_any - pick a "random" cpu from *srcp
* @srcp: the input cpumask
*
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 7e78cfe..ff5ff3a 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -462,7 +462,8 @@ static void free_trial_cpuset(struct cpuset *trial)
* Return 0 if valid, -errno if not.
*/
-static int validate_change(struct cpuset *cur, struct cpuset *trial)
+static int validate_change(struct cpuset *cur, struct cpuset *trial,
+ int remap)
{
struct cgroup_subsys_state *css;
struct cpuset *c, *par;
@@ -470,11 +471,13 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
rcu_read_lock();
- /* Each of our child cpusets must be a subset of us */
- ret = -EBUSY;
- cpuset_for_each_child(c, css, cur)
- if (!is_cpuset_subset(c, trial))
- goto out;
+ if (!remap) {
+ /* Each of our child cpusets must be a subset of us */
+ ret = -EBUSY;
+ cpuset_for_each_child(c, css, cur)
+ if (!is_cpuset_subset(c, trial))
+ goto out;
+ }
/* Remaining checks don't apply to root cpuset */
ret = 0;
@@ -937,11 +940,15 @@ static void update_cpumasks_hier(struct cpuset *cs, struct cpumask *new_cpus)
* @cs: the cpuset to consider
* @trialcs: trial cpuset
* @buf: buffer of cpu numbers written to this cpuset
+ * @remap: recursively remap all child nodes
*/
static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
- const char *buf)
+ const char *buf, int remap)
{
int retval;
+ struct cpuset *cp;
+ struct cgroup_subsys_state *pos_css;
+ struct cpumask tempmask;
/* top_cpuset.cpus_allowed tracks cpu_online_mask; it's read-only */
if (cs == &top_cpuset)
@@ -969,11 +976,25 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
if (cpumask_equal(cs->cpus_allowed, trialcs->cpus_allowed))
return 0;
- retval = validate_change(cs, trialcs);
+ retval = validate_change(cs, trialcs, remap);
if (retval < 0)
return retval;
spin_lock_irq(&callback_lock);
+ if (remap) {
+ rcu_read_lock();
+ cpuset_for_each_descendant_pre(cp, pos_css, cs) {
+ /* skip empty subtrees */
+ if (cpumask_empty(cp->cpus_allowed)) {
+ pos_css = css_rightmost_descendant(pos_css);
+ continue;
+ }
+ cpumask_copy(&tempmask, cp->cpus_allowed);
+ cpumask_remap(cp->cpus_allowed, &tempmask,
+ cs->cpus_allowed, trialcs->cpus_allowed);
+ }
+ rcu_read_unlock();
+ }
cpumask_copy(cs->cpus_allowed, trialcs->cpus_allowed);
spin_unlock_irq(&callback_lock);
@@ -1250,7 +1271,7 @@ static int update_nodemask(struct cpuset *cs, struct cpuset *trialcs,
retval = 0; /* Too easy - nothing to do */
goto done;
}
- retval = validate_change(cs, trialcs);
+ retval = validate_change(cs, trialcs, 0);
if (retval < 0)
goto done;
@@ -1337,7 +1358,7 @@ static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs,
else
clear_bit(bit, &trialcs->flags);
- err = validate_change(cs, trialcs);
+ err = validate_change(cs, trialcs, 0);
if (err < 0)
goto out;
@@ -1596,6 +1617,7 @@ static void cpuset_attach(struct cgroup_taskset *tset)
typedef enum {
FILE_MEMORY_MIGRATE,
FILE_CPULIST,
+ FILE_REMAP_CPULIST,
FILE_MEMLIST,
FILE_EFFECTIVE_CPULIST,
FILE_EFFECTIVE_MEMLIST,
@@ -1728,7 +1750,10 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
switch (of_cft(of)->private) {
case FILE_CPULIST:
- retval = update_cpumask(cs, trialcs, buf);
+ retval = update_cpumask(cs, trialcs, buf, 0);
+ break;
+ case FILE_REMAP_CPULIST:
+ retval = update_cpumask(cs, trialcs, buf, 1);
break;
case FILE_MEMLIST:
retval = update_nodemask(cs, trialcs, buf);
@@ -1845,6 +1870,13 @@ static struct cftype files[] = {
},
{
+ .name = "remap_cpus",
+ .write = cpuset_write_resmask,
+ .max_write_len = (100U + 6 * NR_CPUS),
+ .private = FILE_REMAP_CPULIST,
+ },
+
+ {
.name = "mems",
.seq_show = cpuset_common_seq_show,
.write = cpuset_write_resmask,
--
2.1.4
-11
View File
@@ -1,11 +0,0 @@
Package: pve-kernel-@KVNAME@
Version: @KERNEL_VER@-@PKGREL@
Section: admin
Priority: optional
Architecture: @ARCH@
Provides: linux-image, linux-image-2.6
Suggests: pve-firmware
Depends: grub-pc | grub-efi-amd64 | grub-efi-ia32 | grub-efi-arm64, initramfs-tools, busybox
Maintainer: Proxmox Support Team <support@proxmox.com>
Description: The Proxmox PVE Kernel Image
This package contains the linux kernel and initial ramdisk used for booting
-12
View File
@@ -1,12 +0,0 @@
Source: pve-kernel
Maintainer: Proxmox Support Team <support@proxmox.com>
Package: linux-tools-4.10
Architecture: any
Section: devel
Priority: optional
Depends: ${misc:Depends}, ${shlibs:Depends}, linux-base
Description: Linux kernel version specific tools for version 4.10
This package provides the architecture dependent parts for kernel
version locked tools (such as perf and x86_energy_perf_policy)
+256 -8
View File
@@ -1,18 +1,266 @@
pve-kernel (4.10.17-25) unstable; urgency=medium
pve-kernel (4.13.16-51) unstable; urgency=medium
* update ZFS/SPL to 0.7.3
* update sources to Ubuntu-4.13.0-45.50
* bump ABI to 4.10.17-5-pve
* bump ABI to 4.13.16-4-pve
-- Proxmox Support Team <support@proxmox.com> Mon, 6 Nov 2017 12:37:43 +0100
-- Proxmox Support Team <support@proxmox.com> Thu, 05 Jul 2018 10:25:38 +0200
pve-kernel (4.10.17-24) unstable; urgency=medium
pve-kernel (4.13.16-50) unstable; urgency=medium
* update to Ubuntu-4.10.0-37.41
* fix KVM L1 guest escape when nested virtualization is used - CVE-
2018-12904
* bump ABI to 4.10.17-4-pve
-- Proxmox Support Team <support@proxmox.com> Wed, 27 Jun 2018 18:02:52 +0200
-- Proxmox Support Team <support@proxmox.com> Tue, 10 Oct 2017 14:12:14 +0200
pve-kernel (4.13.16-49) unstable; urgency=medium
* update to Ubuntu-4.13.0-43.48
* update ZFS to 0.7.9-pve1
-- Proxmox Support Team <support@proxmox.com> Tue, 22 May 2018 15:10:48 +0200
pve-kernel (4.13.16-48) unstable; urgency=medium
* update ZFS to 0.7.8-pve1
* update to Ubuntu-4.13.0-40.45
-- Proxmox Support Team <support@proxmox.com> Fri, 04 May 2018 11:00:32 +0200
pve-kernel (4.13.16-47) unstable; urgency=medium
* update to Ubuntu-4.13.0-39.44
* bump ABI to 4.13.16-2-pve
* update ZFS to 0.7.7-pve2
-- Proxmox Support Team <support@proxmox.com> Mon, 9 Apr 2018 09:58:12 +0200
pve-kernel (4.13.16-46) unstable; urgency=medium
* update ZFS/SPL to 0.7.7
-- Proxmox Support Team <support@proxmox.com> Wed, 4 Apr 2018 10:30:30 +0200
pve-kernel (4.13.16-45) unstable; urgency=medium
* cherry-pick fix for shmem related deadlock
-- Proxmox Support Team <support@proxmox.com> Wed, 28 Mar 2018 15:47:11 +0200
pve-kernel (4.13.16-44) unstable; urgency=medium
* cherry-pick fix for THP related deadlock
-- Proxmox Support Team <support@proxmox.com> Wed, 28 Mar 2018 10:36:55 +0200
pve-kernel (4.13.16-43) unstable; urgency=medium
* update to Ubuntu-4.13.0-38.43
* bump ABI to 4.13.16-1-pve
-- Proxmox Support Team <support@proxmox.com> Fri, 16 Mar 2018 19:41:43 +0100
pve-kernel (4.13.13-42) unstable; urgency=medium
* split out pve-kernel-meta and pve-kernel
* switch to dpkg-buildpackage
* update README
* update source to Ubuntu-4.13.0-36.40
* add cherry-pick for NFS in network namespaces
* add cherry-picks for OCFS2 bug
-- Proxmox Support Team <support@proxmox.com> Fri, 9 Mar 2018 11:55:18 +0100
pve-kernel (4.13.13-41) unstable; urgency=medium
* fix refcnt leaks with net namespaces
* update ZFS/SPL to 0.7.6
-- Proxmox Support Team <support@proxmox.com> Wed, 21 Feb 2018 10:07:54 +0100
pve-kernel (4.13.13-40) unstable; urgency=medium
* warn when loading non-RETPOLINE modules
-- Proxmox Support Team <support@proxmox.com> Fri, 16 Feb 2018 09:51:20 +0100
pve-kernel (4.13.13-39) unstable; urgency=medium
* update to Ubuntu-4.13.0-35.39
* bump Abi to 4.13.13-6.pve
* fix EDAC issues
* fix scsi lpfc HBA issue
* cherry-pick sched/wait bug fix
* enable full RETPOLINE support
-- Proxmox Support Team <support@proxmox.com> Wed, 14 Feb 2018 12:14:44 +0100
pve-kernel (4.13.13-38) unstable; urgency=medium
* fix syscall retpoline
-- Proxmox Support Team <support@proxmox.com> Fri, 26 Jan 2018 10:47:09 +0100
pve-kernel (4.13.13-37) unstable; urgency=medium
* update ZFS to 0.7.4 + ARC hit rate cherry-pick
* add tc fixes
* fix 1622: i40e memory leak
-- Proxmox Support Team <support@proxmox.com> Fri, 19 Jan 2018 12:29:37 +0100
pve-kernel (4.13.13-36) unstable; urgency=medium
* cherry-pick (partial) SPECTRE fixes for CPUs supporting IBRS/IBPB
* follow-up fixes for KPTI
-- Proxmox Support Team <support@proxmox.com> Mon, 15 Jan 2018 12:36:49 +0100
pve-kernel (4.13.13-35) unstable; urgency=medium
* KPTI: disable on AMD
* KPTI: add follow-up fixes
* update Spectre KVM PoC fix for AMD
-- Proxmox Support Team <support@proxmox.com> Mon, 8 Jan 2018 10:26:58 +0100
pve-kernel (4.13.13-34) unstable; urgency=medium
* cherry-pick / backport of KPTI / Meltdown fixes
(from Ubuntu-4.13.0-23.25)
* add Google Spectre PoC fix for KVM
* fix objtool build regression
-- Proxmox Support Team <support@proxmox.com> Sun, 7 Jan 2018 13:19:58 +0100
pve-kernel (4.13.13-33) unstable; urgency=medium
* update to Ubuntu-4.13.0-22.25
* fix #1537: cherry-pick AMD NPT / IOMMU fix
-- Proxmox Support Team <support@proxmox.com> Tue, 2 Jan 2018 10:03:58 +0100
pve-kernel (4.13.13-32) unstable; urgency=medium
* update to Ubuntu-4.13.0-21.24
* bump ABI to 4.13.13-2-pve
-- Proxmox Support Team <support@proxmox.com> Thu, 21 Dec 2017 09:02:14 +0100
pve-kernel (4.13.13-31) unstable; urgency=medium
* update to Ubuntu-4.13.0-19.22
* bump ABI to 4.13.13-1-pve
-- Proxmox Support Team <support@proxmox.com> Mon, 11 Dec 2017 10:00:13 +0100
pve-kernel (4.13.8-30) unstable; urgency=medium
* revert igb to 5.3.5.10 (5.3.5.12 broke JUMBO_FRAMES)
-- Proxmox Support Team <support@proxmox.com> Tue, 5 Dec 2017 13:06:48 +0100
pve-kernel (4.13.8-29) unstable; urgency=medium
* cherry-pick KVM fix for old CPUs
* cherry-pick / backport IB fixes
* cherry-pick vhost perf regression and mem-leak fix
* cherry-pick final Windows BSOD fix
* bump version to 4.13-29, bump ABI to 4.13.8-3-pve
-- Proxmox Support Team <support@proxmox.com> Mon, 4 Dec 2017 09:15:03 +0100
pve-kernel (4.13.8-28) unstable; urgency=medium
* revert MMU changeset which caused bluescreens in Windows VMs
* bump ABI to 4.13.8-2-pve
-- Proxmox Support Team <support@proxmox.com> Wed, 29 Nov 2017 09:49:35 +0100
pve-kernel (4.13.8-27) unstable; urgency=medium
* update to Ubuntu-4.13.0-17.20
* bump ABI to 4.13.8-1-pve
* update e1000e to 3.3.6
* update igb to 5.3.5.12
* update ixgbe to 5.3.3
-- Proxmox Support Team <support@proxmox.com> Fri, 17 Nov 2017 11:41:10 +0100
pve-kernel (4.13.4-26) unstable; urgency=medium
* update to ZFS/SPL 0.7.3
-- Proxmox Support Team <support@proxmox.com> Mon, 6 Nov 2017 11:23:55 +0100
pve-kernel (4.13.4-25) unstable; urgency=medium
* update to Ubuntu-4.13.0-16.19
* update to ZFS/SPL 0.7.2
* fix CVE-2017-12188: nested KVM stack overflow
* bump ABI to 4.13.4-1-pve
-- Proxmox Support Team <support@proxmox.com> Fri, 13 Oct 2017 08:59:53 +0200
pve-kernel (4.13.3-2) unstable; urgency=medium
* update to Ubuntu-4.13.0-12.13
* bump ABI to 4.13.3-1-pve
-- Proxmox Support Team <support@proxmox.com> Wed, 27 Sep 2017 14:01:40 +0200
pve-kernel (4.13.1-1) unstable; urgency=medium
* update to Ubuntu-4.13.0-11.12
* bump ABI to 4.13.1-1-pve
* update ACS override patch for 4.12+
* drop cpuset remap patch, add cpuset cherry-picks instead
* update e1000e/igb/ixgbe to current upstream versions
-- Proxmox Support Team <support@proxmox.com> Wed, 27 Sep 2017 10:08:41 +0200
pve-kernel (4.10.17-23) unstable; urgency=medium
+1
View File
@@ -0,0 +1 @@
10
+52
View File
@@ -0,0 +1,52 @@
Source: pve-kernel
Section: devel
Priority: optional
Maintainer: Proxmox Support Team <support@proxmox.com>
Build-Depends: asciidoc,
bc,
bison,
flex,
gcc-6 (>= 6.3.0-18+deb9u1),
libiberty-dev,
libssl-dev,
lintian,
sed,
tar,
xmlto,
Build-Conflicts: pve-headers-@KVNAME@,
Vcs-Git: git://git.proxmox.com/git/pve-kernel
Vcs-Browser: https://git.proxmox.com/?p=pve-kernel.git
Package: linux-tools-@KVMAJMIN@
Architecture: any
Section: devel
Priority: optional
Depends: linux-base,
${misc:Depends},
${shlibs:Depends},
Description: Linux kernel version specific tools for version @KVMAJMIN@
This package provides the architecture dependent parts for kernel
version locked tools (such as perf and x86_energy_perf_policy)
Package: pve-headers-@KVNAME@
Section: devel
Priority: optional
Architecture: any
Provides: linux-headers,
linux-headers-2.6,
Depends: coreutils | fileutils (>= 4.0),
Description: The Proxmox PVE Kernel Headers
This package contains the linux kernel headers
Package: pve-kernel-@KVNAME@
Section: admin
Priority: optional
Architecture: any
Provides: linux-image,
linux-image-2.6,
Suggests: pve-firmware,
Depends: busybox,
grub-pc | grub-efi-amd64 | grub-efi-ia32 | grub-efi-arm64,
initramfs-tools,
Description: The Proxmox PVE Kernel Image
This package contains the linux kernel and initial ramdisk used for booting
+1 -2
View File
@@ -26,6 +26,5 @@ The complete text of the GNU General Public License can be found in
`/usr/share/common-licenses/GPL-2'.
ZFS module is licensed under the Common Development and Distribution
ZFS module is licensed under the Common Development and Distribution
License (CDDL).
View File
View File
View File
Vendored Executable
+230
View File
@@ -0,0 +1,230 @@
#!/usr/bin/make -f
# -*- makefile -*-
# Uncomment this to turn on verbose mode.
#export DH_VERBOSE=1
# TODO: check for headers not being installed
BUILD_DIR=$(shell pwd)
include /usr/share/dpkg/default.mk
include debian/rules.d/env.mk
include debian/rules.d/${DEB_BUILD_ARCH}.mk
CHANGELOG_DATE:=$(shell dpkg-parsechangelog -SDate)
PVE_KERNEL_PKG=pve-kernel-${KVNAME}
PVE_HEADER_PKG=pve-headers-${KVNAME}
LINUX_TOOLS_PKG=linux-tools-${KERNEL_MAJMIN}
# TODO: split for archs, move to files?
PVE_CONFIG_OPTS= \
-m INTEL_MEI_WDT \
-d CONFIG_SND_PCM_OSS \
-e CONFIG_TRANSPARENT_HUGEPAGE_MADVISE \
-d CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS \
-m CONFIG_CEPH_FS \
-m CONFIG_BLK_DEV_NBD \
-m CONFIG_BLK_DEV_RBD \
-m CONFIG_BCACHE \
-m CONFIG_JFS_FS \
-m CONFIG_HFS_FS \
-m CONFIG_HFSPLUS_FS \
-e CONFIG_BRIDGE \
-e CONFIG_BRIDGE_NETFILTER \
-e CONFIG_BLK_DEV_SD \
-e CONFIG_BLK_DEV_SR \
-e CONFIG_BLK_DEV_DM \
-e CONFIG_BLK_DEV_NVME \
-d CONFIG_INPUT_EVBUG \
-d CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND \
-e CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE \
-d CONFIG_MODULE_SIG \
-d CONFIG_MEMCG_DISABLED \
-e CONFIG_MEMCG_SWAP_ENABLED \
-e CONFIG_MEMCG_KMEM \
-d CONFIG_DEFAULT_CFQ \
-e CONFIG_DEFAULT_DEADLINE \
-e CONFIG_MODVERSIONS \
-d CONFIG_DEFAULT_SECURITY_DAC \
-e CONFIG_DEFAULT_SECURITY_APPARMOR \
--set-str CONFIG_DEFAULT_SECURITY apparmor \
-d CONFIG_UNWINDER_ORC \
-d CONFIG_UNWINDER_GUESS \
-e CONFIG_UNWINDER_FRAME_POINTER \
-e CONFIG_PAGE_TABLE_ISOLATION
debian/control: $(wildcard debian/*.in)
sed -e 's/@@KVNAME@@/${KVNAME}/g' < debian/pve-kernel.prerm.in > debian/${PVE_KERNEL_PKG}.prerm
sed -e 's/@@KVNAME@@/${KVNAME}/g' < debian/pve-kernel.postrm.in > debian/${PVE_KERNEL_PKG}.postrm
sed -e 's/@@KVNAME@@/${KVNAME}/g' < debian/pve-kernel.postinst.in > debian/${PVE_KERNEL_PKG}.postinst
sed -e 's/@@KVNAME@@/${KVNAME}/g' < debian/pve-headers.postinst.in > debian/${PVE_HEADER_PKG}.postinst
chmod +x debian/${PVE_KERNEL_PKG}.prerm
chmod +x debian/${PVE_KERNEL_PKG}.postrm
chmod +x debian/${PVE_KERNEL_PKG}.postinst
chmod +x debian/${PVE_HEADER_PKG}.postinst
sed -e 's/@KVNAME@/${KVNAME}/g' -e 's/@KVMAJMIN@/${KERNEL_MAJMIN}/g' < debian/control.in > debian/control
build: .compile_mark .tools_compile_mark .modules_compile_mark
install: .install_mark .tools_install_mark .headers_install_mark
dh_installdocs -A debian/copyright debian/SOURCE
dh_installchangelogs
dh_installman
dh_strip_nondeterminism
dh_compress
dh_fixperms
binary: install
debian/rules fwcheck abicheck
dh_strip -N${PVE_HEADER_PKG}
dh_makeshlibs
dh_shlibdeps
dh_installdeb
dh_gencontrol
dh_md5sums
dh_builddeb
.compile_mark: ${KERNEL_SRC}/.config
cd ${KERNEL_SRC}; scripts/config ${PVE_CONFIG_OPTS}
${MAKE} -C ${KERNEL_SRC} oldconfig
${MAKE} -C ${KERNEL_SRC} KBUILD_BUILD_VERSION_TIMESTAMP="PVE ${DEB_VERSION} (${CHANGELOG_DATE})"
touch $@
.install_mark: .compile_mark .modules_compile_mark
rm -rf debian/${PVE_KERNEL_PKG}
mkdir -p debian/${PVE_KERNEL_PKG}/lib/modules/${KVNAME}
mkdir debian/${PVE_KERNEL_PKG}/boot
install -m 644 ${KERNEL_SRC}/.config debian/${PVE_KERNEL_PKG}/boot/config-${KVNAME}
install -m 644 ${KERNEL_SRC}/System.map debian/${PVE_KERNEL_PKG}/boot/System.map-${KVNAME}
install -m 644 ${KERNEL_SRC}/${KERNEL_IMAGE_PATH} debian/${PVE_KERNEL_PKG}/boot/${KERNEL_INSTALL_FILE}-${KVNAME}
${MAKE} -C ${KERNEL_SRC} INSTALL_MOD_PATH=${BUILD_DIR}/debian/${PVE_KERNEL_PKG}/ modules_install
## install latest ibg driver
install -m 644 ${MODULES}/igb.ko debian/${PVE_KERNEL_PKG}/lib/modules/${KVNAME}/kernel/drivers/net/ethernet/intel/igb/
# install latest ixgbe driver
install -m 644 ${MODULES}/ixgbe.ko debian/${PVE_KERNEL_PKG}/lib/modules/${KVNAME}/kernel/drivers/net/ethernet/intel/ixgbe/
# install latest e1000e driver
install -m 644 ${MODULES}/e1000e.ko debian/${PVE_KERNEL_PKG}/lib/modules/${KVNAME}/kernel/drivers/net/ethernet/intel/e1000e/
# install zfs drivers
install -d -m 0755 debian/${PVE_KERNEL_PKG}/lib/modules/${KVNAME}/zfs
install -m 644 $(addprefix ${MODULES}/,spl.ko splat.ko zfs.ko zavl.ko znvpair.ko zunicode.ko zcommon.ko zpios.ko icp.ko) debian/${PVE_KERNEL_PKG}/lib/modules/${KVNAME}/zfs
# remove firmware
rm -rf debian/${PVE_KERNEL_PKG}/lib/firmware
# strip debug info
find debian/${PVE_KERNEL_PKG}/lib/modules -name \*.ko -print | while read f ; do strip --strip-debug "$$f"; done
# finalize
/sbin/depmod -b debian/${PVE_KERNEL_PKG}/ ${KVNAME}
# Autogenerate blacklist for watchdog devices (see README)
install -m 0755 -d debian/${PVE_KERNEL_PKG}/lib/modprobe.d
ls debian/${PVE_KERNEL_PKG}/lib/modules/${KVNAME}/kernel/drivers/watchdog/ > watchdog-blacklist.tmp
echo ipmi_watchdog.ko >> watchdog-blacklist.tmp
cat watchdog-blacklist.tmp|sed -e 's/^/blacklist /' -e 's/.ko$$//'|sort -u > debian/${PVE_KERNEL_PKG}/lib/modprobe.d/blacklist_${PVE_KERNEL_PKG}.conf
rm -f debian/${PVE_KERNEL_PKG}/lib/modules/${KVNAME}/source
rm -f debian/${PVE_KERNEL_PKG}/lib/modules/${KVNAME}/build
touch $@
.tools_compile_mark: .compile_mark
${MAKE} -C ${KERNEL_SRC}/tools/perf prefix=/usr HAVE_NO_LIBBFD=1 HAVE_CPLUS_DEMANGLE_SUPPORT=1 NO_LIBPYTHON=1 NO_LIBPERL=1 NO_LIBCRYPTO=1 PYTHON=python2.7
echo "checking GPL-2 only perf binary for library linkage with incompatible licenses.."
! ldd ${KERNEL_SRC}/tools/perf/perf | grep -q -E '\blibbfd'
! ldd ${KERNEL_SRC}/tools/perf/perf | grep -q -E '\blibcrypto'
${MAKE} -C ${KERNEL_SRC}/tools/perf man
touch $@
.tools_install_mark: .tools_compile_mark
rm -rf debian/${LINUX_TOOLS_PKG}
mkdir -p debian/${LINUX_TOOLS_PKG}/usr/bin
mkdir -p debian/${LINUX_TOOLS_PKG}/usr/share/man/man1
install -m 755 ${BUILD_DIR}/${KERNEL_SRC}/tools/perf/perf debian/${LINUX_TOOLS_PKG}/usr/bin/perf_$(KERNEL_MAJMIN)
for i in ${BUILD_DIR}/${KERNEL_SRC}/tools/perf/Documentation/*.1; do \
fname="$${i##*/}"; manname="$${fname%.1}"; \
install -m644 "$$i" "debian/${LINUX_TOOLS_PKG}/usr/share/man/man1/$${manname}_$(KERNEL_MAJMIN).1"; \
done
touch $@
.headers_install_mark: .compile_mark .modules_compile_mark
rm -rf debian/${PVE_HEADER_PKG}
mkdir -p debian/${PVE_HEADER_PKG}/usr/src/linux-headers-${KVNAME}
install -m 0644 ${KERNEL_SRC}/.config debian/${PVE_HEADER_PKG}/usr/src/linux-headers-${KVNAME}
install -m 0644 ${KERNEL_SRC}/Module.symvers debian/${PVE_HEADER_PKG}/usr/src/linux-headers-${KVNAME}
cd ${KERNEL_SRC}; find . -path './debian/*' -prune \
-o -path './include/*' -prune \
-o -path './Documentation' -prune \
-o -path './scripts' -prune \
-o -type f \
\( \
-name 'Makefile*' \
-o -name 'Kconfig*' \
-o -name 'Kbuild*' \
-o -name '*.sh' \
-o -name '*.pl' \
\) \
-print | cpio -pd --preserve-modification-time ${BUILD_DIR}/debian/${PVE_HEADER_PKG}/usr/src/linux-headers-${KVNAME}
cd ${KERNEL_SRC}; cp -a include scripts ${BUILD_DIR}/debian/${PVE_HEADER_PKG}/usr/src/linux-headers-${KVNAME}
cd ${KERNEL_SRC}; \
( \
find arch/${KERNEL_HEADER_ARCH} -name include -type d -print | \
xargs -n1 -i: find : -type f \
) | \
cpio -pd --preserve-modification-time ${BUILD_DIR}/debian/${PVE_HEADER_PKG}/usr/src/linux-headers-${KVNAME}
mkdir -p debian/${PVE_HEADER_PKG}/lib/modules/${KVNAME}
ln -sf /usr/src/linux-headers-${KVNAME} debian/${PVE_HEADER_PKG}/lib/modules/${KVNAME}/build
touch $@
.modules_compile_mark: $(addprefix ${MODULES}/,igb.ko ixgbe.ko e1000e.ko spl.ko zfs.ko)
touch $@
${MODULES}/spl.ko: .compile_mark
cd ${MODULES}/${SPLDIR}; ./autogen.sh
cd ${MODULES}/${SPLDIR}; ./configure --with-config=kernel --with-linux=${BUILD_DIR}/${KERNEL_SRC} --with-linux-obj=${BUILD_DIR}/${KERNEL_SRC}
${MAKE} -C ${MODULES}/${SPLDIR}
cp ${MODULES}/${SPLDIR}/module/splat/splat.ko ${MODULES}/
cp ${MODULES}/${SPLDIR}/module/spl/spl.ko ${MODULES}/
${MODULES}/zfs.ko: .compile_mark ${MODULES}/spl.ko
cd ${MODULES}/${ZFSDIR}; ./autogen.sh
cd ${MODULES}/${ZFSDIR}; ./configure --with-spl=${BUILD_DIR}/${MODULES}/${SPLDIR} --with-spl-obj=${BUILD_DIR}/${MODULES}/${SPLDIR} --with-config=kernel --with-linux=${BUILD_DIR}/${KERNEL_SRC} --with-linux-obj=${BUILD_DIR}/${KERNEL_SRC}
${MAKE} -C ${MODULES}/${ZFSDIR}
cp ${MODULES}/${ZFSDIR}/module/avl/zavl.ko ${MODULES}/
cp ${MODULES}/${ZFSDIR}/module/nvpair/znvpair.ko ${MODULES}/
cp ${MODULES}/${ZFSDIR}/module/unicode/zunicode.ko ${MODULES}/
cp ${MODULES}/${ZFSDIR}/module/zcommon/zcommon.ko ${MODULES}/
cp ${MODULES}/${ZFSDIR}/module/zpios/zpios.ko ${MODULES}/
cp ${MODULES}/${ZFSDIR}/module/icp/icp.ko ${MODULES}/
cp ${MODULES}/${ZFSDIR}/module/zfs/zfs.ko ${MODULES}/
${MODULES}/igb.ko: .compile_mark
${MAKE} -C ${MODULES}/${IGBDIR}/src BUILD_KERNEL=${KVNAME} KSRC=${BUILD_DIR}/${KERNEL_SRC}
cp ${MODULES}/${IGBDIR}/src/igb.ko ${MODULES}/
${MODULES}/ixgbe.ko: .compile_mark
${MAKE} -C ${MODULES}/${IXGBEDIR}/src CFLAGS_EXTRA="-DIXGBE_NO_LRO" BUILD_KERNEL=${KVNAME} KSRC=${BUILD_DIR}/${KERNEL_SRC}
cp ${MODULES}/${IXGBEDIR}/src/ixgbe.ko ${MODULES}/
${MODULES}/e1000e.ko: .compile_mark
${MAKE} -C ${MODULES}/${E1000EDIR}/src BUILD_KERNEL=${KVNAME} KSRC=${BUILD_DIR}/${KERNEL_SRC}
cp ${MODULES}/${E1000EDIR}/src/e1000e.ko ${MODULES}/
fwlist-${KVNAME}: .compile_mark .modules_compile_mark
debian/scripts/find-firmware.pl debian/${PVE_KERNEL_PKG}/lib/modules/${KVNAME} >fwlist.tmp
mv fwlist.tmp $@
.PHONY: fwcheck
fwcheck: fwlist-${KVNAME} fwlist-previous
@echo "checking fwlist for changes since last built firmware package.."
@echo "if this check fails, add fwlist-${KVNAME} to the pve-firmware repository and upload a new firmware package together with the ${KVNAME} kernel"
sort fwlist-previous | uniq > fwlist-previous.sorted
sort fwlist-${KVNAME} | uniq > fwlist-${KVNAME}.sorted
diff -up -N fwlist-previous.sorted fwlist-${KVNAME}.sorted > fwlist.diff
rm fwlist.diff fwlist-previous.sorted fwlist-${KVNAME}.sorted
@echo "done, no need to rebuild pve-firmware"
abi-${KVNAME}: .compile_mark
debian/scripts/abi-generate debian/${PVE_HEADER_PKG}/usr/src/linux-headers-${KVNAME}/Module.symvers abi-${KVNAME} ${KVNAME}
.PHONY: abicheck
abicheck: debian/scripts/abi-check abi-${KVNAME} abi-prev-* abi-blacklist
debian/scripts/abi-check abi-${KVNAME} abi-prev-* ${SKIPABI}
.PHONY: clean
+5
View File
@@ -0,0 +1,5 @@
KERNEL_BUILD_ARCH = x86
KERNEL_HEADER_ARCH = $(KERNEL_BUILD_ARCH)
KERNEL_BUILD_IMAGE = bzImage
KERNEL_IMAGE_PATH = arch/$(KERNEL_BUILD_ARCH)/boot/${KERNEL_BUILD_IMAGE}
KERNEL_INSTALL_FILE = vmlinuz
+14 -8
View File
@@ -4,8 +4,14 @@ my $abinew = shift;
my $abiold = shift;
my $skipabi = shift;
# to catch multiple abi-prev-* files being passed in
die "invalid value for skipabi parameter\n"
if (defined($skipabi) && $skipabi !~ /^[01]$/);
$abinew =~ /abi-(.*)/;
my $abinum = $1;
my $abistr = $1;
$abiold =~ /abi-prev-(.*)/;
my $prev_abistr = $1;
my $fail_exit = 1;
my $EE = "EE:";
@@ -23,12 +29,12 @@ if ($skipabi) {
$EE = "WW:";
}
#if ($prev_abinum != $abinum) {
# print "II: Different ABI's, running in no-fail mode\n";
# $fail_exit = 0;
# $EE = "WW:";
#}
#
if ($prev_abistr ne $abistr) {
print "II: Different ABI's, running in no-fail mode\n";
$fail_exit = 0;
$EE = "WW:";
}
if (not -f "$abinew" or not -f "$abiold") {
print "EE: Previous or current ABI file missing!\n";
print " $abinew\n" if not -f "$abinew";
@@ -83,7 +89,7 @@ sub is_ignored($$) {
}
# Read new syms first
print " Reading new symbols ($abinum)...";
print " Reading new symbols ($abistr)...";
$count = 0;
open(NEW, "< $abinew") or
die "Could not open $abinew";
Vendored Executable
+42
View File
@@ -0,0 +1,42 @@
#!/usr/bin/perl -w
use PVE::Tools;
use IO::File;
sub usage {
die "USAGE: $0 INFILE OUTFILE [ABI INFILE-IS-DEB]\n";
}
my $input_file = shift // usage();
my $output_file = shift // usage();
my $abi = shift;
my $extract_deb = shift;
die "input file '$input_file' does not exist\n" if ! -e $input_file;
my $modules_symver_fh;
if ($extract_deb) {
usage() if !defined($abi);
my $cmd = [];
push @$cmd, ['dpkg', '--fsys-tarfile', $input_file];
push @$cmd, ['tar', '-xOf', '-', "./usr/src/linux-headers-${abi}/Module.symvers"];
$modules_symver_fh = IO::File->new_tmpfile();
PVE::Tools::run_command($cmd, output => '>&'.fileno($modules_symver_fh));
seek($modules_symver_fh, 0, 0);
} else {
open($modules_symver_fh, '<', $input_file) or die "can't open '$input_file' - $!\n";
}
my $lines = [];
while(my $line = <$modules_symver_fh>) {
if ($line =~ /^(.+)\s+(.+)\s+(.+)$/) {
push @$lines, "$3 $2 $1";
} else {
warn "malformed symvers line: '$line'\n";
}
}
close($modules_symver_fh);
PVE::Tools::file_set_contents($output_file, join("\n", sort @$lines));
+34
View File
@@ -0,0 +1,34 @@
#!/bin/bash
set -e
top=$(pwd)
if [ "$#" -ne 3 ]; then
echo "USAGE: $0 repo patchdir ref"
echo "\t exports patches from 'repo' to 'patchdir' based on 'ref'"
exit 1
fi
# parameters
kernel_submodule=$1
kernel_patchdir=$2
base_ref=$3
cd "${kernel_submodule}"
echo "clearing old exported patchqueue"
rm -f "${top}/${kernel_patchdir}"/*.patch
echo "exporting patchqueue using 'git format-patch [...] ${base_ref}.."
git format-patch \
--quiet \
--no-numbered \
--no-cover-letter \
--zero-commit \
--no-signature \
--output-dir \
"${top}/${kernel_patchdir}" \
"${base_ref}.."
git checkout ${base_ref}
cd "${top}"
+1 -1
View File
@@ -8,7 +8,7 @@ die "no directory to scan" if !$dir;
die "no such directory" if ! -d $dir;
die "strange directory name" if $dir !~ m|^(.*/)?(4.10.\d+\-\d+\-pve)(/+)?$|;
die "strange directory name" if $dir !~ m|^(.*/)?(4.13.\d+\-\d+\-pve)(/+)?$|;
my $apiver = $2;
+29
View File
@@ -0,0 +1,29 @@
#!/bin/bash
set -e
top=$(pwd)
if [[ "$#" -lt 2 || "$#" -gt 3 ]]; then
echo "USAGE: $0 repo patchdir [branch]"
echo "\t imports patches from 'patchdir' into patchqueue branch 'branch' in 'repo'"
exit 1
fi
# parameters
kernel_submodule=$1
kernel_patchdir=$2
if [[ -z "$3" ]]; then
pq_branch='pq'
else
pq_branch=$3
fi
cd "${kernel_submodule}"
echo "creating patchqeueue branch '${pq_branch}'"
git checkout -b "${pq_branch}"
echo "importing patches from '${kernel_patchdir}'"
git am "${top}/${kernel_patchdir}"/*.patch
cd "${top}"
+119
View File
@@ -0,0 +1,119 @@
#!/bin/bash
set -e
top=$(pwd)
# parameters
kernel_submodule=
kernel_patchdir=
new_tag=
rebase=
# generated based on new_tag
pq_branch=
# previously checked out in submodule
old_ref=
function cleanup_pq_branch {
if [[ -n $pq_branch ]]; then
echo "cleaning up PQ branch '$pq_branch'"
cd "${top}/${kernel_submodule}"
git checkout --quiet $old_ref
git reset --hard
git branch -D "$pq_branch"
fi
}
function error_exit {
echo "$1"
set +e
cleanup_pq_branch
cd "${top}"
exit 1
}
if [[ "$#" -lt 3 || "$#" -gt 4 ]]; then
echo "USAGE: $0 submodule patchdir tag [rebase]"
echo "\t fetches and checks out 'tag' in 'submodule'"
echo "\t if 'rebase' is given, imports, rebases and exports patchqueue from 'patchdir' as well"
exit 1
fi
kernel_submodule=$1
if [ ! -d "${kernel_submodule}" ]; then
error_exit "'${kernel_submodule}' must be a directory!"
fi
kernel_patchdir=$2
if [ ! -d "${kernel_patchdir}" ]; then
error_exit "'${kernel_patchdir}' must be a directory!"
fi
new_tag=$3
rebase=$4
if [[ -n $(git status --untracked-files=no --porcelain) ]]; then
error_exit "working directory unclean, aborting"
fi
cd "${kernel_submodule}"
## check for tag and fetch if needed
echo "checking for tag '${new_tag}'"
if [[ -z $(git tag -l "${new_tag}") ]]; then
echo "tag not found, fetching and retrying"
git fetch --tags
fi
if [[ -z $(git tag -l "${new_tag}") ]]; then
error_exit "tag not found, aborting"
fi
echo "tag found"
cd "${top}"
if [[ -n "$rebase" ]]; then
echo ""
echo "automatic patchqueue rebase enabled"
cd "${kernel_submodule}"
## preparing patch queue branch
old_ref=$(git rev-parse HEAD)
pq_branch="auto_pq/${new_tag}"
cd "${top}"
echo "previous HEAD: ${old_ref}"
echo ""
"${top}/debian/scripts/import-patchqueue" "${kernel_submodule}" "${kernel_patchdir}" "${pq_branch}" || error_exit "failed to import patchqueue"
cd "${kernel_submodule}"
## rebase patches
echo ""
echo "rebasing patchqueue on top of '${new_tag}'"
git rebase "${new_tag}"
cd "${top}"
## regenerate exported patch queue
echo ""
"${top}/debian/scripts/export-patchqueue" "${kernel_submodule}" "${kernel_patchdir}" "${new_tag}" || error_exit "failed to export patchqueue"
cleanup_pq_branch
cd "${top}"
pq_branch=
fi
cd "${kernel_submodule}"
echo ""
echo "checking out '${new_tag}' in submodule"
git checkout --quiet "${new_tag}"
cd "${top}"
echo ""
echo "committing results"
git commit --verbose -s -m "update sources to ${new_tag}" -m "(generated with debian/scripts/import-upstream-tag)" "${kernel_submodule}"
if [[ -n "$rebase" ]]; then
git add "${kernel_patchdir}"
git commit --verbose -s -m "rebase patches on top of ${new_tag}" -m "(generated with debian/scripts/import-upstream-tag)" "${kernel_patchdir}"
fi
Binary file not shown.
Binary file not shown.
-81
View File
@@ -1,81 +0,0 @@
src/{netdev.c.orig => netdev.c} | 18 +++++++++---------
src/{ptp.c.orig => ptp.c} | 4 ++--
2 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/src/netdev.c.orig b/src/netdev.c
index 73b0f9a..480265b 100644
--- a/src/netdev.c.orig
+++ b/src/netdev.c
@@ -4833,24 +4833,24 @@ void e1000e_reinit_locked(struct e1000_adapter *adapter)
/**
* e1000e_sanitize_systim - sanitize raw cycle counter reads
* @hw: pointer to the HW structure
- * @systim: cycle_t value read, sanitized and returned
+ * @systim: u64 value read, sanitized and returned
*
* Errata for 82574/82583 possible bad bits read from SYSTIMH/L:
* check to see that the time is incrementing at a reasonable
* rate and is a multiple of incvalue.
**/
-static cycle_t e1000e_sanitize_systim(struct e1000_hw *hw, cycle_t systim)
+static u64 e1000e_sanitize_systim(struct e1000_hw *hw, u64 systim)
{
u64 time_delta, rem, temp;
- cycle_t systim_next;
+ u64 systim_next;
u32 incvalue;
int i;
incvalue = er32(TIMINCA) & E1000_TIMINCA_INCVALUE_MASK;
for (i = 0; i < E1000_MAX_82574_SYSTIM_REREADS; i++) {
/* latch SYSTIMH on read of SYSTIML */
- systim_next = (cycle_t)er32(SYSTIML);
- systim_next |= (cycle_t)er32(SYSTIMH) << 32;
+ systim_next = (u64)er32(SYSTIML);
+ systim_next |= (u64)er32(SYSTIMH) << 32;
time_delta = systim_next - systim;
temp = time_delta;
@@ -4872,13 +4872,13 @@ static cycle_t e1000e_sanitize_systim(struct e1000_hw *hw, cycle_t systim)
* e1000e_cyclecounter_read - read raw cycle counter (used by time counter)
* @cc: cyclecounter structure
**/
-static cycle_t e1000e_cyclecounter_read(const struct cyclecounter *cc)
+static u64 e1000e_cyclecounter_read(const struct cyclecounter *cc)
{
struct e1000_adapter *adapter = container_of(cc, struct e1000_adapter,
cc);
struct e1000_hw *hw = &adapter->hw;
u32 systimel, systimeh;
- cycle_t systim;
+ u64 systim;
/* SYSTIMH latching upon SYSTIML read does not work well.
* This means that if SYSTIML overflows after we read it but before
* we read SYSTIMH, the value of SYSTIMH has been incremented and we
@@ -4899,8 +4899,8 @@ static cycle_t e1000e_cyclecounter_read(const struct cyclecounter *cc)
systimel = systimel_2;
}
}
- systim = (cycle_t)systimel;
- systim |= (cycle_t)systimeh << 32;
+ systim = (u64)systimel;
+ systim |= (u64)systimeh << 32;
if (adapter->flags2 & FLAG2_CHECK_SYSTIM_OVERFLOW)
systim = e1000e_sanitize_systim(hw, systim);
diff --git a/src/ptp.c.orig b/src/ptp.c
index 00c419f..228adce 100644
--- a/src/ptp.c.orig
+++ b/src/ptp.c
@@ -136,8 +136,8 @@ static int e1000e_phc_get_syncdevicetime(ktime_t * device,
unsigned long flags;
int i;
u32 tsync_ctrl;
- cycle_t dev_cycles;
- cycle_t sys_cycles;
+ u64 dev_cycles;
+ u64 sys_cycles;
tsync_ctrl = er32(TSYNCTXCTL);
tsync_ctrl |= E1000_TSYNCTXCTL_START_SYNC |
+765 -605
View File
File diff suppressed because it is too large Load Diff
-10
View File
@@ -1,10 +0,0 @@
Package: pve-headers-@KVNAME@
Version: @KERNEL_VER@-@PKGREL@
Section: devel
Priority: optional
Architecture: @ARCH@
Provides: linux-headers, linux-headers-2.6
Depends: coreutils | fileutils (>= 4.0)
Maintainer: Proxmox Support Team <support@proxmox.com>
Description: The Proxmox PVE Kernel Headers
This package contains the linux kernel headers
Binary file not shown.
Binary file not shown.
-25
View File
@@ -1,25 +0,0 @@
src/{igb_ptp.c.orig => igb_ptp.c} | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/igb_ptp.c.orig b/src/igb_ptp.c
index 744fa65..f334ac7 100644
--- a/src/igb_ptp.c.orig
+++ b/src/igb_ptp.c
@@ -93,7 +93,7 @@
* SYSTIM read access for the 82576
*/
-static cycle_t igb_ptp_read_82576(const struct cyclecounter *cc)
+static u64 igb_ptp_read_82576(const struct cyclecounter *cc)
{
struct igb_adapter *igb = container_of(cc, struct igb_adapter, cc);
struct e1000_hw *hw = &igb->hw;
@@ -113,7 +113,7 @@ static cycle_t igb_ptp_read_82576(const struct cyclecounter *cc)
* SYSTIM read access for the 82580
*/
-static cycle_t igb_ptp_read_82580(const struct cyclecounter *cc)
+static u64 igb_ptp_read_82580(const struct cyclecounter *cc)
{
struct igb_adapter *igb = container_of(cc, struct igb_adapter, cc);
struct e1000_hw *hw = &igb->hw;
-95
View File
@@ -1,95 +0,0 @@
From 6445198f802d993c73f4b246353b2ceb2dfafc32 Mon Sep 17 00:00:00 2001
From: Ferruh Yigit <ferruh.yigit@intel.com>
Date: Mon, 17 Oct 2016 11:23:14 +0100
Subject: kni: fix build with kernel 4.9
compile error:
CC [M] .../lib/librte_eal/linuxapp/kni/igb_main.o
.../lib/librte_eal/linuxapp/kni/igb_main.c:2317:21:
error: initialization from incompatible pointer type
[-Werror=incompatible-pointer-types]
.ndo_set_vf_vlan = igb_ndo_set_vf_vlan,
^~~~~~~~~~~~~~~~~~~
Linux kernel 4.9 updates API for ndo_set_vf_vlan:
Linux: 79aab093a0b5 ("net: Update API for VF vlan protocol 802.1ad support")
Use new API for Linux kernels >= 4.9
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
src/igb_main.c | 19 +++++++++++++++++++
src/kcompat.h | 4 ++++
2 files changed, 23 insertions(+)
diff --git a/src/igb_main.c b/src/igb_main.c
index 23e2d64..f4dca5a 100644
--- a/src/igb_main.c
+++ b/src/igb_main.c
@@ -195,7 +195,11 @@ static void igb_process_mdd_event(struct igb_adapter *);
#ifdef IFLA_VF_MAX
static int igb_ndo_set_vf_mac( struct net_device *netdev, int vf, u8 *mac);
static int igb_ndo_set_vf_vlan(struct net_device *netdev,
+#ifdef HAVE_VF_VLAN_PROTO
+ int vf, u16 vlan, u8 qos, __be16 vlan_proto);
+#else
int vf, u16 vlan, u8 qos);
+#endif
#ifdef HAVE_VF_SPOOFCHK_CONFIGURE
static int igb_ndo_set_vf_spoofchk(struct net_device *netdev, int vf,
bool setting);
@@ -6412,7 +6416,11 @@ static void igb_set_vmvir(struct igb_adapter *adapter, u32 vid, u32 vf)
}
static int igb_ndo_set_vf_vlan(struct net_device *netdev,
+#ifdef HAVE_VF_VLAN_PROTO
+ int vf, u16 vlan, u8 qos, __be16 vlan_proto)
+#else
int vf, u16 vlan, u8 qos)
+#endif
{
int err = 0;
struct igb_adapter *adapter = netdev_priv(netdev);
@@ -6420,6 +6428,12 @@ static int igb_ndo_set_vf_vlan(struct net_device *netdev,
/* VLAN IDs accepted range 0-4094 */
if ((vf >= adapter->vfs_allocated_count) || (vlan > VLAN_VID_MASK-1) || (qos > 7))
return -EINVAL;
+
+#ifdef HAVE_VF_VLAN_PROTO
+ if (vlan_proto != htons(ETH_P_8021Q))
+ return -EPROTONOSUPPORT;
+#endif
+
if (vlan || qos) {
err = igb_vlvf_set(adapter, vlan, !!vlan, vf);
if (err)
@@ -6580,7 +6594,12 @@ static inline void igb_vf_reset(struct igb_adapter *adapter, u32 vf)
if (adapter->vf_data[vf].pf_vlan)
igb_ndo_set_vf_vlan(adapter->netdev, vf,
adapter->vf_data[vf].pf_vlan,
+#ifdef HAVE_VF_VLAN_PROTO
+ adapter->vf_data[vf].pf_qos,
+ htons(ETH_P_8021Q));
+#else
adapter->vf_data[vf].pf_qos);
+#endif
else
igb_clear_vf_vfta(adapter, vf);
#endif
diff --git a/src/kcompat.h b/src/kcompat.h
index 69e0e7a..84826b2 100644
--- a/src/kcompat.h
+++ b/src/kcompat.h
@@ -3929,4 +3929,8 @@ skb_set_hash(struct sk_buff *skb, __u32 hash, __always_unused int type)
#define vlan_tx_tag_present skb_vlan_tag_present
#endif
+#if ( LINUX_VERSION_CODE >= KERNEL_VERSION(4,9,0) )
+#define HAVE_VF_VLAN_PROTO
+#endif /* >= 4.9.0 */
+
#endif /* _KCOMPAT_H_ */
--
cgit v1.0
Binary file not shown.
Binary file not shown.
-25
View File
@@ -1,25 +0,0 @@
src/{ixgbe_ptp.c.orig => ixgbe_ptp.c} | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/ixgbe_ptp.c.orig b/src/ixgbe_ptp.c
index fb832f0..b868c68 100644
--- a/src/ixgbe_ptp.c.orig
+++ b/src/ixgbe_ptp.c
@@ -244,7 +244,7 @@ static void ixgbe_ptp_setup_sdp_X540(struct ixgbe_adapter *adapter)
* result of SYSTIME is 32bits of "billions of cycles" and 32 bits of
* "cycles", rather than seconds and nanoseconds.
*/
-static cycle_t ixgbe_ptp_read_X550(const struct cyclecounter *hw_cc) {
+static u64 ixgbe_ptp_read_X550(const struct cyclecounter *hw_cc) {
struct ixgbe_adapter *adapter =
container_of(hw_cc, struct ixgbe_adapter, hw_cc);
struct ixgbe_hw *hw = &adapter->hw;
@@ -280,7 +280,7 @@ static cycle_t ixgbe_ptp_read_X550(const struct cyclecounter *hw_cc) {
* cyclecounter structure used to construct a ns counter from the
* arbitrary fixed point registers
*/
-static cycle_t ixgbe_ptp_read_82599(const struct cyclecounter *hw_cc)
+static u64 ixgbe_ptp_read_82599(const struct cyclecounter *hw_cc)
{
struct ixgbe_adapter *adapter =
container_of(hw_cc, struct ixgbe_adapter, hw_cc);
-37
View File
@@ -1,37 +0,0 @@
diff --git a/src/ixgbe_main.c b/src/ixgbe_main.c
index 83c6250..fe226cd 100644
--- a/src/ixgbe_main.c
+++ b/src/ixgbe_main.c
@@ -6379,11 +6379,6 @@ static void ixgbe_free_all_rx_resources(struct ixgbe_adapter *adapter)
static int ixgbe_change_mtu(struct net_device *netdev, int new_mtu)
{
struct ixgbe_adapter *adapter = netdev_priv(netdev);
- int max_frame = new_mtu + ETH_HLEN + ETH_FCS_LEN;
-
- /* MTU < 68 is an error and causes problems on some kernels */
- if ((new_mtu < 68) || (max_frame > IXGBE_MAX_JUMBO_FRAME_SIZE))
- return -EINVAL;
/*
* For 82599EB we cannot allow legacy VFs to enable their receive
@@ -6392,7 +6387,7 @@ static int ixgbe_change_mtu(struct net_device *netdev, int new_mtu)
*/
if ((adapter->flags & IXGBE_FLAG_SRIOV_ENABLED) &&
(adapter->hw.mac.type == ixgbe_mac_82599EB) &&
- (max_frame > (ETH_FRAME_LEN + ETH_FCS_LEN)))
+ (new_mtu > ETH_DATA_LEN))
e_warn(probe, "Setting MTU > 1500 will disable legacy VFs\n");
e_info(probe, "changing MTU from %d to %d\n", netdev->mtu, new_mtu);
@@ -10134,6 +10129,11 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
#ifdef IFF_SUPP_NOFCS
netdev->priv_flags |= IFF_SUPP_NOFCS;
#endif
+
+ /* MTU range: 68 - 9710 */
+ netdev->min_mtu = ETH_MIN_MTU;
+ netdev->max_mtu = IXGBE_MAX_JUMBO_FRAME_SIZE - (ETH_HLEN + ETH_FCS_LEN);
+
#if IS_ENABLED(CONFIG_DCB)
if (adapter->flags & IXGBE_FLAG_DCB_CAPABLE)
netdev->dcbnl_ops = &dcbnl_ops;
@@ -1,12 +0,0 @@
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
--- a/virt/kvm/kvm_main.c 2016-05-12 10:39:37.540387127 +0200
+++ b/virt/kvm/kvm_main.c 2016-05-04 10:43:38.063996221 +0200
@@ -75,7 +75,7 @@ static unsigned int halt_poll_ns = KVM_H
EXPORT_SYMBOL_GPL(halt_poll_ns);
/* Default doubles per-vcpu halt_poll_ns. */
-unsigned int halt_poll_ns_grow = 2;
+unsigned int halt_poll_ns_grow = 0;
module_param(halt_poll_ns_grow, uint, S_IRUGO | S_IWUSR);
EXPORT_SYMBOL_GPL(halt_poll_ns_grow);
+17
View File
@@ -0,0 +1,17 @@
diff --git a/src/igb_main.c.orig b/src/igb_main.c
index 3ee1ec7..c8adf04 100644
--- a/src/igb_main.c.orig
+++ b/src/igb_main.c
@@ -1047,8 +1047,10 @@ static void igb_set_interrupt_capability(struct igb_adapter *adapter, bool msix)
for (i = 0; i < numvecs; i++)
adapter->msix_entries[i].entry = i;
- err = pci_enable_msix(pdev,
- adapter->msix_entries, numvecs);
+ err = pci_enable_msix_range(pdev,
+ adapter->msix_entries,
+ numvecs,
+ numvecs);
if (err == 0)
break;
}
@@ -1,7 +1,10 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Ben Hutchings <ben@decadent.org.uk>
Subject: Make mkcompile_h accept an alternate timestamp string
Date: Tue, 12 May 2015 19:29:22 +0100
Forwarded: not-needed
Subject: [PATCH] Make mkcompile_h accept an alternate timestamp string
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
We want to include the Debian version in the utsname::version string
instead of a full timestamp string. However, we still need to provide
@@ -11,6 +14,13 @@ kernel image reproducible.
Make mkcompile_h use $KBUILD_BUILD_VERSION_TIMESTAMP in preference to
$KBUILD_BUILD_TIMESTAMP.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
scripts/mkcompile_h | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/scripts/mkcompile_h b/scripts/mkcompile_h
index fd8fdb91581d..1e35ac9fc810 100755
--- a/scripts/mkcompile_h
+++ b/scripts/mkcompile_h
@@ -37,10 +37,14 @@ else
@@ -0,0 +1,35 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fabian=20Gr=C3=BCnbichler?= <f.gruenbichler@proxmox.com>
Date: Thu, 14 Sep 2017 11:02:18 +0200
Subject: [PATCH] bridge: keep MAC of first assigned port
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
original commit message:
Default bridge changes MAC dynamically using smallest MAC of all
connected ports (for no real reason). To avoid problems with ARP
we simply use the MAC of the first connected port.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
net/bridge/br_stp_if.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
index 89110319ef0f..5e73fff65f47 100644
--- a/net/bridge/br_stp_if.c
+++ b/net/bridge/br_stp_if.c
@@ -259,10 +259,7 @@ bool br_stp_recalculate_bridge_id(struct net_bridge *br)
return false;
list_for_each_entry(p, &br->port_list, list) {
- if (addr == br_mac_zero ||
- memcmp(p->dev->dev_addr, addr, ETH_ALEN) < 0)
- addr = p->dev->dev_addr;
-
+ addr = p->dev->dev_addr;
}
if (ether_addr_equal(br->bridge_id.addr, addr))
@@ -1,13 +1,15 @@
From 866f4c5de45ae13aa590de0d40819a0c38f3f682 Mon Sep 17 00:00:00 2001
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Mark Weiman <mark.weiman@markzz.com>
Date: Sun, 23 Oct 2016 12:57:37 -0400
Subject: [PATCH] pci: Enable overrides for missing ACS capabilities (4.8+)
Date: Sat, 29 Jul 2017 09:15:32 -0400
Subject: [PATCH] pci: Enable overrides for missing ACS capabilities (4.12+)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This an updated version of Alex Williamson's patch from:
https://lkml.org/lkml/2013/5/30/513
Original commit message follows:
---
PCIe ACS (Access Control Services) is the PCIe 2.0+ feature that
allows us to control whether transactions are allowed to be redirected
in various subnodes of a PCIe topology. For instance, if two
@@ -45,17 +47,17 @@ specific devices which enforce isolation but not provide an ACS
capability. Please contact me to have your devices added and save
your customers the hassle of this boot option.
Signed-off-by: Mark Weiman <mark.weiman@markzz.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
Documentation/admin-guide/kernel-parameters.txt | 9 ++++
drivers/pci/quirks.c | 101 ++++++++++++++++++++++++++++++++++++
2 files changed, 110 insertions(+)
Documentation/admin-guide/kernel-parameters.txt | 9 +++
drivers/pci/quirks.c | 102 ++++++++++++++++++++++++
2 files changed, 111 insertions(+)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index a4f4d69..d68cfab 100644
index ce24cb1e8f46..0cc1d4200c24 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2928,6 +2928,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
@@ -2938,6 +2938,15 @@
nomsi [MSI] If the PCI_MSI kernel config parameter is
enabled, this kernel boot option can be used to
disable the use of MSI interrupts system-wide.
@@ -72,10 +74,10 @@ index a4f4d69..d68cfab 100644
Safety option to keep boot IRQs enabled. This
should never be necessary.
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 44e0ff3..32016cb 100644
index 9dcd5ed5a05b..8882b8d38d7d 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3487,6 +3487,106 @@ static int __init pci_apply_final_quirks(void)
@@ -3694,6 +3694,107 @@ static int __init pci_apply_final_quirks(void)
fs_initcall_sync(pci_apply_final_quirks);
@@ -179,10 +181,11 @@ index 44e0ff3..32016cb 100644
+
+ return -ENOTTY;
+}
+
/*
* Followings are device-specific reset methods which can be used to
* Following are device-specific reset methods which can be used to
* reset a single function if other methods (e.g. FLR, PM D0->D3) are
@@ -4160,6 +4260,7 @@ static const struct pci_dev_acs_enabled {
@@ -4536,6 +4637,7 @@ static const struct pci_dev_acs_enabled {
{ 0x10df, 0x720, pci_quirk_mf_endpoint_acs }, /* Emulex Skyhawk-R */
/* Cavium ThunderX */
{ PCI_VENDOR_ID_CAVIUM, PCI_ANY_ID, pci_quirk_cavium_acs },
@@ -190,6 +193,3 @@ index 44e0ff3..32016cb 100644
{ 0 }
};
--
2.10.1
@@ -0,0 +1,26 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fabian=20Gr=C3=BCnbichler?= <f.gruenbichler@proxmox.com>
Date: Thu, 14 Sep 2017 11:09:58 +0200
Subject: [PATCH] kvm: disable default dynamic halt polling growth
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
virt/kvm/kvm_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3b3e54742263..d0085c9d6297 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -77,7 +77,7 @@ module_param(halt_poll_ns, uint, 0644);
EXPORT_SYMBOL_GPL(halt_poll_ns);
/* Default doubles per-vcpu halt_poll_ns. */
-unsigned int halt_poll_ns_grow = 2;
+unsigned int halt_poll_ns_grow = 0;
module_param(halt_poll_ns_grow, uint, 0644);
EXPORT_SYMBOL_GPL(halt_poll_ns_grow);
@@ -0,0 +1,63 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Waiman Long <longman@redhat.com>
Date: Thu, 17 Aug 2017 15:33:09 -0400
Subject: [PATCH] cgroup: Add mount flag to enable cpuset to use v2 behavior in
v1 cgroup
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
A new mount option "cpuset_v2_mode" is added to the v1 cgroupfs
filesystem to enable cpuset controller to use v2 behavior in a v1
cgroup. This mount option applies only to cpuset controller and have
no effect on other controllers.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
(cherry-picked from e1cba4b85daa71b710384d451ff6238d5e4d1ff6)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
include/linux/cgroup-defs.h | 5 +++++
kernel/cgroup/cgroup-v1.c | 6 ++++++
2 files changed, 11 insertions(+)
diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 09f4c7df1478..c344e77707a5 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -74,6 +74,11 @@ enum {
* aren't writeable from inside the namespace.
*/
CGRP_ROOT_NS_DELEGATE = (1 << 3),
+
+ /*
+ * Enable cpuset controller in v1 cgroup to use v2 behavior.
+ */
+ CGRP_ROOT_CPUSET_V2_MODE = (1 << 4),
};
/* cftype->flags */
diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
index 7bf4b1533f34..ce7426b875f5 100644
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -846,6 +846,8 @@ static int cgroup1_show_options(struct seq_file *seq, struct kernfs_root *kf_roo
seq_puts(seq, ",noprefix");
if (root->flags & CGRP_ROOT_XATTR)
seq_puts(seq, ",xattr");
+ if (root->flags & CGRP_ROOT_CPUSET_V2_MODE)
+ seq_puts(seq, ",cpuset_v2_mode");
spin_lock(&release_agent_path_lock);
if (strlen(root->release_agent_path))
@@ -900,6 +902,10 @@ static int parse_cgroupfs_options(char *data, struct cgroup_sb_opts *opts)
opts->cpuset_clone_children = true;
continue;
}
+ if (!strcmp(token, "cpuset_v2_mode")) {
+ opts->flags |= CGRP_ROOT_CPUSET_V2_MODE;
+ continue;
+ }
if (!strcmp(token, "xattr")) {
opts->flags |= CGRP_ROOT_XATTR;
continue;
@@ -0,0 +1,138 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Waiman Long <longman@redhat.com>
Date: Thu, 17 Aug 2017 15:33:10 -0400
Subject: [PATCH] cpuset: Allow v2 behavior in v1 cgroup
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Cpuset v2 has some useful behaviors that are not present in v1 because
of backward compatibility concern. One of that is the restoration of
the original cpu and memory node mask after a hot removal and addition
event sequence.
This patch makes the cpuset controller to check the
CGRP_ROOT_CPUSET_V2_MODE flag and use the v2 behavior if it is set.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
(cherry-picked from b8d1b8ee93df8ffbabbeadd65d39853cfad6d698)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
kernel/cgroup/cpuset.c | 33 ++++++++++++++++++++-------------
1 file changed, 20 insertions(+), 13 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index e8cb34193433..f76c4bf3d46a 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -300,6 +300,16 @@ static DECLARE_WORK(cpuset_hotplug_work, cpuset_hotplug_workfn);
static DECLARE_WAIT_QUEUE_HEAD(cpuset_attach_wq);
/*
+ * Cgroup v2 behavior is used when on default hierarchy or the
+ * cgroup_v2_mode flag is set.
+ */
+static inline bool is_in_v2_mode(void)
+{
+ return cgroup_subsys_on_dfl(cpuset_cgrp_subsys) ||
+ (cpuset_cgrp_subsys.root->flags & CGRP_ROOT_CPUSET_V2_MODE);
+}
+
+/*
* This is ugly, but preserves the userspace API for existing cpuset
* users. If someone tries to mount the "cpuset" filesystem, we
* silently switch it to mount "cgroup" instead
@@ -489,8 +499,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
/* On legacy hiearchy, we must be a subset of our parent cpuset. */
ret = -EACCES;
- if (!cgroup_subsys_on_dfl(cpuset_cgrp_subsys) &&
- !is_cpuset_subset(trial, par))
+ if (!is_in_v2_mode() && !is_cpuset_subset(trial, par))
goto out;
/*
@@ -896,8 +905,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct cpumask *new_cpus)
* If it becomes empty, inherit the effective mask of the
* parent, which is guaranteed to have some CPUs.
*/
- if (cgroup_subsys_on_dfl(cpuset_cgrp_subsys) &&
- cpumask_empty(new_cpus))
+ if (is_in_v2_mode() && cpumask_empty(new_cpus))
cpumask_copy(new_cpus, parent->effective_cpus);
/* Skip the whole subtree if the cpumask remains the same. */
@@ -914,7 +922,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct cpumask *new_cpus)
cpumask_copy(cp->effective_cpus, new_cpus);
spin_unlock_irq(&callback_lock);
- WARN_ON(!cgroup_subsys_on_dfl(cpuset_cgrp_subsys) &&
+ WARN_ON(!is_in_v2_mode() &&
!cpumask_equal(cp->cpus_allowed, cp->effective_cpus));
update_tasks_cpumask(cp);
@@ -1150,8 +1158,7 @@ static void update_nodemasks_hier(struct cpuset *cs, nodemask_t *new_mems)
* If it becomes empty, inherit the effective mask of the
* parent, which is guaranteed to have some MEMs.
*/
- if (cgroup_subsys_on_dfl(cpuset_cgrp_subsys) &&
- nodes_empty(*new_mems))
+ if (is_in_v2_mode() && nodes_empty(*new_mems))
*new_mems = parent->effective_mems;
/* Skip the whole subtree if the nodemask remains the same. */
@@ -1168,7 +1175,7 @@ static void update_nodemasks_hier(struct cpuset *cs, nodemask_t *new_mems)
cp->effective_mems = *new_mems;
spin_unlock_irq(&callback_lock);
- WARN_ON(!cgroup_subsys_on_dfl(cpuset_cgrp_subsys) &&
+ WARN_ON(!is_in_v2_mode() &&
!nodes_equal(cp->mems_allowed, cp->effective_mems));
update_tasks_nodemask(cp);
@@ -1460,7 +1467,7 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
/* allow moving tasks into an empty cpuset if on default hierarchy */
ret = -ENOSPC;
- if (!cgroup_subsys_on_dfl(cpuset_cgrp_subsys) &&
+ if (!is_in_v2_mode() &&
(cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed)))
goto out_unlock;
@@ -1979,7 +1986,7 @@ static int cpuset_css_online(struct cgroup_subsys_state *css)
cpuset_inc();
spin_lock_irq(&callback_lock);
- if (cgroup_subsys_on_dfl(cpuset_cgrp_subsys)) {
+ if (is_in_v2_mode()) {
cpumask_copy(cs->effective_cpus, parent->effective_cpus);
cs->effective_mems = parent->effective_mems;
}
@@ -2056,7 +2063,7 @@ static void cpuset_bind(struct cgroup_subsys_state *root_css)
mutex_lock(&cpuset_mutex);
spin_lock_irq(&callback_lock);
- if (cgroup_subsys_on_dfl(cpuset_cgrp_subsys)) {
+ if (is_in_v2_mode()) {
cpumask_copy(top_cpuset.cpus_allowed, cpu_possible_mask);
top_cpuset.mems_allowed = node_possible_map;
} else {
@@ -2250,7 +2257,7 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs)
cpus_updated = !cpumask_equal(&new_cpus, cs->effective_cpus);
mems_updated = !nodes_equal(new_mems, cs->effective_mems);
- if (cgroup_subsys_on_dfl(cpuset_cgrp_subsys))
+ if (is_in_v2_mode())
hotplug_update_tasks(cs, &new_cpus, &new_mems,
cpus_updated, mems_updated);
else
@@ -2288,7 +2295,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
static cpumask_t new_cpus;
static nodemask_t new_mems;
bool cpus_updated, mems_updated;
- bool on_dfl = cgroup_subsys_on_dfl(cpuset_cgrp_subsys);
+ bool on_dfl = is_in_v2_mode();
mutex_lock(&cpuset_mutex);
@@ -0,0 +1,90 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Parav Pandit <parav@mellanox.com>
Date: Fri, 5 Jan 2018 23:51:12 +0100
Subject: [PATCH] IB/core: Avoid crash on pkey enforcement failed in received
MADs
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
commit 89548bcafec7ecfeea58c553f0834b5d575a66eb upstream.
Below kernel crash is observed when Pkey security enforcement fails on
received MADs. This issue is reported in [1].
ib_free_recv_mad() accesses the rmpp_list, whose initialization is
needed before accessing it.
When security enformcent fails on received MADs, MAD processing avoided
due to security checks failed.
OpenSM[3770]: SM port is down
kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
kernel: IP: ib_free_recv_mad+0x44/0xa0 [ib_core]
kernel: PGD 0
kernel: P4D 0
kernel:
kernel: Oops: 0002 [#1] SMP
kernel: CPU: 0 PID: 2833 Comm: kworker/0:1H Tainted: P IO 4.13.4-1-pve #1
kernel: Hardware name: Dell XS23-TY3 /9CMP63, BIOS 1.71 09/17/2013
kernel: Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
kernel: task: ffffa069c6541600 task.stack: ffffb9a729054000
kernel: RIP: 0010:ib_free_recv_mad+0x44/0xa0 [ib_core]
kernel: RSP: 0018:ffffb9a729057d38 EFLAGS: 00010286
kernel: RAX: ffffa069cb138a48 RBX: ffffa069cb138a10 RCX: 0000000000000000
kernel: RDX: ffffb9a729057d38 RSI: 0000000000000000 RDI: ffffa069cb138a20
kernel: RBP: ffffb9a729057d60 R08: ffffa072d2d49800 R09: ffffa069cb138ae0
kernel: R10: ffffa069cb138ae0 R11: ffffa072b3994e00 R12: ffffb9a729057d38
kernel: R13: ffffa069d1c90000 R14: 0000000000000000 R15: ffffa069d1c90880
kernel: FS: 0000000000000000(0000) GS:ffffa069dba00000(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 0000000000000008 CR3: 00000011f51f2000 CR4: 00000000000006f0
kernel: Call Trace:
kernel: ib_mad_recv_done+0x5cc/0xb50 [ib_core]
kernel: __ib_process_cq+0x5c/0xb0 [ib_core]
kernel: ib_cq_poll_work+0x20/0x60 [ib_core]
kernel: process_one_work+0x1e9/0x410
kernel: worker_thread+0x4b/0x410
kernel: kthread+0x109/0x140
kernel: ? process_one_work+0x410/0x410
kernel: ? kthread_create_on_node+0x70/0x70
kernel: ? SyS_exit_group+0x14/0x20
kernel: ret_from_fork+0x25/0x30
kernel: RIP: ib_free_recv_mad+0x44/0xa0 [ib_core] RSP: ffffb9a729057d38
kernel: CR2: 0000000000000008
[1] : https://www.spinics.net/lists/linux-rdma/msg56190.html
Fixes: 47a2b338fe63 ("IB/core: Enforce security on management datagrams")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reported-by: Chris Blake <chrisrblake93@gmail.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
drivers/infiniband/core/mad.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index f8f53bb90837..cb91245e9163 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -1974,14 +1974,15 @@ static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv,
unsigned long flags;
int ret;
+ INIT_LIST_HEAD(&mad_recv_wc->rmpp_list);
ret = ib_mad_enforce_security(mad_agent_priv,
mad_recv_wc->wc->pkey_index);
if (ret) {
ib_free_recv_mad(mad_recv_wc);
deref_mad_agent(mad_agent_priv);
+ return;
}
- INIT_LIST_HEAD(&mad_recv_wc->rmpp_list);
list_add(&mad_recv_wc->recv_buf.list, &mad_recv_wc->rmpp_list);
if (ib_mad_kernel_rmpp_agent(&mad_agent_priv->agent)) {
mad_recv_wc = ib_process_rmpp_recv_wc(mad_agent_priv,
@@ -0,0 +1,44 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Daniel Jurgens <danielj@mellanox.com>
Date: Mon, 20 Nov 2017 16:47:45 -0600
Subject: [PATCH] IB/core: Don't enforce PKey security on SMI MADs
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Per the infiniband spec an SMI MAD can have any PKey. Checking the pkey
on SMI MADs is not necessary, and it seems that some older adapters
using the mthca driver don't follow the convention of using the default
PKey, resulting in false denials, or errors querying the PKey cache.
SMI MAD security is still enforced, only agents allowed to manage the
subnet are able to receive or send SMI MADs.
Reported-by: Chris Blake <chrisrblake93@gmail.com>
Fixes: 47a2b338fe63("IB/core: Enforce security on management datagrams")
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
drivers/infiniband/core/security.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/core/security.c b/drivers/infiniband/core/security.c
index 70ad19c4c73e..8f9fd3b757db 100644
--- a/drivers/infiniband/core/security.c
+++ b/drivers/infiniband/core/security.c
@@ -692,8 +692,11 @@ int ib_mad_enforce_security(struct ib_mad_agent_private *map, u16 pkey_index)
{
int ret;
- if (map->agent.qp->qp_type == IB_QPT_SMI && !map->agent.smp_allowed)
- return -EACCES;
+ if (map->agent.qp->qp_type == IB_QPT_SMI) {
+ if (!map->agent.smp_allowed)
+ return -EACCES;
+ return 0;
+ }
ret = ib_security_pkey_access(map->agent.device,
map->agent.port_num,
@@ -0,0 +1,53 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Thu, 26 Oct 2017 09:13:27 +0200
Subject: [PATCH] KVM: SVM: obey guest PAT
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
For many years some users of assigned devices have reported worse
performance on AMD processors with NPT than on AMD without NPT,
Intel or bare metal.
The reason turned out to be that SVM is discarding the guest PAT
setting and uses the default (PA0=PA4=WB, PA1=PA5=WT, PA2=PA6=UC-,
PA3=UC). The guest might be using a different setting, and
especially might want write combining but isn't getting it
(instead getting slow UC or UC- accesses).
Thanks a lot to geoff@hostfission.com for noticing the relation
to the g_pat setting. The patch has been tested also by a bunch
of people on VFIO users forums.
Fixes: 709ddebf81cb40e3c36c6109a7892e8b93a09464
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=196409
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
(cherry picked from commit 15038e14724799b8c205beb5f20f9e54896013c3)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kvm/svm.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 068084c8e540..da10db3de636 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3666,6 +3666,13 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
u32 ecx = msr->index;
u64 data = msr->data;
switch (ecx) {
+ case MSR_IA32_CR_PAT:
+ if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
+ return 1;
+ vcpu->arch.pat = data;
+ svm->vmcb->save.g_pat = data;
+ mark_dirty(svm->vmcb, VMCB_NPT);
+ break;
case MSR_IA32_TSC:
kvm_write_tsc(vcpu, msr);
break;
@@ -0,0 +1,65 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Felix Wilhelm <fwilhelm@google.com>
Date: Mon, 11 Jun 2018 09:43:44 +0200
Subject: [PATCH] kvm: nVMX: Enforce cpl=0 for VMX instructions
VMX instructions executed inside a L1 VM will always trigger a VM exit
even when executed with cpl 3. This means we must perform the
privilege check in software.
Fixes: 70f3aac964ae("kvm: nVMX: Remove superfluous VMX instruction fault checks")
Cc: stable@vger.kernel.org
Signed-off-by: Felix Wilhelm <fwilhelm@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
arch/x86/kvm/vmx.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 54980817194a..b2d75b59b6e5 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -7180,6 +7180,12 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
return 1;
}
+ /* CPL=0 must be checked manually. */
+ if (vmx_get_cpl(vcpu)) {
+ kvm_queue_exception(vcpu, UD_VECTOR);
+ return 1;
+ }
+
if (vmx->nested.vmxon) {
nested_vmx_failValid(vcpu, VMXERR_VMXON_IN_VMX_ROOT_OPERATION);
return kvm_skip_emulated_instruction(vcpu);
@@ -7239,6 +7245,11 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
*/
static int nested_vmx_check_permission(struct kvm_vcpu *vcpu)
{
+ if (vmx_get_cpl(vcpu)) {
+ kvm_queue_exception(vcpu, UD_VECTOR);
+ return 0;
+ }
+
if (!to_vmx(vcpu)->nested.vmxon) {
kvm_queue_exception(vcpu, UD_VECTOR);
return 0;
@@ -7577,7 +7588,7 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
if (get_vmx_mem_address(vcpu, exit_qualification,
vmx_instruction_info, true, &gva))
return 1;
- /* _system ok, as hardware has verified cpl=0 */
+ /* _system ok, nested_vmx_check_permission has verified cpl=0 */
kvm_write_guest_virt_system(&vcpu->arch.emulate_ctxt, gva,
&field_value, (is_long_mode(vcpu) ? 8 : 4), NULL);
}
@@ -7720,7 +7731,7 @@ static int handle_vmptrst(struct kvm_vcpu *vcpu)
if (get_vmx_mem_address(vcpu, exit_qualification,
vmx_instruction_info, true, &vmcs_gva))
return 1;
- /* ok to use *_system, as hardware has verified cpl=0 */
+ /* *_system ok, nested_vmx_check_permission has verified cpl=0 */
if (kvm_write_guest_virt_system(&vcpu->arch.emulate_ctxt, vmcs_gva,
(void *)&to_vmx(vcpu)->nested.current_vmptr,
sizeof(u64), &e)) {
@@ -0,0 +1,30 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Wolfgang Bumiller <w.bumiller@proxmox.com>
Date: Fri, 19 Jan 2018 11:12:37 +0100
Subject: [PATCH] net: sched: em_nbyte: don't add the data offset twice
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
'ptr' is shifted by the offset and then validated,
the memcmp should not add it a second time.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
net/sched/em_nbyte.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/em_nbyte.c b/net/sched/em_nbyte.c
index df3110d69585..07c10bac06a0 100644
--- a/net/sched/em_nbyte.c
+++ b/net/sched/em_nbyte.c
@@ -51,7 +51,7 @@ static int em_nbyte_match(struct sk_buff *skb, struct tcf_ematch *em,
if (!tcf_valid_offset(skb, ptr, nbyte->hdr.len))
return 0;
- return !memcmp(ptr + nbyte->hdr.off, nbyte->pattern, nbyte->hdr.len);
+ return !memcmp(ptr, nbyte->pattern, nbyte->hdr.len);
}
static struct tcf_ematch_ops em_nbyte_ops = {
@@ -0,0 +1,31 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Wolfgang Bumiller <w.bumiller@proxmox.com>
Date: Fri, 19 Jan 2018 11:12:38 +0100
Subject: [PATCH] net: sched: fix TCF_LAYER_LINK case in tcf_get_base_ptr
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
TCF_LAYER_LINK and TCF_LAYER_NETWORK returned the same pointer as
skb->data points to the network header.
Use skb_mac_header instead.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
include/net/pkt_cls.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 537d0a0ad4c4..4450961b1554 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -395,7 +395,7 @@ static inline unsigned char * tcf_get_base_ptr(struct sk_buff *skb, int layer)
{
switch (layer) {
case TCF_LAYER_LINK:
- return skb->data;
+ return skb_mac_header(skb);
case TCF_LAYER_NETWORK:
return skb_network_header(skb);
case TCF_LAYER_TRANSPORT:
@@ -0,0 +1,46 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Honig <ahonig@google.com>
Date: Wed, 10 Jan 2018 10:12:03 -0800
Subject: [PATCH] KVM: x86: Add memory barrier on vmcs field lookup
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
commit 75f139aaf896d6fdeec2e468ddfa4b2fe469bf40 upstream.
This adds a memory barrier when performing a lookup into
the vmcs_field_to_offset_table. This is related to
CVE-2017-5753.
Signed-off-by: Andrew Honig <ahonig@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kvm/vmx.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index b2d75b59b6e5..a393186d14b1 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -883,8 +883,16 @@ static inline short vmcs_field_to_offset(unsigned long field)
{
BUILD_BUG_ON(ARRAY_SIZE(vmcs_field_to_offset_table) > SHRT_MAX);
- if (field >= ARRAY_SIZE(vmcs_field_to_offset_table) ||
- vmcs_field_to_offset_table[field] == 0)
+ if (field >= ARRAY_SIZE(vmcs_field_to_offset_table))
+ return -ENOENT;
+
+ /*
+ * FIXME: Mitigation for CVE-2017-5753. To be replaced with a
+ * generic mechanism.
+ */
+ asm("lfence");
+
+ if (vmcs_field_to_offset_table[field] == 0)
return -ENOENT;
return vmcs_field_to_offset_table[field];
@@ -0,0 +1,34 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: "Gustavo A. R. Silva" <garsilva@embeddedor.com>
Date: Mon, 16 Oct 2017 12:40:29 -0500
Subject: [PATCH] EDAC, sb_edac: Fix missing break in switch
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Add missing break statement in order to prevent the code from falling
through.
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20171016174029.GA19757@embeddedor.com
Signed-off-by: Borislav Petkov <bp@suse.de>
(cherry picked from commit a8e9b186f153a44690ad0363a56716e7077ad28c)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
drivers/edac/sb_edac.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index 5c3e707ff3fc..59af590b660c 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -2454,6 +2454,7 @@ static int ibridge_mci_bind_devs(struct mem_ctl_info *mci,
case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TA:
case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TA:
pvt->pci_ta = pdev;
+ break;
case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_RAS:
case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_RAS:
pvt->pci_ras = pdev;
@@ -0,0 +1,49 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Omar Sandoval <osandov@fb.com>
Date: Tue, 5 Dec 2017 23:15:31 -0800
Subject: [PATCH] sched/wait: Fix add_wait_queue() behavioral change
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The following cleanup commit:
50816c48997a ("sched/wait: Standardize internal naming of wait-queue entries")
... unintentionally changed the behavior of add_wait_queue() from
inserting the wait entry at the head of the wait queue to the tail
of the wait queue.
Beyond a negative performance impact this change in behavior
theoretically also breaks wait queues which mix exclusive and
non-exclusive waiters, as non-exclusive waiters will not be
woken up if they are queued behind enough exclusive waiters.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-team@fb.com
Fixes: ("sched/wait: Standardize internal naming of wait-queue entries")
Link: http://lkml.kernel.org/r/a16c8ccffd39bd08fdaa45a5192294c784b803a7.1512544324.git.osandov@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit c6b9d9a33029014446bd9ed84c1688f6d3d4eab9)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
kernel/sched/wait.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index d6afed6d0752..c09ebe92a40a 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -27,7 +27,7 @@ void add_wait_queue(struct wait_queue_head *wq_head, struct wait_queue_entry *wq
wq_entry->flags &= ~WQ_FLAG_EXCLUSIVE;
spin_lock_irqsave(&wq_head->lock, flags);
- __add_wait_queue_entry_tail(wq_head, wq_entry);
+ __add_wait_queue(wq_head, wq_entry);
spin_unlock_irqrestore(&wq_head->lock, flags);
}
EXPORT_SYMBOL(add_wait_queue);
@@ -0,0 +1,161 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andi Kleen <ak@linux.intel.com>
Date: Thu, 25 Jan 2018 15:50:28 -0800
Subject: [PATCH] module/retpoline: Warn about missing retpoline in module
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
There's a risk that a kernel which has full retpoline mitigations becomes
vulnerable when a module gets loaded that hasn't been compiled with the
right compiler or the right option.
To enable detection of that mismatch at module load time, add a module info
string "retpoline" at build time when the module was compiled with
retpoline support. This only covers compiled C source, but assembler source
or prebuilt object files are not checked.
If a retpoline enabled kernel detects a non retpoline protected module at
load time, print a warning and report it in the sysfs vulnerability file.
[ tglx: Massaged changelog ]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: gregkh@linuxfoundation.org
Cc: torvalds@linux-foundation.org
Cc: jeyu@kernel.org
Cc: arjan@linux.intel.com
Link: https://lkml.kernel.org/r/20180125235028.31211-1-andi@firstfloor.org
(backported from commit caf7501a1b4ec964190f31f9c3f163de252273b8)
Conflicts:
arch/x86/kernel/cpu/bugs.c
context changes
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kernel/cpu/bugs.c | 18 +++++++++++++++++-
include/linux/module.h | 9 +++++++++
kernel/module.c | 11 +++++++++++
scripts/mod/modpost.c | 9 +++++++++
4 files changed, 46 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 7e5db5aa37f3..b5bcdf7e94d7 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -11,6 +11,7 @@
#include <linux/utsname.h>
#include <linux/cpu.h>
#include <linux/smp.h>
+#include <linux/module.h>
#include <linux/nospec.h>
#include <linux/prctl.h>
@@ -131,6 +132,19 @@ static const char *spectre_v2_strings[] = {
static enum spectre_v2_mitigation spectre_v2_enabled __ro_after_init =
SPECTRE_V2_NONE;
+static bool spectre_v2_bad_module;
+
+#ifdef RETPOLINE
+bool retpoline_module_ok(bool has_retpoline)
+{
+ if (spectre_v2_enabled == SPECTRE_V2_NONE || has_retpoline)
+ return true;
+
+ pr_err("System may be vunerable to spectre v2\n");
+ spectre_v2_bad_module = true;
+ return false;
+}
+#endif
void
x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
@@ -627,7 +641,9 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
return sprintf(buf, "Mitigation: OSB (observable speculation barrier, Intel v6)\n");
case X86_BUG_SPECTRE_V2:
- return sprintf(buf, "%s%s\n", spectre_v2_strings[spectre_v2_enabled], ibpb_inuse ? ", IBPB (Intel v4)" : "");
+ return sprintf(buf, "%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
+ ibpb_inuse ? ",IBPB (Intel v4)" : "",
+ spectre_v2_bad_module ? " - vulnerable module loaded" : "");
case X86_BUG_SPEC_STORE_BYPASS:
return sprintf(buf, "%s\n", ssb_strings[ssb_mode]);
diff --git a/include/linux/module.h b/include/linux/module.h
index e7bdd549e527..c4fdf7661f82 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -794,6 +794,15 @@ static inline void module_bug_finalize(const Elf_Ehdr *hdr,
static inline void module_bug_cleanup(struct module *mod) {}
#endif /* CONFIG_GENERIC_BUG */
+#ifdef RETPOLINE
+extern bool retpoline_module_ok(bool has_retpoline);
+#else
+static inline bool retpoline_module_ok(bool has_retpoline)
+{
+ return true;
+}
+#endif
+
#ifdef CONFIG_MODULE_SIG
static inline bool module_sig_ok(struct module *module)
{
diff --git a/kernel/module.c b/kernel/module.c
index 41b97a191a72..1c3fd6f767b4 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2855,6 +2855,15 @@ static int check_modinfo_livepatch(struct module *mod, struct load_info *info)
}
#endif /* CONFIG_LIVEPATCH */
+static void check_modinfo_retpoline(struct module *mod, struct load_info *info)
+{
+ if (retpoline_module_ok(get_modinfo(info, "retpoline")))
+ return;
+
+ pr_warn("%s: loading module not compiled with retpoline compiler.\n",
+ mod->name);
+}
+
/* Sets info->hdr and info->len. */
static int copy_module_from_user(const void __user *umod, unsigned long len,
struct load_info *info)
@@ -3021,6 +3030,8 @@ static int check_modinfo(struct module *mod, struct load_info *info, int flags)
add_taint_module(mod, TAINT_OOT_MODULE, LOCKDEP_STILL_OK);
}
+ check_modinfo_retpoline(mod, info);
+
if (get_modinfo(info, "staging")) {
add_taint_module(mod, TAINT_CRAP, LOCKDEP_STILL_OK);
pr_warn("%s: module is from the staging directory, the quality "
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index 48397feb08fb..cc91f81ac33e 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -2147,6 +2147,14 @@ static void add_intree_flag(struct buffer *b, int is_intree)
buf_printf(b, "\nMODULE_INFO(intree, \"Y\");\n");
}
+/* Cannot check for assembler */
+static void add_retpoline(struct buffer *b)
+{
+ buf_printf(b, "\n#ifdef RETPOLINE\n");
+ buf_printf(b, "MODULE_INFO(retpoline, \"Y\");\n");
+ buf_printf(b, "#endif\n");
+}
+
static void add_staging_flag(struct buffer *b, const char *name)
{
static const char *staging_dir = "drivers/staging";
@@ -2492,6 +2500,7 @@ int main(int argc, char **argv)
add_header(&buf, mod);
add_intree_flag(&buf, !external_module);
+ add_retpoline(&buf);
add_staging_flag(&buf, mod->name);
err |= add_versions(&buf, mod);
add_depends(&buf, mod, modules);
@@ -0,0 +1,124 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Dan Streetman <ddstreet@ieee.org>
Date: Thu, 18 Jan 2018 16:14:26 -0500
Subject: [PATCH] net: tcp: close sock if net namespace is exiting
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
When a tcp socket is closed, if it detects that its net namespace is
exiting, close immediately and do not wait for FIN sequence.
For normal sockets, a reference is taken to their net namespace, so it will
never exit while the socket is open. However, kernel sockets do not take a
reference to their net namespace, so it may begin exiting while the kernel
socket is still open. In this case if the kernel socket is a tcp socket,
it will stay open trying to complete its close sequence. The sock's dst(s)
hold a reference to their interface, which are all transferred to the
namespace's loopback interface when the real interfaces are taken down.
When the namespace tries to take down its loopback interface, it hangs
waiting for all references to the loopback interface to release, which
results in messages like:
unregister_netdevice: waiting for lo to become free. Usage count = 1
These messages continue until the socket finally times out and closes.
Since the net namespace cleanup holds the net_mutex while calling its
registered pernet callbacks, any new net namespace initialization is
blocked until the current net namespace finishes exiting.
After this change, the tcp socket notices the exiting net namespace, and
closes immediately, releasing its dst(s) and their reference to the
loopback interface, which lets the net namespace continue exiting.
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811
Signed-off-by: Dan Streetman <ddstreet@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
include/net/net_namespace.h | 10 ++++++++++
net/ipv4/tcp.c | 3 +++
net/ipv4/tcp_timer.c | 15 +++++++++++++++
3 files changed, 28 insertions(+)
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 1c401bd4c2e0..a5d023fa78db 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -221,6 +221,11 @@ int net_eq(const struct net *net1, const struct net *net2)
return net1 == net2;
}
+static inline int check_net(const struct net *net)
+{
+ return atomic_read(&net->count) != 0;
+}
+
void net_drop_ns(void *);
#else
@@ -245,6 +250,11 @@ int net_eq(const struct net *net1, const struct net *net2)
return 1;
}
+static inline int check_net(const struct net *net)
+{
+ return 1;
+}
+
#define net_drop_ns NULL
#endif
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index a3e91b552edc..fd2a086da910 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2258,6 +2258,9 @@ void tcp_close(struct sock *sk, long timeout)
tcp_send_active_reset(sk, GFP_ATOMIC);
__NET_INC_STATS(sock_net(sk),
LINUX_MIB_TCPABORTONMEMORY);
+ } else if (!check_net(sock_net(sk))) {
+ /* Not possible to send reset; just close */
+ tcp_set_state(sk, TCP_CLOSE);
}
}
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index e906014890b6..ec1e5de41653 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -50,11 +50,19 @@ static void tcp_write_err(struct sock *sk)
* to prevent DoS attacks. It is called when a retransmission timeout
* or zero probe timeout occurs on orphaned socket.
*
+ * Also close if our net namespace is exiting; in that case there is no
+ * hope of ever communicating again since all netns interfaces are already
+ * down (or about to be down), and we need to release our dst references,
+ * which have been moved to the netns loopback interface, so the namespace
+ * can finish exiting. This condition is only possible if we are a kernel
+ * socket, as those do not hold references to the namespace.
+ *
* Criteria is still not confirmed experimentally and may change.
* We kill the socket, if:
* 1. If number of orphaned sockets exceeds an administratively configured
* limit.
* 2. If we have strong memory pressure.
+ * 3. If our net namespace is exiting.
*/
static int tcp_out_of_resources(struct sock *sk, bool do_reset)
{
@@ -83,6 +91,13 @@ static int tcp_out_of_resources(struct sock *sk, bool do_reset)
__NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONMEMORY);
return 1;
}
+
+ if (!check_net(sock_net(sk))) {
+ /* Not possible to send reset; just close */
+ tcp_done(sk);
+ return 1;
+ }
+
return 0;
}
@@ -0,0 +1,86 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tommi Rantala <tommi.t.rantala@nokia.com>
Date: Mon, 5 Feb 2018 21:48:14 +0200
Subject: [PATCH] sctp: fix dst refcnt leak in sctp_v4_get_dst
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Fix dst reference count leak in sctp_v4_get_dst() introduced in commit
410f03831 ("sctp: add routing output fallback"):
When walking the address_list, successive ip_route_output_key() calls
may return the same rt->dst with the reference incremented on each call.
The code would not decrement the dst refcount when the dst pointer was
identical from the previous iteration, causing the dst refcnt leak.
Testcase:
ip netns add TEST
ip netns exec TEST ip link set lo up
ip link add dummy0 type dummy
ip link add dummy1 type dummy
ip link add dummy2 type dummy
ip link set dev dummy0 netns TEST
ip link set dev dummy1 netns TEST
ip link set dev dummy2 netns TEST
ip netns exec TEST ip addr add 192.168.1.1/24 dev dummy0
ip netns exec TEST ip link set dummy0 up
ip netns exec TEST ip addr add 192.168.1.2/24 dev dummy1
ip netns exec TEST ip link set dummy1 up
ip netns exec TEST ip addr add 192.168.1.3/24 dev dummy2
ip netns exec TEST ip link set dummy2 up
ip netns exec TEST sctp_test -H 192.168.1.2 -P 20002 -h 192.168.1.1 -p 20000 -s -B 192.168.1.3
ip netns del TEST
In 4.4 and 4.9 kernels this results to:
[ 354.179591] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 364.419674] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 374.663664] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 384.903717] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 395.143724] unregister_netdevice: waiting for lo to become free. Usage count = 1
[ 405.383645] unregister_netdevice: waiting for lo to become free. Usage count = 1
...
Fixes: 410f03831 ("sctp: add routing output fallback")
Fixes: 0ca50d12f ("sctp: fix src address selection if using secondary addresses")
Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
net/sctp/protocol.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 989a900383b5..e1a3ae4f3cab 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -514,22 +514,20 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
if (IS_ERR(rt))
continue;
- if (!dst)
- dst = &rt->dst;
-
/* Ensure the src address belongs to the output
* interface.
*/
odev = __ip_dev_find(sock_net(sk), laddr->a.v4.sin_addr.s_addr,
false);
if (!odev || odev->ifindex != fl4->flowi4_oif) {
- if (&rt->dst != dst)
+ if (!dst)
+ dst = &rt->dst;
+ else
dst_release(&rt->dst);
continue;
}
- if (dst != &rt->dst)
- dst_release(dst);
+ dst_release(dst);
dst = &rt->dst;
break;
}
@@ -0,0 +1,57 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Alexey Kodanev <alexey.kodanev@oracle.com>
Date: Mon, 5 Feb 2018 15:10:35 +0300
Subject: [PATCH] sctp: fix dst refcnt leak in sctp_v6_get_dst()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
When going through the bind address list in sctp_v6_get_dst() and
the previously found address is better ('matchlen > bmatchlen'),
the code continues to the next iteration without releasing currently
held destination.
Fix it by releasing 'bdst' before continue to the next iteration, and
instead of introducing one more '!IS_ERR(bdst)' check for dst_release(),
move the already existed one right after ip6_dst_lookup_flow(), i.e. we
shouldn't proceed further if we get an error for the route lookup.
Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using secondary addresses for ipv6")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
net/sctp/ipv6.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
index edb462b0b73b..e626d72868fe 100644
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -326,8 +326,10 @@ static void sctp_v6_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
final_p = fl6_update_dst(fl6, rcu_dereference(np->opt), &final);
bdst = ip6_dst_lookup_flow(sk, fl6, final_p);
- if (!IS_ERR(bdst) &&
- ipv6_chk_addr(dev_net(bdst->dev),
+ if (IS_ERR(bdst))
+ continue;
+
+ if (ipv6_chk_addr(dev_net(bdst->dev),
&laddr->a.v6.sin6_addr, bdst->dev, 1)) {
if (!IS_ERR_OR_NULL(dst))
dst_release(dst);
@@ -336,8 +338,10 @@ static void sctp_v6_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
}
bmatchlen = sctp_v6_addr_match_len(daddr, &laddr->a);
- if (matchlen > bmatchlen)
+ if (matchlen > bmatchlen) {
+ dst_release(bdst);
continue;
+ }
if (!IS_ERR_OR_NULL(dst))
dst_release(dst);
@@ -0,0 +1,43 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Vasily Averin <vvs@virtuozzo.com>
Date: Thu, 2 Nov 2017 13:03:42 +0300
Subject: [PATCH] lockd: lost rollback of set_grace_period() in
lockd_down_net()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Commit efda760fe95ea ("lockd: fix lockd shutdown race") is incorrect,
it removes lockd_manager and disarm grace_period_end for init_net only.
If nfsd was started from another net namespace lockd_up_net() calls
set_grace_period() that adds lockd_manager into per-netns list
and queues grace_period_end delayed work.
These action should be reverted in lockd_down_net().
Otherwise it can lead to double list_add on after restart nfsd in netns,
and to use-after-free if non-disarmed delayed work will be executed after netns destroy.
Fixes: efda760fe95e ("lockd: fix lockd shutdown race")
Cc: stable@vger.kernel.org
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
(cherry picked from commit 3a2b19d1ee5633f76ae8a88da7bc039a5d1732aa)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
fs/lockd/svc.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c
index 726b6cecf430..fa8f6effcf00 100644
--- a/fs/lockd/svc.c
+++ b/fs/lockd/svc.c
@@ -274,6 +274,8 @@ static void lockd_down_net(struct svc_serv *serv, struct net *net)
if (ln->nlmsvc_users) {
if (--ln->nlmsvc_users == 0) {
nlm_shutdown_hosts_net(net);
+ cancel_delayed_work_sync(&ln->grace_period_end);
+ locks_end_grace(&ln->lockd_manager);
svc_shutdown_net(serv, net);
dprintk("lockd_down_net: per-net data destroyed; net=%p\n", net);
}
@@ -0,0 +1,58 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Changwei Ge <ge.changwei@h3c.com>
Date: Wed, 31 Jan 2018 16:15:02 -0800
Subject: [PATCH] ocfs2: make metadata estimation accurate and clear
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Current code assume that ::w_unwritten_list always has only one item on.
This is not right and hard to get understood. So improve how to count
unwritten item.
Link: http://lkml.kernel.org/r/1515479070-32653-1-git-send-email-ge.changwei@h3c.com
Signed-off-by: Changwei Ge <ge.changwei@h3c.com>
Reported-by: John Lightsey <john@nixnuts.net>
Tested-by: John Lightsey <john@nixnuts.net>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 63de8bd9328bf2a778fc277503da163ae3defa3c)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
fs/ocfs2/aops.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 88a31e9340a0..77ec9b495027 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -784,6 +784,7 @@ struct ocfs2_write_ctxt {
struct ocfs2_cached_dealloc_ctxt w_dealloc;
struct list_head w_unwritten_list;
+ unsigned int w_unwritten_count;
};
void ocfs2_unlock_and_free_pages(struct page **pages, int num_pages)
@@ -1373,6 +1374,7 @@ static int ocfs2_unwritten_check(struct inode *inode,
desc->c_clear_unwritten = 0;
list_add_tail(&new->ue_ip_node, &oi->ip_unwritten_list);
list_add_tail(&new->ue_node, &wc->w_unwritten_list);
+ wc->w_unwritten_count++;
new = NULL;
unlock:
spin_unlock(&oi->ip_lock);
@@ -2246,7 +2248,7 @@ static int ocfs2_dio_get_block(struct inode *inode, sector_t iblock,
ue->ue_phys = desc->c_phys;
list_splice_tail_init(&wc->w_unwritten_list, &dwc->dw_zero_list);
- dwc->dw_zero_count++;
+ dwc->dw_zero_count += wc->w_unwritten_count;
}
ret = ocfs2_write_end_nolock(inode->i_mapping, pos, len, len, wc);
@@ -0,0 +1,367 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Changwei Ge <ge.changwei@h3c.com>
Date: Wed, 31 Jan 2018 16:15:06 -0800
Subject: [PATCH] ocfs2: try to reuse extent block in dealloc without
meta_alloc
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
A crash issue was reported by John Lightsey with a call trace as follows:
ocfs2_split_extent+0x1ad3/0x1b40 [ocfs2]
ocfs2_change_extent_flag+0x33a/0x470 [ocfs2]
ocfs2_mark_extent_written+0x172/0x220 [ocfs2]
ocfs2_dio_end_io+0x62d/0x910 [ocfs2]
dio_complete+0x19a/0x1a0
do_blockdev_direct_IO+0x19dd/0x1eb0
__blockdev_direct_IO+0x43/0x50
ocfs2_direct_IO+0x8f/0xa0 [ocfs2]
generic_file_direct_write+0xb2/0x170
__generic_file_write_iter+0xc3/0x1b0
ocfs2_file_write_iter+0x4bb/0xca0 [ocfs2]
__vfs_write+0xae/0xf0
vfs_write+0xb8/0x1b0
SyS_write+0x4f/0xb0
system_call_fastpath+0x16/0x75
The BUG code told that extent tree wants to grow but no metadata was
reserved ahead of time. From my investigation into this issue, the root
cause it that although enough metadata is not reserved, there should be
enough for following use. Rightmost extent is merged into its left one
due to a certain times of marking extent written. Because during
marking extent written, we got many physically continuous extents. At
last, an empty extent showed up and the rightmost path is removed from
extent tree.
Add a new mechanism to reuse extent block cached in dealloc which were
just unlinked from extent tree to solve this crash issue.
Criteria is that during marking extents *written*, if extent rotation
and merging results in unlinking extent with growing extent tree later
without any metadata reserved ahead of time, try to reuse those extents
in dealloc in which deleted extents are cached.
Also, this patch addresses the issue John reported that ::dw_zero_count
is not calculated properly.
After applying this patch, the issue John reported was gone. Thanks for
the reproducer provided by John. And this patch has passed
ocfs2-test(29 cases) suite running by New H3C Group.
[ge.changwei@h3c.com: fix static checker warnning]
Link: http://lkml.kernel.org/r/63ADC13FD55D6546B7DECE290D39E373F29196AE@H3CMLB12-EX.srv.huawei-3com.com
[akpm@linux-foundation.org: brelse(NULL) is legal]
Link: http://lkml.kernel.org/r/1515479070-32653-2-git-send-email-ge.changwei@h3c.com
Signed-off-by: Changwei Ge <ge.changwei@h3c.com>
Reported-by: John Lightsey <john@nixnuts.net>
Tested-by: John Lightsey <john@nixnuts.net>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 71a36944042b7d9dd71f6a5d1c5ea1c2353b5d42)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
fs/ocfs2/alloc.c | 206 ++++++++++++++++++++++++++++++++++++++++++++++++++++---
fs/ocfs2/alloc.h | 1 +
fs/ocfs2/aops.c | 6 ++
3 files changed, 203 insertions(+), 10 deletions(-)
diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c
index 386aecce881d..9b5e7d8ba710 100644
--- a/fs/ocfs2/alloc.c
+++ b/fs/ocfs2/alloc.c
@@ -165,6 +165,13 @@ static int ocfs2_dinode_insert_check(struct ocfs2_extent_tree *et,
struct ocfs2_extent_rec *rec);
static int ocfs2_dinode_sanity_check(struct ocfs2_extent_tree *et);
static void ocfs2_dinode_fill_root_el(struct ocfs2_extent_tree *et);
+
+static int ocfs2_reuse_blk_from_dealloc(handle_t *handle,
+ struct ocfs2_extent_tree *et,
+ struct buffer_head **new_eb_bh,
+ int blk_wanted, int *blk_given);
+static int ocfs2_is_dealloc_empty(struct ocfs2_extent_tree *et);
+
static const struct ocfs2_extent_tree_operations ocfs2_dinode_et_ops = {
.eo_set_last_eb_blk = ocfs2_dinode_set_last_eb_blk,
.eo_get_last_eb_blk = ocfs2_dinode_get_last_eb_blk,
@@ -448,6 +455,7 @@ static void __ocfs2_init_extent_tree(struct ocfs2_extent_tree *et,
if (!obj)
obj = (void *)bh->b_data;
et->et_object = obj;
+ et->et_dealloc = NULL;
et->et_ops->eo_fill_root_el(et);
if (!et->et_ops->eo_fill_max_leaf_clusters)
@@ -1159,7 +1167,7 @@ static int ocfs2_add_branch(handle_t *handle,
struct buffer_head **last_eb_bh,
struct ocfs2_alloc_context *meta_ac)
{
- int status, new_blocks, i;
+ int status, new_blocks, i, block_given = 0;
u64 next_blkno, new_last_eb_blk;
struct buffer_head *bh;
struct buffer_head **new_eb_bhs = NULL;
@@ -1214,11 +1222,31 @@ static int ocfs2_add_branch(handle_t *handle,
goto bail;
}
- status = ocfs2_create_new_meta_bhs(handle, et, new_blocks,
- meta_ac, new_eb_bhs);
- if (status < 0) {
- mlog_errno(status);
- goto bail;
+ /* Firstyly, try to reuse dealloc since we have already estimated how
+ * many extent blocks we may use.
+ */
+ if (!ocfs2_is_dealloc_empty(et)) {
+ status = ocfs2_reuse_blk_from_dealloc(handle, et,
+ new_eb_bhs, new_blocks,
+ &block_given);
+ if (status < 0) {
+ mlog_errno(status);
+ goto bail;
+ }
+ }
+
+ BUG_ON(block_given > new_blocks);
+
+ if (block_given < new_blocks) {
+ BUG_ON(!meta_ac);
+ status = ocfs2_create_new_meta_bhs(handle, et,
+ new_blocks - block_given,
+ meta_ac,
+ &new_eb_bhs[block_given]);
+ if (status < 0) {
+ mlog_errno(status);
+ goto bail;
+ }
}
/* Note: new_eb_bhs[new_blocks - 1] is the guy which will be
@@ -1341,15 +1369,25 @@ static int ocfs2_shift_tree_depth(handle_t *handle,
struct ocfs2_alloc_context *meta_ac,
struct buffer_head **ret_new_eb_bh)
{
- int status, i;
+ int status, i, block_given = 0;
u32 new_clusters;
struct buffer_head *new_eb_bh = NULL;
struct ocfs2_extent_block *eb;
struct ocfs2_extent_list *root_el;
struct ocfs2_extent_list *eb_el;
- status = ocfs2_create_new_meta_bhs(handle, et, 1, meta_ac,
- &new_eb_bh);
+ if (!ocfs2_is_dealloc_empty(et)) {
+ status = ocfs2_reuse_blk_from_dealloc(handle, et,
+ &new_eb_bh, 1,
+ &block_given);
+ } else if (meta_ac) {
+ status = ocfs2_create_new_meta_bhs(handle, et, 1, meta_ac,
+ &new_eb_bh);
+
+ } else {
+ BUG();
+ }
+
if (status < 0) {
mlog_errno(status);
goto bail;
@@ -1512,7 +1550,7 @@ static int ocfs2_grow_tree(handle_t *handle, struct ocfs2_extent_tree *et,
int depth = le16_to_cpu(el->l_tree_depth);
struct buffer_head *bh = NULL;
- BUG_ON(meta_ac == NULL);
+ BUG_ON(meta_ac == NULL && ocfs2_is_dealloc_empty(et));
shift = ocfs2_find_branch_target(et, &bh);
if (shift < 0) {
@@ -6593,6 +6631,154 @@ ocfs2_find_per_slot_free_list(int type,
return fl;
}
+static struct ocfs2_per_slot_free_list *
+ocfs2_find_preferred_free_list(int type,
+ int preferred_slot,
+ int *real_slot,
+ struct ocfs2_cached_dealloc_ctxt *ctxt)
+{
+ struct ocfs2_per_slot_free_list *fl = ctxt->c_first_suballocator;
+
+ while (fl) {
+ if (fl->f_inode_type == type && fl->f_slot == preferred_slot) {
+ *real_slot = fl->f_slot;
+ return fl;
+ }
+
+ fl = fl->f_next_suballocator;
+ }
+
+ /* If we can't find any free list matching preferred slot, just use
+ * the first one.
+ */
+ fl = ctxt->c_first_suballocator;
+ *real_slot = fl->f_slot;
+
+ return fl;
+}
+
+/* Return Value 1 indicates empty */
+static int ocfs2_is_dealloc_empty(struct ocfs2_extent_tree *et)
+{
+ struct ocfs2_per_slot_free_list *fl = NULL;
+
+ if (!et->et_dealloc)
+ return 1;
+
+ fl = et->et_dealloc->c_first_suballocator;
+ if (!fl)
+ return 1;
+
+ if (!fl->f_first)
+ return 1;
+
+ return 0;
+}
+
+/* If extent was deleted from tree due to extent rotation and merging, and
+ * no metadata is reserved ahead of time. Try to reuse some extents
+ * just deleted. This is only used to reuse extent blocks.
+ * It is supposed to find enough extent blocks in dealloc if our estimation
+ * on metadata is accurate.
+ */
+static int ocfs2_reuse_blk_from_dealloc(handle_t *handle,
+ struct ocfs2_extent_tree *et,
+ struct buffer_head **new_eb_bh,
+ int blk_wanted, int *blk_given)
+{
+ int i, status = 0, real_slot;
+ struct ocfs2_cached_dealloc_ctxt *dealloc;
+ struct ocfs2_per_slot_free_list *fl;
+ struct ocfs2_cached_block_free *bf;
+ struct ocfs2_extent_block *eb;
+ struct ocfs2_super *osb =
+ OCFS2_SB(ocfs2_metadata_cache_get_super(et->et_ci));
+
+ *blk_given = 0;
+
+ /* If extent tree doesn't have a dealloc, this is not faulty. Just
+ * tell upper caller dealloc can't provide any block and it should
+ * ask for alloc to claim more space.
+ */
+ dealloc = et->et_dealloc;
+ if (!dealloc)
+ goto bail;
+
+ for (i = 0; i < blk_wanted; i++) {
+ /* Prefer to use local slot */
+ fl = ocfs2_find_preferred_free_list(EXTENT_ALLOC_SYSTEM_INODE,
+ osb->slot_num, &real_slot,
+ dealloc);
+ /* If no more block can be reused, we should claim more
+ * from alloc. Just return here normally.
+ */
+ if (!fl) {
+ status = 0;
+ break;
+ }
+
+ bf = fl->f_first;
+ fl->f_first = bf->free_next;
+
+ new_eb_bh[i] = sb_getblk(osb->sb, bf->free_blk);
+ if (new_eb_bh[i] == NULL) {
+ status = -ENOMEM;
+ mlog_errno(status);
+ goto bail;
+ }
+
+ mlog(0, "Reusing block(%llu) from "
+ "dealloc(local slot:%d, real slot:%d)\n",
+ bf->free_blk, osb->slot_num, real_slot);
+
+ ocfs2_set_new_buffer_uptodate(et->et_ci, new_eb_bh[i]);
+
+ status = ocfs2_journal_access_eb(handle, et->et_ci,
+ new_eb_bh[i],
+ OCFS2_JOURNAL_ACCESS_CREATE);
+ if (status < 0) {
+ mlog_errno(status);
+ goto bail;
+ }
+
+ memset(new_eb_bh[i]->b_data, 0, osb->sb->s_blocksize);
+ eb = (struct ocfs2_extent_block *) new_eb_bh[i]->b_data;
+
+ /* We can't guarantee that buffer head is still cached, so
+ * polutlate the extent block again.
+ */
+ strcpy(eb->h_signature, OCFS2_EXTENT_BLOCK_SIGNATURE);
+ eb->h_blkno = cpu_to_le64(bf->free_blk);
+ eb->h_fs_generation = cpu_to_le32(osb->fs_generation);
+ eb->h_suballoc_slot = cpu_to_le16(real_slot);
+ eb->h_suballoc_loc = cpu_to_le64(bf->free_bg);
+ eb->h_suballoc_bit = cpu_to_le16(bf->free_bit);
+ eb->h_list.l_count =
+ cpu_to_le16(ocfs2_extent_recs_per_eb(osb->sb));
+
+ /* We'll also be dirtied by the caller, so
+ * this isn't absolutely necessary.
+ */
+ ocfs2_journal_dirty(handle, new_eb_bh[i]);
+
+ if (!fl->f_first) {
+ dealloc->c_first_suballocator = fl->f_next_suballocator;
+ kfree(fl);
+ }
+ kfree(bf);
+ }
+
+ *blk_given = i;
+
+bail:
+ if (unlikely(status < 0)) {
+ for (i = 0; i < blk_wanted; i++)
+ brelse(new_eb_bh[i]);
+ }
+
+ return status;
+}
+
int ocfs2_cache_block_dealloc(struct ocfs2_cached_dealloc_ctxt *ctxt,
int type, int slot, u64 suballoc,
u64 blkno, unsigned int bit)
diff --git a/fs/ocfs2/alloc.h b/fs/ocfs2/alloc.h
index 4a5152ec88a3..571692171dd1 100644
--- a/fs/ocfs2/alloc.h
+++ b/fs/ocfs2/alloc.h
@@ -61,6 +61,7 @@ struct ocfs2_extent_tree {
ocfs2_journal_access_func et_root_journal_access;
void *et_object;
unsigned int et_max_leaf_clusters;
+ struct ocfs2_cached_dealloc_ctxt *et_dealloc;
};
/*
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 77ec9b495027..2ff02dda97d8 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -2322,6 +2322,12 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
ocfs2_init_dinode_extent_tree(&et, INODE_CACHE(inode), di_bh);
+ /* Attach dealloc with extent tree in case that we may reuse extents
+ * which are already unlinked from current extent tree due to extent
+ * rotation and merging.
+ */
+ et.et_dealloc = &dealloc;
+
ret = ocfs2_lock_allocators(inode, &et, 0, dwc->dw_zero_count*2,
&data_ac, &meta_ac);
if (ret) {
@@ -0,0 +1,100 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Fri, 23 Mar 2018 09:19:21 +0100
Subject: [PATCH] mm/shmem: do not wait for lock_page() in
shmem_unused_huge_shrink()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
shmem_unused_huge_shrink() gets called from reclaim path. Waiting for
page lock may lead to deadlock there.
There was a bug report that may be attributed to this:
http://lkml.kernel.org/r/alpine.LRH.2.11.1801242349220.30642@mail.ewheeler.net
Replace lock_page() with trylock_page() and skip the page if we failed to
lock it. We will get to the page on the next scan.
We can test for the PageTransHuge() outside the page lock as we only need
protection against splitting the page under us. Holding pin oni the page
is enough for this.
Link: http://lkml.kernel.org/r/20180316210830.43738-1-kirill.shutemov@linux.intel.com
Fixes: 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Eric Wheeler <linux-mm@lists.ewheeler.net>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org> [4.8+]
Signed-off-by: Andrew Morton <>
(cherry-picked from https://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git/commit/?h=since-4.15&id=73eccc61c701ee7b4223aea2079542a712feeea7)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
mm/shmem.c | 31 ++++++++++++++++++++-----------
1 file changed, 20 insertions(+), 11 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index 859e4c224b80..2aae929eb90b 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -483,36 +483,45 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo,
info = list_entry(pos, struct shmem_inode_info, shrinklist);
inode = &info->vfs_inode;
- if (nr_to_split && split >= nr_to_split) {
- iput(inode);
- continue;
- }
+ if (nr_to_split && split >= nr_to_split)
+ goto leave;
- page = find_lock_page(inode->i_mapping,
+ page = find_get_page(inode->i_mapping,
(inode->i_size & HPAGE_PMD_MASK) >> PAGE_SHIFT);
if (!page)
goto drop;
+ /* No huge page at the end of the file: nothing to split */
if (!PageTransHuge(page)) {
- unlock_page(page);
put_page(page);
goto drop;
}
+ /*
+ * Leave the inode on the list if we failed to lock
+ * the page at this time.
+ *
+ * Waiting for the lock may lead to deadlock in the
+ * reclaim path.
+ */
+ if (!trylock_page(page)) {
+ put_page(page);
+ goto leave;
+ }
+
ret = split_huge_page(page);
unlock_page(page);
put_page(page);
- if (ret) {
- /* split failed: leave it on the list */
- iput(inode);
- continue;
- }
+ /* If split failed leave the inode on the list */
+ if (ret)
+ goto leave;
split++;
drop:
list_del_init(&info->shrinklist);
removed++;
+leave:
iput(inode);
}
@@ -0,0 +1,43 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Thu, 15 Mar 2018 18:07:47 +0300
Subject: [PATCH] mm/thp: Do not wait for lock_page() in deferred_split_scan()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
deferred_split_scan() gets called from reclaim path. Waiting for page
lock may lead to deadlock there.
Replace lock_page() with trylock_page() and skip the page if we failed
to lock it. We will get to the page on the next scan.
Fixes: 9a982250f773 ("thp: introduce deferred_split_huge_page()")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
(cherry-picked from https://patchwork.kernel.org/patch/10284703/)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
mm/huge_memory.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 8b887db33383..5c4093e0be8d 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2621,11 +2621,13 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
list_for_each_safe(pos, next, &list) {
page = list_entry((void *)pos, struct page, mapping);
- lock_page(page);
+ if (!trylock_page(page))
+ goto next;
/* split_huge_page() removes page from list on success */
if (!split_huge_page(page))
split++;
unlock_page(page);
+next:
put_page(page);
}
@@ -0,0 +1,46 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fabian=20Gr=C3=BCnbichler?= <f.gruenbichler@proxmox.com>
Date: Mon, 9 Apr 2018 09:33:25 +0200
Subject: [PATCH] Revert Ubuntu RETPOLINE checks in kernel Makefile
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
these break builds outside of Ubuntu's packaging.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
scripts/Makefile.build | 8 --------
1 file changed, 8 deletions(-)
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index d74c3f9f1fa8..436005392047 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -282,18 +282,11 @@ objtool_dep = $(objtool_obj) \
$(wildcard include/config/orc/unwinder.h \
include/config/stack/validation.h)
-ifdef CONFIG_RETPOLINE
-cmd_ubuntu_retpoline = $(CONFIG_SHELL) $(srctree)/scripts/ubuntu-retpoline-extract-one $(@) $(<) "$(filter -m16 %code16gcc.h,$(a_flags))";
-else
-cmd_ubuntu_retpoline =
-endif
-
define rule_cc_o_c
$(call echo-cmd,checksrc) $(cmd_checksrc) \
$(call cmd_and_fixdep,cc_o_c) \
$(cmd_modversions_c) \
$(call echo-cmd,objtool) $(cmd_objtool) \
- $(call echo-cmd,ubuntu-retpoline) $(cmd_ubuntu_retpoline) \
$(call echo-cmd,record_mcount) $(cmd_record_mcount)
endef
@@ -301,7 +294,6 @@ define rule_as_o_S
$(call cmd_and_fixdep,as_o_S) \
$(cmd_modversions_S) \
$(call echo-cmd,objtool) $(cmd_objtool)
- $(call echo-cmd,ubuntu-retpoline) $(cmd_ubuntu_retpoline)
endef
# List module undefined symbols (or empty line if not enabled)
@@ -0,0 +1,92 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Wolfgang Bumiller <w.bumiller@proxmox.com>
Date: Mon, 9 Apr 2018 14:56:29 +0200
Subject: [PATCH] net: fix deadlock while clearing neighbor proxy table
When coming from ndisc_netdev_event() in net/ipv6/ndisc.c,
neigh_ifdown() is called with &nd_tbl, locking this while
clearing the proxy neighbor entries when eg. deleting an
interface. Calling the table's pndisc_destructor() with the
lock still held, however, can cause a deadlock: When a
multicast listener is available an IGMP packet of type
ICMPV6_MGM_REDUCTION may be sent out. When reaching
ip6_finish_output2(), if no neighbor entry for the target
address is found, __neigh_create() is called with &nd_tbl,
which it'll want to lock.
Move the elements into their own list, then unlock the table
and perform the destruction.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199289
Fixes: 6fd6ce2056de ("ipv6: Do not depend on rt->n in ip6_finish_output2().")
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
net/core/neighbour.c | 28 ++++++++++++++++++----------
1 file changed, 18 insertions(+), 10 deletions(-)
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index d0713627deb6..3b495739bf65 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -55,7 +55,8 @@ static void neigh_timer_handler(unsigned long arg);
static void __neigh_notify(struct neighbour *n, int type, int flags,
u32 pid);
static void neigh_update_notify(struct neighbour *neigh, u32 nlmsg_pid);
-static int pneigh_ifdown(struct neigh_table *tbl, struct net_device *dev);
+static int pneigh_ifdown_and_unlock(struct neigh_table *tbl,
+ struct net_device *dev);
#ifdef CONFIG_PROC_FS
static const struct file_operations neigh_stat_seq_fops;
@@ -291,8 +292,7 @@ int neigh_ifdown(struct neigh_table *tbl, struct net_device *dev)
{
write_lock_bh(&tbl->lock);
neigh_flush_dev(tbl, dev);
- pneigh_ifdown(tbl, dev);
- write_unlock_bh(&tbl->lock);
+ pneigh_ifdown_and_unlock(tbl, dev);
del_timer_sync(&tbl->proxy_timer);
pneigh_queue_purge(&tbl->proxy_queue);
@@ -681,9 +681,10 @@ int pneigh_delete(struct neigh_table *tbl, struct net *net, const void *pkey,
return -ENOENT;
}
-static int pneigh_ifdown(struct neigh_table *tbl, struct net_device *dev)
+static int pneigh_ifdown_and_unlock(struct neigh_table *tbl,
+ struct net_device *dev)
{
- struct pneigh_entry *n, **np;
+ struct pneigh_entry *n, **np, *freelist = NULL;
u32 h;
for (h = 0; h <= PNEIGH_HASHMASK; h++) {
@@ -691,16 +692,23 @@ static int pneigh_ifdown(struct neigh_table *tbl, struct net_device *dev)
while ((n = *np) != NULL) {
if (!dev || n->dev == dev) {
*np = n->next;
- if (tbl->pdestructor)
- tbl->pdestructor(n);
- if (n->dev)
- dev_put(n->dev);
- kfree(n);
+ n->next = freelist;
+ freelist = n;
continue;
}
np = &n->next;
}
}
+ write_unlock_bh(&tbl->lock);
+ while ((n = freelist)) {
+ freelist = n->next;
+ n->next = NULL;
+ if (tbl->pdestructor)
+ tbl->pdestructor(n);
+ if (n->dev)
+ dev_put(n->dev);
+ kfree(n);
+ }
return -ENOENT;
}
-200
View File
@@ -1,200 +0,0 @@
proxmox-ve (5.0-24) unstable; urgency=medium
* depend on newest 4.10.17-4-pve kernel
-- Proxmox Support Team <support@proxmox.com> Tue, 10 Oct 2017 14:39:55 +0200
proxmox-ve (5.0-21) unstable; urgency=medium
* depend on newest 4.10.17-3-pve kernel
-- Proxmox Support Team <support@proxmox.com> Thu, 31 Aug 2017 14:57:17 +0200
proxmox-ve (5.0-16) unstable; urgency=medium
* depend on newest 4.10.17-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Tue, 11 Jul 2017 11:11:35 +0200
proxmox-ve (5.0-10) unstable; urgency=medium
* depend on newest 4.10.15-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Wed, 7 Jun 2017 10:36:33 +0200
proxmox-ve (5.0-8) unstable; urgency=medium
* depend on newest 4.10.11-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Thu, 18 May 2017 09:14:18 +0200
proxmox-ve (5.0-6) unstable; urgency=medium
* depend on newest 4.10.8-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Thu, 13 Apr 2017 11:24:12 +0200
proxmox-ve (5.0-5) unstable; urgency=medium
* depend on newest 4.10.5-1-pve kernel
* remove old 4.x release key
-- Proxmox Support Team <support@proxmox.com> Tue, 28 Mar 2017 10:14:00 +0200
proxmox-ve (5.0-3) unstable; urgency=medium
* depend on newest 4.10.3-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Fri, 24 Mar 2017 13:44:18 +0100
proxmox-ve (5.0-2) unstable; urgency=medium
* depend on newest 4.10.1-2-pve kernel
-- Proxmox Support Team <support@proxmox.com> Fri, 10 Mar 2017 10:20:14 +0100
proxmox-ve (5.0-1) unstable; urgency=medium
* Proxmox VE package for Debian Stretch
-- Proxmox Support Team <support@proxmox.com> Fri, 3 Mar 2017 15:56:11 +0100
proxmox-ve (4.4-83) unstable; urgency=medium
* depend on newest 4.4.44-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Wed, 1 Mar 2017 09:22:26 +0100
proxmox-ve (4.4-82) unstable; urgency=medium
* install PVE release keys in a Debian Stretch compatible way
-- Proxmox Support Team <support@proxmox.com> Thu, 23 Feb 2017 15:14:06 +0100
proxmox-ve (4.4-80) unstable; urgency=medium
* depend on newest 4.4.40-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Wed, 8 Feb 2017 13:10:57 +0100
proxmox-ve (4.4-76) unstable; urgency=medium
* update version for 4.4 release
-- Proxmox Support Team <support@proxmox.com> Wed, 8 Feb 2017 10:38:33 +0100
proxmox-ve (4.3-74) unstable; urgency=medium
* depend on newest 4.4.35-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Mon, 5 Dec 2016 10:20:03 +0100
proxmox-ve (4.3-73) unstable; urgency=medium
* depend on newest 4.4.30-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Wed, 30 Nov 2016 09:43:29 +0100
proxmox-ve (4.3-72) unstable; urgency=medium
* depend on newest 4.4.24-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Mon, 14 Nov 2016 12:17:16 +0100
proxmox-ve (4.3-67) unstable; urgency=medium
* depend on newest 4.4.21-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Mon, 10 Oct 2016 13:58:20 +0200
proxmox-ve (4.3-66) unstable; urgency=medium
* depend on newest 4.4.19-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Wed, 14 Sep 2016 13:23:43 +0200
proxmox-ve (4.2-63) unstable; urgency=medium
* use /etc/apt/trusted.gpg.d/ mechanism to install trusted apt keys
-- Proxmox Support Team <support@proxmox.com> Mon, 29 Aug 2016 12:08:43 +0200
proxmox-ve (4.2-61) unstable; urgency=medium
* depend on newest 4.4.16-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Wed, 24 Aug 2016 15:11:08 +0200
proxmox-ve (4.2-58) unstable; urgency=medium
* depend on newest 4.4.13-2-pve kernel
-- Proxmox Support Team <support@proxmox.com> Fri, 15 Jul 2016 06:03:16 +0200
proxmox-ve (4.2-55) unstable; urgency=medium
* depend on newest 4.4.13-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Sat, 25 Jun 2016 11:52:14 +0200
proxmox-ve (4.2-54) unstable; urgency=medium
* depend on newest 4.4.10-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Fri, 10 Jun 2016 20:56:08 +0200
proxmox-ve (4.0-49) unstable; urgency=medium
* depend on newest 4.4.8-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Mon, 09 May 2016 10:29:57 +0200
proxmox-ve (4.0-43) unstable; urgency=medium
* depend on newest 4.4.6-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Mon, 11 Apr 2016 09:35:06 +0200
proxmox-ve (4.0-32) unstable; urgency=medium
* depend on newest 4.2.8-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Wed, 03 Feb 2016 16:15:41 +0100
proxmox-ve (4.0-31) unstable; urgency=medium
* setup kernel links for installation CD (rescue boot)
-- Proxmox Support Team <support@proxmox.com> Sun, 10 Jan 2016 10:10:37 +0100
proxmox-ve (4.0-29) unstable; urgency=medium
* depend on newest 4.2.6-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Thu, 31 Dec 2015 12:46:00 +0100
proxmox-ve (4.0-8) unstable; urgency=medium
* depend on newest 4.2.3-2-pve kernel
-- Proxmox Support Team <support@proxmox.com> Tue, 03 Nov 2015 10:40:01 +0100
proxmox-ve (4.0-7) unstable; urgency=medium
* depend on newest 4.2.3-1-pve kernel
-- Proxmox Support Team <support@proxmox.com> Sun, 18 Oct 2015 10:58:21 +0200
proxmox-ve (4.0-6) unstable; urgency=medium
* depend on newest 4.1.3 kernel
-- Proxmox Support Team <support@proxmox.com> Thu, 30 Jul 2015 09:17:30 +0200
proxmox-ve (4.0-1) unstable; urgency=medium
* Proxmox VE package for Debian Jessie
-- Proxmox Support Team <support@proxmox.com> Sat, 28 Feb 2015 17:25:14 +0100
-16
View File
@@ -1,16 +0,0 @@
Package: proxmox-ve
Version: @RELEASE@-@PKGREL@
Architecture: all
Section: admin
Priority: optional
Provides: proxmox-virtual-environment
Conflicts: pve-kernel, proxmox-virtual-environment, proxmox-ve-3.10.0
Replaces: pve-kernel, proxmox-virtual-environment, proxmox-ve-3.10.0
Depends: libc6 (>= 2.7-18), pve-kernel-@KVNAME@, pve-firmware, pve-manager, qemu-server, pve-qemu-kvm, openssh-client, openssh-server, apt, vncterm, spiceterm
Maintainer: Proxmox Support Team <support@proxmox.com>
Description: The Proxmox Virtual Environment
The Proxmox Virtual Environment is an easy to use Open Source
virtualization platform for running Virtual Appliances and Virtual
Machines. This is a meta package which will install everything
needed. This package also depends on the latest available Proxmox
kernel from the @KERNEL_VER@ series.
-21
View File
@@ -1,21 +0,0 @@
Copyright (C) 2016 Proxmox Server Solutions GmbH
This software is written by Proxmox Server Solutions GmbH <support@proxmox.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 dated June, 1991.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
MA 02110-1301 USA
The complete text of the GNU General
Public License can be found in `/usr/share/common-licenses/GPL-2'.
-81
View File
@@ -1,81 +0,0 @@
#! /bin/sh
# Abort if any command returns an error value
set -e
# This script is called as the last step of the installation of the
# package. All the package's files are in place, dpkg has already
# done its automatic conffile handling, and all the packages we depend
# of are already fully installed and configured.
# The following idempotent stuff doesn't generally need protecting
# against being run in the abort-* cases.
case "$1" in
configure)
# Configure this package. If the package must prompt the user for
# information, do it here.
# cleanup - remove Proxmox Release Key key from /etc/apt/trusted.gpg
/usr/bin/apt-key --keyring /etc/apt/trusted.gpg del 9887F95A >/dev/null 2>&1 || /bin/true
# cleanup - remove old stretch-incompatible variant of installing release key
rm -f /etc/apt/trusted.gpg.d/proxmox-ve.gpg /etc/apt/trusted.gpg.d/proxmox-ve.gpg~
# setup kernel links for installation CD (rescue boot)
mkdir -p /boot/pve
ln -sf /boot/vmlinuz-@KVNAME@ /boot/pve/vmlinuz
ln -sf /boot/initrd.img-@KVNAME@ /boot/pve/initrd.img
# There are three sub-cases:
if test "${2+set}" != set; then
# We're being installed by an ancient dpkg which doesn't remember
# which version was most recently configured, or even whether
# there is a most recently configured version.
:
elif test -z "$2" -o "$2" = "<unknown>"; then
# The package has not ever been configured on this system, or was
# purged since it was last configured.
:
else
# Version $2 is the most recently configured version of this
# package.
:
fi ;;
abort-upgrade)
# Back out of an attempt to upgrade this package FROM THIS VERSION
# to version $2. Undo the effects of "prerm upgrade $2".
:
;;
abort-remove)
if test "$2" != in-favour; then
echo "$0: undocumented call to \`postinst $*'" 1>&2
exit 0
fi
# Back out of an attempt to remove this package, which was due to
# a conflict with package $3 (version $4). Undo the effects of
# "prerm remove in-favour $3 $4".
:
;;
abort-deconfigure)
if test "$2" != in-favour -o "$5" != removing; then
echo "$0: undocumented call to \`postinst $*'" 1>&2
exit 0
fi
# Back out of an attempt to deconfigure this package, which was
# due to package $6 (version $7) which we depend on being removed
# to make way for package $3 (version $4). Undo the effects of
# "prerm deconfigure in-favour $3 $4 removing $6 $7".
:
;;
*) echo "$0: didn't understand being called with \`$1'" 1>&2
exit 0;;
esac
exit 0
-19
View File
@@ -1,19 +0,0 @@
#! /bin/sh
# Abort if any command returns an error value
set -e
case "$1" in
purge|remove|upgrade|failed-upgrade|abort-install|abort-upgrade|disappear)
# remove kernel symlinks
rm -f /boot/pve/vmlinuz
rm -f /boot/pve/initrd.img
rmdir --ignore-fail-on-non-empty /boot/pve/ || true
;;
*)
echo "postrm called with unknown argument \`$1'" >&2
exit 1
;;
esac
Binary file not shown.
-11
View File
@@ -1,11 +0,0 @@
Package: pve-headers
Version: @RELEASE@-@PKGREL@
Architecture: all
Section: admin
Priority: optional
Depends: pve-headers-@KVNAME@
Maintainer: Proxmox Support Team <support@proxmox.com>
Description: Latest Proxmox VE Kernel Headers
This is a virtual package which will install the kernel headers
for the latest available proxmox kernel from the @KERNEL_VER@
series.