ZTS: Use QEMU for tests on Linux and FreeBSD
This commit adds functional tests for these systems:
- AlmaLinux 8, AlmaLinux 9, ArchLinux
- CentOS Stream 9, Fedora 39, Fedora 40
- Debian 11, Debian 12
- FreeBSD 13, FreeBSD 14, FreeBSD 15
- Ubuntu 20.04, Ubuntu 22.04, Ubuntu 24.04
- enabled by default:
- AlmaLinux 8, AlmaLinux 9
- Debian 11, Debian 12
- Fedora 39, Fedora 40
- FreeBSD 13, FreeBSD 14
Workflow for each operating system:
- install qemu on the github runner
- download current cloud image of operating system
- start and init that image via cloud-init
- install dependencies and poweroff system
- start system and build openzfs and then poweroff again
- clone build system and start 2 instances of it
- run functional testings and complete in around 3h
- when tests are done, do some logfile preparing
- show detailed results for each system
- in the end, generate the job summary
Real-world benefits from this PR:
1. The github runner scripts are in the zfs repo itself. That means
you can just open a PR against zfs, like "Add Fedora 41 tester", and
see the results directly in the PR. ZFS admins no longer need
manually to login to the buildbot server to update the buildbot config
with new version of Fedora/Almalinux.
2. Github runners allow you to run the entire test suite against your
private branch before submitting a formal PR to openzfs. Just open a
PR against your private zfs repo, and the exact same
Fedora/Alma/FreeBSD runners will fire up and run ZTS. This can be
useful if you want to iterate on a ZTS change before submitting a
formal PR.
3. buildbot is incredibly cumbersome. Our buildbot config files alone
are ~1500 lines (not including any build/setup scripts)!
It's a huge pain to setup.
4. We're running the super ancient buildbot 0.8.12. It's so ancient
it requires python2. We actually have to build python2 from source
for almalinux9 just to get it to run. Ugrading to a more modern
buildbot is a huge undertaking, and the UI on the newer versions is
worse.
5. Buildbot uses EC2 instances. EC2 is a pain because:
* It costs money
* They throttle IOPS and CPU usage, leading to mysterious,
* hard-to-diagnose, failures and timeouts in ZTS.
* EC2 is high maintenance. We have to setup security groups, SSH
* keys, networking, users, etc, in AWS and it's a pain. We also
* have to periodically go in an kill zombie EC2 instances that
* buildbot is unable to kill off.
6. Buildbot doesn't always handle failures well. One of the things we
saw in the past was the FreeBSD builders would often die, and each
builder death would take up a "slot" in buildbot. So we would
periodically have to restart buildbot via a cron job to get the slots
back.
7. This PR divides up the ZTS test list into two parts, launches two
VMs, and on each VM runs half the test suite. The test results are
then merged and shown in the sumary page. So we're basically
parallelizing ZTS on the same github runner. This leads to lower
overall ZTS runtimes (2.5-3 hours vs 4+ hours on buildbot), and one
unified set of results per runner, which is nice.
8. Since the tests are running on a VM, we have much more control over
what happens. We can capture the serial console output even if the
test completely brings down the VM. In the future, we could also
restart the test on the VM where it left off, so that if a single test
panics the VM, we can just restart it and run the remaining ZTS tests
(this functionaly is not yet implemented though, just an idea).
9. Using the runners, users can manually kill or restart a test run
via the github IU. That really isn't possible with buildbot unless
you're an admin.
10. Anecdotally, the tests seem to be more stable and constant under
the QEMU runners.
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #16537
2024-06-17 17:52:58 +03:00
|
|
|
#!/usr/bin/env bash
|
|
|
|
|
|
|
|
######################################################################
|
|
|
|
# 3) install dependencies for compiling and loading
|
|
|
|
######################################################################
|
|
|
|
|
|
|
|
set -eu
|
|
|
|
|
|
|
|
function archlinux() {
|
|
|
|
echo "##[group]Running pacman -Syu"
|
|
|
|
sudo btrfs filesystem resize max /
|
|
|
|
sudo pacman -Syu --noconfirm
|
|
|
|
echo "##[endgroup]"
|
|
|
|
|
|
|
|
echo "##[group]Install Development Tools"
|
|
|
|
sudo pacman -Sy --noconfirm base-devel bc cpio dhclient dkms fakeroot \
|
|
|
|
fio gdb inetutils jq less linux linux-headers lsscsi nfs-utils parted \
|
|
|
|
pax perf python-packaging python-setuptools qemu-guest-agent ksh samba \
|
2024-09-28 19:24:05 +03:00
|
|
|
sysstat rng-tools rsync wget xxhash
|
ZTS: Use QEMU for tests on Linux and FreeBSD
This commit adds functional tests for these systems:
- AlmaLinux 8, AlmaLinux 9, ArchLinux
- CentOS Stream 9, Fedora 39, Fedora 40
- Debian 11, Debian 12
- FreeBSD 13, FreeBSD 14, FreeBSD 15
- Ubuntu 20.04, Ubuntu 22.04, Ubuntu 24.04
- enabled by default:
- AlmaLinux 8, AlmaLinux 9
- Debian 11, Debian 12
- Fedora 39, Fedora 40
- FreeBSD 13, FreeBSD 14
Workflow for each operating system:
- install qemu on the github runner
- download current cloud image of operating system
- start and init that image via cloud-init
- install dependencies and poweroff system
- start system and build openzfs and then poweroff again
- clone build system and start 2 instances of it
- run functional testings and complete in around 3h
- when tests are done, do some logfile preparing
- show detailed results for each system
- in the end, generate the job summary
Real-world benefits from this PR:
1. The github runner scripts are in the zfs repo itself. That means
you can just open a PR against zfs, like "Add Fedora 41 tester", and
see the results directly in the PR. ZFS admins no longer need
manually to login to the buildbot server to update the buildbot config
with new version of Fedora/Almalinux.
2. Github runners allow you to run the entire test suite against your
private branch before submitting a formal PR to openzfs. Just open a
PR against your private zfs repo, and the exact same
Fedora/Alma/FreeBSD runners will fire up and run ZTS. This can be
useful if you want to iterate on a ZTS change before submitting a
formal PR.
3. buildbot is incredibly cumbersome. Our buildbot config files alone
are ~1500 lines (not including any build/setup scripts)!
It's a huge pain to setup.
4. We're running the super ancient buildbot 0.8.12. It's so ancient
it requires python2. We actually have to build python2 from source
for almalinux9 just to get it to run. Ugrading to a more modern
buildbot is a huge undertaking, and the UI on the newer versions is
worse.
5. Buildbot uses EC2 instances. EC2 is a pain because:
* It costs money
* They throttle IOPS and CPU usage, leading to mysterious,
* hard-to-diagnose, failures and timeouts in ZTS.
* EC2 is high maintenance. We have to setup security groups, SSH
* keys, networking, users, etc, in AWS and it's a pain. We also
* have to periodically go in an kill zombie EC2 instances that
* buildbot is unable to kill off.
6. Buildbot doesn't always handle failures well. One of the things we
saw in the past was the FreeBSD builders would often die, and each
builder death would take up a "slot" in buildbot. So we would
periodically have to restart buildbot via a cron job to get the slots
back.
7. This PR divides up the ZTS test list into two parts, launches two
VMs, and on each VM runs half the test suite. The test results are
then merged and shown in the sumary page. So we're basically
parallelizing ZTS on the same github runner. This leads to lower
overall ZTS runtimes (2.5-3 hours vs 4+ hours on buildbot), and one
unified set of results per runner, which is nice.
8. Since the tests are running on a VM, we have much more control over
what happens. We can capture the serial console output even if the
test completely brings down the VM. In the future, we could also
restart the test on the VM where it left off, so that if a single test
panics the VM, we can just restart it and run the remaining ZTS tests
(this functionaly is not yet implemented though, just an idea).
9. Using the runners, users can manually kill or restart a test run
via the github IU. That really isn't possible with buildbot unless
you're an admin.
10. Anecdotally, the tests seem to be more stable and constant under
the QEMU runners.
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #16537
2024-06-17 17:52:58 +03:00
|
|
|
echo "##[endgroup]"
|
|
|
|
}
|
|
|
|
|
|
|
|
function debian() {
|
|
|
|
export DEBIAN_FRONTEND="noninteractive"
|
|
|
|
|
|
|
|
echo "##[group]Running apt-get update+upgrade"
|
|
|
|
sudo apt-get update -y
|
|
|
|
sudo apt-get upgrade -y
|
|
|
|
echo "##[endgroup]"
|
|
|
|
|
|
|
|
echo "##[group]Install Development Tools"
|
|
|
|
sudo apt-get install -y \
|
|
|
|
acl alien attr autoconf bc cpio curl dbench dh-python dkms fakeroot \
|
|
|
|
fio gdb gdebi git ksh lcov isc-dhcp-client jq libacl1-dev libaio-dev \
|
2024-09-27 19:05:49 +03:00
|
|
|
libattr1-dev libblkid-dev libcurl4-openssl-dev libdevmapper-dev libelf-dev \
|
|
|
|
libffi-dev libmount-dev libpam0g-dev libselinux-dev libssl-dev libtool \
|
|
|
|
libtool-bin libudev-dev libunwind-dev linux-headers-$(uname -r) \
|
ZTS: Use QEMU for tests on Linux and FreeBSD
This commit adds functional tests for these systems:
- AlmaLinux 8, AlmaLinux 9, ArchLinux
- CentOS Stream 9, Fedora 39, Fedora 40
- Debian 11, Debian 12
- FreeBSD 13, FreeBSD 14, FreeBSD 15
- Ubuntu 20.04, Ubuntu 22.04, Ubuntu 24.04
- enabled by default:
- AlmaLinux 8, AlmaLinux 9
- Debian 11, Debian 12
- Fedora 39, Fedora 40
- FreeBSD 13, FreeBSD 14
Workflow for each operating system:
- install qemu on the github runner
- download current cloud image of operating system
- start and init that image via cloud-init
- install dependencies and poweroff system
- start system and build openzfs and then poweroff again
- clone build system and start 2 instances of it
- run functional testings and complete in around 3h
- when tests are done, do some logfile preparing
- show detailed results for each system
- in the end, generate the job summary
Real-world benefits from this PR:
1. The github runner scripts are in the zfs repo itself. That means
you can just open a PR against zfs, like "Add Fedora 41 tester", and
see the results directly in the PR. ZFS admins no longer need
manually to login to the buildbot server to update the buildbot config
with new version of Fedora/Almalinux.
2. Github runners allow you to run the entire test suite against your
private branch before submitting a formal PR to openzfs. Just open a
PR against your private zfs repo, and the exact same
Fedora/Alma/FreeBSD runners will fire up and run ZTS. This can be
useful if you want to iterate on a ZTS change before submitting a
formal PR.
3. buildbot is incredibly cumbersome. Our buildbot config files alone
are ~1500 lines (not including any build/setup scripts)!
It's a huge pain to setup.
4. We're running the super ancient buildbot 0.8.12. It's so ancient
it requires python2. We actually have to build python2 from source
for almalinux9 just to get it to run. Ugrading to a more modern
buildbot is a huge undertaking, and the UI on the newer versions is
worse.
5. Buildbot uses EC2 instances. EC2 is a pain because:
* It costs money
* They throttle IOPS and CPU usage, leading to mysterious,
* hard-to-diagnose, failures and timeouts in ZTS.
* EC2 is high maintenance. We have to setup security groups, SSH
* keys, networking, users, etc, in AWS and it's a pain. We also
* have to periodically go in an kill zombie EC2 instances that
* buildbot is unable to kill off.
6. Buildbot doesn't always handle failures well. One of the things we
saw in the past was the FreeBSD builders would often die, and each
builder death would take up a "slot" in buildbot. So we would
periodically have to restart buildbot via a cron job to get the slots
back.
7. This PR divides up the ZTS test list into two parts, launches two
VMs, and on each VM runs half the test suite. The test results are
then merged and shown in the sumary page. So we're basically
parallelizing ZTS on the same github runner. This leads to lower
overall ZTS runtimes (2.5-3 hours vs 4+ hours on buildbot), and one
unified set of results per runner, which is nice.
8. Since the tests are running on a VM, we have much more control over
what happens. We can capture the serial console output even if the
test completely brings down the VM. In the future, we could also
restart the test on the VM where it left off, so that if a single test
panics the VM, we can just restart it and run the remaining ZTS tests
(this functionaly is not yet implemented though, just an idea).
9. Using the runners, users can manually kill or restart a test run
via the github IU. That really isn't possible with buildbot unless
you're an admin.
10. Anecdotally, the tests seem to be more stable and constant under
the QEMU runners.
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #16537
2024-06-17 17:52:58 +03:00
|
|
|
lsscsi nfs-kernel-server pamtester parted python3 python3-all-dev \
|
|
|
|
python3-cffi python3-dev python3-distlib python3-packaging \
|
|
|
|
python3-setuptools python3-sphinx qemu-guest-agent rng-tools rpm2cpio \
|
2024-09-28 19:24:05 +03:00
|
|
|
rsync samba sysstat uuid-dev watchdog wget xfslibs-dev xxhash zlib1g-dev
|
ZTS: Use QEMU for tests on Linux and FreeBSD
This commit adds functional tests for these systems:
- AlmaLinux 8, AlmaLinux 9, ArchLinux
- CentOS Stream 9, Fedora 39, Fedora 40
- Debian 11, Debian 12
- FreeBSD 13, FreeBSD 14, FreeBSD 15
- Ubuntu 20.04, Ubuntu 22.04, Ubuntu 24.04
- enabled by default:
- AlmaLinux 8, AlmaLinux 9
- Debian 11, Debian 12
- Fedora 39, Fedora 40
- FreeBSD 13, FreeBSD 14
Workflow for each operating system:
- install qemu on the github runner
- download current cloud image of operating system
- start and init that image via cloud-init
- install dependencies and poweroff system
- start system and build openzfs and then poweroff again
- clone build system and start 2 instances of it
- run functional testings and complete in around 3h
- when tests are done, do some logfile preparing
- show detailed results for each system
- in the end, generate the job summary
Real-world benefits from this PR:
1. The github runner scripts are in the zfs repo itself. That means
you can just open a PR against zfs, like "Add Fedora 41 tester", and
see the results directly in the PR. ZFS admins no longer need
manually to login to the buildbot server to update the buildbot config
with new version of Fedora/Almalinux.
2. Github runners allow you to run the entire test suite against your
private branch before submitting a formal PR to openzfs. Just open a
PR against your private zfs repo, and the exact same
Fedora/Alma/FreeBSD runners will fire up and run ZTS. This can be
useful if you want to iterate on a ZTS change before submitting a
formal PR.
3. buildbot is incredibly cumbersome. Our buildbot config files alone
are ~1500 lines (not including any build/setup scripts)!
It's a huge pain to setup.
4. We're running the super ancient buildbot 0.8.12. It's so ancient
it requires python2. We actually have to build python2 from source
for almalinux9 just to get it to run. Ugrading to a more modern
buildbot is a huge undertaking, and the UI on the newer versions is
worse.
5. Buildbot uses EC2 instances. EC2 is a pain because:
* It costs money
* They throttle IOPS and CPU usage, leading to mysterious,
* hard-to-diagnose, failures and timeouts in ZTS.
* EC2 is high maintenance. We have to setup security groups, SSH
* keys, networking, users, etc, in AWS and it's a pain. We also
* have to periodically go in an kill zombie EC2 instances that
* buildbot is unable to kill off.
6. Buildbot doesn't always handle failures well. One of the things we
saw in the past was the FreeBSD builders would often die, and each
builder death would take up a "slot" in buildbot. So we would
periodically have to restart buildbot via a cron job to get the slots
back.
7. This PR divides up the ZTS test list into two parts, launches two
VMs, and on each VM runs half the test suite. The test results are
then merged and shown in the sumary page. So we're basically
parallelizing ZTS on the same github runner. This leads to lower
overall ZTS runtimes (2.5-3 hours vs 4+ hours on buildbot), and one
unified set of results per runner, which is nice.
8. Since the tests are running on a VM, we have much more control over
what happens. We can capture the serial console output even if the
test completely brings down the VM. In the future, we could also
restart the test on the VM where it left off, so that if a single test
panics the VM, we can just restart it and run the remaining ZTS tests
(this functionaly is not yet implemented though, just an idea).
9. Using the runners, users can manually kill or restart a test run
via the github IU. That really isn't possible with buildbot unless
you're an admin.
10. Anecdotally, the tests seem to be more stable and constant under
the QEMU runners.
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #16537
2024-06-17 17:52:58 +03:00
|
|
|
echo "##[endgroup]"
|
|
|
|
}
|
|
|
|
|
|
|
|
function freebsd() {
|
|
|
|
export ASSUME_ALWAYS_YES="YES"
|
|
|
|
|
|
|
|
echo "##[group]Install Development Tools"
|
|
|
|
sudo pkg install -y autoconf automake autotools base64 checkbashisms fio \
|
|
|
|
gdb gettext gettext-runtime git gmake gsed jq ksh93 lcov libtool lscpu \
|
2024-09-28 19:24:05 +03:00
|
|
|
pkgconf python python3 pamtester pamtester qemu-guest-agent rsync xxhash
|
ZTS: Use QEMU for tests on Linux and FreeBSD
This commit adds functional tests for these systems:
- AlmaLinux 8, AlmaLinux 9, ArchLinux
- CentOS Stream 9, Fedora 39, Fedora 40
- Debian 11, Debian 12
- FreeBSD 13, FreeBSD 14, FreeBSD 15
- Ubuntu 20.04, Ubuntu 22.04, Ubuntu 24.04
- enabled by default:
- AlmaLinux 8, AlmaLinux 9
- Debian 11, Debian 12
- Fedora 39, Fedora 40
- FreeBSD 13, FreeBSD 14
Workflow for each operating system:
- install qemu on the github runner
- download current cloud image of operating system
- start and init that image via cloud-init
- install dependencies and poweroff system
- start system and build openzfs and then poweroff again
- clone build system and start 2 instances of it
- run functional testings and complete in around 3h
- when tests are done, do some logfile preparing
- show detailed results for each system
- in the end, generate the job summary
Real-world benefits from this PR:
1. The github runner scripts are in the zfs repo itself. That means
you can just open a PR against zfs, like "Add Fedora 41 tester", and
see the results directly in the PR. ZFS admins no longer need
manually to login to the buildbot server to update the buildbot config
with new version of Fedora/Almalinux.
2. Github runners allow you to run the entire test suite against your
private branch before submitting a formal PR to openzfs. Just open a
PR against your private zfs repo, and the exact same
Fedora/Alma/FreeBSD runners will fire up and run ZTS. This can be
useful if you want to iterate on a ZTS change before submitting a
formal PR.
3. buildbot is incredibly cumbersome. Our buildbot config files alone
are ~1500 lines (not including any build/setup scripts)!
It's a huge pain to setup.
4. We're running the super ancient buildbot 0.8.12. It's so ancient
it requires python2. We actually have to build python2 from source
for almalinux9 just to get it to run. Ugrading to a more modern
buildbot is a huge undertaking, and the UI on the newer versions is
worse.
5. Buildbot uses EC2 instances. EC2 is a pain because:
* It costs money
* They throttle IOPS and CPU usage, leading to mysterious,
* hard-to-diagnose, failures and timeouts in ZTS.
* EC2 is high maintenance. We have to setup security groups, SSH
* keys, networking, users, etc, in AWS and it's a pain. We also
* have to periodically go in an kill zombie EC2 instances that
* buildbot is unable to kill off.
6. Buildbot doesn't always handle failures well. One of the things we
saw in the past was the FreeBSD builders would often die, and each
builder death would take up a "slot" in buildbot. So we would
periodically have to restart buildbot via a cron job to get the slots
back.
7. This PR divides up the ZTS test list into two parts, launches two
VMs, and on each VM runs half the test suite. The test results are
then merged and shown in the sumary page. So we're basically
parallelizing ZTS on the same github runner. This leads to lower
overall ZTS runtimes (2.5-3 hours vs 4+ hours on buildbot), and one
unified set of results per runner, which is nice.
8. Since the tests are running on a VM, we have much more control over
what happens. We can capture the serial console output even if the
test completely brings down the VM. In the future, we could also
restart the test on the VM where it left off, so that if a single test
panics the VM, we can just restart it and run the remaining ZTS tests
(this functionaly is not yet implemented though, just an idea).
9. Using the runners, users can manually kill or restart a test run
via the github IU. That really isn't possible with buildbot unless
you're an admin.
10. Anecdotally, the tests seem to be more stable and constant under
the QEMU runners.
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #16537
2024-06-17 17:52:58 +03:00
|
|
|
sudo pkg install -xy \
|
|
|
|
'^samba4[[:digit:]]+$' \
|
|
|
|
'^py3[[:digit:]]+-cffi$' \
|
|
|
|
'^py3[[:digit:]]+-sysctl$' \
|
|
|
|
'^py3[[:digit:]]+-packaging$'
|
|
|
|
echo "##[endgroup]"
|
|
|
|
}
|
|
|
|
|
|
|
|
# common packages for: almalinux, centos, redhat
|
|
|
|
function rhel() {
|
|
|
|
echo "##[group]Running dnf update"
|
|
|
|
echo "max_parallel_downloads=10" | sudo -E tee -a /etc/dnf/dnf.conf
|
|
|
|
sudo dnf clean all
|
|
|
|
sudo dnf update -y --setopt=fastestmirror=1 --refresh
|
|
|
|
echo "##[endgroup]"
|
|
|
|
|
|
|
|
echo "##[group]Install Development Tools"
|
|
|
|
sudo dnf group install -y "Development Tools"
|
|
|
|
sudo dnf install -y \
|
|
|
|
acl attr bc bzip2 curl dbench dkms elfutils-libelf-devel fio gdb git \
|
|
|
|
jq kernel-rpm-macros ksh libacl-devel libaio-devel libargon2-devel \
|
|
|
|
libattr-devel libblkid-devel libcurl-devel libffi-devel ncompress \
|
|
|
|
libselinux-devel libtirpc-devel libtool libudev-devel libuuid-devel \
|
|
|
|
lsscsi mdadm nfs-utils openssl-devel pam-devel pamtester parted perf \
|
|
|
|
python3 python3-cffi python3-devel python3-packaging kernel-devel \
|
|
|
|
python3-setuptools qemu-guest-agent rng-tools rpcgen rpm-build rsync \
|
2024-09-28 19:24:05 +03:00
|
|
|
samba sysstat systemd watchdog wget xfsprogs-devel xxhash zlib-devel
|
ZTS: Use QEMU for tests on Linux and FreeBSD
This commit adds functional tests for these systems:
- AlmaLinux 8, AlmaLinux 9, ArchLinux
- CentOS Stream 9, Fedora 39, Fedora 40
- Debian 11, Debian 12
- FreeBSD 13, FreeBSD 14, FreeBSD 15
- Ubuntu 20.04, Ubuntu 22.04, Ubuntu 24.04
- enabled by default:
- AlmaLinux 8, AlmaLinux 9
- Debian 11, Debian 12
- Fedora 39, Fedora 40
- FreeBSD 13, FreeBSD 14
Workflow for each operating system:
- install qemu on the github runner
- download current cloud image of operating system
- start and init that image via cloud-init
- install dependencies and poweroff system
- start system and build openzfs and then poweroff again
- clone build system and start 2 instances of it
- run functional testings and complete in around 3h
- when tests are done, do some logfile preparing
- show detailed results for each system
- in the end, generate the job summary
Real-world benefits from this PR:
1. The github runner scripts are in the zfs repo itself. That means
you can just open a PR against zfs, like "Add Fedora 41 tester", and
see the results directly in the PR. ZFS admins no longer need
manually to login to the buildbot server to update the buildbot config
with new version of Fedora/Almalinux.
2. Github runners allow you to run the entire test suite against your
private branch before submitting a formal PR to openzfs. Just open a
PR against your private zfs repo, and the exact same
Fedora/Alma/FreeBSD runners will fire up and run ZTS. This can be
useful if you want to iterate on a ZTS change before submitting a
formal PR.
3. buildbot is incredibly cumbersome. Our buildbot config files alone
are ~1500 lines (not including any build/setup scripts)!
It's a huge pain to setup.
4. We're running the super ancient buildbot 0.8.12. It's so ancient
it requires python2. We actually have to build python2 from source
for almalinux9 just to get it to run. Ugrading to a more modern
buildbot is a huge undertaking, and the UI on the newer versions is
worse.
5. Buildbot uses EC2 instances. EC2 is a pain because:
* It costs money
* They throttle IOPS and CPU usage, leading to mysterious,
* hard-to-diagnose, failures and timeouts in ZTS.
* EC2 is high maintenance. We have to setup security groups, SSH
* keys, networking, users, etc, in AWS and it's a pain. We also
* have to periodically go in an kill zombie EC2 instances that
* buildbot is unable to kill off.
6. Buildbot doesn't always handle failures well. One of the things we
saw in the past was the FreeBSD builders would often die, and each
builder death would take up a "slot" in buildbot. So we would
periodically have to restart buildbot via a cron job to get the slots
back.
7. This PR divides up the ZTS test list into two parts, launches two
VMs, and on each VM runs half the test suite. The test results are
then merged and shown in the sumary page. So we're basically
parallelizing ZTS on the same github runner. This leads to lower
overall ZTS runtimes (2.5-3 hours vs 4+ hours on buildbot), and one
unified set of results per runner, which is nice.
8. Since the tests are running on a VM, we have much more control over
what happens. We can capture the serial console output even if the
test completely brings down the VM. In the future, we could also
restart the test on the VM where it left off, so that if a single test
panics the VM, we can just restart it and run the remaining ZTS tests
(this functionaly is not yet implemented though, just an idea).
9. Using the runners, users can manually kill or restart a test run
via the github IU. That really isn't possible with buildbot unless
you're an admin.
10. Anecdotally, the tests seem to be more stable and constant under
the QEMU runners.
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #16537
2024-06-17 17:52:58 +03:00
|
|
|
echo "##[endgroup]"
|
|
|
|
}
|
|
|
|
|
|
|
|
function tumbleweed() {
|
|
|
|
echo "##[group]Running zypper is TODO!"
|
|
|
|
sleep 23456
|
|
|
|
echo "##[endgroup]"
|
|
|
|
}
|
|
|
|
|
|
|
|
# Install dependencies
|
|
|
|
case "$1" in
|
|
|
|
almalinux8)
|
|
|
|
echo "##[group]Enable epel and powertools repositories"
|
|
|
|
sudo dnf config-manager -y --set-enabled powertools
|
|
|
|
sudo dnf install -y epel-release
|
|
|
|
echo "##[endgroup]"
|
|
|
|
rhel
|
|
|
|
echo "##[group]Install kernel-abi-whitelists"
|
|
|
|
sudo dnf install -y kernel-abi-whitelists
|
|
|
|
echo "##[endgroup]"
|
|
|
|
;;
|
|
|
|
almalinux9|centos-stream9)
|
|
|
|
echo "##[group]Enable epel and crb repositories"
|
|
|
|
sudo dnf config-manager -y --set-enabled crb
|
|
|
|
sudo dnf install -y epel-release
|
|
|
|
echo "##[endgroup]"
|
|
|
|
rhel
|
|
|
|
echo "##[group]Install kernel-abi-stablelists"
|
|
|
|
sudo dnf install -y kernel-abi-stablelists
|
|
|
|
echo "##[endgroup]"
|
|
|
|
;;
|
|
|
|
archlinux)
|
|
|
|
archlinux
|
|
|
|
;;
|
|
|
|
debian*)
|
|
|
|
debian
|
|
|
|
echo "##[group]Install Debian specific"
|
|
|
|
sudo apt-get install -yq linux-perf dh-sequence-dkms
|
|
|
|
echo "##[endgroup]"
|
|
|
|
;;
|
|
|
|
fedora*)
|
|
|
|
rhel
|
|
|
|
;;
|
|
|
|
freebsd*)
|
|
|
|
freebsd
|
|
|
|
;;
|
|
|
|
tumbleweed)
|
|
|
|
tumbleweed
|
|
|
|
;;
|
|
|
|
ubuntu*)
|
|
|
|
debian
|
|
|
|
echo "##[group]Install Ubuntu specific"
|
|
|
|
sudo apt-get install -yq linux-tools-common libtirpc-dev \
|
|
|
|
linux-modules-extra-$(uname -r)
|
|
|
|
if [ "$1" != "ubuntu20" ]; then
|
|
|
|
sudo apt-get install -yq dh-sequence-dkms
|
|
|
|
fi
|
|
|
|
echo "##[endgroup]"
|
|
|
|
echo "##[group]Delete Ubuntu OpenZFS modules"
|
|
|
|
for i in $(find /lib/modules -name zfs -type d); do sudo rm -rvf $i; done
|
|
|
|
echo "##[endgroup]"
|
|
|
|
;;
|
|
|
|
esac
|
|
|
|
|
2024-09-19 00:24:12 +03:00
|
|
|
# This script is used for checkstyle + zloop deps also.
|
|
|
|
# Install only the needed packages and exit - when used this way.
|
|
|
|
test -z "${ONLY_DEPS:-}" || exit 0
|
|
|
|
|
ZTS: Use QEMU for tests on Linux and FreeBSD
This commit adds functional tests for these systems:
- AlmaLinux 8, AlmaLinux 9, ArchLinux
- CentOS Stream 9, Fedora 39, Fedora 40
- Debian 11, Debian 12
- FreeBSD 13, FreeBSD 14, FreeBSD 15
- Ubuntu 20.04, Ubuntu 22.04, Ubuntu 24.04
- enabled by default:
- AlmaLinux 8, AlmaLinux 9
- Debian 11, Debian 12
- Fedora 39, Fedora 40
- FreeBSD 13, FreeBSD 14
Workflow for each operating system:
- install qemu on the github runner
- download current cloud image of operating system
- start and init that image via cloud-init
- install dependencies and poweroff system
- start system and build openzfs and then poweroff again
- clone build system and start 2 instances of it
- run functional testings and complete in around 3h
- when tests are done, do some logfile preparing
- show detailed results for each system
- in the end, generate the job summary
Real-world benefits from this PR:
1. The github runner scripts are in the zfs repo itself. That means
you can just open a PR against zfs, like "Add Fedora 41 tester", and
see the results directly in the PR. ZFS admins no longer need
manually to login to the buildbot server to update the buildbot config
with new version of Fedora/Almalinux.
2. Github runners allow you to run the entire test suite against your
private branch before submitting a formal PR to openzfs. Just open a
PR against your private zfs repo, and the exact same
Fedora/Alma/FreeBSD runners will fire up and run ZTS. This can be
useful if you want to iterate on a ZTS change before submitting a
formal PR.
3. buildbot is incredibly cumbersome. Our buildbot config files alone
are ~1500 lines (not including any build/setup scripts)!
It's a huge pain to setup.
4. We're running the super ancient buildbot 0.8.12. It's so ancient
it requires python2. We actually have to build python2 from source
for almalinux9 just to get it to run. Ugrading to a more modern
buildbot is a huge undertaking, and the UI on the newer versions is
worse.
5. Buildbot uses EC2 instances. EC2 is a pain because:
* It costs money
* They throttle IOPS and CPU usage, leading to mysterious,
* hard-to-diagnose, failures and timeouts in ZTS.
* EC2 is high maintenance. We have to setup security groups, SSH
* keys, networking, users, etc, in AWS and it's a pain. We also
* have to periodically go in an kill zombie EC2 instances that
* buildbot is unable to kill off.
6. Buildbot doesn't always handle failures well. One of the things we
saw in the past was the FreeBSD builders would often die, and each
builder death would take up a "slot" in buildbot. So we would
periodically have to restart buildbot via a cron job to get the slots
back.
7. This PR divides up the ZTS test list into two parts, launches two
VMs, and on each VM runs half the test suite. The test results are
then merged and shown in the sumary page. So we're basically
parallelizing ZTS on the same github runner. This leads to lower
overall ZTS runtimes (2.5-3 hours vs 4+ hours on buildbot), and one
unified set of results per runner, which is nice.
8. Since the tests are running on a VM, we have much more control over
what happens. We can capture the serial console output even if the
test completely brings down the VM. In the future, we could also
restart the test on the VM where it left off, so that if a single test
panics the VM, we can just restart it and run the remaining ZTS tests
(this functionaly is not yet implemented though, just an idea).
9. Using the runners, users can manually kill or restart a test run
via the github IU. That really isn't possible with buildbot unless
you're an admin.
10. Anecdotally, the tests seem to be more stable and constant under
the QEMU runners.
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #16537
2024-06-17 17:52:58 +03:00
|
|
|
# Start services
|
|
|
|
echo "##[group]Enable services"
|
|
|
|
case "$1" in
|
|
|
|
freebsd*)
|
|
|
|
# add virtio things
|
|
|
|
echo 'virtio_load="YES"' | sudo -E tee -a /boot/loader.conf
|
|
|
|
for i in balloon blk console random scsi; do
|
|
|
|
echo "virtio_${i}_load=\"YES\"" | sudo -E tee -a /boot/loader.conf
|
|
|
|
done
|
|
|
|
echo "fdescfs /dev/fd fdescfs rw 0 0" | sudo -E tee -a /etc/fstab
|
|
|
|
sudo -E mount /dev/fd
|
|
|
|
sudo -E touch /etc/zfs/exports
|
|
|
|
sudo -E sysrc mountd_flags="/etc/zfs/exports"
|
|
|
|
echo '[global]' | sudo -E tee /usr/local/etc/smb4.conf >/dev/null
|
|
|
|
sudo -E service nfsd enable
|
|
|
|
sudo -E service qemu-guest-agent enable
|
|
|
|
sudo -E service samba_server enable
|
|
|
|
;;
|
|
|
|
debian*|ubuntu*)
|
|
|
|
sudo -E systemctl enable nfs-kernel-server
|
|
|
|
sudo -E systemctl enable qemu-guest-agent
|
|
|
|
sudo -E systemctl enable smbd
|
|
|
|
;;
|
|
|
|
*)
|
|
|
|
# All other linux distros
|
|
|
|
sudo -E systemctl enable nfs-server
|
|
|
|
sudo -E systemctl enable qemu-guest-agent
|
|
|
|
sudo -E systemctl enable smb
|
|
|
|
;;
|
|
|
|
esac
|
|
|
|
echo "##[endgroup]"
|
|
|
|
|
|
|
|
# Setup Kernel cmdline
|
|
|
|
CMDLINE="console=tty0 console=ttyS0,115200n8"
|
|
|
|
CMDLINE="$CMDLINE selinux=0"
|
|
|
|
CMDLINE="$CMDLINE random.trust_cpu=on"
|
|
|
|
CMDLINE="$CMDLINE no_timer_check"
|
|
|
|
case "$1" in
|
|
|
|
almalinux*|centos*|fedora*)
|
|
|
|
GRUB_CFG="/boot/grub2/grub.cfg"
|
|
|
|
GRUB_MKCONFIG="grub2-mkconfig"
|
|
|
|
CMDLINE="$CMDLINE biosdevname=0 net.ifnames=0"
|
|
|
|
echo 'GRUB_SERIAL_COMMAND="serial --speed=115200"' \
|
|
|
|
| sudo tee -a /etc/default/grub >/dev/null
|
|
|
|
;;
|
|
|
|
ubuntu24)
|
|
|
|
GRUB_CFG="/boot/grub/grub.cfg"
|
|
|
|
GRUB_MKCONFIG="grub-mkconfig"
|
|
|
|
echo 'GRUB_DISABLE_OS_PROBER="false"' \
|
|
|
|
| sudo tee -a /etc/default/grub >/dev/null
|
|
|
|
;;
|
|
|
|
*)
|
|
|
|
GRUB_CFG="/boot/grub/grub.cfg"
|
|
|
|
GRUB_MKCONFIG="grub-mkconfig"
|
|
|
|
;;
|
|
|
|
esac
|
|
|
|
|
|
|
|
case "$1" in
|
|
|
|
archlinux|freebsd*)
|
|
|
|
true
|
|
|
|
;;
|
|
|
|
*)
|
|
|
|
echo "##[group]Edit kernel cmdline"
|
|
|
|
sudo sed -i -e '/^GRUB_CMDLINE_LINUX/d' /etc/default/grub || true
|
|
|
|
echo "GRUB_CMDLINE_LINUX=\"$CMDLINE\"" \
|
|
|
|
| sudo tee -a /etc/default/grub >/dev/null
|
|
|
|
sudo $GRUB_MKCONFIG -o $GRUB_CFG
|
|
|
|
echo "##[endgroup]"
|
|
|
|
;;
|
|
|
|
esac
|
|
|
|
|
|
|
|
# reset cloud-init configuration and poweroff
|
|
|
|
sudo cloud-init clean --logs
|
|
|
|
sleep 2 && sudo poweroff &
|
|
|
|
exit 0
|