zygo: btrfs: call find_free_dev_extent with the right num_bytes
The caller of find_free_dev_extent sets num_bytes to the product of
stripe_len and the number of devices. This is a unit conformability
error, because num_bytes is measured in physical bytes in the device
address space, while the product of stripe_len and a value in any other
unit is not.
The result is that the dev_extent allocator is searching for chunk-sized
dev_extents (up to 10 GiB) to satisfy allocations, but the allocator will
only allocate the first 1 GiB of the space on the device that is found.
That results in some unfortunate behavior.
e.g. if a device has large contiguous holes, it will dominate allocation
in the presence of devices with smaller contiguous holes, even when the
holes are of sufficient size to form a chunk. Consider a filesystem
using raid1 profile with these free spaces in dev_extent maps:
Device 1: 1000x 1 GiB holes
Device 2: 1x 1000 GiB hole
Device 3: 10x 1.01 GiB holes
The first 10 block groups will be allocated in pairs of dev_extents from
device 2 and 3, because the allocator selects the devices with the largest
holes even if those holes are larger than 1 GiB:
Device 1: 1000x 1 GiB holes
Device 2: 1x 990 GiB hole
Device 3: 10x 0.01 GiB holes
As the filesystem fills up, this results in a 9.9 GiB shortfall.
990 chunks are created, pairing up all the 1 GiB holes on device 1 with
parts of the 990 GiB hole on device 2:
Device 1: 10x 1 GiB holes
Device 2: 0x 0 GiB hole (full)
Device 3: 10x 0.01 GiB holes
Then the allocator fills up the largest holes it can still find:
Device 1: 9x 1 GiB holes + 1x 0.9 GiB hole
Device 2: 0x 0 GiB hole (full)
Device 3: 0x 0.00 GiB holes (full)
Now the filesystem is out of space 9.9 GiB earlier than it should be.
Ideally, the find_free_dev_extent considers all holes above 1 GiB
in size equal, so the allocation first fills devices 1 and 2 until
they have equal free space to device 3:
Device 1: 10x 1 GiB holes
Device 2: 1x 10 GiB hole
Device 3: 10x 1.01 GiB holes
Device 1: 0x 1 GiB holes (full)
Device 2: 1x 0 GiB hole (full)
Device 3: 0x 1.01 GiB holes (full)
It looks like this bug was originally introduced in
73c5de005153 "btrfs:
quasi-round-robin for chunk allocation" when the parameter was changed
from:
max available space on first device with space
to the product of:
max available space on every device with space *
number of devices with that amount of space
The problematic behavior didn't emerge until later, after changes adding
zoned support and the space_info max chunk size sysfs parameter. These
changes affect the calculations of the alloc_chunk_ctl members, which
find_free_dev_extent then recombines in surprising ways.
To fix this, pass only the max stripe length to find_free_dev_extent,
without multiplying by any other value.
Fixes: 73c5de005153 "btrfs: quasi-round-robin for chunk allocation"
Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>