# git rev-parse -q --verify 63355b9884b3d1677de6bd1517cd2b8a9bf53978^{commit}
63355b9884b3d1677de6bd1517cd2b8a9bf53978
already have revision, skipping fetch
# git checkout -q -f -B kisskb 63355b9884b3d1677de6bd1517cd2b8a9bf53978
# git clean -qxdf
# < git log -1
# commit 63355b9884b3d1677de6bd1517cd2b8a9bf53978
# Author: Linus Torvalds <torvalds@linux-foundation.org>
# Date:   Tue Mar 7 12:16:18 2023 -0800
# 
#     cpumask: be more careful with 'cpumask_setall()'
#     
#     Commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
#     optimizations") changed cpumask_setall() to use "bitmap_set()" instead
#     of "bitmap_fill()", because bitmap_fill() would explicitly set all the
#     bits of a constant sized small bitmap, and that's exactly what we don't
#     want: we want to only set bits up to 'nr_cpu_ids', which is what
#     "bitmap_set()" does.
#     
#     However, Yury correctly points out that while "bitmap_set()" does indeed
#     only set bits up to the required bitmap size, it doesn't _clear_ bits
#     above that size, so the upper bits would still not have well-defined
#     values.
#     
#     Now, none of this should really matter, since any bits set past
#     'nr_cpu_ids' should always be ignored in the first place.  Yes, the bit
#     scanning functions might return them as a result, but since users should
#     always consider the ">= nr_cpu_ids" condition to mean "no more bits",
#     that shouldn't have any actual effect (see previous commit 8ca09d5fa354
#     "cpumask: fix incorrect cpumask scanning result checks").
#     
#     But let's just do it right, the way the code was _intended_ to work.  We
#     have had enough lazy code that works but bites us in the *rse later
#     (again, see previous commit) that there's no reason to not just do this
#     properly.
#     
#     It turns out that "bitmap_fill()" gets this all right for the complex
#     case, and really only fails for the inlined optimized case that just
#     fills the whole word.  And while we could just fix bitmap_fill() to use
#     the proper last word mask, there's two issues with that:
#     
#      - the cpumask case wants to do the _optimization_ based on "NR_CPUS is
#        a small constant", but then wants to do the actual bit _fill_ based
#        on "nr_cpu_ids" that isn't necessarily that same constant
#     
#      - we have lots of non-cpumask users of bitmap_fill(), and while they
#        hopefully don't care, and probably would want the proper semantics
#        anyway ("only set bits up to the limit"), I do not want the cpumask
#        changes to impact other parts
#     
#     So this ends up just doing the single-word optimization by hand in the
#     cpumask code.  If our cpumask is fundamentally limited to a single word,
#     just do the proper "fill in that word" exactly.  And if it's the more
#     complex multi-word case, then the generic bitmap_fill() will DTRT.
#     
#     This is all an example of how our bitmap function optimizations really
#     are somewhat broken.  They conflate the "this is size of the bitmap"
#     optimizations with the actual bit(s) we want to set.
#     
#     In many cases we really want to have the two be separate things:
#     sometimes we base our optimizations on the size of the whole bitmap ("I
#     know this whole bitmap fits in a single word, so I'll just use
#     single-word accesses"), and sometimes we base them on the bit we are
#     looking at ("this is just acting on bits that are in the first word, so
#     I'll use single-word accesses").
#     
#     Notice how the end result of the two optimizations are the same, but the
#     way we get to them are quite different.
#     
#     And all our cpumask optimization games are really about that fundamental
#     distinction, and we'd often really want to pass in both the "this is the
#     bit I'm working on" (which _can_ be a small constant but might be
#     variable), and "I know it's in this range even if it's variable" (based
#     on CONFIG_NR_CPUS).
#     
#     So this cpumask_setall() implementation just makes that explicit.  It
#     checks the "I statically know the size is small" using the known static
#     size of the cpumask (which is what that 'small_cpumask_bits' is all
#     about), but then sets the actual bits using the exact number of cpus we
#     have (ie 'nr_cpumask_bits')
#     
#     Of course, in a perfect world, the compiler would have done all the
#     range analysis (possibly with help from us just telling it that
#     "this value is always in this range"), and would do all of this for us.
#     But that is not the world we live in.
#     
#     While we dream of that perfect world, this does that manual logic to
#     make it all work out.  And this was a very long explanation for a small
#     code change that shouldn't even matter.
#     
#     Reported-by: Yury Norov <yury.norov@gmail.com>
#     Link: https://lore.kernel.org/lkml/ZAV9nGG9e1%2FrV+L%2F@yury-laptop/
#     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# < /opt/cross/kisskb/korg/gcc-11.3.0-nolibc/hppa-linux/bin/hppa-linux-gcc --version
# < /opt/cross/kisskb/korg/gcc-11.3.0-nolibc/hppa-linux/bin/hppa-linux-ld --version
# < git log --format=%s --max-count=1 63355b9884b3d1677de6bd1517cd2b8a9bf53978
# < make -s -j 24 ARCH=parisc O=/kisskb/build/linus_parisc-defconfig_parisc-gcc11 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-11.3.0-nolibc/hppa-linux/bin/hppa-linux-  defconfig
# < make -s -j 24 ARCH=parisc O=/kisskb/build/linus_parisc-defconfig_parisc-gcc11 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-11.3.0-nolibc/hppa-linux/bin/hppa-linux-  help
# make -s -j 24 ARCH=parisc O=/kisskb/build/linus_parisc-defconfig_parisc-gcc11 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-11.3.0-nolibc/hppa-linux/bin/hppa-linux-  olddefconfig
# make -s -j 24 ARCH=parisc O=/kisskb/build/linus_parisc-defconfig_parisc-gcc11 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-11.3.0-nolibc/hppa-linux/bin/hppa-linux-  
Completed OK
# rm -rf /kisskb/build/linus_parisc-defconfig_parisc-gcc11
# Build took: 0:02:15.053730