# git rev-parse -q --verify 80d47f5de5e311cbc0d01ebb6ee684e8f4c196c6^{commit} # git fetch -q -n -f git://fs.ozlabs.ibm.com/kernel/linus master warning: The last gc run reported the following. Please correct the root cause and remove .git/gc.log. Automatic cleanup will not be performed until the file is removed. warning: There are too many unreachable loose objects; run 'git prune' to remove them. # git rev-parse -q --verify 80d47f5de5e311cbc0d01ebb6ee684e8f4c196c6^{commit} 80d47f5de5e311cbc0d01ebb6ee684e8f4c196c6 # git checkout -q -f -B kisskb 80d47f5de5e311cbc0d01ebb6ee684e8f4c196c6 # git clean -qxdf # < git log -1 # commit 80d47f5de5e311cbc0d01ebb6ee684e8f4c196c6 # Author: Linus Torvalds # Date: Thu Feb 17 08:57:47 2022 -0800 # # mm: don't try to NUMA-migrate COW pages that have other uses # # Oded Gabbay reports that enabling NUMA balancing causes corruption with # his Gaudi accelerator test load: # # "All the details are in the bug, but the bottom line is that somehow, # this patch causes corruption when the numa balancing feature is # enabled AND we don't use process affinity AND we use GUP to pin pages # so our accelerator can DMA to/from system memory. # # Either disabling numa balancing, using process affinity to bind to # specific numa-node or reverting this patch causes the bug to # disappear" # # and Oded bisected the issue to commit 09854ba94c6a ("mm: do_wp_page() # simplification"). # # Now, the NUMA balancing shouldn't actually be changing the writability # of a page, and as such shouldn't matter for COW. But it appears it # does. Suspicious. # # However, regardless of that, the condition for enabling NUMA faults in # change_pte_range() is nonsensical. It uses "page_mapcount(page)" to # decide if a COW page should be NUMA-protected or not, and that makes # absolutely no sense. # # The number of mappings a page has is irrelevant: not only does GUP get a # reference to a page as in Oded's case, but the other mappings migth be # paged out and the only reference to them would be in the page count. # # Since we should never try to NUMA-balance a page that we can't move # anyway due to other references, just fix the code to use 'page_count()'. # Oded confirms that that fixes his issue. # # Now, this does imply that something in NUMA balancing ends up changing # page protections (other than the obvious one of making the page # inaccessible to get the NUMA faulting information). Otherwise the COW # simplification wouldn't matter - since doing the GUP on the page would # make sure it's writable. # # The cause of that permission change would be good to figure out too, # since it clearly results in spurious COW events - but fixing the # nonsensical test that just happened to work before is obviously the # CorrectThing(tm) to do regardless. # # Fixes: 09854ba94c6a ("mm: do_wp_page() simplification") # Link: https://bugzilla.kernel.org/show_bug.cgi?id=215616 # Link: https://lore.kernel.org/all/CAFCwf10eNmwq2wD71xjUhqkvv5+_pJMR1nPug2RqNDcFT4H86Q@mail.gmail.com/ # Reported-and-tested-by: Oded Gabbay # Cc: David Hildenbrand # Cc: Peter Xu # Signed-off-by: Linus Torvalds # < /opt/cross/kisskb/korg/gcc-8.1.0-nolibc/x86_64-linux/bin/x86_64-linux-gcc --version # < /opt/cross/kisskb/korg/gcc-8.1.0-nolibc/x86_64-linux/bin/x86_64-linux-ld --version # < git log --format=%s --max-count=1 80d47f5de5e311cbc0d01ebb6ee684e8f4c196c6 # < make -s -j 32 ARCH=x86 O=/kisskb/build/linus_x86-allmodconfig_x86_64-gcc8 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-8.1.0-nolibc/x86_64-linux/bin/x86_64-linux- allmodconfig # Added to kconfig CONFIG_BUILD_DOCSRC=n # Added to kconfig CONFIG_MODULE_SIG=n # Added to kconfig CONFIG_SAMPLES=n # < make -s -j 32 ARCH=x86 O=/kisskb/build/linus_x86-allmodconfig_x86_64-gcc8 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-8.1.0-nolibc/x86_64-linux/bin/x86_64-linux- help # make -s -j 32 ARCH=x86 O=/kisskb/build/linus_x86-allmodconfig_x86_64-gcc8 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-8.1.0-nolibc/x86_64-linux/bin/x86_64-linux- olddefconfig # make -s -j 32 ARCH=x86 O=/kisskb/build/linus_x86-allmodconfig_x86_64-gcc8 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-8.1.0-nolibc/x86_64-linux/bin/x86_64-linux- /opt/cross/kisskb/korg/gcc-8.1.0-nolibc/x86_64-linux/bin/x86_64-linux-objdump: mm/kfence/kfence_test.o: unable to initialize decompress status for section .debug_info /opt/cross/kisskb/korg/gcc-8.1.0-nolibc/x86_64-linux/bin/x86_64-linux-objdump: mm/kfence/kfence_test.o: unable to initialize decompress status for section .debug_info /opt/cross/kisskb/korg/gcc-8.1.0-nolibc/x86_64-linux/bin/x86_64-linux-objdump: mm/kfence/kfence_test.o: File format not recognized /opt/cross/kisskb/korg/gcc-8.1.0-nolibc/x86_64-linux/bin/x86_64-linux-ld: mm/kfence/kfence_test.o: unable to initialize decompress status for section .debug_info /opt/cross/kisskb/korg/gcc-8.1.0-nolibc/x86_64-linux/bin/x86_64-linux-ld: mm/kfence/kfence_test.o: unable to initialize decompress status for section .debug_info /opt/cross/kisskb/korg/gcc-8.1.0-nolibc/x86_64-linux/bin/x86_64-linux-ld: mm/kfence/kfence_test.o: unable to initialize decompress status for section .debug_info /opt/cross/kisskb/korg/gcc-8.1.0-nolibc/x86_64-linux/bin/x86_64-linux-ld: mm/kfence/kfence_test.o: unable to initialize decompress status for section .debug_info mm/kfence/kfence_test.o: file not recognized: File format not recognized make[3]: *** [/kisskb/src/scripts/Makefile.modfinal:60: mm/kfence/kfence_test.ko] Error 1 make[3]: *** Waiting for unfinished jobs.... make[2]: *** [/kisskb/src/scripts/Makefile.modpost:140: __modpost] Error 2 make[1]: *** [/kisskb/src/Makefile:1746: modules] Error 2 make: *** [Makefile:219: __sub-make] Error 2 Command 'make -s -j 32 ARCH=x86 O=/kisskb/build/linus_x86-allmodconfig_x86_64-gcc8 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-8.1.0-nolibc/x86_64-linux/bin/x86_64-linux- ' returned non-zero exit status 2 # rm -rf /kisskb/build/linus_x86-allmodconfig_x86_64-gcc8 # Build took: 0:17:34.241257