# git rev-parse -q --verify 80d47f5de5e311cbc0d01ebb6ee684e8f4c196c6^{commit} 80d47f5de5e311cbc0d01ebb6ee684e8f4c196c6 already have revision, skipping fetch # git checkout -q -f -B kisskb 80d47f5de5e311cbc0d01ebb6ee684e8f4c196c6 # git clean -qxdf # < git log -1 # commit 80d47f5de5e311cbc0d01ebb6ee684e8f4c196c6 # Author: Linus Torvalds # Date: Thu Feb 17 08:57:47 2022 -0800 # # mm: don't try to NUMA-migrate COW pages that have other uses # # Oded Gabbay reports that enabling NUMA balancing causes corruption with # his Gaudi accelerator test load: # # "All the details are in the bug, but the bottom line is that somehow, # this patch causes corruption when the numa balancing feature is # enabled AND we don't use process affinity AND we use GUP to pin pages # so our accelerator can DMA to/from system memory. # # Either disabling numa balancing, using process affinity to bind to # specific numa-node or reverting this patch causes the bug to # disappear" # # and Oded bisected the issue to commit 09854ba94c6a ("mm: do_wp_page() # simplification"). # # Now, the NUMA balancing shouldn't actually be changing the writability # of a page, and as such shouldn't matter for COW. But it appears it # does. Suspicious. # # However, regardless of that, the condition for enabling NUMA faults in # change_pte_range() is nonsensical. It uses "page_mapcount(page)" to # decide if a COW page should be NUMA-protected or not, and that makes # absolutely no sense. # # The number of mappings a page has is irrelevant: not only does GUP get a # reference to a page as in Oded's case, but the other mappings migth be # paged out and the only reference to them would be in the page count. # # Since we should never try to NUMA-balance a page that we can't move # anyway due to other references, just fix the code to use 'page_count()'. # Oded confirms that that fixes his issue. # # Now, this does imply that something in NUMA balancing ends up changing # page protections (other than the obvious one of making the page # inaccessible to get the NUMA faulting information). Otherwise the COW # simplification wouldn't matter - since doing the GUP on the page would # make sure it's writable. # # The cause of that permission change would be good to figure out too, # since it clearly results in spurious COW events - but fixing the # nonsensical test that just happened to work before is obviously the # CorrectThing(tm) to do regardless. # # Fixes: 09854ba94c6a ("mm: do_wp_page() simplification") # Link: https://bugzilla.kernel.org/show_bug.cgi?id=215616 # Link: https://lore.kernel.org/all/CAFCwf10eNmwq2wD71xjUhqkvv5+_pJMR1nPug2RqNDcFT4H86Q@mail.gmail.com/ # Reported-and-tested-by: Oded Gabbay # Cc: David Hildenbrand # Cc: Peter Xu # Signed-off-by: Linus Torvalds # < /opt/cross/kisskb/korg/gcc-11.1.0-nolibc/x86_64-linux/bin/x86_64-linux-gcc --version # < /opt/cross/kisskb/korg/gcc-11.1.0-nolibc/x86_64-linux/bin/x86_64-linux-ld --version # < git log --format=%s --max-count=1 80d47f5de5e311cbc0d01ebb6ee684e8f4c196c6 # < make -s -j 32 ARCH=x86_64 O=/kisskb/build/linus-rand_x86_64-randconfig_x86_64-gcc11 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-11.1.0-nolibc/x86_64-linux/bin/x86_64-linux- randconfig # Added to kconfig CONFIG_STANDALONE=y # Added to kconfig CONFIG_PREVENT_FIRMWARE_BUILD=y # Added to kconfig CONFIG_CC_STACKPROTECTOR_STRONG=n # Added to kconfig CONFIG_GCC_PLUGINS=n # Added to kconfig CONFIG_GCC_PLUGIN_CYC_COMPLEXITY=n # Added to kconfig CONFIG_GCC_PLUGIN_SANCOV=n # Added to kconfig CONFIG_GCC_PLUGIN_LATENT_ENTROPY=n # Added to kconfig CONFIG_BPF_PRELOAD=n # Added to kconfig # < make -s -j 32 ARCH=x86_64 O=/kisskb/build/linus-rand_x86_64-randconfig_x86_64-gcc11 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-11.1.0-nolibc/x86_64-linux/bin/x86_64-linux- help # make -s -j 32 ARCH=x86_64 O=/kisskb/build/linus-rand_x86_64-randconfig_x86_64-gcc11 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-11.1.0-nolibc/x86_64-linux/bin/x86_64-linux- olddefconfig .config:4604:warning: override: reassigning to symbol STANDALONE .config:4607:warning: override: reassigning to symbol GCC_PLUGINS # make -s -j 32 ARCH=x86_64 O=/kisskb/build/linus-rand_x86_64-randconfig_x86_64-gcc11 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-11.1.0-nolibc/x86_64-linux/bin/x86_64-linux- lib/strncpy_from_user.o: warning: objtool: strncpy_from_user()+0x3c9: call to do_strncpy_from_user() with UACCESS enabled lib/strnlen_user.o: warning: objtool: strnlen_user()+0x386: call to do_strnlen_user() with UACCESS enabled WARNING: modpost: vmlinux.o(.text.unlikely+0x23cd): Section mismatch in reference from the function svm_hv_hardware_setup() to the (unknown reference) .init.data:(unknown) The function svm_hv_hardware_setup() references the (unknown reference) __initdata (unknown). This is often because svm_hv_hardware_setup lacks a __initdata annotation or the annotation of (unknown) is wrong. WARNING: modpost: vmlinux.o(.text.unlikely+0x23e0): Section mismatch in reference from the function svm_hv_hardware_setup() to the (unknown reference) .init.data:(unknown) The function svm_hv_hardware_setup() references the (unknown reference) __initdata (unknown). This is often because svm_hv_hardware_setup lacks a __initdata annotation or the annotation of (unknown) is wrong. WARNING: modpost: vmlinux.o(.text.unlikely+0x23e7): Section mismatch in reference from the function svm_hv_hardware_setup() to the (unknown reference) .init.data:(unknown) The function svm_hv_hardware_setup() references the (unknown reference) __initdata (unknown). This is often because svm_hv_hardware_setup lacks a __initdata annotation or the annotation of (unknown) is wrong. WARNING: modpost: vmlinux.o(.text.unlikely+0x23f7): Section mismatch in reference from the function svm_hv_hardware_setup() to the (unknown reference) .init.data:(unknown) The function svm_hv_hardware_setup() references the (unknown reference) __initdata (unknown). This is often because svm_hv_hardware_setup lacks a __initdata annotation or the annotation of (unknown) is wrong. WARNING: modpost: vmlinux.o(.text.unlikely+0x25af): Section mismatch in reference from the function svm_hv_hardware_setup() to the (unknown reference) .init.data:(unknown) The function svm_hv_hardware_setup() references the (unknown reference) __initdata (unknown). This is often because svm_hv_hardware_setup lacks a __initdata annotation or the annotation of (unknown) is wrong. WARNING: modpost: vmlinux.o(.text.unlikely+0x25bb): Section mismatch in reference from the function svm_hv_hardware_setup() to the (unknown reference) .init.data:(unknown) The function svm_hv_hardware_setup() references the (unknown reference) __initdata (unknown). This is often because svm_hv_hardware_setup lacks a __initdata annotation or the annotation of (unknown) is wrong. ERROR: modpost: Section mismatches detected. Set CONFIG_SECTION_MISMATCH_WARN_ONLY=y to allow them. make[2]: *** [/kisskb/src/scripts/Makefile.modpost:59: vmlinux.symvers] Error 1 make[2]: *** Deleting file 'vmlinux.symvers' make[1]: *** [/kisskb/src/Makefile:1155: vmlinux] Error 2 make: *** [Makefile:219: __sub-make] Error 2 Command 'make -s -j 32 ARCH=x86_64 O=/kisskb/build/linus-rand_x86_64-randconfig_x86_64-gcc11 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-11.1.0-nolibc/x86_64-linux/bin/x86_64-linux- ' returned non-zero exit status 2 # rm -rf /kisskb/build/linus-rand_x86_64-randconfig_x86_64-gcc11 # Build took: 0:03:21.749929