# git rev-parse -q --verify 80d47f5de5e311cbc0d01ebb6ee684e8f4c196c6^{commit} 80d47f5de5e311cbc0d01ebb6ee684e8f4c196c6 already have revision, skipping fetch # git checkout -q -f -B kisskb 80d47f5de5e311cbc0d01ebb6ee684e8f4c196c6 # git clean -qxdf # < git log -1 # commit 80d47f5de5e311cbc0d01ebb6ee684e8f4c196c6 # Author: Linus Torvalds # Date: Thu Feb 17 08:57:47 2022 -0800 # # mm: don't try to NUMA-migrate COW pages that have other uses # # Oded Gabbay reports that enabling NUMA balancing causes corruption with # his Gaudi accelerator test load: # # "All the details are in the bug, but the bottom line is that somehow, # this patch causes corruption when the numa balancing feature is # enabled AND we don't use process affinity AND we use GUP to pin pages # so our accelerator can DMA to/from system memory. # # Either disabling numa balancing, using process affinity to bind to # specific numa-node or reverting this patch causes the bug to # disappear" # # and Oded bisected the issue to commit 09854ba94c6a ("mm: do_wp_page() # simplification"). # # Now, the NUMA balancing shouldn't actually be changing the writability # of a page, and as such shouldn't matter for COW. But it appears it # does. Suspicious. # # However, regardless of that, the condition for enabling NUMA faults in # change_pte_range() is nonsensical. It uses "page_mapcount(page)" to # decide if a COW page should be NUMA-protected or not, and that makes # absolutely no sense. # # The number of mappings a page has is irrelevant: not only does GUP get a # reference to a page as in Oded's case, but the other mappings migth be # paged out and the only reference to them would be in the page count. # # Since we should never try to NUMA-balance a page that we can't move # anyway due to other references, just fix the code to use 'page_count()'. # Oded confirms that that fixes his issue. # # Now, this does imply that something in NUMA balancing ends up changing # page protections (other than the obvious one of making the page # inaccessible to get the NUMA faulting information). Otherwise the COW # simplification wouldn't matter - since doing the GUP on the page would # make sure it's writable. # # The cause of that permission change would be good to figure out too, # since it clearly results in spurious COW events - but fixing the # nonsensical test that just happened to work before is obviously the # CorrectThing(tm) to do regardless. # # Fixes: 09854ba94c6a ("mm: do_wp_page() simplification") # Link: https://bugzilla.kernel.org/show_bug.cgi?id=215616 # Link: https://lore.kernel.org/all/CAFCwf10eNmwq2wD71xjUhqkvv5+_pJMR1nPug2RqNDcFT4H86Q@mail.gmail.com/ # Reported-and-tested-by: Oded Gabbay # Cc: David Hildenbrand # Cc: Peter Xu # Signed-off-by: Linus Torvalds # < /opt/cross/kisskb/korg/gcc-8.1.0-nolibc/mips-linux/bin/mips-linux-gcc --version # < /opt/cross/kisskb/korg/gcc-8.1.0-nolibc/mips-linux/bin/mips-linux-ld --version # < git log --format=%s --max-count=1 80d47f5de5e311cbc0d01ebb6ee684e8f4c196c6 # < make -s -j 32 ARCH=mips O=/kisskb/build/linus_mips-allmodconfig_mips-gcc8 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-8.1.0-nolibc/mips-linux/bin/mips-linux- allmodconfig # Added to kconfig CONFIG_BUILD_DOCSRC=n # Added to kconfig CONFIG_MODULE_SIG=n # Added to kconfig CONFIG_SAMPLES=n # Added to kconfig CONFIG_MIPS_CPS_NS16550_BASE=0x1b0003f8 # Added to kconfig CONFIG_MIPS_CPS_NS16550_SHIFT=0 # < make -s -j 32 ARCH=mips O=/kisskb/build/linus_mips-allmodconfig_mips-gcc8 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-8.1.0-nolibc/mips-linux/bin/mips-linux- help # make -s -j 32 ARCH=mips O=/kisskb/build/linus_mips-allmodconfig_mips-gcc8 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-8.1.0-nolibc/mips-linux/bin/mips-linux- olddefconfig .config:13501:warning: override: reassigning to symbol MIPS_CPS_NS16550_SHIFT # make -s -j 32 ARCH=mips O=/kisskb/build/linus_mips-allmodconfig_mips-gcc8 CROSS_COMPILE=/opt/cross/kisskb/korg/gcc-8.1.0-nolibc/mips-linux/bin/mips-linux- /kisskb/src/arch/mips/boot/dts/img/boston.dts:128.19-178.5: Warning (pci_device_reg): /pci@14000000/pci2_root@0,0,0: PCI unit address format error, expected "0,0" /kisskb/src/arch/mips/boot/dts/ingenic/jz4780.dtsi:513.33-515.6: Warning (unit_address_format): /nemc@13410000/efuse@d0/eth-mac-addr@0x22: unit name should not have leading "0x" Completed OK # rm -rf /kisskb/build/linus_mips-allmodconfig_mips-gcc8 # Build took: 0:32:21.438065