# git rev-parse -q --verify 0a53820353d1741964963ede60231c816692da4c^{commit} 0a53820353d1741964963ede60231c816692da4c already have revision, skipping fetch # git checkout -q -f -B kisskb 0a53820353d1741964963ede60231c816692da4c # git clean -qxdf # < git log -1 # commit 0a53820353d1741964963ede60231c816692da4c # Author: Srikar Dronamraju # Date: Mon Aug 13 12:55:50 2018 +0530 # # powerpc/topology: Get topology for shared processors at boot # # On a shared lpar, Phyp will not update the cpu associativity at boot # time. Just after the boot system does recognize itself as a shared lpar and # trigger a request for correct cpu associativity. But by then the scheduler # would have already created/destroyed its sched domains. # # This causes: # - Broken load balance across Nodes causing islands of cores. # - Performance degradation esp if the system is lightly loaded # - dmesg to wrongly report all CPUs to be in Node 0. # - Messages in dmesg saying borken topology. # - With commit 051f3ca02e46 ("sched/topology: Introduce NUMA identity # node sched domain"), can cause rcu stalls at boot up. # # The sched_domains_numa_masks table which is used to generate cpumasks # is only created at boot time just before creating sched domains and # never updated. Hence, its better to get the topology correct before # the sched domains are created. # # For example on 64 core Power 8 shared lpar, dmesg reports: # # Brought up 512 CPUs # Node 0 CPUs: 0-511 # Node 1 CPUs: # Node 2 CPUs: # Node 3 CPUs: # Node 4 CPUs: # Node 5 CPUs: # Node 6 CPUs: # Node 7 CPUs: # Node 8 CPUs: # Node 9 CPUs: # Node 10 CPUs: # Node 11 CPUs: # ... # BUG: arch topology borken # the DIE domain not a subset of the NUMA domain # BUG: arch topology borken # the DIE domain not a subset of the NUMA domain # ... # # numactl/lscpu output will still be correct with cores spreading across # all nodes. # # Socket(s): 64 # NUMA node(s): 12 # Model: 2.0 (pvr 004d 0200) # Model name: POWER8 (architected), altivec supported # Hypervisor vendor: pHyp # Virtualization type: para # L1d cache: 64K # L1i cache: 32K # NUMA node0 CPU(s): 0-7,32-39,64-71,96-103,176-183,272-279,368-375,464-471 # NUMA node1 CPU(s): 8-15,40-47,72-79,104-111,184-191,280-287,376-383,472-479 # NUMA node2 CPU(s): 16-23,48-55,80-87,112-119,192-199,288-295,384-391,480-487 # NUMA node3 CPU(s): 24-31,56-63,88-95,120-127,200-207,296-303,392-399,488-495 # NUMA node4 CPU(s): 208-215,304-311,400-407,496-503 # NUMA node5 CPU(s): 168-175,264-271,360-367,456-463 # NUMA node6 CPU(s): 128-135,224-231,320-327,416-423 # NUMA node7 CPU(s): 136-143,232-239,328-335,424-431 # NUMA node8 CPU(s): 216-223,312-319,408-415,504-511 # NUMA node9 CPU(s): 144-151,240-247,336-343,432-439 # NUMA node10 CPU(s): 152-159,248-255,344-351,440-447 # NUMA node11 CPU(s): 160-167,256-263,352-359,448-455 # # Currently on this LPAR, the scheduler detects 2 levels of NUMA and # created NUMA sched domains for all CPUs, but it finds a single DIE # domain consisting of all CPUs. Hence it deletes all NUMA sched # domains. # # To address this, detect the shared processor and update topology soon # after CPUs are setup so that correct topology is updated just before # scheduler creates sched domain. # # With the fix, dmesg reports # # numa: Node 0 CPUs: 0-7 32-39 64-71 96-103 176-183 272-279 368-375 464-471 # numa: Node 1 CPUs: 8-15 40-47 72-79 104-111 184-191 280-287 376-383 472-479 # numa: Node 2 CPUs: 16-23 48-55 80-87 112-119 192-199 288-295 384-391 480-487 # numa: Node 3 CPUs: 24-31 56-63 88-95 120-127 200-207 296-303 392-399 488-495 # numa: Node 4 CPUs: 208-215 304-311 400-407 496-503 # numa: Node 5 CPUs: 168-175 264-271 360-367 456-463 # numa: Node 6 CPUs: 128-135 224-231 320-327 416-423 # numa: Node 7 CPUs: 136-143 232-239 328-335 424-431 # numa: Node 8 CPUs: 216-223 312-319 408-415 504-511 # numa: Node 9 CPUs: 144-151 240-247 336-343 432-439 # numa: Node 10 CPUs: 152-159 248-255 344-351 440-447 # numa: Node 11 CPUs: 160-167 256-263 352-359 448-455 # # and lscpu would also report: # # Socket(s): 64 # NUMA node(s): 12 # Model: 2.0 (pvr 004d 0200) # Model name: POWER8 (architected), altivec supported # Hypervisor vendor: pHyp # Virtualization type: para # L1d cache: 64K # L1i cache: 32K # NUMA node0 CPU(s): 0-7,32-39,64-71,96-103,176-183,272-279,368-375,464-471 # NUMA node1 CPU(s): 8-15,40-47,72-79,104-111,184-191,280-287,376-383,472-479 # NUMA node2 CPU(s): 16-23,48-55,80-87,112-119,192-199,288-295,384-391,480-487 # NUMA node3 CPU(s): 24-31,56-63,88-95,120-127,200-207,296-303,392-399,488-495 # NUMA node4 CPU(s): 208-215,304-311,400-407,496-503 # NUMA node5 CPU(s): 168-175,264-271,360-367,456-463 # NUMA node6 CPU(s): 128-135,224-231,320-327,416-423 # NUMA node7 CPU(s): 136-143,232-239,328-335,424-431 # NUMA node8 CPU(s): 216-223,312-319,408-415,504-511 # NUMA node9 CPU(s): 144-151,240-247,336-343,432-439 # NUMA node10 CPU(s): 152-159,248-255,344-351,440-447 # NUMA node11 CPU(s): 160-167,256-263,352-359,448-455 # # Reported-by: Manjunatha H R # Signed-off-by: Srikar Dronamraju # [mpe: Trim / format change log] # Signed-off-by: Michael Ellerman # < /opt/cross/kisskb/br-mipsel-o32-full-2016.08-613-ge98b4dd/bin/mipsel-linux-gcc --version # < git log --format=%s --max-count=1 0a53820353d1741964963ede60231c816692da4c # < make -s -j 80 ARCH=mips O=/kisskb/build/powerpc-next_mips-defconfig_mipsel CROSS_COMPILE=/opt/cross/kisskb/br-mipsel-o32-full-2016.08-613-ge98b4dd/bin/mipsel-linux- defconfig # make -s -j 80 ARCH=mips O=/kisskb/build/powerpc-next_mips-defconfig_mipsel CROSS_COMPILE=/opt/cross/kisskb/br-mipsel-o32-full-2016.08-613-ge98b4dd/bin/mipsel-linux- FIT description: Linux 4.18.0-rc3-g0a53820 Created: Mon Aug 20 21:50:51 2018 Image 0 (kernel@0) Description: Linux 4.18.0-rc3-g0a53820 Created: Mon Aug 20 21:50:51 2018 Type: Kernel Image Compression: gzip compressed Data Size: 4499347 Bytes = 4393.89 kB = 4.29 MB Architecture: MIPS OS: Linux Load Address: 0x80100000 Entry Point: 0x80877f10 Hash algo: sha1 Hash value: 2a8b94099d6f620a7bc33bf86914eebfc8ae6225 Image 1 (fdt@boston) Description: img,boston Device Tree Created: Mon Aug 20 21:50:51 2018 Type: Flat Device Tree Compression: uncompressed Data Size: 3668 Bytes = 3.58 kB = 0.00 MB Architecture: MIPS Hash algo: sha1 Hash value: 569c37cc891ce1e1f3a193cb41cc691a5d2debb5 Image 2 (fdt@ni169445) Description: NI 169445 device tree Created: Mon Aug 20 21:50:51 2018 Type: Flat Device Tree Compression: uncompressed Data Size: 1871 Bytes = 1.83 kB = 0.00 MB Architecture: MIPS Hash algo: sha1 Hash value: 51b89b31605ee62038c8468c429af091dfc75ec7 Image 3 (fdt@xilfpga) Description: MIPSfpga (xilfpga) Device Tree Created: Mon Aug 20 21:50:51 2018 Type: Flat Device Tree Compression: uncompressed Data Size: 2708 Bytes = 2.64 kB = 0.00 MB Architecture: MIPS Hash algo: sha1 Hash value: 509ce58e44c561d54539e64e9d4b47054e696fc6 Default Configuration: 'conf@default' Configuration 0 (conf@default) Description: Generic Linux kernel Kernel: kernel@0 Configuration 1 (conf@boston) Description: Boston Linux kernel Kernel: kernel@0 FDT: fdt@boston Configuration 2 (conf@ni169445) Description: NI 169445 Linux Kernel Kernel: kernel@0 FDT: fdt@ni169445 Configuration 3 (conf@xilfpga) Description: MIPSfpga Linux kernel Kernel: kernel@0 FDT: fdt@xilfpga Completed OK # rm -rf /kisskb/build/powerpc-next_mips-defconfig_mipsel # Build took: 0:01:56.536161