Networking troubleshooting grid

Revision as of 17:48, 9 April 2020 by Registered User (Add ethernet flood entry)

Some typical issues related to the Networking feature are listed below. Solutions or debugging methods are proposed for these issues.

If your issue is not listed, try also looking in the articles in the Networking or troubleshooting grids categories.


Symptom Resolution
 
swapper/0: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC)
[...]
[43154.580149] stm32-dwmac 5800a000.ethernet: fail to alloc skb entry 430
[...]
[<c0232144>] (warn_alloc) from [<c02332f4>] (__alloc_pages_nodemask+0x1018/0x144c)
[<c02332f4>] (__alloc_pages_nodemask) from [<c02803e4>] (new_slab+0x4f0/0x5c0)
[...]
[<c0282e94>] (kmem_cache_alloc) from [<c0844cac>] (__build_skb+0x28/0x98)
[<c0844cac>] (__build_skb) from [<c0844e6c>] (__netdev_alloc_skb+0xe8/0x138)
[<c0844e6c>] (__netdev_alloc_skb) from [<c06ad700>] (stmmac_rx+0x698/0xb90)

When the DDR is widely used, this issue could occurs if atomic allocation area is in contension. The solution would be to increase "vm.min_free_kbytes" but take care of not increasing too much this area to avoid OOM. Recommendation is min_free_kbytes = 5% of DDR size / <nr_cpus>

[21727.053570] WARNING: CPU: 0 PID: 9 at net/sched/sch_generic.c:447 dev_watchdog+0x300/0x304
[21727.061869] NETDEV WATCHDOG: eth0 (stm32-dwmac): transmit queue 0 timed out
[...]
[21727.220936] stm32-dwmac 5800a000.ethernet eth0: Reset adapter.
[21727.264380] stm32-dwmac 5800a000.ethernet eth0: PHY [stmmac-0:00] driver [Generic PHY]
[21727.287491] dwmac4: Master AXI performs any burst length
[21727.291421] stm32-dwmac 5800a000.ethernet eth0: No Safety Features support found
[21727.298779] stm32-dwmac 5800a000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
[21727.311812] stm32-dwmac 5800a000.ethernet eth0: registered PTP clock
[21727.316730] stm32-dwmac 5800a000.ethernet eth0: configuring for phy/rgmii-id link mode
[21732.507133] stm32-dwmac 5800a000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
[21732.529889] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready

In case of DoS attack (example : SYN flood), this issue could occur because CPUs will spend most of the time in ethernet rx irq handler : after a while, a NETDEV WATCHDOG could occur which will reset ethernet interface.

To workaround this, we can assign CPU1 only to handle eth0 IRQ with this command :
echo 2 > /proc/irq/49/smp_affinity
where 49 is eth0 IRQ number. In that case, CPU0 is able schedule and avoid the watchdog.