Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 14 Dec 2025 03:34:40 +0000
From:      bugzilla-noreply@freebsd.org
To:        net@FreeBSD.org
Subject:   [Bug 289017] [lagg] A time-of-check to time-of-use (TOCTOU) race exists in the Link Aggregation (LAGG) network subsystem
Message-ID:  <bug-289017-7501-oIpv7jd7Hu@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-289017-7501@https.bugs.freebsd.org/bugzilla/>
References:  <bug-289017-7501@https.bugs.freebsd.org/bugzilla/>

index | next in thread | previous in thread | raw e-mail

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=289017

--- Comment #2 from Qiu-ji Chen <chenqiuji666@gmail.com> ---
(In reply to Zhenlei Huang from comment #1)

Thank you for the response.

I have successfully verified this TOCTOU issue dynamically, moving beyond the
initial static analysis.

To reproduce the race condition reliably, I set up a QEMU environment with a
custom FreeBSD 14.3 kernel. I injected a busy loop (DELAY) in
lagg_transmit_ethernet specifically between the protocol check (sc_proto !=
NONE) and the function pointer call (lagg_proto_start). I then developed a
multi-threaded PoC where one thread repeatedly toggles the lagg protocol via
SIOCSLAGG while multiple victim threads flood the interface with packets.

This setup successfully triggered a Kernel Panic (Fatal trap 12: page fault
with instruction pointer 0x0), proving that the protocol can indeed be switched
to NONE after the check passes but before usage.

Panic Log:
Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x0
fault code              = supervisor read instruction, page not present
instruction pointer     = 0x20:0x0
stack pointer           = 0x28:0xfffffe0068ff2948
frame pointer           = 0x28:0xfffffe0068ff2970
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, IOPL = 0
current process         = 873 (poc)
rdi: fffff8000360a200 rsi: fffff800043c8700 rdx: 0000000000000000
rcx: 000000000000005a  r8: fffffe000937c060  r9: fffff800043c8760
rax: 0000000000000000 rbx: fffff80003663000 rbp: fffffe0068ff2970
r10: 00000000000000a0 r11: fffff800046f2740 r12: 000000000000000e
r13: 0000000000000008 r14: fffffe0068ff2ac0 r15: fffff80003663000
trap number             = 12
panic: page fault
cpuid = 1
time = 1765682403
KDB: stack backtrace:
#0 0xffffffff80ba8f1d at kdb_backtrace+0x5d
#1 0xffffffff80b5aa11 at vpanic+0x161
#2 0xffffffff80b5a8a3 at panic+0x43
#3 0xffffffff8104dbfa at trap_pfault+0x3da
#4 0xffffffff81023dd8 at calltrap+0x8
#5 0xffffffff80c85a50 at ether_output+0x6b0
#6 0xffffffff80d21998 at ip_output+0x13a8
#7 0xffffffff80d52c40 at udp_send+0xb60
#8 0xffffffff80c0145c at sosend_dgram+0x31c
#9 0xffffffff80c0242f at sousrsend+0x5f
#10 0xffffffff80c0aec0 at kern_sendit+0x1c0
#11 0xffffffff80c0b1f2 at sendit+0x1b2
#12 0xffffffff80c0b02d at sys_sendto+0x4d
#13 0xffffffff8104e547 at amd64_syscall+0x117
#14 0xffffffff810246eb at fast_syscall_common+0xf8
Uptime: 1m13s
Automatic reboot in 15 seconds - press a key on the console to abort
--> Press a key on the console to reboot,
--> or switch off the system now.


I verified that this vulnerable logic also persists in 15.0. While I plan to
attempt a reproduction without artificial delays using pure concurrency to
further demonstrate the impact, I believe the current result with the widened
window definitively proves the bug's existence and mechanism.

I suggest prioritizing a fix for this race condition.

-- 
You are receiving this mail because:
You are the assignee for the bug.

help

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-289017-7501-oIpv7jd7Hu>