Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 12 Oct 2019 11:46:46 -0700
From:      "Kristof Provost" <kp@FreeBSD.org>
To:        "Gleb Smirnoff" <glebius@FreeBSD.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r353292 - in head/sys: contrib/ipfilter/netinet dev/firewire dev/iicbus dev/usb/net kern net netgraph netinet netinet6 netipsec netpfil/ipfw netpfil/pf ofed/drivers/infiniband/ulp/ipoib
Message-ID:  <17A70566-1922-44BE-BFDB-ED713B6E54F6@FreeBSD.org>
In-Reply-To: <8EC71856-75DA-4231-AB73-22AB6621E9F5@FreeBSD.org>
References:  <201910072240.x97Me60x065650@repo.freebsd.org> <8EC71856-75DA-4231-AB73-22AB6621E9F5@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 11 Oct 2019, at 16:29, Kristof Provost wrote:
> On 7 Oct 2019, at 15:40, Gleb Smirnoff wrote:
>> Author: glebius
>> Date: Mon Oct  7 22:40:05 2019
>> New Revision: 353292
>> URL: https://svnweb.freebsd.org/changeset/base/353292
>>
>> Log:
>>   Widen NET_EPOCH coverage.
>>
>>   When epoch(9) was introduced to network stack, it was basically
>>   dropped in place of existing locking, which was mutexes and
>>   rwlocks. For the sake of performance mutex covered areas were
>>   as small as possible, so became epoch covered areas.
>>
>>   However, epoch doesn't introduce any contention, it just delays
>>   memory reclaim. So, there is no point to minimise epoch covered
>>   areas in sense of performance. Meanwhile entering/exiting epoch
>>   also has non-zero CPU usage, so doing this less often is a win.
>>
>>   Not the least is also code maintainability. In the new paradigm
>>   we can assume that at any stage of processing a packet, we are
>>   inside network epoch. This makes coding both input and output
>>   path way easier.
>>
>>   On output path we already enter epoch quite early - in the
>>   ip_output(), in the ip6_output().
>>
>>   This patch does the same for the input path. All ISR processing,
>>   network related callouts, other ways of packet injection to the
>>   network stack shall be performed in net_epoch. Any leaf function
>>   that walks network configuration now asserts epoch.
>>
>>   Tricky part is configuration code paths - ioctls, sysctls. They
>>   also call into leaf functions, so some need to be changed.
>>
>>   This patch would introduce more epoch recursions (see EPOCH_TRACE)
>>   than we had before. They will be cleaned up separately, as several
>>   of them aren't trivial. Note, that unlike a lock recursion the
>>   epoch recursion is safe and just wastes a bit of resources.
>>
>>   Reviewed by:	gallatin, hselasky, cy, adrian, kristof
>>   Differential Revision:	https://reviews.freebsd.org/D19111
>>
> This appears to have broken vlan.
>
> To reproduce do:
>
> sudo ifconfig epair create
> sudo ifconfig vlan create vlandev epair0a vlan 42
>
> I see the following panic:
>
> 	panic: Assertion in_epoch(net_epoch_preempt) failed at 
> /usr/src/sys/net/if_vlan.c:1353
> 	cpuid = 5
> 	time = 1570828155
> 	KDB: stack backtrace:
> 	db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
> 0xfffffe0060d894c0
> 	vpanic() at vpanic+0x17e/frame 0xfffffe0060d89520
> 	panic() at panic+0x43/frame 0xfffffe0060d89580
> 	vlan_config() at vlan_config+0x599/frame 0xfffffe0060d895c0
> 	vlan_clone_create() at vlan_clone_create+0x29b/frame 
> 0xfffffe0060d89630
> 	if_clone_createif() at if_clone_createif+0x4a/frame 
> 0xfffffe0060d89680
> 	ifioctl() at ifioctl+0x6f4/frame 0xfffffe0060d89750
> 	kern_ioctl() at kern_ioctl+0x295/frame 0xfffffe0060d897b0
> 	sys_ioctl() at sys_ioctl+0x15c/frame 0xfffffe0060d89880
> 	amd64_syscall() at amd64_syscall+0x2b5/frame 0xfffffe0060d899b0
> 	fast_syscall_common() at fast_syscall_common+0x101/frame 
> 0xfffffe0060d899b0
> 	--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x80047dc2a, rsp = 
> 0x7fffffffe1a8, rbp = 0x7fffffffe1b0 ---
>
> I guess that we need to enter net_epoch in vlan_clone_create(). Or 
> perhaps that’s something that’s generically needed and it should 
> be done on if_clone_createif().
>
Alternatively, this test case seems to trigger a number of different 
issues (I’ve tried entering the epoch and removing the assertions and 
keep running into different issues):

	diff --git a/tests/sys/net/Makefile b/tests/sys/net/Makefile
	index 03e0631f625..bf8fb174e47 100644
	--- a/tests/sys/net/Makefile
	+++ b/tests/sys/net/Makefile
	@@ -8,6 +8,7 @@ BINDIR=         ${TESTSDIR}
	 ATF_TESTS_SH+= if_lagg_test
	 ATF_TESTS_SH+= if_clone_test
	 ATF_TESTS_SH+= if_tun_test
	+ATF_TESTS_SH+= if_vlan

	 # The tests are written to be run in parallel, but doing so leads to 
random
	 # panics.  I think it's because the kernel's list of interfaces isn't 
properly
	diff --git a/tests/sys/net/if_vlan.sh b/tests/sys/net/if_vlan.sh
	new file mode 100644
	index 00000000000..020ec59830f
	--- /dev/null
	+++ b/tests/sys/net/if_vlan.sh
	@@ -0,0 +1,38 @@
	+# $FreeBSD$
	+
	+. $(atf_get_srcdir)/../common/vnet.subr
	+
	+atf_test_case "basic" "cleanup"
	+basic_head()
	+{
	+       atf_set descr 'Basic VLAN test'
	+       atf_set require.user root
	+}
	+
	+basic_body()
	+{
	+       vnet_init
	+
	+       epair_vlan=$(vnet_mkepair)
	+
	+       vnet_mkjail alcatraz ${epair_vlan}a
	+       vnet_mkjail singsing ${epair_vlan}b
	+
	+       vlan0=$(jexec alcatraz ifconfig vlan create vlandev 
${epair_vlan}a vlan 42)
	+       jexec alcatraz ifconfig ${vlan0} 10.0.0.1/24 up
	+
	+       vlan1=$(jexec singsing ifconfig vlan create vlandev 
${epair_vlan}b vlan 42)
	+       jexec singsing ifconfig ${vlan1} 10.0.0.2/24 up
	+
	+       atf_check -s exit:0 jexec singsing ping -c 1 10.0.0.1
	+}
	+
	+basic_cleanup()
	+{
	+       vnet_cleanup
	+}
	+
	+atf_init_test_cases()
	+{
	+       atf_add_test_case "basic"
	+}

Best regards,
Kristof




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?17A70566-1922-44BE-BFDB-ED713B6E54F6>