From owner-freebsd-stable@FreeBSD.ORG Wed Jun 28 09:21:53 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3D5F016A410 for ; Wed, 28 Jun 2006 09:21:53 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9F26243D98 for ; Wed, 28 Jun 2006 09:21:52 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 48E7346D03; Wed, 28 Jun 2006 05:21:52 -0400 (EDT) Date: Wed, 28 Jun 2006 10:21:52 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Fabian Keil In-Reply-To: <20060627175853.765a590e@localhost> Message-ID: <20060628101729.J50845@fledge.watson.org> References: <20060627175853.765a590e@localhost> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Peter Thoenen , freebsd-stable@freebsd.org Subject: Re: FreeBSD 6.1 Tor issues (Once More, with Feeling) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Jun 2006 09:21:53 -0000 On Tue, 27 Jun 2006, Fabian Keil wrote: > There was a "request" for Tor related problem reports a while ago, I > couldn't find the message again, but I believe it was posted here. I'm very interested in tracking down this problem, but have had a lot of trouble getting reliable reports of problems -- i.e., ones where I could get any debugging information. I had a similar conversation on these lines yeterday with Roger (Tor author) here at the WEIS conference. If this is easily reproduceable, I would like you to do the following: - Compile in options DDB, options KDB, options BREAK_TO_DEBUGGER, options WITNESS, options WITNESS_SKIPSPIN, options INVARIANTS, options INVARIANT_SUPPORT. - Make sure to have a kernel with debugging symbols for the kernel. - Turn on core dumps. The above debugging options will have a significant performance impact, and may or may not affect the probability of the race or deadlock being exercised. The first question is: - Are there any warnings on the console from WITNESS or other debugging options? If so, please copy/paste them into an e-mail for me. - Does a panic occur? If so, the output of the following comments would be very useful: show pcpu show allpcpu ps show locks show alllocks show lockedvnods trace Then walk the list of all processes listed in 'show alllocks', and run trace on each pid. - Does the hang occur? If so, use a serial break to get into DDB, see the above. In both of the last two cases, attempt to get a core dump. Robert N M Watson Computer Laboratory University of Cambridge > > Last week I installed: > FreeBSD tor.fabiankeil.de 6.1-RELEASE-p2 FreeBSD > 6.1-RELEASE-p2 #0: Fri Jun 23 20:06:57 CEST 2006 > fk@fabiankeil.de:/usr/obj/usr/src/sys/BIGSLEEP i386. > > At the moment it is only acting as Tor node > > tor-devel (maintainer CC'd) is running jailed in a Geli image, > ntpd, named, cron and sshd are running in the host system > and that's about it. No mail or web server and nearly no traffic > besides the one caused by Tor. > > I started Tor Friday night and had to reset the box three times > since then. The server just suddenly stops responding, the logs > stop as well, therefore I assume it either panics or hangs. > > I only have remote access, a serial console is available, > but it becomes unresponsive as well. I didn't configure DDB yet, > so maybe that is to be expected? > > cron creates some stats every five minutes, a few minutes > before a hang this morning the load was: > > last pid: 7996; load averages: 0.40, 0.37, 0.36 up 0+18:38:25 05:55:02 > 83 processes: 2 running, 66 sleeping, 15 waiting > CPU states: 21.3% user, 0.0% nice, 17.8% system, 20.2% interrupt, 40.7% idle > Mem: 100M Active, 157M Inact, 102M Wired, 12K Cache, 60M Buf, 134M Free > Swap: 1024M Total, 1024M Free > > PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND > 11 root 1 171 52 0K 8K RUN 857:30 53.61% idle > 12 root 1 -44 -163 0K 8K WAIT 45:22 6.54% swi1: net > 23 root 1 -68 -187 0K 8K WAIT 14:48 2.83% irq12: fxp0 fxp1 > 7973 root 1 96 0 2264K 1544K RUN 0:00 0.51% top > 13 root 1 -32 -151 0K 8K WAIT 5:49 0.10% swi4: clock sio > 33 root 1 171 52 0K 8K pgzero 0:02 0.10% pagezero > 3 root 1 -8 0 0K 8K - 0:16 0.05% g_up > 1586 _tor 14 20 0 99M 97912K kserel 188:36 0.00% tor > 15 root 1 -16 0 0K 8K - 1:01 0.00% yarrow > 1443 root 1 -8 0 0K 8K geli:w 0:49 0.00% g_eli[0] md0 > 4 root 1 -8 0 0K 8K - 0:21 0.00% g_down > 35 root 1 20 0 0K 8K syncer 0:17 0.00% syncer > 1439 root 1 -8 0 0K 8K mdwait 0:13 0.00% md0 > 24 root 1 -64 -183 0K 8K WAIT 0:08 0.00% irq14: ata0 > 2 root 1 -8 0 0K 8K - 0:07 0.00% g_event > 42 root 1 -16 0 0K 8K - 0:06 0.00% schedcpu > 453 root 1 96 0 2920K 1752K select 0:05 0.00% ntpd > 256 _pflogd 1 -58 0 1548K 1216K bpf 0:05 0.00% pflog > > pfctls -si: > Status: Enabled for 0 days 18:37:52 Debug: Urgent > > Hostid: 0x1ec3da6b > > Interface Stats for fxp0 IPv4 IPv6 > Bytes In 25077859159 0 > Bytes Out 27498863362 0 > Packets In > Passed 36192760 0 > Blocked 32213 0 > Packets Out > Passed 36871432 0 > Blocked 265 0 > > State Table Total Rate > current entries 5290 > searches 73567507 1096.8/s > inserts 600068 8.9/s > removals 594778 8.9/s > Counters > match 752600 11.2/s > bad-offset 0 0.0/s > fragment 102 0.0/s > short 0 0.0/s > normalize 2 0.0/s > memory 68 0.0/s > bad-timestamp 0 0.0/s > congestion 0 0.0/s > ip-option 0 0.0/s > proto-cksum 0 0.0/s > state-mismatch 12655 0.2/s > state-insert 0 0.0/s > state-limit 0 0.0/s > src-limit 2 0.0/s > synproxy > > Today's traffic graph: > > (The hang around 14:00 happened while I was logged in doing a buildworld) > > At the moment I'm building RELENG_6 with DDB to see if it changes anything > and if I can get a core dump, but so far the problem seems to be > similar to: http://www.freebsd.org/cgi/query-pr.cgi?pr=95180 (closed) > and . > > Is anyone on this list running a Tor node on FreeBSD 6.1-RELEASE > or later with similar or higher load? > > Fabian > -- > http://www.fabiankeil.de/ >