From owner-freebsd-current@FreeBSD.ORG Wed May 1 17:45:54 2013 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id B8C8A566; Wed, 1 May 2013 17:45:54 +0000 (UTC) (envelope-from rwatson@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [204.107.128.56]) by mx1.freebsd.org (Postfix) with ESMTP id 955EB1BFE; Wed, 1 May 2013 17:45:54 +0000 (UTC) Received: from fell28n62.joh.private.cam.ac.uk (global-1-82.nat.csx.cam.ac.uk [131.111.184.82]) by cyrus.watson.org (Postfix) with ESMTPSA id 9143346B35; Wed, 1 May 2013 13:45:53 -0400 (EDT) Subject: Re: panic: in_pcblookup_local (?) Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: text/plain; charset=us-ascii From: "Robert N. M. Watson" In-Reply-To: <201305011156.03974.jhb@freebsd.org> Date: Wed, 1 May 2013 18:45:53 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <201304301653.13845.jhb@freebsd.org> <20130430211908.GB1621@glenbarber.us> <201305011156.03974.jhb@freebsd.org> To: John Baldwin X-Mailer: Apple Mail (2.1283) Cc: Ian FREISLICH , Glen Barber , freebsd-current@freebsd.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 May 2013 17:45:54 -0000 On 1 May 2013, at 16:56, John Baldwin wrote: > It looks like the ipi_hash_lock is locked (and udp_connect() locks = it), so I=20 > think the offending code is somewhere else. Also, I can't find = anything that > removes an inp without hold the correct pcbinfo lock. Only thing I = can think > of is if the pcbinfo pointer for an inp could change, so we could = maybe > lock the wrong one while removing it? >=20 > Hmmmmmm, you know. In in_pcbremlists() and in_pcbdrop(), we read = inp_phd=20 > without holding the hash lock. I think that probably don't actaully = break > anything, but this feels like a locking issue of some sort. I'll need to catch up on this thread later, but a few questions: Do we know if the application in question is multithreaded, and if so, = might it be attempting concurrent operations on this socket? The corrupted pointer is worrying ... but interesting, and suggests = something else is going on here -- stack corruption earlier in the = system call, perhaps? In general, to modify our various hash lists you must lock both the = inpcb and the list. It's therefore sufficient to hold either lock to = read, so reading inp_phd should be OK with the inpcb lock held, even = without the hash lock held. Do we have a dump of *inp, and if so, can we confirm that the inpcb is = still properly referenced, if there is an associated socket, likewise a = dump of *inp->inp_socket to check things are properly referenced there? Robert=