From owner-freebsd-current@freebsd.org Mon Apr 17 15:29:46 2017 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5AA33D42444 for ; Mon, 17 Apr 2017 15:29:46 +0000 (UTC) (envelope-from ian@freebsd.org) Received: from pmta2.delivery6.ore.mailhop.org (pmta2.delivery6.ore.mailhop.org [54.200.129.228]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3A86897E for ; Mon, 17 Apr 2017 15:29:45 +0000 (UTC) (envelope-from ian@freebsd.org) X-MHO-User: 97c84e62-2382-11e7-8c46-c35e37f62db1 X-Report-Abuse-To: https://support.duocircle.com/support/solutions/articles/5000540958-duocircle-standard-smtp-abuse-information X-Originating-IP: 73.78.92.27 X-Mail-Handler: DuoCircle Outbound SMTP Received: from ilsoft.org (unknown [73.78.92.27]) by outbound2.ore.mailhop.org (Halon) with ESMTPSA id 97c84e62-2382-11e7-8c46-c35e37f62db1; Mon, 17 Apr 2017 15:29:06 +0000 (UTC) Received: from rev (rev [172.22.42.240]) by ilsoft.org (8.15.2/8.15.2) with ESMTP id v3HFTaCg006464; Mon, 17 Apr 2017 09:29:36 -0600 (MDT) (envelope-from ian@freebsd.org) Message-ID: <1492442976.96207.12.camel@freebsd.org> Subject: Re: r316958: booting a server takes >10 minutes! From: Ian Lepore To: Maxim Sobolev , Ben Woods , Peter Wemm Cc: Larry Rosenman , Kurt Jaeger , FreeBSD CURRENT Date: Mon, 17 Apr 2017 09:29:36 -0600 In-Reply-To: References: <20170415135314.6e628657@thor.intern.walstatt.dynvpn.de> <2321.1492272025@critter.freebsd.dk> <20170415160916.GY1326@albert.catwhisker.org> <4292.1492274488@critter.freebsd.dk> <20170415184331.GB1326@albert.catwhisker.org> <3E0D0513-0337-40E1-8173-11D845C0EFF4@lerctr.org> <20170415191329.GA74780@home.opsec.eu> Content-Type: text/plain; charset="ISO-8859-1" X-Mailer: Evolution 3.18.5.1 FreeBSD GNOME Team Port Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Apr 2017 15:29:46 -0000 On Sun, 2017-04-16 at 21:53 -0700, Maxim Sobolev wrote: > Well, all this suggests to me that there must be some issue with the client > syslog code in the libc, so that if syslog daemon hangs or has some > internal issue that would basically render system mostly unusable. I think > that might be an interesting project for somebody who has some spare time > on hands to take syslogd as of (r317033 - 1) and see what can be done to > improve resilience of the system against such a failure mode. > > -Max > On the sending side, the libc code tries very hard to deliver messages to the unpriveleged /var/run/log socket; if the datagram send fails due to buffer space (i.e., due to syslogd not keeping up on the read side), it will endlessly loop to sleep for 1us then try again until it succeeds. On the other hand, for /var/run/logpriv apparently the theory is that hanging a process with enough privs to use that connection would be bad.  So it retries just once for errors that are not related to buffer space, and doesn't retry at all if the error was buffer space (which is a case of the code not quite matching the nearby comments) then gives up on syslogd and writes the message directly to the console before returning. So yeah, there may be some room for improvement in that logic. :)  I think it could eventually give up in the non-priv case and maybe try an extra time or two in the priveleged case. When we ran into this at $work years ago we just wrote our own work- alike function to use instead of syslog(3); it retries any kind of failure no more than 3 times, with a millisecond sleep between each try.  (Losing logging is bad, but losing the functionality of our app that's trying to do the logging is even worse.) -- Ian > On Sun, Apr 16, 2017 at 5:50 PM, Ben Woods > wrote: > > > > > On 16 April 2017 at 03:24, Larry Rosenman wrote: > > > > > > > > Current SVN seems to have fixed it (via sobomax@ syslogd commit). > > > > > > > I experienced this issue too, and can confirm that it existing on > > r316952, > > but is resolve on r317033. > > > > It was extremely strange. The symptoms I was experiencing were: > > - lightdm display manager would fail to start > > - slim display manager would start, but then fail to login to xfce > > - "service hald restart" and "service dbus restart" would fail > > - "pkg upgrade hal" would fail > > > > Regards, > > Ben > > > > -- > > From: Benjamin Woods > > woodsb02@gmail.com > > _______________________________________________ > > freebsd-current@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-current > > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freeb > > sd.org" > > > > > _______________________________________________ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd > .org"