From owner-freebsd-current@freebsd.org Thu Nov 15 22:10:26 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 02A80110BD38 for ; Thu, 15 Nov 2018 22:10:26 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-it1-x143.google.com (mail-it1-x143.google.com [IPv6:2607:f8b0:4864:20::143]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 40C598248B for ; Thu, 15 Nov 2018 22:10:25 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-it1-x143.google.com with SMTP id x124so4060027itd.1 for ; Thu, 15 Nov 2018 14:10:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=66EdJhhtIjRZICL7Mf2/Fr3RSyAD5cl+MnsyVPJmOms=; b=JegnkfqzB4dvHzFkO8aeviCbPm33jfl1guWlWq6UQp82YzWbAkz4V/YThBiYShmhX9 gASz3NNTlMV9vZxfMqRkBRsK2x+np7v8XRIWqAh+pBeDqsjQREtDTGHIBDFUfBwO1/v0 nQD2PgQ57YvzstNd0nPbB7sc2qt0wi4vbg0zQnxlFex+hfNDGxvfRn/1jpYhOUHGpHb3 O5DbOI14sPWSsioeEgw74mDrLvgp6ltUky242phndpqstSP03sCPjHbtum4oFz636pl7 t7BcNtayyU2e+yyfQH2dOZ3oVNucmsp+v//arqyDJXFQ46BWvWEN9e/LAJM3hawGkEhP GSKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=66EdJhhtIjRZICL7Mf2/Fr3RSyAD5cl+MnsyVPJmOms=; b=N6B50TD16f4D59EXZukdhX9aP76Cdy1BEZKk4F1mYvaBz9uH1q/q/8Y0aO9eoC+Cb4 SSQOZL+LtRHj33DnUHxrDvhQGdwSAzjKQeCISbuIhArDJvWKs//bE0Uw8y80bX3rJiOk Sh2yk7gdVrcfRULbDfQa8rM/9TGTRpTqwV5Dmdi3Xz6X8mkVsYB6F3A7KVPk+xf/nHcI MjNskRngIWcNJqIZQfINbATpCX+u62q+BNdBYJvtSiJfkAwmLW1ddIa71lm3tx74wa8i mL4V9eIYaNO+r8kQ9SN7hpHlI5WGUZodu5PDN2zDMPJLNESWhHCm9pkyX7QBDBubrSCm 1tmA== X-Gm-Message-State: AGRZ1gLHeBajT46P3xa3DCu5PajPzm4E/8mnxat+p8hnqGkkdh1D+RDe KUhJUj4UusDfQLHL2LE5PsN5F2td X-Google-Smtp-Source: AJdET5dLVyfqH3rN8iybcrpExEfkv9GYkhVaV8sECnl0IopiZ1Gmr9MpmpU+xUfi+Co5Vy60BNohWA== X-Received: by 2002:a05:660c:984:: with SMTP id z4mr7594715itj.77.1542319824511; Thu, 15 Nov 2018 14:10:24 -0800 (PST) Received: from raichu (toroon0560w-lp130-07-64-229-95-98.dsl.bell.ca. [64.229.95.98]) by smtp.gmail.com with ESMTPSA id z11-v6sm11813153itz.26.2018.11.15.14.10.23 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 15 Nov 2018 14:10:23 -0800 (PST) Sender: Mark Johnston Date: Thu, 15 Nov 2018 17:10:19 -0500 From: Mark Johnston To: Sylvain GALLIANO Cc: freebsd-current@freebsd.org Subject: Re: Panic on kern_event.c Message-ID: <20181115221019.GA2514@raichu> References: <20181107043503.GB30861@raichu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Rspamd-Queue-Id: 40C598248B X-Spamd-Result: default: False [-3.03 / 40.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[gmail.com:+]; RCPT_COUNT_TWO(0.00)[2]; MX_GOOD(-0.01)[cached: alt3.gmail-smtp-in.l.google.com]; FORGED_SENDER(0.30)[markj@freebsd.org,markjdb@gmail.com]; RCVD_TLS_LAST(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_NEQ_ENVFROM(0.00)[markj@freebsd.org,markjdb@gmail.com]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.997,0]; R_DKIM_ALLOW(-0.20)[gmail.com]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_SHORT(-0.75)[-0.750,0]; NEURAL_HAM_LONG(-1.00)[-0.999,0]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org]; DMARC_NA(0.00)[freebsd.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[3.4.1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.b.8.f.7.0.6.2.list.dnswl.org : 127.0.5.0]; IP_SCORE(-0.58)[ip: (1.69), ipnet: 2607:f8b0::/32(-2.67), asn: 15169(-1.81), country: US(-0.10)]; MID_RHS_NOT_FQDN(0.50)[] X-Rspamd-Server: mx1.freebsd.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Nov 2018 22:10:26 -0000 On Thu, Nov 08, 2018 at 05:05:03PM +0100, Sylvain GALLIANO wrote: > Hi, > > I replaced > << printf("XXX knote %p already in tailq status:%x kq_count:%d [%p %p] > %u\n",kn,kn->kn_status,kq->kq_count,kn->kn_tqe.tqe_next,kn->kn_tqe.tqe_prev,__LINE__); > by > >> panic("XXX knote %p already in tailq status:%x kq_count:%d [%p %p] > %u\n",kn,kn->kn_status,kq->kq_count,kn->kn_tqe.tqe_next,kn->kn_tqe.tqe_prev,__LINE__); > > Here is the stack during panic: > panic: XXX knote 0xfffff801e1c6ddc0 already in tailq status:1 kq_count:2 > [0 0xfffff8000957a978] 2671 > > cpuid = 0 > time = 1541688832 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2c/frame > 0xfffffe0412258fd0 > kdb_backtrace() at kdb_backtrace+0x53/frame 0xfffffe04122590a0 > vpanic() at vpanic+0x277/frame 0xfffffe0412259170 > doadump() at doadump/frame 0xfffffe04122591d0 > knote_enqueue() at knote_enqueue+0xf2/frame 0xfffffe0412259210 > kqueue_register() at kqueue_register+0xaed/frame 0xfffffe0412259340 > kqueue_kevent() at kqueue_kevent+0x13c/frame 0xfffffe04122595b0 > kern_kevent_fp() at kern_kevent_fp+0x66/frame 0xfffffe0412259610 > kern_kevent() at kern_kevent+0x17f/frame 0xfffffe0412259700 > kern_kevent_generic() at kern_kevent_generic+0xfe/frame 0xfffffe0412259780 > sys_kevent() at sys_kevent+0xaa/frame 0xfffffe0412259810 > syscallenter() at syscallenter+0x4e3/frame 0xfffffe04122598f0 > amd64_syscall() at amd64_syscall+0x1b/frame 0xfffffe04122599b0 > fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe04122599b0 > --- syscall (560, FreeBSD ELF64, sys_kevent), rip = 0x406e3bfa, rsp = > 0x7fffdf7e9db8, rbp = 0x7fffdf7e9e00 --- > KDB: enter: panic > > > you can get kernel.debug + vmcore at: > https://drive.google.com/drive/folders/1MbqJQm12-KOYDbb4-9uNRTnAdsNqLaIP?usp=sharing Could you please give the following patch a try? If possible, could you also ktrace one of the active syslog-ng processes for some time, perhaps 15 seconds, and share the kdump? I have been trying to reproduce the problem without any luck. diff --git a/sys/kern/kern_event.c b/sys/kern/kern_event.c index d9c670e29d60..9ef7c53361bf 100644 --- a/sys/kern/kern_event.c +++ b/sys/kern/kern_event.c @@ -224,6 +224,7 @@ SYSCTL_UINT(_kern, OID_AUTO, kq_calloutmax, CTLFLAG_RW, #define KQ_LOCK(kq) do { \ mtx_lock(&(kq)->kq_lock); \ } while (0) +#define KQ_LOCKPTR(kq) (&(kq)->kq_lock) #define KQ_FLUX_WAKEUP(kq) do { \ if (((kq)->kq_state & KQ_FLUXWAIT) == KQ_FLUXWAIT) { \ (kq)->kq_state &= ~KQ_FLUXWAIT; \ @@ -687,8 +688,11 @@ filt_timerexpire(void *knx) struct kq_timer_cb_data *kc; kn = knx; + KQ_OWNED(kn->kn_kq); kn->kn_data++; - KNOTE_ACTIVATE(kn, 0); /* XXX - handle locking */ + + if (!kn_in_flux(kn) || (kn->kn_status & KN_SCAN) != 0) + KNOTE_ACTIVATE(kn, 1); if ((kn->kn_flags & EV_ONESHOT) != 0) return; @@ -753,7 +757,7 @@ filt_timerattach(struct knote *kn) kn->kn_flags |= EV_CLEAR; /* automatically set */ kn->kn_status &= ~KN_DETACHED; /* knlist_add clears it */ kn->kn_ptr.p_v = kc = malloc(sizeof(*kc), M_KQUEUE, M_WAITOK); - callout_init(&kc->c, 1); + callout_init_mtx(&kc->c, KQ_LOCKPTR(kn->kn_kq), 0); filt_timerstart(kn, to); return (0); @@ -772,8 +776,10 @@ filt_timerstart(struct knote *kn, sbintime_t to) kc->next = to + sbinuptime(); kc->to = to; } + KQ_LOCK(kn->kn_kq); callout_reset_sbt_on(&kc->c, kc->next, 0, filt_timerexpire, kn, PCPU_GET(cpuid), C_ABSOLUTE); + KQ_UNLOCK(kn->kn_kq); } static void @@ -826,7 +832,6 @@ filt_timertouch(struct knote *kn, struct kevent *kev, u_long type) KQ_LOCK(kq); if (kn->kn_status & KN_QUEUED) knote_dequeue(kn); - kn->kn_status &= ~KN_ACTIVE; kn->kn_data = 0; KQ_UNLOCK(kq); @@ -1843,6 +1848,7 @@ kqueue_scan(struct kqueue *kq, int maxevents, struct kevent_copyops *k_ops, } TAILQ_INSERT_TAIL(&kq->kq_head, marker, kn_tqe); + marker->kn_status |= KN_QUEUED; influx = 0; while (count) { KQ_OWNED(kq); @@ -1861,8 +1867,10 @@ kqueue_scan(struct kqueue *kq, int maxevents, struct kevent_copyops *k_ops, } TAILQ_REMOVE(&kq->kq_head, kn, kn_tqe); + KASSERT((kn->kn_status & KN_QUEUED) != 0, + ("knote %p not queued", kn)); + kn->kn_status &= ~KN_QUEUED; if ((kn->kn_status & KN_DISABLED) == KN_DISABLED) { - kn->kn_status &= ~KN_QUEUED; kq->kq_count--; continue; } @@ -1876,7 +1884,6 @@ kqueue_scan(struct kqueue *kq, int maxevents, struct kevent_copyops *k_ops, ("knote %p is unexpectedly in flux", kn)); if ((kn->kn_flags & EV_DROP) == EV_DROP) { - kn->kn_status &= ~KN_QUEUED; kn_enter_flux(kn); kq->kq_count--; KQ_UNLOCK(kq); @@ -1888,7 +1895,6 @@ kqueue_scan(struct kqueue *kq, int maxevents, struct kevent_copyops *k_ops, KQ_LOCK(kq); continue; } else if ((kn->kn_flags & EV_ONESHOT) == EV_ONESHOT) { - kn->kn_status &= ~KN_QUEUED; kn_enter_flux(kn); kq->kq_count--; KQ_UNLOCK(kq); @@ -1910,8 +1916,8 @@ kqueue_scan(struct kqueue *kq, int maxevents, struct kevent_copyops *k_ops, if (kn->kn_fop->f_event(kn, 0) == 0) { KQ_LOCK(kq); KQ_GLOBAL_UNLOCK(&kq_global, haskqglobal); - kn->kn_status &= ~(KN_QUEUED | KN_ACTIVE | - KN_SCAN); + /* XXXMJ races with EVFILT_TIMER activation. */ + kn->kn_status &= ~(KN_ACTIVE | KN_SCAN); kn_leave_flux(kn); kq->kq_count--; kn_list_unlock(knl); @@ -1937,11 +1943,13 @@ kqueue_scan(struct kqueue *kq, int maxevents, struct kevent_copyops *k_ops, } if (kn->kn_flags & EV_DISPATCH) kn->kn_status |= KN_DISABLED; - kn->kn_status &= ~(KN_QUEUED | KN_ACTIVE); + kn->kn_status &= ~KN_ACTIVE; kq->kq_count--; - } else + } else if ((kn->kn_status & KN_QUEUED) == 0) { + kn->kn_status |= KN_QUEUED; TAILQ_INSERT_TAIL(&kq->kq_head, kn, kn_tqe); - + } + kn->kn_status &= ~KN_SCAN; kn_leave_flux(kn); kn_list_unlock(knl);