From owner-freebsd-stable@freebsd.org Sat Jan 5 10:01:42 2019 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 92DD114964B7 for ; Sat, 5 Jan 2019 10:01:42 +0000 (UTC) (envelope-from jurij.kovacic@ocpea.com) Received: from mail-it1-x130.google.com (mail-it1-x130.google.com [IPv6:2607:f8b0:4864:20::130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 30B2173CB3 for ; Sat, 5 Jan 2019 10:01:40 +0000 (UTC) (envelope-from jurij.kovacic@ocpea.com) Received: by mail-it1-x130.google.com with SMTP id g85so4850973ita.3 for ; Sat, 05 Jan 2019 02:01:40 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=yr3WXk3Vu5m4zdG1OU4xN+Ltn9YlrFNqtX0LTQ9sWUo=; b=VKv/b10A4O62zej0/4PcDIwfwV3mclKLzUXt8mOgM2y8WFXqnxRmqezZP9Oipw2a0M rb5BR+KYH6GiWYOpm12VaBRQt7hnL39oZ4lz9J3WF+CSGUQKEhA3C/0GwCmOdmOL8oWH AN2l+wxoH2lJLeFDck9CpgyqwhLbd6t8FZyls5AhxmenE4LXmh2As7G1bbMKZOvvAiIv KWIntdEUWKAIKUvX6zlU0N2nxYTCaOU1s9eKIWS+AN/FZcRUJok3CUcTpE3EwPoE0l+v ZavT6yh/1im4+Syqgxcx+W2zk+ohPhYkfVPuyFTRpLpex2furXK4QmUYrrJU9Rqyrmzt q+aw== X-Gm-Message-State: AJcUukckDaWxhtf4h3vQv16OqZ+gjX6xm50cyW1XCtoj60fIWhCe81e7 8jRltuUDGqeqqaL8eE9iO3+RQ/T6KX3VloEgoE0BcpApnb/AwQ== X-Google-Smtp-Source: ALg8bN5PSdBoCqZyDnoJNlCib1WQYkgHxv5YcCOeFt+tnOJf8MPVZ9o2UnHxnt38rR/5Y2yuxq9x4cXrZHauH6h4wh4= X-Received: by 2002:a24:2912:: with SMTP id p18mr2986020itp.16.1546682499862; Sat, 05 Jan 2019 02:01:39 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: =?UTF-8?B?SnVyaWogS292YcSNacSN?= Date: Sat, 5 Jan 2019 11:01:28 +0100 Message-ID: Subject: Re: Kernel panic on 11.2-RELEASE-p7 To: freebsd-stable@freebsd.org X-Rspamd-Queue-Id: 30B2173CB3 X-Spamd-Bar: ------ X-Spamd-Result: default: False [-6.21 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; R_DKIM_ALLOW(-0.20)[ocpea.com:s=google]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-stable@freebsd.org]; TO_DN_NONE(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; IP_SCORE(-2.39)[ip: (-8.00), ipnet: 2607:f8b0::/32(-2.20), asn: 15169(-1.67), country: US(-0.08)]; DKIM_TRACE(0.00)[ocpea.com:+]; MX_GOOD(-0.01)[alt1.aspmx.l.google.com,alt4.aspmx.l.google.com,alt3.aspmx.l.google.com,alt2.aspmx.l.google.com,aspmx.l.google.com]; RCVD_IN_DNSWL_NONE(0.00)[0.3.1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.b.8.f.7.0.6.2.list.dnswl.org : 127.0.5.0]; NEURAL_HAM_SHORT(-0.81)[-0.810,0]; DMARC_POLICY_ALLOW(-0.50)[ocpea.com,reject]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; RCVD_COUNT_TWO(0.00)[2] Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Jan 2019 10:01:42 -0000 Dear list, About a week ago, we had a kernel panic on Freebsd 11.2-RELEASE-p7 with GENERIC kernel, ZFS root. As the kernel was not compiled with debug support enabled, the resulting "vmcore" files were of little use. Consequently, I recompiled kernel with debug support: --- GENERIC 2018-12-29 08:03:04.786846000 +0100 +++ DEBUG 2018-12-29 08:23:36.522966000 +0100 @@ -19,11 +19,16 @@ # $FreeBSD: releng/11.2/sys/amd64/conf/GENERIC 333417 2018-05-09 16:14:12Z sbruno $ cpu HAMMER -ident GENERIC +ident DEBUG makeoptions DEBUG=3D-g # Build kernel with gdb(1) debug symbols makeoptions WITH_CTF=3D1 # Run ctfconvert(1) for DTrace support +# kernel debugging +options KDB +options KDB_UNATTENDED +options KDB_TRACE + options SCHED_ULE # ULE scheduler options PREEMPTION # Enable kernel thread preemption options INET # InterNETworking and installed it. After running for about a week, the server crashed again this night. Unfortunately, there are no "vmcore" files on "/var/crash" this time. The server has 12GB of RAM installed: # sysctl hw.physmem hw.physmem: 12843053056 and uses 2 swap partitions (2G each): # swapinfo -h Device 1K-blocks Used Avail Capacity /dev/ada0p2 2097152 642M 1.4G 31% /dev/ada1p2 2097152 638M 1.4G 31% Total 4194304 1.3G 2.7G 31% Dump device is set in /etc/rc.conf: # grep dump /etc/rc.conf # Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable dumpdev=3D"AUTO" There seems to be enough space left in "/var/crash": # zfs list | grep crash zroot/var/crash 857M 17.2G 857M /var/crash and like I said earlier, the system DID create "vmcore" files when crashing with GENERIC kernel. Is it possible that swap partition(s) are too small for the memory dump, now that the kernel is compiled with debug support? Or is some additional configuration needed to make the system save vmcore files? Please advise. Kind regards, Jurij On Tue, Dec 25, 2018 at 7:57 AM Jurij Kova=C4=8Di=C4=8D wrote: > Dear list, > > I hope I am posting this to the correct list - if not, I apologize (and > please advise where to post this instead). > > Today I experienced a kernel panic on a (physical) server, running Freebs= d > 11.2-RELEASE-p7 with GENERIC kernel, ZFS root: > > Fatal trap 9: general protection fault while in kernel mode > cpuid =3D 0; apic id =3D 00 > instruction pointer =3D 0x20:0xffffffff82299013 > stack pointer =3D 0x28:0xfffffe0352893ad0 > frame pointer =3D 0x28:0xfffffe0352893b10 > code segment =3D base 0x0, limit 0xfffff, type 0x1b > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > current process =3D 9 (dbuf_evict_thread) > trap number =3D 9 > panic: general protection fault > cpuid =3D 0 > KDB: stack backtrace: > #0 0xffffffff80b3d577 at kdb_backtrace+0x67 > #1 0xffffffff80af6b17 at vpanic+0x177 > #2 0xffffffff80af6993 at panic+0x43 > #3 0xffffffff80f77fdf at trap_fatal+0x35f > #4 0xffffffff80f7759e at trap+0x5e > #5 0xffffffff80f5808c at calltrap+0x8 > #6 0xffffffff8229c049 at dbuf_evict_one+0xe9 > #7 0xffffffff82297a15 at dbuf_evict_thread+0x1a5 > #8 0xffffffff80aba093 at fork_exit+0x83 > #9 0xffffffff80f58fae at fork_trampoline+0xe > > I have used "crashinfo" utility to generate the text file which is > available at this URL: http://www.ocpea.com/dump/core.txt > > At the time of the crash, the server was probably under more intensive I/= O > load (scheduled backup with rsync). > > This is a production server, so naturally, all advice is deeply > appreciated. :) > > Kind regards, > Jurij >