Date: Sun, 01 Nov 2020 10:04:40 +0000 From: bugzilla-noreply@freebsd.org To: virtualization@FreeBSD.org Subject: [Bug 250770] AWS EC2 system freezes up possibly associated with NFS (EFS) Message-ID: <bug-250770-27103-2WLx5yuavY@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-250770-27103@https.bugs.freebsd.org/bugzilla/> References: <bug-250770-27103@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D250770 --- Comment #1 from Gunther Schadow <raj@gusw.net> --- Note, this is not like the ENA kernel panic nor the other AWS EC2 freeze on= t3 bug (apparently fixed in 12.1-RELENG) nor does the presence or absence of t= he "intsmb0: Could not allocate I/O space" error during boot (open since 2018) make any difference. So this is not a duplicate of those bugs. sysctl -a |... kern.features.nfsd: 1 kern.features.nfscl: 1 vfs.nfs.downdelayinitial: 12 vfs.nfs.downdelayinterval: 30 vfs.nfs.defect: 0 vfs.nfs.iodmax: 20 vfs.nfs.iodmin: 0 vfs.nfs.iodmaxidle: 120 vfs.nfs.use_buf_pager: 1 vfs.nfs.fileid_maxwarnings: 10 vfs.nfs.diskless_rootpath: vfs.nfs.diskless_valid: 0 vfs.nfs.nfs_ip_paranoia: 1 vfs.nfs.nfs_directio_allow_mmap: 1 vfs.nfs.nfs_keep_dirty_on_error: 0 vfs.nfs.nfs_directio_enable: 0 vfs.nfs.clean_pages_on_close: 1 vfs.nfs.commit_on_close: 0 vfs.nfs.prime_access_cache: 0 vfs.nfs.access_cache_timeout: 60 vfs.nfs.dssameconn: 0 vfs.nfs.ignore_eexist: 0 vfs.nfs.pnfsiothreads: -1 vfs.nfs.userhashsize: 100 vfs.nfs.debuglevel: 0 vfs.nfs.callback_addr: vfs.nfs.realign_count: 0 vfs.nfs.realign_test: 0 vfs.nfs.pnfsmirror: 1 vfs.nfs.enable_uidtostring: 0 vfs.nfs.dsretries: 2 vfs.nfs.skip_wcc_data_onerr: 1 vfs.nfs.nfs3_jukebox_delay: 10 vfs.nfs.reconnects: 0 vfs.nfs.bufpackets: 4 searching for ways to do some debug logging, should I enable this? debug.fail_point.status_nfscl_force_fileid_warning: off debug.fail_point.nfscl_force_fileid_warning: off How might I log NFS warnings? Current setup with out of the box unchanged /etc/syslog.conf # $FreeBSD: releng/12.2/usr.sbin/syslogd/syslog.conf 338146 2018-08-21 17:01:47Z brd $ # # Spaces ARE valid field separators in this file. However, # other *nix-like systems still insist on using tabs as field # separators. If you are sharing this file between systems, you # may want to use only tabs as field separators here. # Consult the syslog.conf(5) manpage. *.err;kern.warning;auth.notice;mail.crit /dev/console *.notice;authpriv.none;kern.debug;lpr.info;mail.crit;news.err=20=20 /var/log/messages security.* /var/log/security auth.info;authpriv.info /var/log/auth.log mail.info /var/log/maillog cron.* /var/log/cron !-devd *.=3Ddebug /var/log/debug.log *.emerg * # uncomment this to log all writes to /dev/console to /var/log/console.log # touch /var/log/console.log and chmod it to mode 600 before it will work #console.info /var/log/console.log # uncomment this to enable logging of all log messages to /var/log/all.log # touch /var/log/all.log and chmod it to mode 600 before it will work #*.* /var/log/all.log # uncomment this to enable logging to a remote loghost named loghost #*.* @loghost # uncomment these if you're running inn # news.crit /var/log/news/news.crit # news.err /var/log/news/news.err # news.notice /var/log/news/news.notice # Uncomment this if you wish to see messages produced by devd # !devd # *.>=3Dnotice /var/log/devd.log !* include /etc/syslog.d include /usr/local/etc/syslog.d in /var/log/messages you see the reboot without any issues reported: Nov 1 04:00:13 freebsd ec2[777]: ############################################################# ... Nov 1 05:49:13 freebsd su[1602]: ec2-user to root on /dev/pts/5 Nov 1 08:56:09 freebsd syslogd: exiting on signal 15 Nov 1 08:59:01 freebsd syslogd: kernel boot file is /boot/kernel/kernel Nov 1 08:59:01 freebsd kernel: ---<<BOOT>>--- not sure why syslogd is exiting on signal 15, that might have to do wit the lock-up, but more likely has to do with my force-stop action on the AWS console. Here is the earlier lock-up sequence: Oct 31 23:58:12 freebsd su[888]: ec2-user to root on /dev/pts/2 Oct 31 23:58:23 freebsd su[891]: ec2-user to root on /dev/pts/1 Oct 31 23:58:43 freebsd fsck[861]: /dev/gpt/varfs: 550 files, 1650 used, 12= 5234 free (106 frags, 15641 blocks, 0.1% fragmentation) Nov 1 03:48:09 freebsd syslogd: exiting on signal 15 Nov 1 03:50:35 freebsd syslogd: kernel boot file is /boot/kernel/kernel Nov 1 03:50:35 freebsd kernel: ---<<BOOT>>--- about 4 hours like I estimated, and the "syslogd: exiting on signal 15" is = just too close to the next boot that I don't think it is a herald of the lock-up, but rather it happens when I do the force-stop on the AWS EC2 management dashboard. And this would prove that we are not losing log messages due to = the lock-up. If the system had anything to say, it would have said it! --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-250770-27103-2WLx5yuavY>