Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 16 Jul 2006 23:23:38 +0900
From:      Norikatsu Shigemura <nork@FreeBSD.org>
To:        freebsd-current@FreeBSD.org
Subject:   many thread applications are unstable on 7-current
Message-ID:  <20060716232338.2357f50a.nork@FreeBSD.org>

next in thread | raw e-mail | index | archive | help
	I think that this is FYI, maybe.  Sorry, I don't know why it causes.
	I could not trace problems.

	The problems are happened for about one ~ two month, but I report
	following environment with SYMVER_ENABLE=YES.  I don't know what
	happened on 7-current without SYMBER_ENABLE.
FreeBSD nadesico.ninth-nine.com 7.0-CURRENT FreeBSD 7.0-CURRENT #87: Sun Jul 16 11:48:32 JST 2006     nork@nadesico.ninth-nine.com:/usr/obj/usr/src/sys/NADESICO  i386


	The applications which I troubled are following:
	1. cvsync (net/cvsync)
	2. openoffice.org (editors/openoffice.org-2.0)
	3. firefox (www/firefox)
	4. scim-launcher (textproc/scim)
	   It offten crash, and automatically recovery.  And it is too hard
	   to trace:-(.  So I didn't trace it.

	----------------------------------------------------------
	1. cvsync (net/cvsync)

# ktrace -i /usr/local/bin/cvsync -v -c FreeBSD.conf
	:
Updating (collection freebsd-cvsroot-ports/rcs)
/home/ncvs/CVSROOT-ports/commitlogs/.cvsync.QCnj4R: Unknown error: 0
Updater(RCS): UPDATE: /home/ncvs/CVSROOT-ports/commitlogs/ports
Updater: RCS Error
Socket Error: recv: 2 residue 2
Receiver Error: recv
Mux(RECV) Error: not running: 0
FileScan: RCS Error
Mux(SEND) Error: not running: 0
DirScan: RCS Error
Failed

	I got following kdump's result:
(snip)
  4667 cvsync   CALL  utimes(0x831071c,0xbf5faeb0)
  4667 cvsync   NAMI  "/home/ncvs/CVSROOT-ports/commitlogs/.cvsync.QCnj4R"
  4667 cvsync   RET   utimes 0
  4667 cvsync   CALL  rename(0x831071c,0x8310218)
  4667 cvsync   NAMI  "/home/ncvs/CVSROOT-ports/commitlogs/.cvsync.QCnj4R"
  4667 cvsync   NAMI  "/home/ncvs/CVSROOT-ports/commitlogs/ports"
  4667 cvsync   RET   rename 0
  4667 cvsync   CALL  unlink(0x831071c)
  4667 cvsync   NAMI  "/home/ncvs/CVSROOT-ports/commitlogs/.cvsync.QCnj4R"
  4667 cvsync   RET   unlink -1 errno 2 No such file or directory
  4667 cvsync   CALL  write(0x2,0xbf5fa690,0x45)
  4667 cvsync   GIO   fd 2 wrote 69 bytes
       "/home/ncvs/CVSROOT-ports/commitlogs/.cvsync.QCnj4R: Unknown error: 0
       "
  4667 cvsync   RET   write 69/0x45
(snip)

	It looks like errno cross-talking per threads.


	----------------------------------------------------------
	2. openoffice.org (editors/openoffice.org-2.0)

$ ktrace -i /usr/local/bin/openoffice.org
  (Open new document)
X IO Error

	I got following kdump's result:
(snip)
  4720 soffice.bin CALL  socket(PF_LOCAL|PF_UNIX,SOCK_STREAM,0)
  4720 soffice.bin RET   socket 36/0x24
(snip)
  4720 soffice.bin CALL  connect(0x24,0xbfbfceb8,0x13)
  4720 soffice.bin NAMI  "/tmp/.X11-unix/X0"
  4720 soffice.bin RET   connect 0
(snip)
  4720 soffice.bin CALL  fcntl(0x24,F_SETFD,FD_CLOEXEC)
  4720 soffice.bin RET   fcntl 0
(snip)
  4720 soffice.bin CALL  write(0x24,0x9014000,0x8)
  4720 soffice.bin GIO   fd 36 wrote 8 bytes
       0x0000 1700 0200 2401 0000                      |....$...|

  4720 soffice.bin RET   write 8
  4720 soffice.bin CALL  read(0x24,0xbf4f9aac,0x20)
  4720 soffice.bin RET   read -1 errno 35 Resource temporarily unavailable
  4720 soffice.bin CALL  write(0x2,0xbf4f9404,0xb)
  4720 soffice.bin GIO   fd 2 wrote 11 bytes
       "X IO Error
       "
  4720 soffice.bin RET   write 11/0xb
  4720 soffice.bin CALL  exit(0)

	Sorry, I don't have any coments.


	----------------------------------------------------------
	3. firefox (www/firefox)
$ ktrace -i /usr/X11R6/bin/firefox http://www.google.co.jp/

	I got following kdump's result:
(snip)
  4862 firefox-bin CALL  socket(PF_INET,SOCK_STREAM,0)
  4862 firefox-bin RET   socket 34/0x22
  4862 firefox-bin CALL  fcntl(0x22,F_GETFL,0xbf8fdb38)
  4862 firefox-bin RET   fcntl 2
  4862 firefox-bin CALL  fcntl(0x22,F_SETFL,O_RDWR|O_NONBLOCK)
  4862 firefox-bin RET   fcntl 0
  4862 firefox-bin CALL  connect(0x22,0xbf8fdc50,0x10)
  4862 firefox-bin RET   connect -1 errno 36 Operation now in progress
  4862 firefox-bin CALL  write(0x9,0x68284d2d,0x1)
  4862 firefox-bin GIO   fd 9 wrote 1 byte
       "8"
  4862 firefox-bin RET   write 1
  4862 firefox-bin CALL  close(0x22)
  4862 firefox-bin RET   close 0
(snip)
  4862 firefox-bin CALL  socket(PF_INET,SOCK_STREAM,0)
  4862 firefox-bin RET   socket 35/0x23
  4862 firefox-bin CALL  fcntl(0x23,F_GETFL,0xbf8fdb38)
  4862 firefox-bin RET   fcntl 2
  4862 firefox-bin CALL  fcntl(0x23,F_SETFL,O_RDWR|O_NONBLOCK)
  4862 firefox-bin RET   fcntl 0
  4862 firefox-bin CALL  connect(0x23,0xbf8fdc50,0x10)
  4862 firefox-bin RET   connect -1 errno 36 Operation now in progress
  4862 firefox-bin CALL  write(0x9,0x68284d2d,0x1)
  4862 firefox-bin GIO   fd 9 wrote 1 byte
       "8"
  4862 firefox-bin RET   write 1
  4862 firefox-bin CALL  close(0x23)
  4862 firefox-bin RET   close 0
(snip)

	I got following result of 'tcpdump -pn not port ssh'.

23:10:32.340676 IP 219.127.74.122.54603 > 66.249.89.104.80: S 3969568294:3969568294(0) win 65535 <mss 1414,nop,wscale 1,nop,nop,timestamp 29620081 0,sackOK,eol>
23:10:32.358419 IP 66.249.89.104.80 > 219.127.74.122.54603: S 2563066654:2563066654(0) ack 3969568295 win 8190 <mss 1402>
23:10:32.358486 IP 219.127.74.122.54603 > 66.249.89.104.80: R 3969568295:3969568295(0) win 0
23:10:32.823420 IP 219.127.74.122.54748 > 63.245.209.21.80: S 3885439161:3885439161(0) win 65535 <mss 1414,nop,wscale 1,nop,nop,timestamp 29620564 0,sackOK,eol>
23:10:32.970835 IP 63.245.209.21.80 > 219.127.74.122.54748: S 2028454746:2028454746(0) ack 3885439162 win 8190 <mss 1402>
23:10:32.970930 IP 219.127.74.122.54748 > 63.245.209.21.80: R 3885439162:3885439162(0) win 0

	According to tcpdump, this is two times connect(2) failure are
	matched.

219.127.74.122:54603 -> 66.249.89.104:80	SYN
219.127.74.122:54603 <- 66.249.89.104:80	SYN+ACK
219.127.74.122:54603 -> 66.249.89.104:80	RSET

219.127.74.122.54748 -> 63.245.209.21.80	SYN
219.127.74.122.54748 <- 63.245.209.21.80	SYN+ACK
219.127.74.122.54748 -> 63.245.209.21.80	RSET

	I know why connect(2) is falure, but I don't know why close(2).



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060716232338.2357f50a.nork>