Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 7 Feb 2012 17:46:49 +0000
From:      Anton Shterenlikht <mexas@bristol.ac.uk>
To:        freebsd-ia64@freebsd.org
Cc:        xcllnt@mac.com
Subject:   Re: fatal kernel trap
Message-ID:  <20120207174649.GA89244@mech-cluster241.men.bris.ac.uk>
In-Reply-To: <20120207111557.GA82299@mech-cluster241.men.bris.ac.uk>
References:  <20120206142239.GA71689@mech-cluster241.men.bris.ac.uk> <20120206144444.GA71830@mech-cluster241.men.bris.ac.uk> <20120207094713.GA81250@mech-cluster241.men.bris.ac.uk> <20120207102305.GA81545@mech-cluster241.men.bris.ac.uk> <20120207111557.GA82299@mech-cluster241.men.bris.ac.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Feb 07, 2012 at 11:15:57AM +0000, Anton Shterenlikht wrote:
> On Tue, Feb 07, 2012 at 10:23:05AM +0000, Anton Shterenlikht wrote:
> > On Tue, Feb 07, 2012 at 09:47:13AM +0000, Anton Shterenlikht wrote:
> > > On Mon, Feb 06, 2012 at 02:44:44PM +0000, Anton Shterenlikht wrote:
> > > > > fatal kernel trap (cpu 1):
> > > > > 
> > > > >     trap vector = 0x14 (Page Not Present)
> > > > >     cr.iip      = 0x9ffc0000008cb960
> > > > >     cr.ipsr     = 0x1010080a6018 (ac,mfl,ic,i,dt,dfh,rt,cpl=0,it,ri=0,bn)
> > > > >     cr.isr      = 0x400000000 (code=0,vector=0,r,ei=0)
> > > > >     cr.ifa      = 0x168
> > > > >     curthread   = 0xe000000011a9f9e0
> > > > >         pid = 760, comm = dig
> > > > > 
> > > > > [ thread pid 760 tid 100073 ]
> > > > > Stopped at      cpu_set_upcall+0x190:   [M0]    ld8 r14=[r14] ;;
> > > > > db> 
> > > > > db> show proc 760
> > > > > Process 760 (dig) at 0xe000000011a9a8e0:
> > > > >  state: NORMAL
> > > > >  uid: 0  gids: 0
> > > > >  parent: pid 759 at 0xe000000011b64000
> > > > >  ABI: FreeBSD ELF64
> > > > >  arguments: dig
> > > > >  threads: 1
> > > > > 100073                   Run     CPU 1                       dig
> > > > > db> 
> > > > > db> thread 100073
> > > > > [ thread pid 760 tid 100073 ]
> > > > > cpu_set_upcall+0x190:   [M0]    ld8 r14=[r14] ;;
> > > > > db>
> > > > > db> bt
> > > > > Tracing pid 760 tid 100073 td 0xe000000011a9f9e0
> > > > > cpu_set_upcall(0xe000000011a9e8a0, 0xe000000011a9f9e0, 0xa0000000f87ab780, 0xa0000000f87ab550) at cpu_set_upcall+0x190
> > > > > create_thread(0xe000000011a9f9e0, 0x0, 0x1209a7090, 0x120c04800, 0x7fffffffff9fe000, 0x200000, 0x12039c200, 0x120c04800) at create_thread+0x1c0
> > > > > kern_thr_new(0xe000000011a9f9e0, 0xa0000000f872b330, 0x9ffc000000436360) at kern_thr_new+0x100
> > > > > sys_thr_new(0xe000000011a9f9e0, 0xa0000000f872b4e8, 0x9ffc0000008c6bf0, 0x48d) at sys_thr_new+0xa0
> > > > > syscall(0xe000000011a9a8e0, 0xa0000000f872b3a8, 0x120c0442c, 0xe000000011a9f9e0, 0x0, 0x0, 0x9ffc0000008c2ec0, 0x8) at syscall+0x550
> > > > > epc_syscall_return() at epc_syscall_return
> > > > > db> 
> > > 
> > > This panic now happens very often:
> > > 
> > > fatal kernel trap (cpu 0):
> > > 
> > >     trap vector = 0x14 (Page Not Present)
> > >     cr.iip      = 0x9ffc0000008cb960
> > >     cr.ipsr     = 0x1010080a6018 (ac,mfl,ic,i,dt,dfh,rt,cpl=0,it,ri=0,bn)
> > >     cr.isr      = 0x400000000 (code=0,vector=0,r,ei=0)
> > >     cr.ifa      = 0x168
> > >     curthread   = 0xe0000000114159e0
> > >         pid = 57093, comm = csup
> > > 
> > > [ thread pid 57093 tid 100058 ]
> > > Stopped at      cpu_set_upcall+0x190:   [M0]    ld8 r14=[r14] ;;
> > > db> 
> > > db> show thread
> > > Thread 100058 at 0xe0000000114159e0:
> > >  proc (pid 57093): 0xe00000001198ed50
> > >  name: csup
> > >  stack: 0xa0000000ecd4e000-0xa0000000ecd55fff
> > >  flags: 0x4  pflags: 0
> > >  state: RUNNING (CPU 0)
> > >  priority: 126
> > >  container lock: sched lock 0 (0x9ffc000000ca6a00)
> > > db> 
> > > db> show proc
> > > Process 57093 (csup) at 0xe00000001198ed50:
> > >  state: NORMAL
> > >  uid: 0  gids: 0, 5
> > >  parent: pid 57091 at 0xe000000011aac470
> > >  ABI: FreeBSD ELF64
> > >  arguments: /usr/bin/csup
> > >  threads: 1
> > > 100058                   Run     CPU 0                       csup
> > > db> 
> > > db> bt
> > > Tracing pid 57093 tid 100058 td 0xe0000000114159e0
> > > cpu_set_upcall(0xe000000015e79590, 0xe0000000114159e0, 0xa0000000f89fb780, 0xa0000000f89fb550) at cpu_set_upcall+0x190
> > > create_thread(0xe0000000114159e0, 0x0, 0x1200b3890, 0x120805800, 0x7fffffffff9fe000, 0x200000, 0x1200b6200, 0x120805800) at create_thread+0x1c0
> > > kern_thr_new(0xe0000000114159e0, 0xa0000000ecd55330, 0x9ffc000000436360) at kern_thr_new+0x100
> > > sys_thr_new(0xe0000000114159e0, 0xa0000000ecd554e8, 0x9ffc0000008c6bf0, 0x48d) at sys_thr_new+0xa0
> > > syscall(0xe00000001198ed50, 0xa0000000ecd553a8, 0x12080442c, 0xe0000000114159e0, 0x0, 0x0, 0x9ffc0000008c2ec0, 0x8) at syscall+0x550
> > > epc_syscall_return() at epc_syscall_return
> > > db> 
> > 
> > and again:
> > 
> > fatal kernel trap (cpu 0):
> > 
> >     trap vector = 0x14 (Page Not Present)
> >     cr.iip      = 0x9ffc0000008d1960
> >     cr.ipsr     = 0x1010080a6018 (ac,mfl,ic,i,dt,dfh,rt,cpl=0,it,ri=0,bn)
> >     cr.isr      = 0x400000000 (code=0,vector=0,r,ei=0)
> >     cr.ifa      = 0x168
> >     curthread   = 0xe000000011ed88a0
> >         pid = 1002, comm = csup
> > 
> > [ thread pid 1002 tid 100104 ]
> > Stopped at      cpu_set_upcall+0x190:   [M0]    ld8 r14=[r14] ;;
> > db> show thread
> > Thread 100104 at 0xe000000011ed88a0:
> >  proc (pid 1002): 0xe000000011b70470
> >  name: csup
> >  stack: 0xa0000000f881c000-0xa0000000f8823fff
> >  flags: 0x4  pflags: 0
> >  state: RUNNING (CPU 0)
> >  priority: 121
> >  container lock: sched lock 0 (0x9ffc000000cb6b80)
> > db> show proc
> > Process 1002 (csup) at 0xe000000011b70470:
> >  state: NORMAL
> >  uid: 0  gids: 0, 5
> >  parent: pid 998 at 0xe000000011fc8470
> >  ABI: FreeBSD ELF64
> >  arguments: csup
> >  threads: 1
> > 100104                   Run     CPU 0                       csup
> > db> bt
> > Tracing pid 1002 tid 100104 td 0xe000000011ed88a0
> > cpu_set_upcall(0xe000000011b8a000, 0xe000000011ed88a0, 0xa0000000f889b780, 0xa0000000f889b550) at cpu_set_upcall+0x190
> > create_thread(0xe000000011ed88a0, 0x0, 0x1200b3890, 0x120805800, 0x7fffffffff9fe000, 0x200000, 0x1200b6200, 0x120805800) at create_thread+0x1c0
> > kern_thr_new(0xe000000011ed88a0, 0xa0000000f8823330, 0x9ffc0000004363d0) at kern_thr_new+0x100
> > sys_thr_new(0xe000000011ed88a0, 0xa0000000f88234e8, 0x9ffc0000008ccbf0, 0x48d) at sys_thr_new+0xa0
> > syscall(0xe000000011b70470, 0xa0000000f88233a8, 0x12080442c, 0xe000000011ed88a0, 0x0, 0x0, 0x9ffc0000008c80a0, 0x8) at syscall+0x550
> > epc_syscall_return() at epc_syscall_return
> > db> 
> 
> Now it's ruby:
> 
> fatal kernel trap (cpu 1):
> 
>     trap vector = 0x14 (Page Not Present)
>     cr.iip      = 0x9ffc0000008d1960
>     cr.ipsr     = 0x1010080a6018 (ac,mfl,ic,i,dt,dfh,rt,cpl=0,it,ri=0,bn)
>     cr.isr      = 0x400000000 (code=0,vector=0,r,ei=0)
>     cr.ifa      = 0x168
>     curthread   = 0xe000000012122000
>         pid = 3832, comm = ruby
> 
> [ thread pid 3832 tid 100130 ]
> Stopped at      cpu_set_upcall+0x190:   [M0]    ld8 r14=[r14] ;;
> db> show thread
> Thread 100130 at 0xe000000012122000:
>  proc (pid 3832): 0xe000000012112d50
>  name: ruby
>  stack: 0xa0000000f88ec000-0xa0000000f88f3fff
>  flags: 0x10004  pflags: 0
>  state: RUNNING (CPU 1)
>  priority: 182
>  container lock: sched lock 1 (0x9ffc000000cb7800)
> db> show proc
> Process 3832 (ruby) at 0xe000000012112d50:
>  state: NORMAL
>  uid: 0  gids: 0, 5
>  parent: pid 2574 at 0xe000000011c78470
>  ABI: FreeBSD ELF64
>  arguments: /usr/local/bin/ruby
>  threads: 1
> 100130                   Run     CPU 1                       ruby
> db> bt
> Tracing pid 3832 tid 100130 td 0xe000000012122000
> cpu_set_upcall(0xe000000012120cf0, 0xe000000012122000, 0xa0000000f8913780, 0xa0000000f8913550) at cpu_set_upcall+0x190
> create_thread(0xe000000012122000, 0x0, 0x14064bac0, 0x140804800, 0x7ffffffff3bfe000, 0xc000000, 0x14008c200, 0x140804800) at create_thread+0x1c0
> kern_thr_new(0xe000000012122000, 0xa0000000f88f3330, 0x9ffc0000004363d0) at kern_thr_new+0x100
> sys_thr_new(0xe000000012122000, 0xa0000000f88f34e8, 0x9ffc0000008ccbf0, 0x48d) at sys_thr_new+0xa0
> syscall(0xe000000012112d50, 0xa0000000f88f33a8, 0x14080442c, 0xe000000012122000, 0x0, 0x0, 0x9ffc0000008c80a0, 0x8) at syscall+0x550
> epc_syscall_return() at epc_syscall_return
> db> 

Marcel, these panics make the system unusable on
r231087, r231124. I'll try to roll back 1000 or
so and try again. Let me know if you want
any more info on this panic.

Thanks
Anton

-- 
Anton Shterenlikht
Room 2.6, Queen's Building
Mech Eng Dept
Bristol University
University Walk, Bristol BS8 1TR, UK
Tel: +44 (0)117 331 5944
Fax: +44 (0)117 929 4423



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120207174649.GA89244>