Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 Sep 2009 11:11:25 -0400
From:      David Samms <dsamms@nw-ds.com>
To:        freebsd-stable@freebsd.org
Subject:   Re: May running megarc still cause memory corruption on 7.X?
Message-ID:  <h9imiu$35b$1@ger.gmane.org>
In-Reply-To: <81d45flayz.fsf@zhuzha.ua1>
References:  <81d45flayz.fsf@zhuzha.ua1>

next in thread | previous in thread | raw e-mail | index | archive | help
Mikolaj Golub wrote:
> Hi,
> 
> Previously sysutils/megarc port was marked as broken with the statement:
> running megarc may cause memory corruption/system instability.
> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/128082
> 
> But recently it has been re-enabled:
> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/137938
> 
> Gerrit Beine (the maintainer) said that he verified on 7.2 and it worked.
> 
> But yesterday we had the panic on 7.1-RELEASE-p5 that looked like was caused
> by megarc with bt identical to reported in ports/128082.
> 
> Unread portion of the kernel message buffer:
> TPTE at 0xbfd20830  IS ZERO @ VA 4820c000
> panic: bad pte
> cpuid = 0
> Uptime: 10h19m56s
> Physical memory: 3059 MB
> Dumping 225 MB: 210 194 178 162 146 130 114 98 82 66 50 34 18 2
> 
> (kgdb) backtrace
> #0  doadump () at pcpu.h:196
> #1  0xc07910a7 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418
> #2  0xc0791379 in panic (fmt=Variable "fmt" is not available.
> ) at /usr/src/sys/kern/kern_shutdown.c:574
> #3  0xc0aa37f6 in pmap_remove_pages (pmap=0xc69ae6e4) at /usr/src/sys/i386/i386/pmap.c:3084
> #4  0xc09cf79c in vmspace_exit (td=0xc64f68c0) at /usr/src/sys/vm/vm_map.c:404
> #5  0xc076b6ad in exit1 (td=0xc64f68c0, rv=0) at /usr/src/sys/kern/kern_exit.c:305
> #6  0xc076ca0d in sys_exit (td=Could not find the frame base for "sys_exit".
> ) at /usr/src/sys/kern/kern_exit.c:109
> #7  0xc0aa81a5 in syscall (frame=0xe8d6ed38) at /usr/src/sys/i386/i386/trap.c:1090
> #8  0xc0a8e6e0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255
> #9  0x00000033 in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> 
> (kgdb) allpcpu 
> cpuid        = 3
> curthread    = 0xc6ae3d20: pid 48975 "sh"
> curpcb       = 0xe8ea1d90
> fpcurthread  = none
> idlethread   = 0xc633daf0: pid 11 "idle: cpu3"
> switchticks  = 37193321
> 
> cpuid        = 2
> curthread    = 0xc633d8c0: pid 12 "idle: cpu2"
> curpcb       = 0xe4f10d90
> fpcurthread  = none
> idlethread   = 0xc633d8c0: pid 12 "idle: cpu2"
> switchticks  = 37193374
> 
> cpuid        = 1
> curthread    = 0xc633d690: pid 13 "idle: cpu1"
> curpcb       = 0xe4f13d90
> fpcurthread  = none
> idlethread   = 0xc633d690: pid 13 "idle: cpu1"
> switchticks  = 37193374
> 
> cpuid        = 0
> curthread    = 0xc64f68c0: pid 48980 "sh"
> curpcb       = 0xe8d6ed90
> fpcurthread  = none
> idlethread   = 0xc633d460: pid 14 "idle: cpu0"
> switchticks  = 37193321
> 
> (kgdb) ps
>   pid  ppid  pgrp   uid   state   wmesg     wchan    cmd
> 48980 48975 48975     0  RE      CPU  0              sh
> 48978 48976 48976     0  R                           megarc
> 48976 48973 48976     0  Ss      wait     0xc826e570 sh
> 48975 48972 48975     0  Rs      CPU  3              sh
> 48973   705   705     0  S       piperd   0xc8303318 cron
> 48972   705   705     0  S       piperd   0xc674a18c cron
> 48267 18141 18141    80  S       lockf    0xc83922c0 httpd
> 48266 18141 18141    80  S       lockf    0xc7d62400 httpd
> 48265 18141 18141    80  S       select   0xc0c4ecb8 httpd
> 48264 18141 18141    80  S       lockf    0xc7ceb240 httpd
> ...
> 
> At the moment of the crash megarc was run by cron (48973) at the same time
> other cron job was started (we have the following script set up to run in the
> same time:
> 
> if [ -x /usr/local/bin/vnstat ] && [ `ls -l /var/db/vnstat/ | wc -l` -ge 1 ]; then /usr/local/bin/vnstat -u; fi)
> 
> and this sh process caused panic on its exit when kernel was trying to remove
> its address space due to corrupted memory.
> 
> Should I add the comment to ports/137938 about this? I cc to Gerrit. Please
> note, we are using 7.1-RELEASE-p5 while in ports/137938 it is said that it was
> checked on 7.2. But it might be that Gerrit just did not test long enough? We
> had megarc enabled on several 7.1 hosts for some times and saw only this one
> panic (well, there was another one about a week ago, but it looked hardly
> related, because megarc was not running at the moment of the crash and the
> panic was when removing an entry from the namecache, I reported it to
> hackers@).
> 
> Below some details from gdb session in case someone is interested to look at
> this closer.
> 
> (kgdb) allchains 
> # no output
> 
> (kgdb) fr 5
> #5  0xc076b6ad in exit1 (td=0xc64f68c0, rv=0) at /usr/src/sys/kern/kern_exit.c:305
> 305             vmspace_exit(td);
> (kgdb) p *td->td_proc
> $1 = {p_list = {le_next = 0xc69a2570, le_prev = 0xc0c433f8}, p_threads = {tqh_first = 0xc64f68c0, 
>     tqh_last = 0xc64f68c8}, p_upcalls = {tqh_first = 0x0, tqh_last = 0xc6502838}, p_slock = {
>     lock_object = {lo_name = 0xc0b3b5ae "process slock", lo_type = 0xc0b3b5ae "process slock", 
>       lo_flags = 720896, lo_witness_data = {lod_list = {stqe_next = 0x0}, lod_witness = 0x0}}, 
>     mtx_lock = 4, mtx_recurse = 0}, p_ucred = 0xc708f700, p_fd = 0x0, p_fdtol = 0x0, 
>   p_stats = 0xc64f8000, p_limit = 0xc7c60800, p_limco = {c_links = {sle = {sle_next = 0x0}, tqe = {
>         tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_mtx = 0xc65028b8, 
>     c_flags = 0}, p_sigacts = 0xc7d00000, p_flag = 268443648, p_state = PRS_NORMAL, p_pid = 48980, 
>   p_hash = {le_next = 0x0, le_prev = 0xc632ad50}, p_pglist = {le_next = 0x0, le_prev = 0xc709b8a0}, 
>   p_pptr = 0xc709b828, p_sibling = {le_next = 0x0, le_prev = 0xc709b8b4}, p_children = {
>     lh_first = 0x0}, p_mtx = {lock_object = {lo_name = 0xc0b3b5a1 "process lock", 
>       lo_type = 0xc0b3b5a1 "process lock", lo_flags = 21168128, lo_witness_data = {lod_list = {
>           stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4, mtx_recurse = 0}, p_ksi = 0xc6655cd0, 
>   p_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = {
>       tqh_first = 0x0, tqh_last = 0xc65028f4}, sq_proc = 0xc6502828, sq_flags = 1}, p_oppid = 0, 
>   p_vmspace = 0xc69ae658, p_swtick = 37193315, p_realtimer = {it_interval = {tv_sec = 0, tv_usec = 0}, 
>     it_value = {tv_sec = 0, tv_usec = 0}}, p_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {
>       tv_sec = 0, tv_usec = 0}, ru_maxrss = 0, ru_ixrss = 0, ru_idrss = 0, ru_isrss = 0, ru_minflt = 0, 
>     ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, 
>     ru_nsignals = 0, ru_nvcsw = 0, ru_nivcsw = 0}, p_rux = {rux_runtime = 0, rux_uticks = 0, 
>     rux_sticks = 0, rux_iticks = 0, rux_uu = 0, rux_su = 0, rux_tu = 0}, p_crux = {
>     rux_runtime = 20485868, rux_uticks = 0, rux_sticks = 0, rux_iticks = 0, rux_uu = 0, rux_su = 6784, 
>     rux_tu = 6784}, p_profthreads = 0, p_exitthreads = 0, p_traceflag = 0, p_tracevp = 0x0, 
>   p_tracecred = 0x0, p_textvp = 0xc66dce04, p_lock = 0 '\0', p_sigiolst = {slh_first = 0x0}, 
>   p_sigparent = 20, p_sig = 0, p_code = 0, p_stops = 0, p_stype = 0, p_step = 0 '\0', 
>   p_pfsflags = 0 '\0', p_nlminfo = 0x0, p_aioinfo = 0x0, p_singlethread = 0x0, p_suspcount = 0, 
>   p_xthread = 0x0, p_boundary_count = 0, p_pendingcnt = 0, p_itimers = 0x0, p_numupcalls = 0, 
>   p_upsleeps = 0, p_completed = 0x0, p_nextupcall = 0, p_upquantum = 0, p_magic = 3203398350, 
>   p_osrel = 701000, p_comm = "sh\000n\000er", '\0' <repeats 12 times>, p_pgrp = 0xc839c5c0, 
>   p_sysent = 0xc0c0a6e0, p_args = 0xc7c25b00, p_cpulimit = 9223372036854775807, p_nice = 0 '\0', 
>   p_fibnum = 0, p_xstat = 0, p_klist = {kl_list = {slh_first = 0x0}, 
>     kl_lock = 0xc0766af0 <knlist_mtx_lock>, kl_unlock = 0xc07664d0 <knlist_mtx_unlock>, 
>     kl_locked = 0xc07664b0 <knlist_mtx_locked>, kl_lockarg = 0xc65028b8}, p_numthreads = 1, p_md = {
>     md_ldt = 0x0}, p_itcallout = {c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, 
>         tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_mtx = 0x0, c_flags = 16}, 
>   p_acflag = 1, p_peers = 0x0, p_leader = 0xc6502828, p_emuldata = 0x0, p_label = 0x0, 
>   p_sched = 0xc6502ae0, p_ktr = {stqh_first = 0x0, stqh_last = 0xc6502ad0}, p_mqnotifier = {
>     lh_first = 0x0}, p_dtrace = 0x0}
> (kgdb) p *td
> $8 = {td_lock = 0xc0c4bcc0, td_proc = 0xc6502828, td_plist = {tqe_next = 0x0, tqe_prev = 0xc6502830}, 
>   td_slpq = {tqe_next = 0x0, tqe_prev = 0xc632f040}, td_lockq = {tqe_next = 0x0, 
>     tqe_prev = 0xe8ee6a6c}, td_selq = {tqh_first = 0x0, tqh_last = 0xc64f68e0}, 
>   td_sleepqueue = 0xc632f040, td_turnstile = 0xc68d6eb0, td_umtxq = 0xc64d8840, td_tid = 100094, 
>   td_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = {
>       tqh_first = 0x0, tqh_last = 0xc64f6918}, sq_proc = 0xc6502828, sq_flags = 1}, td_flags = 65542, 
>   td_inhibitors = 0, td_pflags = 0, td_dupfd = 0, td_sqqueue = 0, td_wchan = 0x0, td_wmesg = 0x0, 
>   td_lastcpu = 0 '\0', td_oncpu = 0 '\0', td_owepreempt = 0 '\0', td_locks = -2, td_tsqueue = 0 '\0', 
>   td_blocked = 0x0, td_lockname = 0x0, td_contested = {lh_first = 0x0}, td_sleeplocks = 0x0, 
>   td_intr_nesting_level = 0, td_pinned = 3, td_mailbox = 0x0, td_ucred = 0xc708f700, td_standin = 0x0, 
>   td_upcall = 0x0, td_estcpu = 0, td_slptick = 0, td_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, 
>     ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 1768, ru_ixrss = 1512, ru_idrss = 8792, 
>     ru_isrss = 1792, ru_minflt = 51, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, 
>     ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 2, ru_nivcsw = 1}, td_runtime = 3186278, 
>   td_pticks = 13, td_sticks = 14, td_iticks = 0, td_uticks = 0, td_uuticks = 0, td_usticks = 0, 
>   td_intrval = 0, td_oldsigmask = {__bits = {0, 0, 0, 0}}, td_sigmask = {__bits = {0, 0, 0, 0}}, 
>   td_generation = 3, td_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags = 4}, td_kflags = 0, td_xsig = 0, 
>   td_profil_addr = 0, td_profil_ticks = 0, td_name = '\0' <repeats 19 times>, td_base_pri = 134 '\206', 
>   td_priority = 134 '\206', td_pri_class = 3 '\003', td_user_pri = 144 '\220', 
>   td_base_user_pri = 144 '\220', td_pcb = 0xe8d6ed90, td_state = TDS_RUNNING, td_retval = {0, 
>     134598480}, td_slpcallout = {c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, 
>         tqe_prev = 0xda3550f0}}, c_time = 34564372, c_arg = 0xc64f68c0, 
>     c_func = 0xc07c2f90 <sleepq_timeout>, c_mtx = 0x0, c_flags = 18}, td_frame = 0xe8d6ed38, 
>   td_kstack_obj = 0xc677e554, td_kstack = 3906392064, td_kstack_pages = 2, td_altkstack_obj = 0x0, 
>   td_altkstack = 0, td_altkstack_pages = 0, td_critnest = 0, td_md = {md_spinlock_count = 0, 
>     md_saved_flags = 70}, td_sched = 0xc64f6abc, td_ar = 0x0, td_syscalls = 75641, 
>   td_incruntime = 3186278, td_cpuset = 0xc6331e38, td_fpop = 0x0, td_dtrace = 0x0, td_errno = 0}
> 
> (kgdb) thr 126
> [Switching to thread 126 (Thread 100083)]#0  sched_switch (td=0xc674cd20, newtd=Variable "newtd" is not available.
> )
>     at /usr/src/sys/kern/sched_ule.c:1944
> 1944                    cpuid = PCPU_GET(cpuid);
> (kgdb) backtrace
> #0  sched_switch (td=0xc674cd20, newtd=Variable "newtd" is not available.
> ) at /usr/src/sys/kern/sched_ule.c:1944
> #1  0xc0799136 in mi_switch (flags=Variable "flags" is not available.
> ) at /usr/src/sys/kern/kern_synch.c:440
> #2  0xc07c284b in sleepq_switch (wchan=Variable "wchan" is not available.
> ) at /usr/src/sys/kern/subr_sleepqueue.c:497
> #3  0xc07c2e96 in sleepq_wait (wchan=0xc6492f28) at /usr/src/sys/kern/subr_sleepqueue.c:580
> #4  0xc07995a6 in _sleep (ident=0xc6492f28, lock=0xc647592c, priority=76, wmesg=0xc0b042bb "amrwcmd", 
>     timo=0) at /usr/src/sys/kern/kern_synch.c:226
> #5  0xc04e8ca4 in amr_wait_command (ac=0xc6492f28) at /usr/src/sys/dev/amr/amr.c:1392
> #6  0xc04e9faa in amr_ioctl (dev=0xc645e700, cmd=3224388353, addr=0xc7bb1c40 "\003", flag=1, 
>     td=0xc674cd20) at /usr/src/sys/dev/amr/amr.c:914
> #7  0xc0755d37 in giant_ioctl (dev=0xc645e700, cmd=3224388353, data=0xc7bb1c40 "\003", fflag=1, 
>     td=0xc674cd20) at /usr/src/sys/kern/kern_conf.c:408
> #8  0xc071ff47 in devfs_ioctl_f (fp=0xc707063c, com=3224388353, data=0xc7bb1c40, cred=0xc707a200, 
>     td=0xc674cd20) at /usr/src/sys/fs/devfs/devfs_vnops.c:595
> #9  0xc07c8005 in kern_ioctl (td=0xc674cd20, fd=4, com=3224388353, data=0xc7bb1c40 "\003") at file.h:268
> #10 0xc07c8164 in ioctl (td=0xc674cd20, uap=0xe8d33cfc) at /usr/src/sys/kern/sys_generic.c:570
> #11 0xc0aa81a5 in syscall (frame=0xe8d33d38) at /usr/src/sys/i386/i386/trap.c:1090
> #12 0xc0a8e6e0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255
> #13 0x00000033 in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> 
> (kgdb) fr 4
> #4  0xc07995a6 in _sleep (ident=0xc6492f28, lock=0xc647592c, priority=76, wmesg=0xc0b042bb "amrwcmd", 
>     timo=0) at /usr/src/sys/kern/kern_synch.c:226
> 226                     sleepq_wait(ident);
> (kgdb) p *lock
> $2 = {lo_name = 0xc0b04655 "AMR List Lock", lo_type = 0xc0b04655 "AMR List Lock", lo_flags = 16973824, 
>   lo_witness_data = {lod_list = {stqe_next = 0x0}, lod_witness = 0x0}}
> 
> (kgdb) fr 5
> #5  0xc04e8ca4 in amr_wait_command (ac=0xc6492f28) at /usr/src/sys/dev/amr/amr.c:1392
> 1392            error = msleep(ac,&sc->amr_list_lock, PRIBIO, "amrwcmd", 0);
> (kgdb) p ac
> $3 = (struct amr_command *) 0xc6492f28
> (kgdb) p *ac
> $4 = {ac_link = {stqe_next = 0x0}, ac_sc = 0xc6475000, ac_slot = 118 'v', ac_status = 0, ac_sg = {
>     sg32 = 0xe6993fe0, sg64 = 0xe6993fe0}, ac_sgbusaddr = 21716960, ac_sg64_lo = 0, ac_sg64_hi = 0, 
>   ac_mailbox = {mb_command = 3 '\003', mb_ident = 119 'w', mb_blkcount = 0, mb_lba = 0, 
>     mb_physaddr = 21773056, mb_drive = 0 '\0', mb_nsgelem = 0 '\0', res1 = 0 '\0', mb_busy = 0 '\0', 
>     mb_nstatus = 0 '\0', mb_status = 0 '\0', mb_completed = '\0' <repeats 45 times>, mb_poll = 0 '\0', 
>     mb_ack = 0 '\0', res2 = '\0' <repeats 15 times>}, ac_flags = 71, ac_retries = 0, ac_bio = 0x0, 
>   ac_complete = 0, ac_private = 0x0, ac_data = 0xc672ac90, ac_length = 8, ac_dmamap = 0x0, 
>   ac_dma64map = 0x0, ac_tag = 0xc647c600, ac_datamap = 0x0, ac_nsegments = 0, 
>   ac_mb_physaddr = 27487376, ac_ccb = 0xe699eb00, ac_ccb_busaddr = 21773056}
> 
> (kgdb) fr 6
> #6  0xc04e9faa in amr_ioctl (dev=0xc645e700, cmd=3224388353, addr=0xc7bb1c40 "\003", flag=1, 
>     td=0xc674cd20) at /usr/src/sys/dev/amr/amr.c:914
> 914         error = amr_wait_command(ac);
> (kgdb) p *td
> $6 = {td_lock = 0xc0c4bcc0, td_proc = 0xc69a2570, td_plist = {tqe_next = 0x0, tqe_prev = 0xc69a2578}, 
>   td_slpq = {tqe_next = 0x0, tqe_prev = 0xc632f3e0}, td_lockq = {tqe_next = 0x0, 
>     tqe_prev = 0xe8fe9a6c}, td_selq = {tqh_first = 0x0, tqh_last = 0xc674cd40}, 
>   td_sleepqueue = 0xc632f3e0, td_turnstile = 0xc7c95460, td_umtxq = 0xc66a2500, td_tid = 100083, 
>   td_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = {
>       tqh_first = 0x0, tqh_last = 0xc674cd78}, sq_proc = 0xc69a2570, sq_flags = 1}, td_flags = 4, 
>   td_inhibitors = 0, td_pflags = 0, td_dupfd = -1, td_sqqueue = 0, td_wchan = 0x0, td_wmesg = 0x0, 
>   td_lastcpu = 1 '\001', td_oncpu = 255 'ÿ', td_owepreempt = 0 '\0', td_locks = 6, td_tsqueue = 0 '\0', 
>   td_blocked = 0x0, td_lockname = 0x0, td_contested = {lh_first = 0x0}, td_sleeplocks = 0x0, 
>   td_intr_nesting_level = 0, td_pinned = 0, td_mailbox = 0x0, td_ucred = 0xc707a200, td_standin = 0x0, 
>   td_upcall = 0x0, td_estcpu = 0, td_slptick = 0, td_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, 
>     ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 0, ru_ixrss = 0, ru_idrss = 0, ru_isrss = 0, 
>     ru_minflt = 39, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, 
>     ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 9, ru_nivcsw = 1}, td_runtime = 6243427, td_pticks = 0, 
>   td_sticks = 0, td_iticks = 0, td_uticks = 0, td_uuticks = 0, td_usticks = 0, td_intrval = 0, 
>   td_oldsigmask = {__bits = {0, 0, 0, 0}}, td_sigmask = {__bits = {0, 0, 0, 0}}, td_generation = 10, 
>   td_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags = 4}, td_kflags = 0, td_xsig = 0, td_profil_addr = 0, 
>   td_profil_ticks = 0, td_name = '\0' <repeats 19 times>, td_base_pri = 76 'L', td_priority = 76 'L', 
>   td_pri_class = 3 '\003', td_user_pri = 128 '\200', td_base_user_pri = 128 '\200', 
>   td_pcb = 0xe8d33d90, td_state = TDS_CAN_RUN, td_retval = {0, 17}, td_slpcallout = {c_links = {sle = {
>         sle_next = 0xc0c3f8d0}, tqe = {tqe_next = 0xc0c3f8d0, tqe_prev = 0xda33fce8}}, 
>     c_time = 34422419, c_arg = 0xc674cd20, c_func = 0xc07c2f90 <sleepq_timeout>, c_mtx = 0x0, 
>     c_flags = 18}, td_frame = 0xe8d33d38, td_kstack_obj = 0xc68e9c1c, td_kstack = 3906150400, 
>   td_kstack_pages = 2, td_altkstack_obj = 0x0, td_altkstack = 0, td_altkstack_pages = 0, 
>   td_critnest = 1, td_md = {md_spinlock_count = 1, md_saved_flags = 582}, td_sched = 0xc674cf1c, 
>   td_ar = 0x0, td_syscalls = 99004, td_incruntime = 6243427, td_cpuset = 0xc6331e38, 
>   td_fpop = 0xc707063c, td_dtrace = 0x0, td_errno = 0}
> (kgdb) p *td->td_proc
> $7 = {p_list = {le_next = 0xc826e570, le_prev = 0xc6502828}, p_threads = {tqh_first = 0xc674cd20, 
>     tqh_last = 0xc674cd28}, p_upcalls = {tqh_first = 0x0, tqh_last = 0xc69a2580}, p_slock = {
>     lock_object = {lo_name = 0xc0b3b5ae "process slock", lo_type = 0xc0b3b5ae "process slock", 
>       lo_flags = 720896, lo_witness_data = {lod_list = {stqe_next = 0x0}, lod_witness = 0x0}}, 
>     mtx_lock = 4, mtx_recurse = 0}, p_ucred = 0xc707a200, p_fd = 0xc7c52d00, p_fdtol = 0x0, 
>   p_stats = 0xc674fd00, p_limit = 0xc7c34000, p_limco = {c_links = {sle = {sle_next = 0x0}, tqe = {
>         tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_mtx = 0xc69a2600, 
>     c_flags = 0}, p_sigacts = 0xc7c76000, p_flag = 268451840, p_state = PRS_NORMAL, p_pid = 48978, 
>   p_hash = {le_next = 0x0, le_prev = 0xc632ad48}, p_pglist = {le_next = 0x0, le_prev = 0xc826e5e8}, 
>   p_pptr = 0xc826e570, p_sibling = {le_next = 0x0, le_prev = 0xc826e5fc}, p_children = {
>     lh_first = 0x0}, p_mtx = {lock_object = {lo_name = 0xc0b3b5a1 "process lock", 
>       lo_type = 0xc0b3b5a1 "process lock", lo_flags = 21168128, lo_witness_data = {lod_list = {
>           stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4, mtx_recurse = 0}, p_ksi = 0xc6656a00, 
>   p_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = {
>       tqh_first = 0x0, tqh_last = 0xc69a263c}, sq_proc = 0xc69a2570, sq_flags = 1}, p_oppid = 0, 
>   p_vmspace = 0xc6fb7488, p_swtick = 37193314, p_realtimer = {it_interval = {tv_sec = 0, tv_usec = 0}, 
>     it_value = {tv_sec = 0, tv_usec = 0}}, p_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {
>       tv_sec = 0, tv_usec = 0}, ru_maxrss = 0, ru_ixrss = 0, ru_idrss = 0, ru_isrss = 0, ru_minflt = 0, 
>     ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, 
>     ru_nsignals = 0, ru_nvcsw = 0, ru_nivcsw = 0}, p_rux = {rux_runtime = 0, rux_uticks = 0, 
>     rux_sticks = 0, rux_iticks = 0, rux_uu = 0, rux_su = 0, rux_tu = 0}, p_crux = {rux_runtime = 0, 
>     rux_uticks = 0, rux_sticks = 0, rux_iticks = 0, rux_uu = 0, rux_su = 0, rux_tu = 0}, 
>   p_profthreads = 0, p_exitthreads = 0, p_traceflag = 0, p_tracevp = 0x0, p_tracecred = 0x0, 
>   p_textvp = 0xc6f368a0, p_lock = 0 '\0', p_sigiolst = {slh_first = 0x0}, p_sigparent = 20, p_sig = 0, 
>   p_code = 0, p_stops = 0, p_stype = 0, p_step = 0 '\0', p_pfsflags = 0 '\0', p_nlminfo = 0x0, 
>   p_aioinfo = 0x0, p_singlethread = 0x0, p_suspcount = 0, p_xthread = 0x0, p_boundary_count = 0, 
>   p_pendingcnt = 0, p_itimers = 0x0, p_numupcalls = 0, p_upsleeps = 0, p_completed = 0x0, 
>   p_nextupcall = 0, p_upquantum = 0, p_magic = 3203398350, p_osrel = 502010, 
>   p_comm = "megarc", '\0' <repeats 13 times>, p_pgrp = 0xc839d140, p_sysent = 0xc0c0a6e0, 
>   p_args = 0xc7bd6a00, p_cpulimit = 9223372036854775807, p_nice = 0 '\0', p_fibnum = 0, p_xstat = 0, 
>   p_klist = {kl_list = {slh_first = 0x0}, kl_lock = 0xc0766af0 <knlist_mtx_lock>, 
>     kl_unlock = 0xc07664d0 <knlist_mtx_unlock>, kl_locked = 0xc07664b0 <knlist_mtx_locked>, 
>     kl_lockarg = 0xc69a2600}, p_numthreads = 1, p_md = {md_ldt = 0x0}, p_itcallout = {c_links = {sle = {
>         sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, 
>     c_mtx = 0x0, c_flags = 16}, p_acflag = 0, p_peers = 0x0, p_leader = 0xc69a2570, p_emuldata = 0x0, 
>   p_label = 0x0, p_sched = 0xc69a2828, p_ktr = {stqh_first = 0x0, stqh_last = 0xc69a2818}, 
>   p_mqnotifier = {lh_first = 0x0}, p_dtrace = 0x0}
> 
> I am keeping vmcore in case any additional output is needed.
> 

I run 7.2 and after seeing the note that megarc port should work on 7.2 
I re-synced the source and installed it on a production server.

I ran megarc just once, and the server locked up within 8 hours while 
under very light load.  I was not able to confirm the crash was related 
to megarc, but since it was the first server crash since 7.2 came out 
and I strongly suspect megarc.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?h9imiu$35b$1>