Date: Thu, 02 Nov 2017 15:13:34 +0000 From: bugzilla-noreply@freebsd.org To: chromium@FreeBSD.org Subject: [Bug 212812] www/chromium: tabs "hang" 10% of the time Message-ID: <bug-212812-28929-oTPBWwEPjF@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-212812-28929@https.bugs.freebsd.org/bugzilla/> References: <bug-212812-28929@https.bugs.freebsd.org/bugzilla/>
index | next in thread | previous in thread | raw e-mail
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=212812 --- Comment #33 from Vince Mulhollon <vince.mulhollon@springcitysolutions.com> --- Problem seems to be getting worse over the last few months and last few upgrades, to the point I've had to switch to Firefox recently now that tab rendering success rate is well below 10%. With no change in rate of tab rendering failure I have tried the following from the bug report: Disable V8 caching in chrome://flags (which also takes 10-20 attempts to render) Wipe .config/chromium (multiple times... first time I did that I wiped over a gig of junk) Disable hardware acceleration in chrome://settings "Use hardware Accleration when available" Multiple machines: I have three desktops using NFS home / LDAP etc so its trivial to try chromium across multiple machines, all with identical problems. I had various older proprietary nvidia driver software on my multiple machines, I've upgraded the machines multiple times since the problem began per nvidia-smi as of today I'm running the latest "Driver Version: 384.59" and there is no change in tab hanging. Because I have multiple different machines on the desk I've experienced the same failure on six nvidia cards of various ages and models: two older (circa 2014) geforce gtx 730 connected via analog vga 1280x1024, one ancient (circa 2011) geforce gtx 560ti DVI connection 1600x1200, and three new (circa 2017) geforce gtx 1050ti displayport connection 2560x1440 144 hz. I have no hardware problems running CPU/GPU intensive workloads for hours without any crashes such as firefox or minecraft. The bug I'm experiencing is solely related to chromium tab opening failing 90%+ of the time; never a graphics (or other subsystem) hang or kernel crash. I can and do run onshape.com for hours in firefox, which is an online html5 professional CAD program which would seem to torture test 3d graphics rendering and hardware acceleration in a html5 browser. I have an onshape CAD drawing open in another tab in firefox while I'm typing this; flawless operation, at least in firefox. Whatever is wrong, it probably doesn't relate to graphics hardware or hardware acceleration or graphics drivers given that chromium hangs the same way and same rate of failure regardless of whatever hardware I throw at it; admittedly I don't have access to any non-nvidia graphics hardware. I've tried a couple combinations of the above ; old card with disabled hardware and disabled caching but latest driver, wipe the config and no caching but enable hardware, sorry but I didn't record all my mixtures of experiments. No change in failure rate under any condition. One of the machines has front mounted drive trays; stick a drive tray boot drive with a SSD with devuan (basically, a debian distro) linux installed and everything works with every nvidia hardware card. Ditto win10 and win7. I don't have drive trays on the other two freebsd machines. I find it fascinating that the failure rate is not 100% or 0% but is roughly 90% and for a given version of chromium regardless of hardware, the rate of failure in the long run is constant regardless of what I try, a sparse and graphically bare intranet of a couple K of pure html and CSS and no JS fails equally often as some social networking site with 5 megs of spyware and ad links and JS. I opened and closed a gmail tab on another machine on my desk in chromium 13 times before it worked, whereas the intranet took 8 tries, but I've seen those numbers reverse before. The standard deviation is very high, sometimes (although VERY rarely) tabs render in as little as three attempts. Its been a long time since a chromium tab renders on the first try; maybe a year now. All three machines are running SSDs. Two have 1/4 tb raid1 mirror arrays with zfs; the third drive tray machine has a 120 GB SSD for freebsd. Two machines have 16 GB of RAM one has 8 GB. I'm playing with one machine as I'm typing this, trying to open a gmail window and top reports 2925M free with 0 swap use; whatever's wrong its not because its starved for memory or cpu load is too high. All three motherboards are older AMD64. I work in a secured "armed guard" type of environment, so once I got a tab to gmail to work, I was able to leave the tab open and running on a physically secured machine; once a tab initially renders, if it doesn't hang when opened of course, performance is excellent and there are no weird crashes even after days of continuous use of the same tab. No slowdowns, no bugs, no weird rendering, no kernel crashes, no weird syslog lines, no memory leaks (none that hit within a week or two anyway) Either it fails to begin to render or it works and that one tab will continue to work for at least a week of continuous use. I have access to an extremely large vmware cluster so this morning I spun up a freebsd host on the dev vlan and installed chromium 61.0,3163.100_1 ... I can connect to the vmware image using rdesktop and its slow but tabs work 100% of the time. That is interesting because its the same OS with the same ansible-enforced packages configuration and installation but if it runs on bare hardware the chromium render hangs, but if I run it on virtualized hardware it works perfectly (well, slowly as you'd expect...) Neither ansible nor the OS "know or care" that one machine is bare metal hardware and the other machine is a virtual image, from an ansible configuration standpoint the only difference between the three bare metal installs on my desk and the virtual image on the cluster is ip addrs and hostname. Obviously the hardware is different; the cluster is intel hardware and the cluster image devices are virtual not bare metal and silicon. dmesg on the identical ansible-configured virtualized image reports "vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0". /var/log/Xorg.0.log reports its using the vmware x11 driver "[ 11.742] (--) vmware(0): VMware SVGA regs at (0x1070, 0x1071)" I believe there is a way to pass thru GPU access to a vmware image and I'm obviously not doing that. The fact that chromium renders tabs fine when an identical image is virtualized and run without the nvidia driver, connected via a rdesktop session, would seem to point to the nvidia driver GPU acceleration as the problem, although the nvidia driver operates perfectly with all other software on the machine including extremely graphically intensive software and the only symptom of any sort ever is creating tabs fails to render almost all the time only on chromium seems to point to a problem with how chromium talks to nvidia driver WRT some kind of hardware acceleration that cannot be disabled from options in the browser config. Its also odd that the driver fails only 90% of the time and the failure rate appears unrelated to system workload or complexity/size of the page to be rendered or physical nvidia card hardware model. I wonder if I opened 100000 tabs, using some kind of GUI automation software that might not even exist, if the failure rate converged to 15/16th of the time. If when opening a tab, some random stack address or something has to randomly line up precisely on a perfect 16 byte address boundary or it locks up the thread. I wouldn't even know where to begin to look, but I do have a gut level feeling the failure rate, if it could be measured, is currently exactly 15/16th of the time, and somewhere a 128 bit long "something" is only being stored correctly 1/16th of the time. If anyone has any ideas or suggestions for experiments, please advise. I'm outta ideas, and firefox works great, and the only reason I care anymore is my chromebook uses chrome so it would be nice to sync my desktop and my chromebook bookmarks, whatevs... Have a pleasant day and thanks for your efforts thus far and in the future and good luck with this tricky bug! -- You are receiving this mail because: You are the assignee for the bug.help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-212812-28929-oTPBWwEPjF>
