Page 6 of 9
Re: B3 dies without errors
Posted: 27 Jan 2011, 12:37
by DanielM
Ok. Now we should have a better chance of getting somewhere here. I've now connected my Bubba via a special console port cable kindly provided to Excito to a logging laptop. So next crash will hopefully provide a lot more useful information about what actually happened.
So now I just have to push it hard. Starting some torrents might help...
/Daniel
Re: B3 dies without errors
Posted: 30 Jan 2011, 12:17
by DanielM
DanielM wrote:So next crash will hopefully provide a lot more useful information about what actually happened.
[247095.099004] kernel BUG at mm/slub.c:2834!
Had my first crash with console logging today. Still quite unclear what caused it. The above message when I google for it gives some hits of people experiencing the same crash, it seems all of them are running kernel 2.6.35.x. So it could be some kernel bug.
Either way we've decided that I will disable nut and unplug my ups (since it seems usbhid-ups was doing something at the time of the crash. I don't think it's related though since I had the crashes before even installing nut) and Tor is going to make a kernel with more debugging enabled for me.
I'll keep you posted.
/Daniel
Re: B3 dies without errors
Posted: 30 Jan 2011, 16:00
by Ubi
from
http://lxr.free-electrons.com/source/mm/slub.c?v=2.6.35
Code: Select all
2822 void kfree(const void *x)
2823 {
2824 struct page *page;
2825 void *object = (void *)x;
2826
2827 trace_kfree(_RET_IP_, x);
2828
2829 if (unlikely(ZERO_OR_NULL_PTR(x)))
2830 return;
2831
2832 page = virt_to_head_page(x);
2833 if (unlikely(!PageSlab(page))) {
2834 BUG_ON(!PageCompound(page));
2835 kmemleak_free(x);
2836 put_page(page);
2837 return;
so it's a memory interaction issue? If so you may want to run memtest for a while. I hope there must be an ARM version somehwere...?
Re: B3 dies without errors
Posted: 31 Jan 2011, 00:24
by DanielM
Ubi wrote:so it's a memory interaction issue? If so you may want to run memtest for a while. I hope there must be an ARM version somehwere...?
Well, I guess I could always test something like this
http://packages.debian.org/squeeze/stress and see how the system behaves. I'll wait for my debug kernel first though, hope it'll be more useful.
/Daniel
Re: B3 dies without errors
Posted: 07 Mar 2011, 15:58
by Asad
This is also the errors I get when it crashes, just did a dmesg now:
Code: Select all
[ 0.913701] cpuidle: using governor menu
[ 0.917790] mv_xor_shared mv_xor_shared.0: Marvell shared XOR driver
[ 0.924131] mv_xor_shared mv_xor_shared.1: Marvell shared XOR driver
[ 0.966707] mv_xor mv_xor.0: Marvell XOR: ( xor cpy )
[ 1.006707] mv_xor mv_xor.1: Marvell XOR: ( xor fill cpy )
[ 1.046747] mv_xor mv_xor.2: Marvell XOR: ( xor cpy )
[ 1.086703] mv_xor mv_xor.3: Marvell XOR: ( xor fill cpy )
[ 1.092891] usbcore: registered new interface driver usbhid
[ 1.098462] usbhid: USB HID core driver
[ 1.102720] TCP cubic registered
[ 1.105936] NET: Registered protocol family 17
[ 1.110450] Gating clock of unused units
[ 1.110459] before: 0x00dfc3fd
[ 1.110465] after: 0x00cfc1cd
[ 1.111137] rtc-mv rtc-mv: setting system clock to 2011-03-07 10:30:11 UTC (1299493811)
[ 1.166682] usb 1-1: new high speed USB device using orion-ehci and address 2
[ 1.216693] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl F300)
[ 1.256719] ata1.00: ATA-8: WDC WD10EARS-00Y5B1, 80.00A80, max UDMA/133
[ 1.263307] ata1.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32)
[ 1.306761] ata1.00: configured for UDMA/133
[ 1.311304] scsi 0:0:0:0: Direct-Access ATA WDC WD10EARS-00Y 80.0 PQ: 0 ANSI: 5
[ 1.320487] sd 0:0:0:0: [sda] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
[ 1.328258] sd 0:0:0:0: Attached scsi generic sg0 type 0
[ 1.334282] sd 0:0:0:0: [sda] Write Protect is off
[ 1.339094] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[ 1.339278] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 1.349034] sda:
[ 1.350928] hub 1-1:1.0: USB hub found
[ 1.355147] hub 1-1:1.0: 4 ports detected
[ 1.398956] sda1 sda2 sda3
[ 1.403246] sd 0:0:0:0: [sda] Attached SCSI disk
[ 1.637021] usb 1-1.2: new high speed USB device using orion-ehci and address 3
[ 1.676712] ata2: SATA link down (SStatus 0 SControl F300)
[ 1.682237] md: Waiting for all devices to be available before autodetect
[ 1.689011] md: If you don't use raid, use raid=noautodetect
[ 1.695249] md: Autodetecting RAID arrays.
[ 1.699351] md: Scanned 0 and added 0 devices.
[ 1.703770] md: autorun ...
[ 1.706543] md: ... autorun DONE.
[ 1.748882] scsi2 : usb-storage 1-1.2:1.0
[ 1.855479] EXT3-fs: barriers not enabled
[ 2.747872] scsi 2:0:0:0: Direct-Access WD Ext HDD 1021 2002 PQ: 0 ANSI: 4
[ 2.757034] sd 2:0:0:0: Attached scsi generic sg1 type 0
[ 2.763055] sd 2:0:0:0: [sdb] 1953519616 512-byte logical blocks: (1.00 TB/931 GiB)
[ 2.772204] sd 2:0:0:0: [sdb] Test WP failed, assume Write Enabled
[ 2.778394] sd 2:0:0:0: [sdb] Assuming drive cache: write through
[ 2.786156] sd 2:0:0:0: [sdb] Test WP failed, assume Write Enabled
[ 2.792339] sd 2:0:0:0: [sdb] Assuming drive cache: write through
[ 2.798445] sdb:
[ 2.800349] kjournald starting. Commit interval 5 seconds
[ 2.806344] EXT3-fs (sda1): using internal journal
[ 2.811176] ext3_orphan_cleanup: deleting unreferenced inode 139282
[ 2.811242] ext3_orphan_cleanup: deleting unreferenced inode 139273
[ 2.811270] ext3_orphan_cleanup: deleting unreferenced inode 139272
[ 2.811293] ext3_orphan_cleanup: deleting unreferenced inode 139269
[ 2.829229] ext3_orphan_cleanup: deleting unreferenced inode 305791
[ 2.829295] ext3_orphan_cleanup: deleting unreferenced inode 303189
[ 2.829333] ext3_orphan_cleanup: deleting unreferenced inode 139279
[ 2.829358] ext3_orphan_cleanup: deleting unreferenced inode 139277
[ 2.829381] ext3_orphan_cleanup: deleting unreferenced inode 139276
[ 2.829405] ext3_orphan_cleanup: deleting unreferenced inode 139275
[ 2.829428] ext3_orphan_cleanup: deleting unreferenced inode 139274
[ 2.829454] ext3_orphan_cleanup: deleting unreferenced inode 319614
[ 2.829482] ext3_orphan_cleanup: deleting unreferenced inode 139271
[ 2.829505] ext3_orphan_cleanup: deleting unreferenced inode 139270
[ 2.829524] EXT3-fs (sda1): 14 orphan inodes deleted
[ 2.834479] EXT3-fs (sda1): recovery complete
[ 2.886424] EXT3-fs (sda1): mounted filesystem with writeback data mode
kernel Linux b3 2.6.35.4 here.
however what increased these problems dramatically for me was that my bubba3 wan cable was going through a switch and not directly connected to the cable modem. I got a replacement unit and after direct connecting it it runs fine for awhile. Sometimes fine for 14 days, sometimes for 5, but it still crashes in-frequently. Here is log from today.
Re: B3 dies without errors
Posted: 07 Mar 2011, 16:02
by Ubi
this seems to be the startup dmesg with some cleanup that is expected to occur after a hard down. I don't see how anything can be deduced from this information, or how a network cable or switch should rersult in a complete freeze of the machine. So could you elaborate?
Re: B3 dies without errors
Posted: 07 Mar 2011, 16:29
by Asad
Collisions should not normally happen on a switch, but it could be that the switch is faulty. In such as case, it will loose network connectivity. It could also be that the port on the switch is either half or full duplex while the port on Bubba is set to different speed. When you say complete freeze, I refer it to the device loosing routing.
edit: packets with larger MTU size can cause retransmissions resulting in collisions and then bubba won't be able to reroute it to the other interface. when you connect it directly, you avoid that problem.
The transmission of a packet on a physical network segment that is larger than the segment's MTU is known as jabber. This is almost always caused by faulty devices. Many network switches have a built-in capability to detect when a device is jabbering and block it until it resumes proper operation. (Wikipedia)
The problem could have been a combination of many things. It might not have been a direct cause of the cable going through the switch. I was also running squid proxy and realized that it happened more often regardless of the cable. Even after changing cable, when running squid transparently it increased the frequence of the problem, so I uninstalled it. So I am not saying, something IS the cause here.
Re: B3 dies without errors
Posted: 07 Mar 2011, 17:45
by Ubi
This thread was about completely non-responsive systems. I don't see how this lecture on TCP networking fits into that. Furthermore, your issue, which is hijacking this tread, seems to be a problem with your switch and not with your bubba. PLease either start a new thread or stay on-topic on this one.
Re: B3 dies without errors
Posted: 08 Mar 2011, 03:51
by Asad
Ok sorry for that, i must have mixed the issue.
Re: B3 dies without errors
Posted: 08 Mar 2011, 05:59
by RandomUsername
Hi,
Could anyone tell me how they configured the Apache plugin on their munin node? I've added
to /etc/apache2/mods-available/status.conf and at a suggestion I found on the web added
Code: Select all
[apache*]
env.url http://my.domain-name.com/server-status?auto
env.ports 80 443
to /etc/munin/plugin-conf.d/munin-node but I'm still getting this error
no (no apache server-status on ports 80)
when I run
Code: Select all
perl /usr/share/munin/plugins/apache_processes autoconf
Any ideas?
Thanks.
[EDIT]Never mind. Fixed it by removing a dodgy entry in /etc/hosts.
Re: B3 dies without errors
Posted: 15 Mar 2011, 01:11
by RandomUsername
Sorry for the double post. Everything appears like it should be working with the Apache plugins but my graphs are just flat so something's obviously wrong.
The only thing I've found is here:
http://wiki.kartbuilding.net/index.php/ ... munin-node but it doesn't seem to apply to me.
Can one of you guys post your config for your Apache plugins?
Thanks.
Darren.
Re: B3 dies without errors
Posted: 15 Mar 2011, 03:04
by DanielM
RandomUsername wrote:Can one of you guys post your config for your Apache plugins?
The strange thing here is that my Munin worked out-of-the-box, I never touched anything at all. My /etc/munin/plugins/apache_accesses is a simple symlink to /usr/share/munin/plugins/apache_accesses which is completely untouched. Are there any other config files I should be looking at?
/Daniel
Re: B3 dies without errors
Posted: 15 Mar 2011, 04:15
by RandomUsername
Daniel, is there anything in /etc/munin/plugin-conf.d/munin-node relating to apache?
Thanks.
Re: B3 dies without errors
Posted: 15 Mar 2011, 05:30
by DanielM
RandomUsername wrote:Daniel, is there anything in /etc/munin/plugin-conf.d/munin-node relating to apache?
Not a word.
Code: Select all
root@b3:/etc/munin/plugin-conf.d# grep apa munin-node
root@b3:/etc/munin/plugin-conf.d#
/Daniel
Re: B3 dies without errors
Posted: 16 Mar 2011, 10:13
by RandomUsername
Well, removing any apache related lines from munin-node didn't help
Must be something do with the fact you have a B3 and I only a lowly B2.
[EDIT]Actually, it has worked. Yay!