Failed B3 (another)
Posted: 23 Nov 2012, 20:19
I *think* my B3 died this morning but I'm not sure. Prepare for a long post.
I went to bed last night with everything working OK.
This morning, I woke up to Pingdom alerts saying my website (hosted on B3) had been down since about 06:30.
The power LED was on and I could see my SSID on my phone. My phone would briefly connect to the wifi but would disconnect again within about 5 seconds.
So I rebooted B3 but it stayed on the pink LED as if the boot up process had hung. I left it and went to work.
When I got home the LED was blue but I could not connect to B3 over SSH on any wired interface or wifi. I booted up from a rescue stick which worked fine. The logs didn't't show any thing unusual and in fact, it would appear the B3 had been running happily most of the day, albeit with no network connectivity. fsck reported no problems with the disk.
Several hours of playing around with different files (i.e. disabling fail2ban, modifying iptables rule files) and most of the time B3 stays on a pink LED when booting. Occasionally, and not consistently, it would get to a blue LED after a very long wait but still with no connectivity. Twice in a row, I successfully connected to it after modifying /etc/network/interfaces by commenting out the br0 interface and assigning static IP addresses to both eth0 and eth1. After rebooting and uncommenting out br0, I was back to the pink LED again. So I thought this was the fix and commented out br0 again but this time the B3 has failed to boot successfully since so I think this was just coincidence.
So, although the disk appeared OK, I'm thought it must be the problem. I took out the disk and put in a known working disk that came with the B3 but currently lives in a B2. The first attempt to install from USB seemed be working initially but it ended up not rebooting and staying on the pink LED. The second attempt to install is still running, and also currently on pink LED. I will leave it until the morning and then try boot the thing (also, the USB stick will hopefully contain an install log, not sure how verbose they are though).
Anything obvious I'm missing? I'm thinking maybe the disk controller has failed.
I went to bed last night with everything working OK.
This morning, I woke up to Pingdom alerts saying my website (hosted on B3) had been down since about 06:30.
The power LED was on and I could see my SSID on my phone. My phone would briefly connect to the wifi but would disconnect again within about 5 seconds.
So I rebooted B3 but it stayed on the pink LED as if the boot up process had hung. I left it and went to work.
When I got home the LED was blue but I could not connect to B3 over SSH on any wired interface or wifi. I booted up from a rescue stick which worked fine. The logs didn't't show any thing unusual and in fact, it would appear the B3 had been running happily most of the day, albeit with no network connectivity. fsck reported no problems with the disk.
Several hours of playing around with different files (i.e. disabling fail2ban, modifying iptables rule files) and most of the time B3 stays on a pink LED when booting. Occasionally, and not consistently, it would get to a blue LED after a very long wait but still with no connectivity. Twice in a row, I successfully connected to it after modifying /etc/network/interfaces by commenting out the br0 interface and assigning static IP addresses to both eth0 and eth1. After rebooting and uncommenting out br0, I was back to the pink LED again. So I thought this was the fix and commented out br0 again but this time the B3 has failed to boot successfully since so I think this was just coincidence.
So, although the disk appeared OK, I'm thought it must be the problem. I took out the disk and put in a known working disk that came with the B3 but currently lives in a B2. The first attempt to install from USB seemed be working initially but it ended up not rebooting and staying on the pink LED. The second attempt to install is still running, and also currently on pink LED. I will leave it until the morning and then try boot the thing (also, the USB stick will hopefully contain an install log, not sure how verbose they are though).
Anything obvious I'm missing? I'm thinking maybe the disk controller has failed.