Page 2 of 5
Re: Finding out what's using disk
Posted: 01 Dec 2011, 04:41
by DanielM
su_root wrote:edit: I guess I can check with this
sudo smartctl -a /dev/sda | grep Load_Cycle_Count
My load_cycle_count is currently 933568. I don't know if it might be a good idea throwing a party when it passes a million?
You can see my smart statistics graphically here
http://ideel.nl/bubba/bubba/daniel.bubba-smart_sda.html
/Daniel
Re: Finding out what's using disk
Posted: 01 Dec 2011, 05:11
by su_root
DanielM wrote:su_root wrote:edit: I guess I can check with this
sudo smartctl -a /dev/sda | grep Load_Cycle_Count
My load_cycle_count is currently 933568. I don't know if it might be a good idea throwing a party when it passes a million?
You can see my smart statistics graphically here
http://ideel.nl/bubba/bubba/daniel.bubba-smart_sda.html
/Daniel
OMFG, yes it sounds to me like you will soon have an fresh OS install party on a new HDD. I don't know what a "safe value" is but yours seems like it is running for the world record!
Are you by any chance running B3 on a WD green HDD?
edit: try to calculate how many cycles there are in 10minutes
Re: Finding out what's using disk
Posted: 01 Dec 2011, 05:22
by DanielM
su_root wrote:OMFG, yes it sounds to me like you will soon have an fresh OS install party on a new HDD. I don't know what a "safe value" is but yours seems like it is running for the world record!
Are you by any chance running B3 on a WD green HDD?
edit: try to calculate how many cycles there are in 10minutes
Yep. This is the WD10EARS that came with my B3. I was kinda worried and searched a lot when I first saw that steep red line, but I found official answers from WD saying that I shouldn't be worried. The logical interpretation otherwise would be that the disk would be trash by now...
Anyway, I'm planning on a new installation once B4 is released
(And of course, all important data on the B3 is on my offsite backup B1)
/Daniel
Re: Finding out what's using disk
Posted: 01 Dec 2011, 05:35
by su_root
DanielM wrote:su_root wrote:OMFG, yes it sounds to me like you will soon have an fresh OS install party on a new HDD. I don't know what a "safe value" is but yours seems like it is running for the world record!
Are you by any chance running B3 on a WD green HDD?
edit: try to calculate how many cycles there are in 10minutes
Yep. This is the WD10EARS that came with my B3. I was kinda worried and searched a lot when I first saw that steep red line, but I found official answers from WD saying that I shouldn't be worried. The logical interpretation otherwise would be that the disk would be trash by now...
Anyway, I'm planning on a new installation once B4 is released
(And of course, all important data on the B3 is on my offsite backup B1)
/Daniel
I guess the WD greens can manage a bigger count (like laptop drives). A normal HDD cycle count is getting critical above 600K (at least that is what I have heard).
I would still count how many cycles it does in 10minutes, if it is over 300 cycles / 24h, I would try to do something about it.
What filesystem do you use and with what parameters are you booting?
Re: Finding out what's using disk
Posted: 01 Dec 2011, 05:42
by johannes
Yes, it is disputed how bad this really is, but in any case you should known that we finally got hold of WD's tool for ARM to remove this Load/Unload behaviour. So from 2.4 on LCC count shouldn't move anymore.
Re: Finding out what's using disk
Posted: 01 Dec 2011, 06:01
by DanielM
johannes wrote:Yes, it is disputed how bad this really is, but in any case you should known that we finally got hold of WD's tool for ARM to remove this Load/Unload behaviour. So from 2.4 on LCC count shouldn't move anymore.
You did? I searched EVERYWHERE for that darn tool. Doesn't seem like they want anyone to use it...
Anyone knows how to see the time since first startup for a disk? I bought my B3 when it was released, so I guess it's somewhere around 13 months old. And of course it's been constantly running since then. So I guess number of load cycles per minute in average must be something like 933568/13/30/24/60 which gives me around 1.66.
Where do I check my boot parameters? I have noatime in fstab, unsure if I've added that myself or if it was there from the beginning.
/Daniel
Re: Finding out what's using disk
Posted: 01 Dec 2011, 06:12
by johannes
You can see the number of power-on hours in the smart data?
Re: Finding out what's using disk
Posted: 01 Dec 2011, 06:21
by su_root
johannes wrote:Yes, it is disputed how bad this really is, but in any case you should known that we finally got hold of WD's tool for ARM to remove this Load/Unload behaviour. So from 2.4 on LCC count shouldn't move anymore.
Ok, that is good to hear! If the WD greens 3.5" use similar stuff as laptop 2.5", it could still run fine for a veeeery long time. I've seen 2.5" HDD's with 3M cycles running just fine
I've tried to setup ramlog + tmpfs today and see how HDD behaves when system idle. Hopefully I'll get nice readings.
Re: Finding out what's using disk
Posted: 01 Dec 2011, 06:24
by DanielM
johannes wrote:You can see the number of power-on hours in the smart data?
Yes, I know I have seen it somewhere among the numbers. But now I can't seem to find it. You know exactly where?
/Daniel
edit: Found it. It's been online for 9846 hours...
edit2: ...which gives 1.58 load cycles per minute
Re: Finding out what's using disk
Posted: 01 Dec 2011, 06:33
by johannes
su_root wrote:
Ok, that is good to hear! If the WD greens 3.5" use similar stuff as laptop 2.5", it could still run fine for a veeeery long time. I've seen 2.5" HDD's with 3M cycles running just fine
Yes, I have still to hear about the first krasch proved to be because a high LCC. WD's big mistake was to state a maximum number for LCC, and the whole Linux world (or most of it) reaching that number quite fast. Byt in any case, to ease people's concern we'll provide the LCC fix with 2.4.
Re: Finding out what's using disk
Posted: 01 Dec 2011, 09:05
by su_root
DanielM wrote:johannes wrote:Yes, it is disputed how bad this really is, but in any case you should known that we finally got hold of WD's tool for ARM to remove this Load/Unload behaviour. So from 2.4 on LCC count shouldn't move anymore.
You did? I searched EVERYWHERE for that darn tool. Doesn't seem like they want anyone to use it...
/Daniel
I think you could try something like
hdparm -B 254 /dev/sda
What do you others think/suggest?
Re: Finding out what's using disk
Posted: 01 Dec 2011, 09:19
by johannes
We tried all that, nothing seemed to have that effect. From what we understand, the binary does somehing non-standard (WD proprietary) to change the head load/unload timer. We'll include it from 2.4 and make it run once and then die forever. If you want to try it PM me and I'll send it to you (not sure I can publish it here).
Re: Finding out what's using disk
Posted: 01 Dec 2011, 10:11
by su_root
Yeah, sure or I'm at least interested in seing what and how it is doing
Btw, here are some nice hacks
http://www.thinkwiki.org/wiki/Problem_w ... e_clicking
Code: Select all
#!/bin/bash
lastval=0
while :
do
newval=`smartctl -A /dev/sda | awk '$2=="Load_Cycle_Count" {print $10}'`
if [[ $newval != $lastval ]] # i.e., anything has changed (here: load cycle count only)
then
date
echo $newval
fi
lastval=$newval
sleep 30 # or some other interval
done
Re: Finding out what's using disk
Posted: 08 Feb 2012, 14:13
by ingo2
[quote="johannes"]We tried all that, nothing seemed to have that effect. From what we understand, the binary does somehing non-standard (WD proprietary) to change the head load/unload timer. We'll include it from 2.4 and make it run once and then die forever.
I do observe the very same here. Updated my B3 this morning from 2.3.1.1 -> 2.4 (new kernel 2.6.39.4-9). But LCC's are still racing. My HD as suppied with the box:
Code: Select all
Model Family: Western Digital Caviar Green
Device Model: WDC WD5000AADS-00M2B0
Serial Number: WD-WCAV5D182002
LU WWN Device Id: 5 0014ee 2af341b18
Firmware Version: 01.00A01
Warranty check at WD says expiry on 16.06.2013 - not too much for a brand new piece.
Here the S.M.A.R.T values recorded for 1 hour operation:
Code: Select all
smartctl -A /dev/sda | egrep 'Power_On_Hours|Load_Cycle_Count'
9 Power_On_Hours 0x0032 100 100 000 ... - 10
193 Load_Cycle_Count 0x0032 200 200 000 ... - 1260
an 1 hour later:
Code: Select all
9 Power_On_Hours 0x0032 100 100 000 ... - 11
193 Load_Cycle_Count 0x0032 200 200 000 ... - 1374
which means more than 100LCC/hour! Based on the specification of 300.000 LCC's will be reached in roughly 1/2 year under 24/7 operation.
Could it be that the mentioned fix was not applied during system upgrade?
The proposed trick
Code: Select all
hdparm -B /dev/sda
/dev/sda:
APM_level = not supported
obviously does not apply to that HD.
I also tried to enable "suspend" of the HD to save even more power. But I experienced that you can spin down it only for quite a short time (less then 1 min) by
Code: Select all
hdparm -y /dev/sda
/dev/sda:
issuing standby command
but it soon is waked up by some process on the box, the most busy one as reported by 'top' is 'mysqld'.
Are there any hints to prevent this regular access to the HD in short intervals? I did already try to disable almost all services without success. It would save reasonable energy though:
- operating: 10W
HD idle: 8W
HD suspend: 5.5W
Best regards,
Ingo
Re: Finding out what's using disk
Posted: 09 Feb 2012, 05:34
by ingo2
Partial success:
after having learned that a power cycle is required after firmware upgrade (reboot is not sufficient), the LCC's don't race any more. LCC only increases by 1 when you spin down the disk (suspend) - thanks to Excito!
One now can see the difference also on the power consumption: that remains steady at some 9.8 watts. Before it was cycling with values of 9.8 | 8.0 | 11.5 depending on what the disk state was.
What is remaining now is to find out what is continously accessing the disk preventing spin down when disk is idle for a certain time by using 'hdparm'. Spin down really saves power: in that state the box is only drawing 5.6 watts!
Kind regards,
Ingo
EDIT: Another hint at that occasion with regard to 'smartmontools':
I personally recommend to use the latest version, at least >= v5.4 which supports SSD's and the latest HD models. Squeeze comes with only v5.39 in the repository. I downloaded the armel build from Wheezy and installed it manually (dpkg -i ...). That's how I did on my PC and also on the B3 - works smoothly.