Page 5 of 5

Re: Finding out what's using disk

Posted: 19 Aug 2013, 05:09
by Gordon
Also did some digging on the dovecot-timefix. Kind of a weird procedure actually, since the dovecot wiki says that running ntpd should be enough to fix the issue that this cron script is meant to fight. I need to do some additional monitoring of the hardware clock and the correct way to set it (it gave me UTC time since the last reboot). Knowing its shift would allow me to figure out a schedule to rewrite it with the ntp values (dovecot requires al shifts to stay within 5 seconds - or it will crash) which should make it safe to reboot.

The 1 minute idle time setting on the drive appears to somewhat on the short side, especially when having an SSH connection to the B3. The same would probably be valid for longer timeouts as well though. I added the following content to the end of my .bashrc:

Code: Select all

# Do not park the harddisk heads while I'm connected
while [ 1 ]; do touch /tmp/$USER; sleep 30; done &
Don't forget the ampersand at the end of that line - you will never see a prompt if you do.

The system runs nice and cool at around 37⁰C. A lot better than my existing B3 which runs at 51⁰C right now. I do have somewhat of a memory issue on that older B3 though, with ~100k of swap in use. Don't know how many hits that swap gets, but that obviously also keeps the disk from going into idle mode.

Re: Finding out what's using disk

Posted: 21 Aug 2013, 06:21
by Gordon
Looking at the mount parameters I noticed that the root partition is mounted with the noatime option, but /home is mounted with the default setting of relatime. Presumably this is because certain programs are said to fail if atime is not updated, but AFAIK I do not have any such application running.

One specific type of applications mentioned here are email applications, but dovecot explicitly states that they do not depend/rely on atime. What I did find is that relatime does update the atime on folder access, which obviously happens often enough and also when you're only reading data (from file cache). IMO it is safe to change the mount option to noatime and I did.

Also, according to the samba docs you can improve performance by mounting your disks with acl and user_xattr. If you're changing mount options, you should therefore consider adding those options as well. Currently my fstab reads as follows:

Code: Select all

/dev/sda1	/	ext3	noatime,defaults	0	1
/dev/mapper/bubba-storage	/home	ext3	noatime,acl,user_xattr,noauto		0	2
/dev/sda3	none	swap	sw			0	0
usbfs		/proc/bus/usb	usbfs	defaults	0	0
/proc		/proc	proc	defaults		0       0
/dev/sdd1       /var    ext4    noatime,data=writeback,noacl,barrier=0  0       1

Re: Finding out what's using disk

Posted: 30 Aug 2013, 03:43
by Gordon
Finally managed to run a day without the disk waking up even once. The last trouble was the DHCP client that, even though I get a 24 hour lease, updated a Samba include file and resolv.conf every 4-5 hours. Particularly strange as I had already reconfigured the DHCP client not to request the type of information that would require updating these files.

I now have both interfaces set to static addresses, so that fixes it. One remark on the tip for letting php itself handle clearing of old sessions: that doesn't seem to work as expected. If I don't log off from the admin pages I can now return the next day and still be logged on. My guess is that you need a much higher rate of visits to the web server, or probably reduce the gc_divisor to 10 or even less.

Re: Finding out what's using disk

Posted: 30 Aug 2013, 06:56
by ingo2
Great tuning work Gordon!

Here also Samba is the most disturbing service with regard to HD spin-down. I now banned my Smart-TV from network, because it spams the network and thus wakes up my B3 several times a day. Now with only Linux machines in the network and using just NFSv4 for data exchange all is fine.

Re: Finding out what's using disk

Posted: 30 Aug 2013, 08:33
by flexor
I like a program called nmon, which is in the Debian repos - apt-get install nmon. Not sure if it can give you information that other programs can't, but I'm an AIX person and I like to use nmon on Linux as well as AIX.

Once installed, start it using nmon, then make the window ~45 characters high and hit the keys c, d, and t. That turns on CPU, disks and top.

Re: Finding out what's using disk

Posted: 30 Aug 2013, 11:37
by Gordon
Well,

After reinstating the php5 cron job this morning, no disk activity has registered since on /dev/sda. I'm quite pleased with that. Next is to actually run some services on this B3 and see what happens then with disk usage and if the class 6 SD card doesn't become too much of a slow down factor or even make the system unresponsive.

:idea: Maybe the SD card addition is a good idea for the next generation Bubba - the B4. You could probably also run the whole system from a well sized SDHC and use the harddisk for swap and user storage only.

Re: Finding out what's using disk

Posted: 30 Aug 2013, 13:43
by ingo2
Gordon wrote: :idea: Maybe the SD card addition is a good idea for the next generation Bubba - the B4. You could probably also run the whole system from a well sized SDHC and use the harddisk for swap and user storage only.
I really doubt that a SD card is a good choice. I had attached some time ago such a piece to my old TS-109 NAS for some directories below /var . That SD card lasted for some months, then uncorrectable file system errors stopped the nas from working. Checking the card on my PC showed it was completely dead.

This kind of pluggable devices are ment for intermittent use, but not for beeing part of a Linux file-system running 24/7. For such application a PCI Express Mini-Card slot would be the choice. There you can plug-in matching SSD-cards.

@ flexor:
Thanks for the hint. 'nmon' is really a usefull tool. Didn't know thatt before, just nmap ;-)

Re: Finding out what's using disk

Posted: 30 Aug 2013, 15:17
by Ubi
sorry but that last post is not representative for normal operation. FreeNAS is designed to run from SD cards and does so very well. It does require some tweaking which is exactly what this thread describes. I think SD (or flash which is actually IMO a little better) as a primary drive is a very good idea.

Re: Finding out what's using disk

Posted: 30 Aug 2013, 16:01
by ingo2
Ubi wrote: I think SD (or flash which is actually IMO a little better) as a primary drive is a very good idea.
Yes, if there is some kind of error detection an correction. This can be either in the hardware (like in rotational disks or SSD's) or can be in the filesystem itself. That's why there are dedicated filesystems for flash devices which perform the error handling. Ext3 and ext4 do not have any flash specific features, see here: https://en.wikipedia.org/wiki/Flash_file_system

EDIT: Another solution (and that's how FreeNAS does) is to have the root filesystem read-only and load all into RAM. That's perfect (epecially if you have ECC-RAM) but limits the space for installation of additional software.

Re: Finding out what's using disk

Posted: 31 Aug 2013, 05:08
by Gordon
ingo2 wrote:
Ubi wrote: I think SD (or flash which is actually IMO a little better) as a primary drive is a very good idea.
Yes, if there is some kind of error detection an correction. This can be either in the hardware (like in rotational disks or SSD's) or can be in the filesystem itself. That's why there are dedicated filesystems for flash devices which perform the error handling. Ext3 and ext4 do not have any flash specific features, see here: https://en.wikipedia.org/wiki/Flash_file_system
Those flash file systems are meant for raw flash operation. But SD memory cards (and CF, MMC etc.) as well as thumb drives have that flash file system implemented in hardware already. Obviously you still need to restrict the number of writes that the system performs to the flash drive and preferably it should maintain a substantial amount of free space (to allow spreading).

There are probably some other optimizations that can be done here, but setting noatime and writeback mount options seems like a good start. Further tweaks might involve moving /var/log over to tmpfs and reconfigure logrotate to keep the archives in an alternate location (on a real storage system).

Re: Finding out what's using disk

Posted: 02 Sep 2013, 11:07
by Gordon
Hi all,

I did some more digging into this flash memory warning. Although my previous statement is correct, there is one little issue that remains in flash memory operation. When changing a file on flash memory the logic inside it erases a certain block of data and rewrites that together with the changed data on a new location. The length of this block of data may vary based on manufacturer and size of the unit. To limit overhead and additional writes it is important that you align the blocksize that Linux uses with the blocksize that the flash memory unit uses. The result is better speed and less wear.

There are a lot of articles saying this and that on the subject and most of them very confusing. What it comes down to is that these "erase blocks" are either 512kB, 1MB, 2MB or 4MB in size and discussion is all about at what sector you should start your partition which to me sounds like a lot of fuzz. Just pick 4MB (8192) and make sure that the whole size of the partition is also a full multiple of that same 8192 (end cylinder should be at [n*8192 - 1]).

Still, using the flashbench tool that is mentioned by many of the articles I am guessing that my 2GB SD card has in fact a 1MB erase block size and I therefore ran the following format command on the new partition.

Code: Select all

mkfs -t ext4 -b 4096 -E stride=2,stripe-width=128 /dev/sdd1
Lots of reading available on this as well - the idea is to let ext4 do full 1MB writes if possible and thus avoid multiple sequential updates to the same hardware defined block.

So, a good tip after all. Thanks for pointing it out.

Re: Finding out what's using disk

Posted: 03 Sep 2013, 04:38
by flimflam
Great job guys!

I have only one comment .... nothing for no linux users. I would really love to stop most services which are useless for my needs, but to spend nights ( and hear my beloved wife´s comments to my precious box ) learning how to type in bash ... NO PLEASE! If i knew that, I would change my mind by the device selection.
Actually i am just surprised from the sounds while nothing in my LAN is contacting B3 and all the time is something flashing.
.
.
.
I also own STB VUPLUS SOLO which can perfect cooperate with my B3. But to become middling user in linux STB it takes me a lot of time during cold nights. And you can imagine the rumors. So no more again.

I prefer to set the whole device through webinterface ( similar to SYNOLOGY ) with check marks and let it work for years.

That is my personal opinion.

Re: Finding out what's using disk

Posted: 11 Sep 2013, 11:24
by Gordon
Well...

A bit off topic, but I've been able to confirm that if you put the B3 on its side the disk reports 2~3⁰C cooler than in the flat position. I didn't think it would make that much of a difference.
flimflam wrote:Actually i am just surprised from the sounds while nothing in my LAN is contacting B3 and all the time is something flashing.
I noticed the same thing. Having the new B3 connected to the network there was constant activity on the network interface and I checked with tcpdump. As it turned out my squeezebox devices are constantly shouting broadcasts over the net - even when in standby. Today I made it my business to bring some more silence to my network and I changed the network configuration on the older B3, which has Wifi, so that wlan now runs on a different IP range than the wired lan. Now all the automated traffic that reaches the new B3 is ARP and SMB echo requests and I got a lot less flashing on the switch as well (the squeezebox devices are all connected through Wifi).

Re: Finding out what's using disk

Posted: 18 Sep 2013, 15:28
by Gordon
The next phase

I had some issues with the SD card sometimes not mounting or suddenly showing as read only. Not really sure what happened as it also occurred with a replacement SD card. Found some really nasty bug while battling this, because the system would not boot with a corrupted card. Luckily it turned out to be a known bug and the solution was to add the `nofail` boot option.

During the course of fixing this I had reinstated /var to a regular directory instead of a symlink to /var/tmp and found that the disk wouldn't get to standby anymore. The monitoring method I'd used upto this point did not show me what was happening here, but I did notice the timestamp of /tmp changing continuously. Using iotop and inotify I found that it is in fact kjournald and jbd2 (ext3 and ext4 journalling kernel processes) writing temp files on a regular base (depending on other disk activity). I figured I could move /tmp to tmpfs, but that did not keep the modification time of /tmp itself from changing. Weird...

I did find another solution though that I'm currently testing and may in fact also make moving /var onto a different disk a redundant operation. It's called noflushd and it is designed to hold back disk writes for as long as memory lets it or the disk is brought back to active by some other process. Results look promising so far.

Re: Finding out what's using disk

Posted: 20 Sep 2013, 10:32
by Gordon
As it turns out running noflushd is not sufficient to make the disk actually spin down. I ran the daemon in debug mode and this is what it showed:

Code: Select all

root@babaorum:~# noflushd -n 7 -d
No devices given - autoprobing.
/dev/sda: using spindown handler SGIO
Spinning up /dev/sda after 0 minutes.
Spinning up /dev/sda after 0 minutes.
Spinning up /dev/sda after 0 minutes.
Spinning up /dev/sda after 0 minutes.
Spinning up /dev/sda after 0 minutes.
Spinning up /dev/sda after 0 minutes.
Spinning up /dev/sda after 0 minutes.
Spinning up /dev/sda after 0 minutes.
I actually do get LCC increments now (timeout @ 5 minutes) but on the rare occasions that a spin down is done (@ 7 minutes) it gets brought back to life within seconds.

Compare the previous with the B3 where I have /var on a SD card:

Code: Select all

root@petibonum:~# noflushd -n 5 -d
No devices given - autoprobing.
/dev/sdc: using spindown handler SGIO
/dev/sdb: using spindown handler SGIO
/dev/sda: using spindown handler SGIO
Spinning up /dev/sdb after 4 minutes.
Spinning up /dev/sdb after 5 minutes.
Spinning up /dev/sdb after 2 minutes.
Spinning up /dev/sdb after 2 minutes.
Spinning up /dev/sdb after 4 minutes.
Spinning up /dev/sdb after 4 minutes.
Spinning up /dev/sdb after 1 minutes.
Spinning up /dev/sdb after 0 minutes.
Spinning up /dev/sdb after 4 minutes.
Spinning up /dev/sdb after 0 minutes.
Spinning up /dev/sda after 75 minutes.
Spinning up /dev/sdb after 4 minutes.
Spinning up /dev/sdb after 5 minutes.
Spinning up /dev/sda after 14 minutes.
Spinning up /dev/sdb after 4 minutes.
Spinning up /dev/sdb after 0 minutes.
Spinning up /dev/sdb after 0 minutes.
Spinning up /dev/sdb after 3 minutes.
Here I have the LCC timeout set @ 1 minute and it becomes obvious that frequent writing to /var leaves practically no sleep time to the disk. Do look at /dev/sda though, which had been asleep for 75 minutes and got waken because I downloaded a file from it. That is really good and seeing that there is (minimal) downtime on /dev/sdb as well I expect this will give me a better life expectancy on the SD card also. IMO Excito should make this package part of their installation.