Please note the new address for this forum : forum.excito.org. The old address redirects here but I don't know for how long. Thanks !
New user's registration have been closed due to high spamming and low trafic on this forum. Please contact forum admins directly if you need an account. Thanks !

special characters (åäö)

Got problems with your B2 or B3? Share and get helped!
d_rylndr
Posts: 48
Joined: 31 Mar 2011, 14:20

special characters (åäö)

Post by d_rylndr »

Having problems when reaching the b3 from one machine where characters like åäö is replaced by things like "├ñ"...
Tried using iocharset=utf-8, nls=utf-8, locale=sv_SE.utf8 etc. in fstab, nothing works.
Exactly the same fstab-etries, without any utf-8 flags, works in other machines.
Also have these issues when accessing through ftp.

What is wrong?
Ubi
Posts: 1549
Joined: 17 Jul 2007, 09:01

Re: special characters (åäö)

Post by Ubi »

sorry, our psychic department is currently out of office. COuld you please specify what you mean by "reaching", as it suggests the problem was with SSH (or www? or mail? gopher? UUCP? Appletalk?), but then later you mention fstab.
As a rule of thumb I'd say it's unwise to put special characters in files like fstab. Yes, my native language has them too, but its asking for trouble relying on character recoding on crucial system files.
d_rylndr
Posts: 48
Joined: 31 Mar 2011, 14:20

Re: special characters (åäö)

Post by d_rylndr »

Though samba that is. And it's file names on the shared disk that is encoded wrong. (Sorry to have missed that important information...) But also when using ftp or ssh.

It does work on other clients, just having problems with one machine.
Cheeseboy
Posts: 789
Joined: 08 Apr 2007, 12:16

Re: special characters (åäö)

Post by Cheeseboy »

Hi,

So the server is B3, and the client is undefined.
Could you please let us know how the problem manifests itself exactly (incorrect filenames after transfers, incorrect filenames when listing (ls) through ssh/ftp/SAMBA shares/or whatever, or incorrect file content)?
Do you have a a permanent mount?
What are your /etc/fstab and /etc/exports files looking like like on both systems?
What system is the client?
What error messages do you get in the logs on the server and client?

If you want us to help you, you will have to tell us the facts. As Ubi hinted, we cannot guess...

Cheers,

Cheeseboy

EDIT:
Sorry, just reread your previous post.
So there is an issue specific to one machine.
Can you think of any differences from the other machines?
d_rylndr
Posts: 48
Joined: 31 Mar 2011, 14:20

Re: special characters (åäö)

Post by d_rylndr »

The issue is also present when browsing the storage from the web-interface (from any machine, even where the samba mount works). Also when ssh:ing from any machine to storage/

It manifests itself so characters like åäö is replaced by characters like "├ñ" so a file called testar_ö.png is displayed as testar_├╢.png.

It is both present when browsing the share and when copying files from the server. Wrongly displayed files is also named wrong when downloaded locally.
Uploading files thorugh the web-interface with special characters does work! Those files are displayed correctly both in web interface and through the mounted share as well as through ssh.

So the problem does not seem to be with the mount options in the clients (?). Can it be some user and rights settings in the b3?
ryz
Posts: 183
Joined: 12 Feb 2009, 06:03

Re: special characters (åäö)

Post by ryz »

I am still confused. On which machine are you fiddling with the /etc/fstab file? The B3 server or the client? Is the client a windows or Linux machine?

What type is the partition you have mounted on b3?

Problems with Swedish character can be depending on so many different thing but it all comes down to that everyone that creates and reads files need to agree on which locale to use for the file name. How to do this is different depending on how you access the files. It is best to use utf8 as the encoding which is exactly what you are trying to do.

The question is in which encoding the file name that you have problem with really are to know who is using the correct encoding and who is not.

What ssh client are you using. Some clients needs to be told to use UTF8 as the encoding.
d_rylndr
Posts: 48
Joined: 31 Mar 2011, 14:20

Re: special characters (åäö)

Post by d_rylndr »

Sorry for not coming through clear.

Having problems with one machine, web interface and ssh.
Fiddling with fstab in the one machine not working.
The share on the b3 is the preinstalled storage disk - ext3?
ryz
Posts: 183
Joined: 12 Feb 2009, 06:03

Re: special characters (åäö)

Post by ryz »

So the machine that has problem reading file from the B3 is it using windows/linux or what? Which locale is it using? It sound like the locale on the machine having the problem is the wrong. You have to realize that the B3 does or the file system does not care about locales since the file names is just a list of bytes. So the locale is only used when displaying the file name. If the locale is not the same when displaying the file name as the one used when creating the file name you will end up with strange characters.

The problem with the web page might be that the B3 is not setup to use UTF8 as the default locale.
To set the default locale on the B3 you should as root run the following commands.

Code: Select all

apt-get install  locales
dpkg-reconfigure locales
And make sure that you on the second screen chose an utf8 locale.

I do not own a B3 only a B2 and i do not use the web interface to transfer files so I am mostly guessing here.
d_rylndr
Posts: 48
Joined: 31 Mar 2011, 14:20

Re: special characters (åäö)

Post by d_rylndr »

Done some more testing. New created files on the server, from the machine not reading properly, does get the right encoding.

Most content on this b3-share was moved from my old server using rsync. It is this old material that have the wrong encoding on one machine, web interface and ssh/ftp.

So something probably went wrong in copying the content or that the material had some different encoding than utf-8. Why does it look right through some clients though? (echo $LANG gives sv_SE.UTF-8 on all machines, on server B3 as well)

How can I tell which encoding is used for the files? How can I convert all to UTF-8? Can this be done without risk?

addition: Machine with problems is running linux (which could be figured out by it using fstab...) Sorry I forgot to mention it.
ryz
Posts: 183
Joined: 12 Feb 2009, 06:03

Re: special characters (åäö)

Post by ryz »

I have had some problems with file name encoding myself. One good program to change encoding on file name is convmv. Just install convmv and read the man page.
d_rylndr
Posts: 48
Joined: 31 Mar 2011, 14:20

Re: special characters (åäö)

Post by d_rylndr »

Thank you!

I did some testing with convmv and it tells me that files which have the wrong characters in them already use UTF-8

Code: Select all

user@B3:/home/storage/WORK/FOLDER$ convmv -t utf8 -r --nfc targetfolder
Your Perl version has fleas #37757 #49830 
Starting a dry run without changes...
Skipping, already UTF-8: targetfolder/b├╢rjan_skiss.mov
If I use the --nosmart flag to force conversion the converted name would be "b������rjan_skiss.mov"

It doesn't say anything about the other files in that folder and I can't find any verbose option for convmv. Are the other files converted then? How can I know which encoding they use?
And how can I convert ├╢ to ö?
ryz
Posts: 183
Joined: 12 Feb 2009, 06:03

Re: special characters (åäö)

Post by ryz »

I think you need to specify from what locale to change from with the -f flag this will most likely be iso-8859-1.
d_rylndr
Posts: 48
Joined: 31 Mar 2011, 14:20

Re: special characters (åäö)

Post by d_rylndr »

That gave "07_WIP_animatics_etc/b├╢rjan_skiss.mov"...

Again, how can I tell which encoding is used by files?
ryz
Posts: 183
Joined: 12 Feb 2009, 06:03

Re: special characters (åäö)

Post by ryz »

You really cant do that since it is just a stream of bytes and it is impossible to for a program to know what token those bytes should be. The reason it can check if it is utf8 is because not all byte streams are valid in UTF8. But they are in old character set.

what you could do is using xxd to find out the hex number for the problematic characters and the try to Google to find which encoding gives the right token for that hex number. This is the encoding that the files names really are in.

Code: Select all

ls <file> | xxd -g1
Might it be that the files you do have problem with really are utf8 and that it is all the other that are in iso-8859-1?

What happens if say you want to change from utf8 to is0-8859-1 instead?
ryz
Posts: 183
Joined: 12 Feb 2009, 06:03

Re: special characters (åäö)

Post by ryz »

The xxd will of course be problematic if the encoding really are utf 8 since then the strange character can be 1-3 bytes long and it is hard to know how long.

So if the xxd will give more bytes than the number of characters in the file the encoding should be utf8. Note that one bytes equals two hex numbers.
Post Reply