[OT] Re: Commit 6803 contains invalid char on commit log

Anderson Lizardo lizardo at linuxfromscratch.org
Sat Oct 1 16:47:29 PDT 2005


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Alexander E. Patrakov wrote:
> Anderson Lizardo wrote:
>> FIY, the Subversion commit messages need always to be written either
>> in Unicode or plain ASCII, so "svn log --xml" can output valid XML
>> data. So please, setup your favorite editor to one of these encodings
>>  before issuing a "svn commit".
> 
> A bit wrong. SVN uses the current locale encoding when accepting the log
> message from the editor or from the command line, and then internally
> converts it to UTF-8. When dumping the log in non-XML format, it is
> converted from UTF-8 to the locale encoding. When dumping the log in XML
> format, it stays in UTF-8.

I understand, but in this case "svn log --xml" should have output "ü" as
some strange character (maybe not even a valid UTF-8 sequence) and not
just keep it in "plain" ISO-8859-1 (which is what happens), right?

This is the xxd dump snippet that contains the char in question
(produced by "svn log -r6803 --verbose --xml \
svn://svn.linuxfromscratch.org/LFS | xxd"):

00004a0: 696c 642e 2054 6861 6e6b 2079 6f75 204a  ild. Thank you J
00004b0: fc72 6720 4269 6c6c 6574 6572 0a20 4669  .rg Billeter. Fi

> I.e., all my attempts to add invalid characters to the log failed.
> ("Добавлена пустая строка" == "Added an empty line").
> 
> Jim: how did you do that? Which svn version?
> Anderson: Which svn version is on the server?

Our server is running svn 1.1.1 (r11581).

Note for server admins: we should upgrade this ASAP. The latest version
for 1.1.x series is 1.1.4.

>> Notice that the XML header says the content is in utf-8, but the
>> content itself contains the "ü" char (the ü HTML entity). That
>> confuses the XML::Parser module used on the script. As a workaround, I
>> had to force the script to always interpret its input as ISO-8859-1.
> 
> 
> Is this possible to undo the hack now? It would prevent "normal"
> non-ASCII log messages from working properly.

I'd suggest we keep the hack for now while the real issue (Subversion
accepting invalid input on the commit message) is fixed. Otherwise we
would risk the website script breaking again...

BTW, does anyone know whether it's possible to change the commit message
of a specific revision on Subversion? That's because even when we fix
the issue, that invalid char will still be there...

Regards,
- --
Anderson Lizardo
lizardo at linuxfromscratch.org
http://www.linuxfromscratch.org/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFDPyARkzNmn+NRHHoRAqiSAJ479YQh7h6UGPWXd6LTYuVdItv20wCfY/qy
aG2uQLdwzwzxpVtA23rM4Ec=
=nP77
-----END PGP SIGNATURE-----

	

	
		
_______________________________________________________ 
Novo Yahoo! Messenger com voz: ligações, Yahoo! Avatars, novos emoticons e muito mais. Instale agora! 
www.yahoo.com.br/messenger/



More information about the lfs-dev mailing list