For reference purposes, cvs2svn was run with the following options on the transition:
--username=freeciv --encoding=ISO-8859-1 --cvs-revnums
A full explanation of the options is available here.
CVS is not smart enough to understand which encoding files are in. The Freeciv source tree is quite old, and has been in at least 3 different machines that I am aware of. The first two likely used ISO-8859-1 locale encoding, and the last used UTF-8. Unfortunately individual commiters submit commit log messages in whatever locale they use, and CVS eats them up like a champ without complaining! This means the log is jumbled up with oodles of messages in different encodings (likely ASCII, ISO-8859-1, EUC-JP and UTF-8), and there are no marks saying which encoding they are in.
I saw no easy solution to this problem, so I chickened out and took the easy route. I converted the commit logs as if everything was in ISO-8859-1.
The encoding option makes cvs2svn use ISO-8859-1 as the encoding of log messages in the CVS repository.
CVS revision numbers are stored as metadata because they might be necessary to ease old bug and patch tracking.
It is possible to change log messages in SVN after the commit. See: