May 18, 2009

feedback about converting eigen2 to mercurial

by orzel
Categories: Admin, KDE
Tags: , ,
Comments: 6 Comments

This week-end we did the final conversion of the eigen2 source code repository. I shall describe here the few problems we had, as a feedback to the community.

Eigen original purpose was to help provide linear algebra for several KDE parts. As such, it was until now developed inside the KDE repository, which (still) uses subversion. Let it be clear that KDE did a wonderful job at hosting eigen repository, even though several accounts needed to be created for people who only contribute to eigen and not KDE as a whole. Eigen is getting big enough to live and release its own way , but eigen keeps very strong relationships with KDE.
Here is the script that was used for the conversion. The version of mercurial (and of the mercurial convert extension) was 1.2.1.

hg convert https://orzel@svn.kde.org/home/kde \
    --authors eigen.authors \
    --config convert.svn.tags=tags/eigen \
    --config convert.svn.branches=branches/eigen \
    --config convert.svn.trunk=trunk/kdesupport/eigen2\
    eigen.hg

We faced several problems. Fortunately all but one could be resolved patching python code. That would have been a lot more difficult using, for example, git. Speaking of git, we have found no easy way to make such a conversion, mainly because KDE repository is so huge and it seems git needs to parse the whole history of the whole repository just to convert this part. Eigen is quite small. It took mercurial two hours to convert it, almost all of which was due to network latency between my computer and KDE repository.

Bad default branch name

In mercurial, the default branch name is ‘default’ (same as ‘trunk’ on svn). If you browse a mercurial repository (using the wonderful tck/tk frontend or the web browser), each changeset is tagged with the branch name (and with every tag name is there is any). There is an exception to this, if the branch is ‘default’, nothing is displayed, as a convenience for clarity. Which is great. Using the script I present here, the convert extension would create a branch according to the name of the branches/ subdirectory, which is fine. Though the name for the main branch, the one called ‘trunk’ in svn and usually called ‘default’ in mercurial, was named ‘eigen2’ here, probably because the main directory was named this way.

I have found no way to prevent this using command line arguments. I have tried to play with the –filemap option, but failed. The final solution I used was to locally patch my mercurial (yes, I know, this is ugly):

--- /usr/lib/python2.6/site-packages/hgext/convert/subversion.py
+++ hgext/convert/subversion.py
@@ -777,6 +777,8 @@
                 branch = self.module.split("/")[-1]
                 if branch == 'trunk':
                     branch = ''
+                if branch == 'eigen2':
+                    branch = ''
             except IndexError:
                 branch = None

Bad branch used to save the tags file

In mercurial, tags are created by updating a file called ‘.hgtags’ which maps changesets (identified by their md5 sum) to a string. At the end of the conversion process, ‘hg convert’ will create such a file and check it in the mercurial repository. Unfortunately in our case, the file was not created on the right branch neither, so I also had to patch mercurial for this :

--- /usr/lib/python2.6/site-packages/hgext/convert/hg.py
+++ hgext/convert/hg.py
@@ -178,7 +178,8 @@                                                                                                   

          self.ui.status(_("updating tags\n"))                                                                       
          date = "%s 0" % int(time.mktime(time.gmtime()))                                                            
-         extra = {'branch': self.tagsbranch}                                                                        
+#         extra = {'branch': self.tagsbranch}                                                                       
+         extra = {'branch': 'default'}                                                                              
          ctx = context.memctx(self.repo, (tagparent, None), "update tags",                                          
                               [".hgtags"], getfilectx, "convert-repo", date,                                        
                               extra)

Spurious file from old history

Once the conversion was done, we noticed a file called src/Core, which should not be here. A diff with a svn checkout would confirm that this is the only difference. I have traced back the problem to a bug in svn which would allow a ‘svn copy’ to be done even if the destination name would already exist.

In our case, a directory called ‘Core’ was copied on top of the file ‘Core’. As such, the underlying file was never deleted, and mercurial kept it. Mercurial did quite well dealing with this actually.

See this wonderful commit

http://websvn.kde.org/?view=rev&revision=72256

And this is how the repository looked like just before this commit:

http://websvn.kde.org/branches/work/eigen2/src/?pathrev=72254

We could not manage to fix this problem, and we just removed the file after the conversion. Having this file would anyway not prevent using eigen2 in compilation.

It seems the bug is fixed in svn 1.6.2:  I can not reproduce this on a small test svn repository. A bug was filled for mercurial with all details, but as we cannot reproduce this behaviour on a test svn repository, this will be very difficult to solve. We for sure can not use KDE repository to test, it is far too huge for this.

Conclusion

We now have eigen2 repository on https://bitbucket.org/eigen/eigen2/ and so far everything seems ok. The repository is worth 7.3 Mb, which is perfectly honest. We can now grant or revoke write access very quickly and easily without disturbing KDE admins. And of course we also have all the wonderful advantages of DVCS.