Here's another POV. (Why does evereybody keep emailing me personally?) --Guido van Rossum (home page: http://www.python.org/~guido/) ---------- Forwarded message ---------- From: Daniel Berlin <dberlin at dberlin.org> Date: Aug 13, 2005 7:33 PM Subject: Re: [Python-Dev] PEP: Migrating the Python CVS to Subversion To: gvanrossum at gmail.com (Sorry for the lack of proper References: headers, this is a reply to the email archive). It's been a couple years since i've been in the world of python-dev, but apparently i'm rejoining the mailing list at just the right time. Take all of this for what it is worth: I'm currently responsible for GCC's bugzilla, wiki, in addition to maintaining several optimization areas of the compiler :P. In addition, i'm responsible for pushing GCC (my main love and world :P) towards Subversion. I should note my bias at this point. I now have full commit access to Subversion. However, I've also submitted patches to monotone, etc. We had a long thread about the various alternatives (arch, bzr, etc), and besides "freeness" constraints on what we can run on gcc.gnu.org as an FSF project, it wouldn't have mattered anyway. This has been in the planning for about a year now (mainly waiting for new hardware). Originally, we were hoping to move GCC to monotone, but it didn't mature fast enough (it's way too slow), and we couldn't make it centralized enough for our tastes (more later). The rest of the free tools other than subversion (arch, monotone, git, darcs, etc) simply couldn't handle our repository with reasonable speed/hardware. GCC has project history dating back to 1987. It's a 4 gig CVS repo containing > 1000 tags, and > 300 branches. The changelog alone has 30k trunk revisions. Those distributed systems that carry full history often can't deal with this fast enough or in any space efficient way. arch was an example of this. It had a retarded mechanism that forced users to care about caching certain revisions to speed it up , instead of doing it on it's own. I've never tried converting this repo to bazaar-ng, it wasn't far enough along when i started. It also had no cvs2bzr type program, and we aren't about to lose all our history. Except for monotone (builtin cvs_import) and subversion (cvs2svn), none of the cvs2* programs i've run across either run in reasonable time (those that don't actually understand how to extract rcs revisions would take weeks to convert our repo, literally), or could handle all the complexities a repository with our history present (branches of branches, etc). Most simply crash in weird ways or run out of memory :). Anyway: Monotone took 45 minutes just to diff two old revisions that are one revision away from each other. CVS takes about 2 minutes for the same operation. SVN on fsfs takes 4 seconds. The converted SVN repo has > 100000 revisions, and is only ~15% bigger than the cvs repo (mostly due to stupid copies it has to do to handle some tag fuckery people were doing in some cases. If it had been subversion from the start, it would have been smaller). We have cvs2svn speedup patches that were done with the KDE folks that make cvs2svn io bound again instead of cpu bound (it was O(N^2) in extracting cvs revision texts before). It takes 17 hours to convert the gcc repository now, only 45 minutes of cpu time :). It used to take 52 hours. I've also talked with Linus about version control before. He believes extreme distributed development is the way to go. I believe heavily that in most cases where you have a mix of corporations and free developers, it ends up causing people to "hide the ball" more than they should. This is particularly prevalent in GCC. We don't want design and development done and then sent as mega-patches presented as fait accompli, then watch these people whine as their designs get torn apart. We'd rather have the discussion on the mailing list and the work done in a visible place (IE CVS branch stored in some central place) rather than getting patch bombs. As a result (and there are many other reasons, i'm just presenting one of them), we actually don't *want* to move from a centralized model, in order to help control the social and political problems we'd have to face if we went fully distributed. Python may not face any of these problem to the degree that GCC does (i doubt many projects do actually. GCC is a very weird and tense political situation :P), because of size, etc, in which case, a distributed model may make more sense. However, you need to be careful to make sure that people understand that it hasn't actually changed your real development process (PEP's, etc), only the workflow used to implement it.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4