版本管理 FeatureBranch by Matain Folwer

liuxue.gu@hotmail.com · 2011年10月20日 · 7 次阅读

[font="][size=12.0pt] With the rise of Distributed VersionControl Systems (DVCS) such as git and Mercurial, I've seen more conversationsabout strategies for branching and merging and how they fit in with [url=http://martinfowler.com/articles/continuousIntegration.htmlcolor=blue] Continuous][ Integration /url. There's a bit ofconfusion here, particularly on the practice of feature branching and how itfits in with CI. [/font]

[align=center][font="][size=12.0pt] [/font][/align] [b][font="][size=18.0pt] Simple (isolated) Feature Branch [/font][/b] [font="][size=12.0pt] The basic idea of a feature branchis that when you start work on a feature (or story if you prefer that term) youtake a branch of the repository to work on that feature. In a DVCS, you'll dothis in your personal repository, but the same kind of thing works in a centralizedVCS too. [/font] [font="][size=12.0pt] I'm going to illustrate this with aseries of diagrams. I have a shared project mainline, colored blue, and twodevelopers, colored purple and green (since the developers names are ReverendGreen and Professor Plum). [/font]

[font="][size=12.0pt][attach] 1590[/attach] [/font]

[font="][size=12.0pt] I'm using labeled colored boxes (egP1 and P2) to represent local commits on the branch. Arrows between branchesrepresent merges between branches, the boxes are colored orange to make themstand out. In this case there are updates, say a couple of bug-fixes, appliedto the mainline (presumably by Mrs Peacock). When these happen our developersmerge them into their work. To give this a sense of time, I'll assume we'relooking at a few days work here, with each developer committing to their localbranch roughly once a day. [/font] [font="][size=12.0pt] In order to ensure things areworking properly, they can run builds and tests on their branch. Indeed forthis article I'll assume that each commit and merge comes with an automatedbuild and test on the branch it's on. [/font] [font="][size=12.0pt] The advantage of feature branchingis that each developer can work on their own feature and be isolated fromchanges going on elsewhere. They can pull in changes from the mainline at theirown pace, ensuring they don't break the flow of their feature. Furthermore itallows the team to choose its features for release. If Reverend Green takes toolong, we can release with just Professor Plum's changes. Or we may want todelay Professor Plum's feature, perhaps because we are uncertain that thefeature works the way we want to release it. In this case we just tell theprofessor to not merge his changes into mainline until we are ready for thefeature. This is called [i] cherry-picking , the team decides which featuresto merge in before release. [/font] [font="][size=12.0pt] Attractive though that picturelooks, there can be trouble ahead. [/font]

[font="][size=12.0pt][attach] 1591[/attach] [/font]

[font="][size=12.0pt] Although our developers can developtheir features in isolation, at some point their work does have to beintegrated. In this case Professor Plum easily updates the mainline with hisown changes. There's no merge here because he's already incorporated themainline changes into his own branch (there will be a build). Things arehowever not so simple for Reverend Green, he needs to merge all of his changes(G1-6) with all of Professor Plum's (P1-5). [/font] [font="]size=12.0pt [/font] [font="][size=12.0pt] I've made this a big merge box asit's a scary merge. It may be just fine, the developers may have been workingon completely separate parts of the code base with no interaction, in whichcase the merge will go smoothly. But they may be working on bits that do interact,in which case here lye dragons. [/font] [font="][size=12.0pt] The dragons can come in many forms,and tooling can help slay [i] some of them. The most of obvious dragon isthe complexity of merging the source code and dealing with conflicts asdevelopers edit the same files. Modern DVCSs actually handle this rather well,indeed somewhat magically. Git has quite the reputation for dealing withcomplicated merges. So much so that the textual issues of merging are muchbetter than they used to be - indeed I'll go so far as to discount textualconflicts for the purposes of this article. [/font] [font="][size=12.0pt] The problem I worry more about is asemantic conflict. A simple example of this is that if Professor Plum changesthe name of a method that Reverend Green's code calls. Refactoring tools allowyou to rename a method safely, but only on your code base. So if G1-6 containnew code that calls foo, Professor Plum can't tell in his code base as hedoesn't have it. You only find out on the big merge. [/font] [font="][size=12.0pt] A function rename is a relativelyobvious case of a semantic conflict. In practice they can be much more subtle.Tests are the key to discovering them, but the more code there is to merge themore likely you'll have conflicts and the harder it is to fix them. It's therisk of conflicts, particularly semantic conflicts, that make big merges scary. [/font] [font="][size=12.0pt] This fear of big merges also acts asa deterrent to refactoring. Keeping code clean is constant effort, to do itwell it requires everyone to keep an eye out for cruft and fix it wherever theysee it. However this kind of refactoring on a feature branch is awkward becauseit makes the Big Scary Merge much worse. The result we see is that teams usingfeature branches shy away from refactoring which leads to uglier code bases. [/font] [font="][size=12.0pt] Indeed I see this as the decisivereason why Feature Branching is a bad idea. Once a team is afraid to refactorto keep their code healthy they are on downward spiral with no pretty end. [/font]

[align=center][font="][size=12.0pt] [/font][/align] [b][font="][size=18.0pt] Continuous Integration [/font][/b] [font="][size=12.0pt] It's these problems that ContinuousIntegration was designed to solve. With Continuous Integration my diagram lookslike this. [/font]

[font="][size=12.0pt][attach] 1592[/attach] [/font]

[font="][size=12.0pt] There's a lot more merging going onhere, but merging is one of those things that's much easier to do frequentlyand small rather than rarely and large. As a result if Professor Plum ischanging some code that Reverend Green relies on, the Reverend will find itearly, such as when he merges in P1-2. At that point he's only got to modifyG1-2 to work with the changes, rather than G1-6. [/font] [font="][size=12.0pt] CI is effective at removing theproblem of big merges, but it's also a vital communication mechanism. In thisscenario the potential conflict will actually appear when Professor Plum mergesG1 and realizes that Reverend Green is actively building on Plum's libraries.At this point Professor Plum can go and find Reverend Green and they candiscuss how their two features interact. It may be that Professor Plum'sfeature requires some changes that don't mesh well with Reverend Green'schanges. By looking at both their features they can come up with a betterdesign that affects both their work-streams. With the isolated feature branchesour developers don't discover this till late, probably too late to do muchabout it. Communication is one of the key factors in software development andone of CI's most important features is that it facilitates human communication. [/font] [font="][size=12.0pt] It's important to note that, most ofthe time, feature branching like this is a different approach to CI. One of theprinciples of CI is that everyone commits to the mainline every day. So unlessfeature branches only last less than a day, running a feature branch is adifferent animal to CI. I've heard people say they are doing CI because theyare running builds, perhaps using a CI server, on every branch with everycommit. That's continuous building, and a Good Thing, but there's no [i] integration ,so it's not CI. [/font]

[align=center][font="][size=12.0pt] [/font][/align] [b][font="][size=18.0pt] Promiscuous Integration [/font][/b] [font="][size=12.0pt] Earlier I said parenthetically thatthere are other ways of doing feature branching. Say Professor Plum andReverend Green take tea together early in the cycle. While chatting theydiscover they are working on features that interact. At this point they maychoose to integrate with each other directly, like this. [/font]

[font="][size=12.0pt][attach] 1593[/attach] [/font]

[font="][size=12.0pt] With this approach they only push tothe mainline at the end, as before. But they merge frequently with each other,so this avoids the Big Scary Merge. The point here is that the primary issuewith the isolated feature branching scheme is its isolation. When you isolatethe feature branches, there is a risk of a nasty conflict growing without yourealizing it. Then the isolation is an illusion, and will be shatteredpainfully sooner or later. [/font] [font="][size=12.0pt] So is this more ad-hoc integration aform of CI or a different animal entirely? I think it is a different animal,again a key point of CI is everyone integrates to the [i] mainline everyday. Integrating across feature branches, which I shall call [i] promiscuousintegration (PI), doesn't involve or even need a mainline. I think thisdifference is important. [/font] [font="][size=12.0pt] I see CI as primarily giving birthto a release candidate at each commit. The job of the CI system and deploymentprocess is to disprove the production-readiness of a release candidate. Thismodel relies on the need to have some mainline that represents the currentshared, most up to date picture of complete. [/font] [font="][size=12.0pt]-- Dave Farley [/font]

[align=center][font="][size=12.0pt] [/font][/align] [b][font="][size=18.0pt] PromiscuousIntegration vs Continuous Integration [/font][/b] [font="][size=12.0pt] So if it's different is PI betterthan CI, or more realistically under what circumstances is PI better than CI? [/font] [font="][size=12.0pt] With CI, you lose the ability to usethe VCS to do cherry picking. Every developer is touching mainline, so allfeatures grow in the mainline. With CI, the mainline must always be healthy, soin theory (and often in practice) you can safely release after any commit.Having a half built feature or a feature you'd rather not release yet won'tdamage the other functionality of the software, but may require some masking ifyou don't want it to be visible in the user-interface. This can be as simple asnot including a menu item in the UI to trigger the feature. [/font] [font="][size=12.0pt] PI can provide some middle groundhere. It allows Reverend Green the choice of when to incorporate ProfessorPlum's changes. If Professor Plum makes some core API changes in P2, thenReverend Green can import P1-2 but leave the others until Professor Plum'sfeature is put onto the release. [/font] [font="][size=12.0pt] One worry with all this picking andchoosing is that PI makes it really hard to keep track of who has what in theirbranch. In practice, it seems tooling pretty much solves this problem. DVCSskeep a clear track of changes and their origins and can figure out that whenProfessor Plum pulls G3 he already has G2 but doesn't have B2. I may have mademistakes drawing the diagram by hand, but tools do keep track of these thingswell. [/font] [font="][size=12.0pt] On the whole, however, I don't thinkcherry-picking with the VCS is a good idea. [/font] [font="][size=12.0pt] Feature Branching is a poor man'smodular architecture, instead of building systems with the ability to easy swapin and out features at runtime/deploytime they couple themselves to the sourcecontrol providing this mechanism through manual merging. [/font] [font="][size=12.0pt]-- Dan Bodart [/font] [font="][size=12.0pt] I much prefer designing the softwarein such a way that makes it easy to enable or disable features throughconfiguration changes. Two useful techniques for this are [url=http://martinfowler.com/bliki/FeatureToggle.htmlcolor=blue] FeatureToggles][ [/url] and [url=http://martinfowler.com/bliki/BranchByAbstraction.htmlcolor=blue] BranchByAbstraction][ [/url]. These require you to putsome thought into what needs to be modularized and how to control thatvariation, but we've found the result to be far less messy that relying on theVCS. [/font] [font="][size=12.0pt] The main thing that makes me nervousabout PI is the influence on human communication. With CI the mainline acts asa communication point. Even if Professor Plum and Reverend Green never talk,they will discover the nascent conflict - within a day of it forming. With PIthey have to notice they are working on interacting code. An up-to-datemainline also makes it easy for someone to be sure they are integrating witheveryone, they don't have to poke around to find out who is doing what - soless chance of some changes being hidden until a late integration. [/font] [font="][size=12.0pt] PI arose out of open-source work,and it could be that the less intensive tempo of open-source could be a factorhere. In a full time job, you work several hours a day on a project. This makesit easier for features to be worked in priority. With an open source projectpeople often put in a hour here, and the next hour a few days later. A featuremay take one developer quite a while to complete while other developers withmore time are able to get features into a releasable state earlier. In thissituation cherry picking can be more important. [/font] [font="][size=12.0pt] It's important to realize that thetools you use are largely independent of the integration strategy you use.Although many people associate DVCSs with feature branching, they can be usedwith CI. All you need to do is mark one branch on one repository as themainline. If everyone pulls and pushes to that every day, then you have a CImainline. Indeed with a disciplined team, I would usually prefer to use a DVCSon a CI project than a centralized one. With a less disciplined team I wouldworry that a DVCS would nudge people towards long lived branches, while acentralized VCS and a reluctance to branch nudges them towards frequentmainline commits. Paul Hammant may be right: "I wonder though, if a teamshould not be adept with trunk-based development before they move todistributed."

http://martinfowler.com/bliki/FeatureBranch.html [/font]

[code] I've heard people say they are doing CI because they are running builds, perhaps using a CI server, on every branch with every commit. That's continuous building, and a Good Thing, but there's no integration, so it's not CI. --- Martin Folwer [/code]

需要 登录 后方可回复。