At Wed, 21 May 2008 10:43:43 -0700 (PDT), Linus Torvalds wrote:
On Wed, 21 May 2008, Takashi Iwai wrote:
Well, what I meant is about the fixes to the subsystem (say, ALSA) by people in the outside. Not every ALSA-bugfix patch goes into the upstream from ALSA tree. You, Andrew and others pick individually ALSA-fix patches. They will be missing in the ALSA subsystem tree.
Well, that's actually fairly rare, but when it happens, either:
if you didn't get the fix (ie you're are just seeing random patches go in that happen to touch alsa), why should you then merge the WHOLE TREE with all my experimental stuff anyway? You can largely ignore it, knowing it's fixed, and when you ask me to pull, we'll have a good end result.
if you got the same fix as a patch, just apply it to your tree (ie just ignore what happens upstream). This happens all the time - people duplicate patches simply because two people apply it.
But the real issue is here is that my tree sometimes gets ten THOUSAND commits during the merge window. Do you really want to pull those thousands of commits into your tree just for one or two possible ALSA fixes?
Indeed, that's the whole question. My statement follows in below.
In _my_ tree, at least the people involved with asking me to pull end up also having (a) people test it and (b) aware that it's in my tree, so they work on trying to fix it. But if ALSA just merges at random times, neither of those two cases are true. Nobody will know about or test some random state that ALSA merged into its own tree.
Ask yourself (and ignore the ALSA parts - think of some totally *different* development area) which you think is better
developing in one area based on a stable base, with the people who do development in that area knowing about that area.
or develop on top of a churning sea of thousands of changes to other sub-areas that you don't know anything about?
In other words, the reason I ask people to not do lots of merges is more than just "it looks confusing". It's literally a matter of "it's bad development practice". It causes problems. The confusing history is actually *real* - it's not just a "visual artifact" of looking at the result in gitk. The confusing history is a real phenomenon, and implies that people are doing development not based on some tested base.
Yeah, I've been always amazed by gitk graphs :)
And, what if that you need a fix for the fix that isn't in ALSA tree...? IMO, either a rebase or a merge is better than cherry-picks.
First off, I don't see why you even need cherry-picks in the first place. I think your argument is bogus, and you're making it because you want to get the end result, not because the argument is valid on its own.
Here, let's see what I committed to the sound subsystem since 2.6.24 (ignoring merges):
git log --no-merges v2.6.24.. --committer=torvalds sound/
and look over that list. Remember: this is not some short timeframe, this is over TWO whole merge windows, ie this is way more commits than we would normally _ever_ get out of sync over.
Realistically, which of those commits aren't (a) either already from you sent to me just as a way to get a quick fix into my tree without merging the whole thing or (b) stuff that can't just be in my tree and doesn't have to be in the ALSA tree until the next release?
Honestly, now: does *any* of those commits look like "we should merge all the other changes just because we need that commit _now_ in ALSA"?
I really doubt it.
Don't get me wrong: I haven't suggested frequent rebases at all. This thread began actually because an update of the present alsa.git tree is required for applying my patches properly. [BACKGROUND: We are trying to make alsa.git tree with multiple committers. And, the current git-rebase doesn't care about sign-offs when a patch was committed by others. But, this is another topic...]
However, I have to point that backport or backmerge is a rare case but does happen certainly.
For example, assume that we now need to change the codes that touch the device creation. Now on the current your tree, the driver core changed the API. So, we need that change as well. However, picking this particular change might not be enough if it's a part of a long series of patches.
BTW, about the stability: we have an independent ALSA tree containing only the subset of the kernel tree (the sound part). On this, we apply patches continuously without rebase or merge. People except for the development kernel testers usually use this tree.
So I'd seriously suggest submaintainers merge *AT*MOST* once a week, and preferably much much less often than that. There simply isn't any real reason to do it more often. Because it can cause problems.
That's why my suggested rule is:
merge with mainline at major releases
This is "safe". Yes, releases still have bugs, but on the other hand, they have much fewer problems than random git trees of the day, so they are a lot safer targets to merge.
merge with mainline if you know there are real conflicts that need to be resolved.
This isn't "safe", but it's about trying to resolve conflicts early, so at some point the downside of merging with a "random point" is smaller than the downside of delaying the merge!
but perhaps the most important rule is that things should never be *really* black-and-white, and in the end the really fundamental rule should be:
- Use your own judicious good sense, and merge at other points as necessary, but just keep in mind that a merge is a big change.
Yes, merging with git may be technically really really trivial and take all of two seconds of your time, but:
(a) you *do* potentially get thousands of new commits that aren't actually related to your work and that you probably don't know well. (b) others, when they look at your history, will have a harder time following it.
so while I can give you a few guidelines, in the end those guidelines are just _examples_ of when merges can make sense. You need to understand what the impact of a merge is - and that while git makes merging technically pretty damn trivial most of the time, a merge should still be a big deal, and something you think about.
So the kinds of merges I *really* dislike are the ones that are basically "let's do a regular merge every day to keep up-to-date". That's fine if you don't do any development at all and "git pull" is just basically a "track the current development kernel for testing", but if it involves a merge, it means that there is something wrong in your development model.
Oh, this is really helpful. Maybe it should be documented somewhere as a reference...
But, my question is about the divergence between the development and for-linus branches: how to apply patches that exist only in for-linus tree back.
How often does it happen? And how big/important are those? I really think it's probably a "maybe once or twice a release cycle".
And then, the actual answer can be different depending on the details. For example, there are really three things you can do:
ignore it. Is it a cleanup patch (like the sparse patches) or just fairly trivial stuff that doesn't matter in real life ("remove duplicated unlikely()" patch or the /proc fixups)
This is often the right thing to do. You _will_ merge eventually anyway, we know that. I'd expect merges to happen at least once in the development cycle, maybe twice.
Yes, the patch may touch the sound system, but do you really _care_ about it happening rigth now, or can you just wait until the next merge you do?
Well, there is another case to think. For example, core API changes or changes of header files. These happen pretty often, at each kernel release, practically :) And, the code I'm working on is for the next kernel release. So, it should follow these changes, too. That is, I need the top-most development tree. This is another "divergence".
Or, I could postpone the changes touching these until the next kernel release -- then the tree gets merged anyhow and patches can be applied safely. But, of course, it means the fix or improvement will be delayed for one kernel release cycle.
cherry-pick it. Is it a small, simple patch that you want, but that isn't really worth pulling in all the other stuff that you simply don't know?
This isn't wrong. It shouldn't be *common*, but it's not wrong to have the same patch in two different branches. It makes sense if it is something you really want, but it's still not important or complex enough to actually mege everything else!
Hm, that's what I didn't consider seriously. I thought cherry-picking patches may cause merge errors easily.
and finally: merge. It really can be the RightThing(tm). Is it a biggish infrastructure change? Is it a series of several related and dependent commits?
In other words: is it something big enough that you'd rather merge everything else too (which at least has gotten tested together)? If so, merging is absolutely the right thing to do!
So merging on its own is not "wrong or evil" at all. Merging is a very good operation to do, but *mindless* merging is bad. That's really all that I'm really trying to argue against.
If you thought it through, and decided that yes, you really want to merge, then you should merge. I just think a lot of people merge without even thinking about all the other things it involves, just because git made it *so* easy to do.
Yeah, that's exactly what I feel now, too. There is no crystal clear guideline. But, the common sense tells best...
Thanks,
Takashi