Tue, 01 Dec 2009

Linux Next Graphing

A while back Rusty posted about graphing the size of the daily linux-next patches.

Since we are heading towards the merge window for 2.6.33 and hence sfr has been getting home later and later, I thought I'd take another look at it.

The dodgy script I've been using to create this is out here. This also creates the raw data file which is here.

You can see some periods where there was no linux-next release, like around the 2.6.27 release. You can also see that linux-next is never zero size. Either Linus doesn't take everything in linux-next, or new stuff for the following release is coming in before the last release is done with. sfr mentioned that there is some stuff in linux-next that's been in there for ages and hasn't been merged up to Linus.

There is a difference between how Rusty got his data and how I did. Rusty used the size of the bz2 patch out on kernel.org. These patches are against the release and release candidates (ie. against 2.6.30, 2.6.30-rc1, 2.6.30-rc2, etc). I'm using the linux-next git tree to determine how big linux-next is for that day. Since sfr bases linux-next off Linus' git origin each day, I take the difference between Linus' git origin and the linux-next release to determine the size. Since Linus' origin is at least as new as the RCs, my size is never larger than Rusty's. This is especially noticeable in the merge window (the ~2 weeks between the release and rc1). In the merge window, Rusty's size continues to grow until rc1 is released, but mine starts to go down almost immediately after the main release as Linus starts merging trees into his git origin and making life easier for sfr. Also, Rusty is using patch size (bz2 compressed) and I'm using the number of lines changed (insertions + deletions).

It seems that maintainers are working/merging new code constantly throughout the cycle. Ideally (yeah, coz I'm is the authority on this!), we wouldn't see a lot of new code hit linux-next just before the merge window opens as new code should hopefully be being tested at this point. If the rate of new code was slowing down before the merge window, we'd see the line flatten to horizontally before the release. I guess we're hacking until the last minute, who would have thought!?!? ;-)

The peaks of linux-next seem to be a reasonable predictor of the relative size of the following kernel release. ie. if linux-next is bigger, so is the following release, although it's not perfect (ie. 2.6.29 vs 32)
Release Actual line changes linux-next changes linux-next/Actual %
2.6.29 1879345 1222635 (peak at 2.6.28) 65%
2.6.30 1547035 1168031 (peak at 2.6.29) 76%
2.6.31 1419059 1118892 (peak at 2.6.30) 79%
2.6.32 (-rc8) 1618369 1247456 (peak at 2.6.31) 77%

These last two ideas are interesting to combine. When a release is delayed, it's resulting in more code for the following release, since code is being developed right up until the merge window opens. So delaying a release is double edged sword. It improves the current release (more testing/debugging), but makes the follow release bigger. If we were developing earlier in the cycle and then just testing as the merge window approached, we wouldn't have this phenomenon. I suspect this is already known, but hopefully this backs it up a bit.

I haven't attempted to confirm what Rusty noticed about hackers working more on weekends but if someone wants to analyse the raw data....

Since I've got this scripted up, so I'll endeavour to keep this graph updated out here.


[/tech] permanent link