 |
|
 |
|
On Mon, 23 Jan 2012 18:12:18 +1100, Martin Pool <...@sourcefrog.net
On 23 January 2012 17:41, Eli Zaretskii <...@gnu.org
I think the shortest path to a nontrivial answer to that is to run the
command with -Dhpss and have a look at what the traffic actually
comprises.
--
Martin
|
|
 |
|
 |
 |
|
 |
|
On Mon, 23 Jan 2012 04:44:33 -0500, Eli Zaretskii <...@gnu.org
OK, but how can I run that command, when the offending revision is
already in my repository? What am I missing?
|
|
 |
|
 |
 |
|
 |
|
On Mon, 23 Jan 2012 11:18:22 +0100, John Arbash Meinel <...@arbash-meinel.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Some other bits that you could do:
1) Look at .bzr/repository/packs/*
There should be a fairly recent file that contains that pull. See what
size it is. If this is comparable to the 58MB then we genuinely
downloaded content that we wanted to keep. If it is a lot smaller,
than it is possible that on the server side it is stored
inefficiently, and we copied it, and compressed it on the fly before
writing it locally.
2) Use the name of that file to then inspect the associated index
files (tix = text content, rix = revision, cix = inventory stuff,
iix/six are probably not very interesting). For example, my most
recent file in bzr is: c0ba9a41c20d1b447d3b603361b63bbf.pack
You can use
head -n5 .bzr/repository/indicies/c0ba9a41c20d1b447d3b603361b63bbf.tix
To get the summary information, and you can this to get the detail:
bzr dump-btree [--raw]
.bzr/repository/indicies/c0ba9a41c20d1b447d3b603361b63bbf.tix
Note that it will probably be a bit verbose, but if you look around in
it, you can see how many files are affected, etc. In my case, a 2.7MB
bzr pack file had 1712 entries in the .tix (1700 files were affected),
549 entries in .rix (549 total revisions), and 2,159 entries in .cix
(which has to do with inventory management.)
With the raw data, you can start working out what the size on disk
actually comprises of.
3) If you want to test the fetch again, you can create a new
repository, and branch your old revision into it (so it shouldn't copy
any new data) and then do the fetch again. So something like:
bzr branch -r 106888 . ../../somewhere-not-in-the-shared-repo --no-tree
cd ../../somewhere-not-in-the-shared-repo
bzr pull bzr+...@bzr.savannah.gnu.org/emacs/trunk -Dhpss
John
=:------BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk8dM+4ACgkQJdeBCYSNAAPLUQCgjiWlLOOdZt2HWtffre65nKF9
pdEAoNA41yZ886dYC0RU0b9ObAfFeR9r
=qcMf
-----END PGP SIGNATURE-----
|
|
 |
|
 |
 |
|
 |
|
On Mon, 23 Jan 2012 05:50:25 -0500, Eli Zaretskii <...@gnu.org
The pack is this:
-rw-rw-rw- 1 eliz eliz 28040427 Jan 19 06:35 36bfdda5be84a32615e6db8f9eaabed3.pack
I verified (by looking at .bzr.log) that there was no repacking since
then.
What am I looking for, though? E.g., the .rix index corresponding to
the above pack has 35 revisions, while the corresponding .tix file has
4088 texts. Is the latter unusually large?
Will do, thanks.
|
|
 |
|
 |
 |
|
 |
|
On Mon, 23 Jan 2012 12:22:37 +0100, John Arbash Meinel <...@arbash-meinel.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
...
So 28MB vs the ~58MB you said it downloaded. I can certainly come up
with a scenario where this could happen. Namely:
https://bugs.launchpad.net/bzr/+bug/402669
I believe the emacs main repository has a lot of people committing
directly to the repository, so you probably get a fair number of
single commit pack files. Which coupled with bug #402669, means that
all of those single-commits have the fulltext of all the texts
present, and that gets transmitted. Then when you receive the 35
revisions locally, you can combine any files that were changed
multiple times.
Averaged across a lot of histories (bzr, mysql, linux kernel, emacs I
think), a good heuristic is <10 texts changed per commit. Above is
averaging 100 texts changed per commit, or about 10x normal. So yes,
it is larger than expected.
repo # texts # revs t/r
bzr 172249 63446 2.7
emacs 264859 118524 2.2
mysql 388608 74779 5.2
Now that is averaged over a lot of history, and development workflow
impacts this a lot. Merges, in particular, can swing it high or low.
In the case of MySQL, they tend to do a lot more merges that touch the
same files, so a merge creates one commit that touches lots of files.
While bzr tends to do merges that are orthogonal, so the actual merge
commit doesn't introduce new content, so the merge looks more like a
commit that doesn't change much.
John
=:-
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk8dQv0ACgkQJdeBCYSNAAMc1ACgp+1JR7+8DU0RK47Epz+Gh8vH
YRYAn0tSp4QmYT/LVzCwG3Vqc8uioSeN
=sIXn
-----END PGP SIGNATURE-----
|
|
 |
|
 |
 |
|
 |
|
On Mon, 23 Jan 2012 06:55:31 -0500, Eli Zaretskii <...@gnu.org
As I wrote earlier, only 7 files were changed and the diffs are less
than 200 lines.
|
|
 |
|
 |
 |
|
 |
|
On Mon, 23 Jan 2012 15:24:35 +0100, John Arbash Meinel <...@arbash-meinel.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
bzr log -n0 -r 106891 shows:
99634.2.1005 Glenn Morris 2012-01-10
Update short copyright year to 2012 (do not merge to
trunk)
and
99634.2.1006 Glenn Morris 2012-01-10
Add 2012 to FSF copyright years for Emacs files (do
not merge to trunk)
Which together modify about 2000 files, and
99634.21.7 Kenichi Handa 2012-01-13 [merge]
Which also has about 2000 entries (though those may not have been
modified vs trunk).
Now, it is possible that the changes introduced by 99634.2.1005 and
99634.2.1006 were reverted when they were merged to trunk. However,
that history is still part of the ancestry and that 2000 texts is
still copied around.
John
=:------BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk8dbaMACgkQJdeBCYSNAAPF1wCg1huQ1OAanfaWtAtw729eRkfM
27MAnRPxR735GzVMh4VWqVT3CfWR4UjO
=0/2s
-----END PGP SIGNATURE-----
|
|
 |
|
 |
 |
|
 |
|
On Mon, 23 Jan 2012 19:09:52 +0200, Eli Zaretskii <...@gnu.org
Since the log messages say "do not merge to trunk" (because the trunk
already had such Copyright changes), I assumed Glenn didn't. Glenn?
And those 2000 entries again change the Copyright notices...
So it appears we made some mess to the history by cross-merging the
same changes back and forth between the trunk and the branch, is that
so? If so, how to avoid this in the future?
Could you perhaps take a look in the Emacs trunk at admin/bzrmerge.el
(which is used to do these merges), and give your opinion about the
method it uses?
Thanks.
|
|
 |
|
 |
 |
|
 |
|
On Mon, 23 Jan 2012 22:48:38 +0200, Eli Zaretskii <...@gnu.org
Stefan?
Of course, there's a way to cherry-pick in Bazaar. The thing is,
cherry-picks are not recorded in the history DAG, while Stefan wanted
the merged revisions to be recorded.
|
|
 |
|
 |
 |
|
 |
|
On Mon, 23 Jan 2012 20:35:50 -0500, Stefan Monnier <...@iro.umontreal.ca
Yes?
Stefan
|
|
 |
|
 |
 |
|
 |
|
On Tue, 24 Jan 2012 06:05:13 +0200, Eli Zaretskii <...@gnu.org
Could you please describe what does "do not merge to trunk" do?
|
|
 |
|
 |
 |
|
 |
|
On Tue, 24 Jan 2012 13:41:34 +0100, John Arbash Meinel <...@arbash-meinel.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Given the specific history, I'm pretty sure this was done with
something like:
cd trunk
bzr merge ../other-branch
bzr revert .
bzr commit -m "Merge but ignore the copyright changes"
I realize the specifics may be different, but that could easily be why
you would have lots of changes in the ancestry, but not actually shown
in the trunk revisions.
John
=:-
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk8epv4ACgkQJdeBCYSNAAPxBwCgqdm1JqnEg7E40/56iip9eJbL
fPAAoIBEAVVcTdcYTiCrk5zsF8Zo8TvF
=mixe
-----END PGP SIGNATURE-----
|
|
 |
|
 |
 |
|
 |
|
On Tue, 24 Jan 2012 09:51:34 -0500, Stefan Monnier <...@IRO.UMontreal.CA
It tells bzrmerge.el that this revision is one that should probably not
be merged, so bzrmerge.el prompts the user to confirm whether or not to
merge the corresponding revision.
Yes, that's pretty much what happens if the user decides not to merge
that revision.
Stefan
|
|
 |
|
 |
 |
|
 |
|
On Tue, 24 Jan 2012 19:12:59 +0200, Eli Zaretskii <...@gnu.org
Could you please suggest a better way, if it exists, of doing this?
The basic requirement is to record merges from the stable branch to
trunk in the trunk history (so cherrypicking is out). This is because
we merge from the branch to the trunk from time to time and in chunks;
recording the merges in the history helps us to know where to begin
the next merge, because "bzr missing" etc. do that job.
The disadvantage of the above method, whereby a revision is merged and
then backed out from the tree, is that metadata says revision REV1 is
in the trunk, but no "bzr diff" command will ever show you any of the
changes done in that revision on the branch. The excessive size of
the transferred data in the case in point is just an extreme example
of where this can lead, but even with "normal" metadata sizes this
leads to confusing contradictions between what the various bzr
commands tell about a particular merged revision.
TIA
|
|
 |
|
 |
 |
|
 |
|
On Tue, 24 Jan 2012 12:42:42 -0500, Stefan Monnier <...@IRO.UMontreal.CA
There is no better way!
Think about it:
if the patch really was already applied (and assuming "bzr merge"
doesn't end up generating spurious conflicts), then
cd trunk
bzr merge ../other-branch
bzr revert .
bzr commit -m "Merge but ignore the copyright changes"
will do the same as
cd trunk
bzr merge ../other-branch
bzr commit -m "Merge but ignore the copyright changes"
because after "bzr merge" there will be no local changes.
Stefan
|
|
 |
|
 |
 |
|
 |
|
On Tue, 24 Jan 2012 23:14:52 +0200, Eli Zaretskii <...@gnu.org
Yes, I know you think that, but I wanted a second opinion.
In any case, merging both from trunk to branch and back is not a good
idea, as the example of copyright changes demonstrates.
|
|
 |
|
 |
 |
|
 |
|
On Tue, 24 Jan 2012 16:57:41 -0500, Glenn Morris <...@gnu.org
That isn't really what happened. I'd already updated the trunk under the
assumption that the emacs-23 branch was dead, then it was decided to
make another release from that branch and I was asked to make the same
change there [1]. Since we have already established that there is no
other way to do the merge emacs-23 -way this could have been done.
Nobody else seems particularly interested in either of these tedious
jobs (updating years, merging between branches) so I did them to the
best of my abilities.
Anyway, I don't think this is on-topic for the bazaar list...
[1] Except it's not literally the same change because the notices have
different formats in the two branches. I tried it last year making the
change in emacs-23 then merging it trunk, and it was a huge PITA
resolving conflicts in thousands (literally) of files.
So I would not have done it that way this year even had I known in
advance that I would have to do it on two branches. Sorry, 28.8k modem
users.
http://lists.gnu.org/archive/html/savannah-hackers-public/2012-01/msg00007.html
|
|
 |
|
 |
 |
|
 |
|
On Wed, 25 Jan 2012 06:01:34 +0200, Eli Zaretskii <...@gnu.org
The problem that I'm talking about is that those changes got to the
trunk twice, not once. Theses are the relevant revisions:
99634.2.1006 Glenn Morris 2012-01-10
Add 2012 to FSF copyright years for Emacs files (do not merge to trunk)
99634.21.7 Kenichi Handa 2012-01-13 [merge]
You are talking about the former, but what about the latter?
I have no doubt, and thank you.
|
|
 |
|
 |
 |
|
 |
|
On Tue, 24 Jan 2012 21:44:29 -0500, Stefan Monnier <...@iro.umontreal.ca
AFAICT you did it just fine, indeed. The size of those revisions is
a bit annoying, but I don't think it's that bad. Furthermore, I don't
know how we could have done it better. AFAIK, even if we had applied
the patch to the emacs-23 branch first and then merged it into trunk and
that went all very smoothly (e.g. no spurious conflicts), I believe Bzr
would have shown the same large data size on both branches, i.e. it
would not have helped.
Stefan
|
|
 |
|
 |
 |
|
 |
|
On Wed, 25 Jan 2012 06:10:03 +0200, Eli Zaretskii <...@gnu.org
You are missing the point. I didn't complain about the large size of
the revision that changes all 2000 files. I complained about the
revision that changes twice that much, for some reason. See the
numbers shown by John in his analysis of the merge commit.
|
|
 |
|
 |
 |
|
 |
|
On Wed, 25 Jan 2012 08:27:54 -0500, Stefan Monnier <...@iro.umontreal.ca
Same difference: if we merge emacs-23 into the trunk, then when we fetch
the trunk we'll fetch the large diff on the trunk plus the large diff on
emacs-23 because Bazaar always wants to have the complete DAG locally.
Stefan
|
|
 |
|
 |
 |
|
 |
|
On Wed, 25 Jan 2012 15:49:51 +0200, Eli Zaretskii <...@gnu.org
But in this case, we fetched the large diff on the emacs-23 branch
plus _twice_ the large diff from the trunk. It is that twice part
that shouldn't have happened.
|
|
 |
|
 |
 |
|
 |
|
On Wed, 25 Jan 2012 10:03:20 -0500, Stefan Monnier <...@iro.umontreal.ca
I haven't followed the thread closely enough to know what you're
referring to.
Indeed, I don't think it should happen, can you point to the data that
shows that it happened?
Stefan
|
|
 |
|
 |
 |
|
 |
|
On Wed, 25 Jan 2012 16:09:39 +0100, John Arbash Meinel <...@arbash-meinel.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
So doing a revert merge has to create a new node in the text history,
because you have to assert that the text that should have superseded
doesn't. (rev A is the original text, rev B is the updated copyright,
you have to create a rev C that supersedes rev B that has the content
of rev A.)
However, this particular content compresses with 100% efficiency as
long as it is in the same group as text A, which will happen once we
have triggered recompression, but doesn't happen on the original
commit. (We don't delta compress at commit time, unless you trigger an
autopack.)
John
=:-
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk8gGzIACgkQJdeBCYSNAAN5fgCfSZLB8PfXfQylCjF1VkaJRGaH
O6UAnic6/xROvl1Kb3u1aO9gyO3JE02a
=6AdK
-----END PGP SIGNATURE-----
|
|
 |
|
 |
 |
|
 |
|
On Wed, 25 Jan 2012 20:12:51 +0200, Eli Zaretskii <...@gnu.org
Look at the diffs of these two revisions on the trunk:
99634.2.1006 Glenn Morris 2012-01-10
Add 2012 to FSF copyright years for Emacs files (do not merge to trunk)
99634.21.7 Kenichi Handa 2012-01-13 [merge]
|
|
 |
|
 |
|
|