src/ceph/doc/changelog/v0.61.8.txt

   1 commit d783e33b672ec324eb48d588f956da0c51ff5dac
   2 Author: Gary Lowell <gary.lowell@inktank.com>
   3 Date:   Sun Aug 18 23:54:49 2013 -0700
   4
   5     v0.61.8
   6
   7 commit 21a6e2479133a3debb9ab9057ff9fae70c9eede9
   8 Author: Samuel Just <sam.just@inktank.com>
   9 Date:   Thu Aug 8 15:12:46 2013 -0700
  10
  11     RadosClient: shutdown monclient after dropping lock
  12
  13     Otherwise, the monclient shutdown may deadlock waiting
  14     on a context trying to take the RadosClient lock.
  15
  16     Fixes: #5897
  17     Signed-off-by: Samuel Just <sam.just@inktank.com>
  18     Reviewed-by: Sage Weil <sage@inktank.com>
  19     (cherry picked from commit 0aacd10e2557c55021b5be72ddf39b9cea916be4)
  20
  21 commit 64bef4ae4bab28b0b82a1481381b0c68a22fe1a4
  22 Author: Sage Weil <sage@inktank.com>
  23 Date:   Sat Aug 17 09:05:32 2013 -0700
  24
  25     mon/OSDMonitor: make 'osd pool mksnap ...' not expose uncommitted state
  26
  27     [This is a backport of d1501938f5d07c067d908501fc5cfe3c857d7281]
  28
  29     We were returning success without waiting if the pending pool state had
  30     the snap.
  31
  32     Signed-off-by: Sage Weil <sage@inktank.com>
  33     Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
  34
  35 commit 411871f6bcc9a4b81140c2e98d13dc123860f6f7
  36 Author: Sage Weil <sage@inktank.com>
  37 Date:   Fri Aug 16 10:52:02 2013 -0700
  38
  39     mon/OSDMonitor: make 'osd pool rmsnap ...' not racy/crashy
  40
  41     NOTE: This is a manual backport of d90683fdeda15b726dcf0a7cab7006c31e99f14.
  42     Due to all kinds of collateral changes in the mon the original patch
  43     doesn't cleanly apply.
  44
  45     Ensure that the snap does in fact exist before we try to remove it.  This
  46     avoids a crash where a we get two dup rmsnap requests (due to thrashing, or
  47     a reconnect, or something), the committed (p) value does have the snap, but
  48     the uncommitted (pp) does not.  This fails the old test such that we try
  49     to remove it from pp again, and assert.
  50
  51     Restructure the flow so that it is easier to distinguish the committed
  52     short return from the uncommitted return (which must still wait for the
  53     commit).
  54
  55              0> 2013-07-16 14:21:27.189060 7fdf301e9700 -1 osd/osd_types.cc: In function 'void pg_pool_t::remove_snap(snapid_t)' thread 7fdf301e9700 time 2013-07-16 14:21:27.187095
  56         osd/osd_types.cc: 662: FAILED assert(snaps.count(s))
  57
  58          ceph version 0.66-602-gcd39d8a (cd39d8a6727d81b889869e98f5869e4227b50720)
  59          1: (pg_pool_t::remove_snap(snapid_t)+0x6d) [0x7ad6dd]
  60          2: (OSDMonitor::prepare_command(MMonCommand*)+0x6407) [0x5c1517]
  61          3: (OSDMonitor::prepare_update(PaxosServiceMessage*)+0x1fb) [0x5c41ab]
  62          4: (PaxosService::dispatch(PaxosServiceMessage*)+0x937) [0x598c87]
  63          5: (Monitor::handle_command(MMonCommand*)+0xe56) [0x56ec36]
  64          6: (Monitor::_ms_dispatch(Message*)+0xd1d) [0x5719ad]
  65          7: (Monitor::handle_forward(MForward*)+0x821) [0x572831]
  66          8: (Monitor::_ms_dispatch(Message*)+0xe44) [0x571ad4]
  67          9: (Monitor::ms_dispatch(Message*)+0x32) [0x588c52]
  68          10: (DispatchQueue::entry()+0x549) [0x7cf1d9]
  69          11: (DispatchQueue::DispatchThread::entry()+0xd) [0x7060fd]
  70          12: (()+0x7e9a) [0x7fdf35165e9a]
  71          13: (clone()+0x6d) [0x7fdf334fcccd]
  72          NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
  73
  74     Signed-off-by: Sage Weil <sage@inktank.com>
  75     Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
  76
  77 commit 50698d1862065c8d74338fd08c7e5af66e222490
  78 Author: Sage Weil <sage@inktank.com>
  79 Date:   Tue Aug 13 12:52:41 2013 -0700
  80
  81     librados: fix async aio completion wakeup
  82
  83     For aio flush, we register a wait on the most recent write.  The write
  84     completion code, however, was *only* waking the waiter if they were waiting
  85     on that write, without regard to previous writes (completed or not).
  86     For example, we might have 6 and 7 outstanding and wait on 7.  If they
  87     finish in order all is well, but if 7 finishes first we do the flush
  88     completion early.  Similarly, if we
  89
  90      - start 6
  91      - start 7
  92      - finish 7
  93      - flush; wait on 7
  94      - finish 6
  95
  96     we can hang forever.
  97
  98     Fix by doing any completions that are prior to the oldest pending write in
  99     the aio write completion handler.
 100
 101     Refs: #5919
 102
 103     Signed-off-by: Sage Weil <sage@inktank.com>
 104     Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
 105     Tested-by: Oliver Francke <Oliver.Francke@filoo.de>
 106     (cherry picked from commit 16ed0b9af8bc08c7dabead1c1a7c1a22b1fb02fb)
 107
 108 commit ef731dfc84a71d3c3262f5cff9a9d33a60255485
 109 Author: Josh Durgin <josh.durgin@inktank.com>
 110 Date:   Mon Aug 12 19:17:09 2013 -0700
 111
 112     librados: fix locking for AioCompletionImpl refcounting
 113
 114     Add an already-locked helper so that C_Aio{Safe,Complete} can
 115     increment the reference count when their caller holds the
 116     lock. C_AioCompleteAndSafe's caller is not holding the lock, so call
 117     regular get() to ensure no racing updates can occur.
 118
 119     This eliminates all direct manipulations of AioCompletionImpl->ref,
 120     and makes the necessary locking clear.
 121
 122     The only place C_AioCompleteAndSafe is used is in handling
 123     aio_flush_async(). This could cause a missing completion.
 124
 125     Refs: #5919
 126     Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
 127     Reviewed-by: Sage Weil <sage@inktank.com>
 128     Tested-by: Oliver Francke <Oliver.Francke@filoo.de>
 129     (cherry picked from commit 7a52e2ff5025754f3040eff3fc52d4893cafc389)
 130
 131 commit 32631685199f2e47c2ba0ed27d16eff80fa6917d
 132 Author: Sage Weil <sage@inktank.com>
 133 Date:   Fri Jul 12 14:27:04 2013 -0700
 134
 135     mon/Paxos: bootstrap peon too if monmap updates
 136
 137     If we get a monmap update, the leader bootstraps.  Peons should do the
 138     same.
 139
 140     Signed-off-by: Sage Weil <sage@inktank.com>
 141     Reviewed-by: Greg Farnum <greg@inktank.com>
 142     (cherry picked from commit efe5b67bb700ef6218d9579abf43cc9ecf25ef52)
 143
 144 commit 1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e
 145 Author: Sage Weil <sage@inktank.com>
 146 Date:   Tue Jun 25 13:16:45 2013 -0700
 147
 148     osd: fix race when queuing recovery ops
 149
 150     Previously we would sample how many ops to start under the lock, drop it,
 151     and start that many.  This is racy because multiple threads can jump in
 152     and we start too many ops.  Instead, claim as many slots as we can and
 153     release them back later if we do not end up using them.
 154
 155     Take care to re-wake the work-queue since we are releasing more resources
 156     for wq use.
 157
 158     Signed-off-by: Sage Weil <sage@inktank.com>
 159     Reviewed-by: Samuel Just <sam.just@inktank.com>
 160     (cherry picked from commit 01d3e094823d716be0b39e15323c2506c6f0cc3b)
 161
 162 commit 4433f9ad8b338b6a55e205602434b307287bfaa3
 163 Author: Sage Weil <sage@inktank.com>
 164 Date:   Mon Jun 24 16:37:29 2013 -0700
 165
 166     osd: tolerate racing threads starting recovery ops
 167
 168     We sample the (max - active) recovery ops to know how many to start, but
 169     do not hold the lock over the full duration, such that it is possible to
 170     start too many ops.  This isn't problematic except that our condition
 171     checks for being == max but not beyond it, and we will continue to start
 172     recovery ops when we shouldn't.  Fix this by adjusting the conditional
 173     to be <=.
 174
 175     Reported-by: Stefan Priebe <s.priebe@profihost.ag>
 176     Signed-off-by: Sage Weil <sage@inktank.com>
 177     Reviewed-by: David Zafman <david.zafman@inktank.com>
 178     (cherry picked from commit 3791a1e55828ba541f9d3e8e3df0da8e79c375f9)
 179
 180 commit 0964d53ef3e8e386e0a1635d2240aefad7b8e2c1
 181 Author: Sage Weil <sage@inktank.com>
 182 Date:   Fri Aug 9 18:02:32 2013 -0700
 183
 184     ceph-disk: fix mount options passed to move_mount
 185
 186     Commit 6cbe0f021f62b3ebd5f68fcc01a12fde6f08cff5 added a mount_options but
 187     in certain cases it may be blank.  Fill in with the defaults, just as we
 188     do in mount().
 189
 190     Backport: cuttlefish
 191     Reviewed-by: Dan Mick <dan.mick@inktank.com>
 192     Signed-off-by: Sage Weil <sage@inktank.com>
 193     (cherry picked from commit cb50b5a7f1ab2d4e7fdad623a0e7769000755a70)
 194
 195 commit d6be5ed2601b8cf45570afe7ca75ce5aba3f8b4f
 196 Author: Yehuda Sadeh <yehuda@inktank.com>
 197 Date:   Mon Aug 12 10:05:44 2013 -0700
 198
 199     rgw: fix multi delete
 200
 201     Fixes: #5931
 202     Backport: bobtail, cuttlefish
 203
 204     Fix a bad check, where we compare the wrong field. Instead of
 205     comparing the ret code to 0, we compare the string value to 0
 206     which generates implicit casting, hence the crash.
 207
 208     Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
 209     Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
 210     (cherry picked from commit f9f1c48ad799da2b4be0077bf9d61ae116da33d7)
 211
 212     Conflicts:
 213         src/rgw/rgw_rest_s3.cc
 214
 215 commit ecaa46a13837305b9382ab319d43890729c54f1e
 216 Author: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
 217 Date:   Tue Jul 23 21:56:09 2013 +0200
 218
 219     ceph.spec.in: obsolete ceph-libs only on the affected distro
 220
 221     The ceph-libs package existed only on Redhat based distro,
 222     there was e.g. never such a package on SUSE. Therefore: make
 223     sure the 'Obsoletes' is only set on these affected distros.
 224
 225     Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
 226
 227 commit 81aa68c309a6d4eaecc54f8d735efde8843fed8c
 228 Author: Gary Lowell <glowell@inktank.com>
 229 Date:   Wed Jul 3 11:28:28 2013 -0700
 230
 231     ceph.spec.in:  Obsolete ceph-libs
 232
 233     Signed-off-by: Gary Lowell  <gary.lowell@inktank.com>
 234     Reviewed-by: Sage Weil <sage@inktank.com>
 235
 236 commit 2a34df68bb02d14f6a25bd13dff600a4d629ad05
 237 Author: Joao Eduardo Luis <joao.luis@inktank.com>
 238 Date:   Fri Aug 9 14:48:15 2013 -0700
 239
 240     common: pick_addresses: fix bug with observer class that triggered #5205
 241
 242     The Observer class we defined to observe conf changes and thus avoid
 243     triggering #5205 (as fixed by eb86eebe1ba42f04b46f7c3e3419b83eb6fe7f9a),
 244     was returning always the same const static array, which would lead us to
 245     always populate the observer's list with an observer for 'public_addr'.
 246
 247     This would of course become a problem when trying to obtain the observer
 248     for 'cluster_add' during md_config_t::set_val() -- thus triggering the
 249     same assert as initially reported on #5205.
 250
 251     Backport: cuttlefish
 252     Fixes: #5205
 253
 254     Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
 255     Reviewed-by: Sage Weil <sage@inktank.com>
 256     (cherry picked from commit 7ed6de9dd7aed59f3c5dd93e012cf080bcc36d8a)
 257
 258 commit 1243c9749ed27850c5d041023780efcdf7b31a68
 259 Author: Alfredo Deza <alfredo@deza.pe>
 260 Date:   Thu Aug 8 16:09:26 2013 -0700
 261
 262     make sure we are using the mount options
 263
 264     Signed-off-by: Alfredo Deza <alfredo@deza.pe>
 265     (cherry picked from commit 34831d0989d4bcec4920068b6ee09ab6b3234c91)
 266
 267 commit a9a370be2d8155b696ebe2866febb0571da5740f
 268 Author: Samuel Just <sam.just@inktank.com>
 269 Date:   Fri Aug 2 11:58:52 2013 -0700
 270
 271     PG: set !flushed in Reset()
 272
 273     Otherwise, we might serve a pull before we start_flush in the
 274     ReplicaActive constructor.
 275
 276     Fixes: #5799
 277     Signed-off-by: Samuel Just <sam.just@inktank.com>
 278     Reviewed-by: Sage Weil <sage@inktank.com>
 279     (cherry picked from commit 9e7d6d547e0e8a6db6ba611882afa9bf74ea0195)
 280
 281 commit 65bfa4941f983c988837cd010f731966ff53fd19
 282 Author: Sage Weil <sage@inktank.com>
 283 Date:   Fri Jul 26 14:02:07 2013 -0700
 284
 285     osd: make open classes on start optional
 286
 287     This is cuttlefish; default to the old behavior!
 288
 289     Signed-off-by: Sage Weil <sage@inktank.com>
 290     (cherry picked from commit 6f996223fb34650771772b88355046746f238cf2)
 291
 292 commit e8253ae5451b1c8e3d7d50199b8db7b2d4c66486
 293 Author: Sage Weil <sage@inktank.com>
 294 Date:   Fri Jul 26 13:58:46 2013 -0700
 295
 296     osd: load all classes on startup
 297
 298     This avoid creating a wide window between when ceph-osd is started and
 299     when a request arrives needing a class and it is loaded.  In particular,
 300     upgrading the packages in that window may cause linkage errors (if the
 301     class API has changed, for example).
 302
 303     Fixes: #5752
 304     Signed-off-by: Sage Weil <sage@inktank.com>
 305     Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
 306     (cherry picked from commit c24e652d8c5e693498814ebe38c6adbec079ea36)
 307
 308     Conflicts:
 309         src/osd/ClassHandler.cc
 310
 311 commit 7a1d6d3e727fd8b6947c658e171bf7ec31cd7966
 312 Author: Sage Weil <sage@inktank.com>
 313 Date:   Sun Jul 28 15:42:08 2013 -0700
 314
 315     ceph_test_rados: print version banner on startup
 316
 317     It is helpful when looking at qa run logs to see what version of the
 318     tester is running.
 319
 320     Signed-off-by: Sage Weil <sage@inktank.com>
 321     (cherry picked from commit 12c1f1157c7b9513a3d9f716a8ec62fce00d28f5)
 322
 323 commit 86769f05ccc54bfec403bb9ea9a3a951bbcea301
 324 Author: Sage Weil <sage@inktank.com>
 325 Date:   Thu Jun 13 22:08:36 2013 -0700
 326
 327     ceph_test_rados: add --pool <name> arg
 328
 329     Signed-off-by: Sage Weil <sage@inktank.com>
 330     (cherry picked from commit bcfbd0a3ffae6947464d930f636c8b35d1331e9d)
 331
 332 commit b70a9abc5e3ae01204256f414bd7e69d083ed7c6
 333 Author: Sage Weil <sage@inktank.com>
 334 Date:   Fri Jul 26 14:07:02 2013 -0700
 335
 336     upstart: stop ceph-create-keys when the monitor stops
 337
 338     This avoids lingering ceph-create-keys tasks.
 339
 340     Backport: cuttlefish
 341     Signed-off-by: Sage Weil <sage@inktank.com>
 342     (cherry picked from commit a90a2b42db8de134b8ea5d81cab7825fb9ec50b4)
 343
 344 commit 5af48dc7c7e3a0d7f7bc22af58831d58d165e657
 345 Author: Samuel Just <sam.just@inktank.com>
 346 Date:   Fri Jul 26 13:42:27 2013 -0700
 347
 348     FileStore: fix fd leak in _check_global_replay_guard
 349
 350     Bug introduced in f3f92fe21061e21c8b259df5ef283a61782a44db.
 351
 352     Fixes: #5766
 353     Backport: cuttlefish
 354     Signed-off-by: Samuel Just <sam.just@inktank.com>
 355     Reviewed-by: Sage Weil <sage@inktank.com>
 356     (cherry picked from commit c562b72e703f671127d0ea2173f6a6907c825cd1)
 357
 358 commit 17aa2d6d16c77028bae1d2a77903cdfd81efa096
 359 Author: Sage Weil <sage@inktank.com>
 360 Date:   Thu Jul 25 11:10:53 2013 -0700
 361
 362     mon/Paxos: share uncommitted value when leader is/was behind
 363
 364     If the leader has and older lc than we do, and we are sharing states to
 365     bring them up to date, we still want to also share our uncommitted value.
 366     This particular case was broken by b26b7f6e, which was only contemplating
 367     the case where the leader was ahead of us or at the same point as us, but
 368     not the case where the leader was behind.  Note that the call to
 369     share_state() a few lines up will bring them fully up to date, so
 370     after they receive and store_state() for this message they will be at the
 371     same lc as we are.
 372
 373     Fixes: #5750
 374     Backport: cuttlefish
 375     Signed-off-by: Sage Weil <sage@inktank.com>
 376     Reviewed-by: Greg Farnum <greg@inktank.com>
 377     Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
 378     (cherry picked from commit 05b6c7e8645081f405c616735238ae89602d3cc6)
 379
 380 commit 09a664e25391dbad9a479bae33904d28231f429d
 381 Merge: 8f010af b0535fc
 382 Author: Sage Weil <sage@inktank.com>
 383 Date:   Thu Jul 25 15:21:31 2013 -0700
 384
 385     Merge remote-tracking branch 'gh/cuttlefish-next' into cuttlefish
 386
 387 commit b0535fcf854c5042d6b5ff481aabca08026d8f7f
 388 Author: Samuel Just <sam.just@inktank.com>
 389 Date:   Tue Jul 23 18:04:40 2013 -0700
 390
 391     HashIndex: reset attr upon split or merge completion
 392
 393     A replay of an in progress merge or split might make
 394     our counts unreliable.
 395
 396     Fixes: #5723
 397     Signed-off-by: Samuel Just <sam.just@inktank.com>
 398     Reviewed-by: Sage Weil <sage@inktank.com>
 399     (cherry picked from commit 0dc3efdd885377a07987d868af5bb7a38245c90b)
 400
 401 commit 8f73302b4e637ca8b85d68ea7503279faecb57d8
 402 Author: Samuel Just <sam.just@inktank.com>
 403 Date:   Tue Jul 23 17:34:25 2013 -0700
 404
 405     test/filestore/store_test: add test for 5723
 406
 407     Signed-off-by: Samuel Just <sam.just@inktank.com>
 408     Reviewed-by: Sage Weil <sage@inktank.com>
 409     (cherry picked from commit 37a4c4af54879512429bb114285bcb4c7c3488d5)
 410
 411     Conflicts:
 412         src/os/LFNIndex.cc
 413         src/test/filestore/store_test.cc
 414
 415 commit 6a7b9e5f0c1d2344209c69ab9992f94221a16468
 416 Author: Samuel Just <sam.just@inktank.com>
 417 Date:   Tue Jul 23 13:51:26 2013 -0700
 418
 419     FileStore::_collection_rename: fix global replay guard
 420
 421     If the replay is being replayed, we might have already
 422     performed the rename, skip it.  Also, we must set the
 423     collection replay guard only after we have done the
 424     rename.
 425
 426     Signed-off-by: Samuel Just <sam.just@inktank.com>
 427     Reviewed-by: Sage Weil <sage@inktank.com>
 428     (cherry picked from commit 870c474c5348831fcb13797d164f49682918fb30)
 429
 430 commit 7d98651775265896c22bacfc4afcfccbb0128470
 431 Author: Samuel Just <sam.just@inktank.com>
 432 Date:   Mon Jul 22 13:46:10 2013 -0700
 433
 434     PGLog::rewind_divergent_log: unindex only works from tail, index() instead
 435
 436     Fixes: #5714
 437     Signed-off-by: Samuel Just <sam.just@inktank.com>
 438     Reviewed-by: Sage Weil <sage@inktank.com>
 439     (cherry picked from commit 6957dbc75cc2577652b542aa3eae69f03060cb63)
 440
 441     The original patch covered the same code in PGLog.cc.
 442
 443     Conflicts:
 444
 445         src/osd/PGLog.cc
 446         src/osd/PG.cc
 447
 448 commit 611a06ae6c9cba468db206dfc82ec883c7a394af
 449 Author: Sage Weil <sage@inktank.com>
 450 Date:   Thu Jul 18 09:55:43 2013 -0700
 451
 452     msg/Pipe: do not hold pipe_lock for verify_authorizer()
 453
 454     We shouldn't hold the pipe_lock while doing the ms_verify_authorizer
 455     upcalls.
 456
 457     Fix by unlocking a bit earlier, and verifying our state is still correct
 458     in the failure path.
 459
 460     This regression was introduced by ecab4bb9513385bd765cca23e4e2fadb7ac4bac2.
 461
 462     Signed-off-by: Sage Weil <sage@inktank.com>
 463     Reviewed-by: Greg Farnum <greg@inktank.com>
 464     (cherry picked from commit 723d691f7a1f53888618dfc311868d1988f61f56)
 465
 466     Conflicts:
 467
 468         src/msg/Pipe.cc
 469
 470 commit 45bda482fa8a23f4b80d115e29d6f04cb5e226d6
 471 Author: Sage Weil <sage@inktank.com>
 472 Date:   Tue Jul 16 14:17:05 2013 -0700
 473
 474     msg/Pipe: a bit of additional debug output
 475
 476     Signed-off-by: Sage Weil <sage@inktank.com>
 477     (cherry picked from commit 16568d9e1fb8ac0c06ebaa1e1dc1d6a432a5e4d4)
 478
 479 commit 806eab59ad1a32aedb662c51de3b4a1d61fcbb62
 480 Author: Sage Weil <sage@inktank.com>
 481 Date:   Tue Jul 16 13:13:46 2013 -0700
 482
 483     msg/Pipe: hold pipe_lock during important parts of accept()
 484
 485     Previously we did not bother with locking for accept() because we were
 486     not visible to any other threads.  However, we need to close accepting
 487     Pipes from mark_down_all(), which means we need to handle interference.
 488
 489     Fix up the locking so that we hold pipe_lock when looking at Pipe state
 490     and verify that we are still in the ACCEPTING state any time we retake
 491     the lock.
 492
 493     Signed-off-by: Sage Weil <sage@inktank.com>
 494     (cherry picked from commit ecab4bb9513385bd765cca23e4e2fadb7ac4bac2)
 495
 496 commit ce6a0b74459996f91a0511a4a7147179bcd47876
 497 Author: Greg Farnum <greg@inktank.com>
 498 Date:   Wed Jul 17 15:23:12 2013 -0700
 499
 500     msgr: fix a typo/goto-cross from dd4addef2d
 501
 502     We didn't build or review carefully enough!
 503
 504     Signed-off-by: Greg Farnum <greg@inktank.com>
 505     Reviewed-by: Sage Weil <sage@inktank.com>
 506     (cherry picked from commit 1a84411209b13084b3edb87897d5d678937e3299)
 507
 508 commit 1ed51ad535612d5c444a3cc35a331f5e6a68ce30
 509 Author: Sage Weil <sage@inktank.com>
 510 Date:   Mon Jul 15 17:16:23 2013 -0700
 511
 512     msgr: close accepting_pipes from mark_down_all()
 513
 514     We need to catch these pipes too, particularly when doing a rebind(),
 515     to avoid them leaking through.
 516
 517     Signed-off-by: Sage Weil <sage@inktank.com>
 518     (cherry picked from commit 687fe888b32ac9d41595348dfc82111c8dbf2fcb)
 519
 520 commit 2f696f17a413015a3038d5aa76d18fe94f503f03
 521 Author: Sage Weil <sage@inktank.com>
 522 Date:   Mon Jul 15 17:14:25 2013 -0700
 523
 524     msgr: maintain list of accepting pipes
 525
 526     New pipes exist in a sort of limbo before we know who the peer is and
 527     add them to rank_pipe.  Keep a list of them in accepting_pipes for that
 528     period.
 529
 530     Signed-off-by: Sage Weil <sage@inktank.com>
 531     (cherry picked from commit dd4addef2d5b457cc9a58782fe42af6b13c68b81)
 532
 533 commit 540a6f49d402c1990f0e0fe9f8897dd664e79501
 534 Author: Sage Weil <sage@inktank.com>
 535 Date:   Tue Jul 16 16:25:28 2013 -0700
 536
 537     msgr: adjust nonce on rebind()
 538
 539     We can have a situation where:
 540
 541      - we have a pipe to a peer
 542      - pipe goes to standby (on peer)
 543      - we rebind to a new port
 544      - ....
 545      - we rebind again to the same old port
 546      - we connect to peer
 547
 548     and get reattached to the ancient pipe from two instances back.  Avoid that
 549     by picking a new nonce each time we rebind.
 550
 551     Add 1,000,000 each time so that the port is still legible in the printed
 552     output.
 553
 554     Signed-off-by: Sage Weil <sage@inktank.com>
 555     (cherry picked from commit 994e2bf224ab7b7d5b832485ee14de05354d2ddf)
 556
 557     Conflicts:
 558
 559         src/msg/Accepter.cc
 560
 561 commit f938a5bf604885ffba65a9b86e19258ca254e58c
 562 Author: Sage Weil <sage@inktank.com>
 563 Date:   Mon Jul 15 17:10:23 2013 -0700
 564
 565     msgr: mark_down_all() after, not before, rebind
 566
 567     If we are shutting down all old connections and binding to new ports,
 568     we want to avoid a sequence like:
 569
 570      - close all prevoius connections
 571      - new connection comes in on old port
 572      - rebind to new ports
 573      -> connection from old port leaks through
 574
 575     As a first step, close all connections after we shut down the old
 576     accepter and before we start the new one.
 577
 578     Signed-off-by: Sage Weil <sage@inktank.com>
 579     (cherry picked from commit 07a0860a1899c7353bb506e33de72fdd22b857dd)
 580
 581     Conflicts:
 582
 583         src/msg/SimpleMessenger.cc
 584
 585 commit 07b9ebf4212d53606ce332ff927a2ff68ed26978
 586 Author: Sage Weil <sage@inktank.com>
 587 Date:   Tue Jul 16 13:01:18 2013 -0700
 588
 589     msg/Pipe: unlock msgr->lock earlier in accept()
 590
 591     Small cleanup.  Nothing needs msgr->lock for the previously larger
 592     window.
 593
 594     Signed-off-by: Sage Weil <sage@inktank.com>
 595     (cherry picked from commit ad548e72fd94b4a16717abd3b3f1d1be4a3476cf)
 596
 597 commit ae85a0a101d624363fe761c06ecd52d3d38ba4a2
 598 Author: Sage Weil <sage@inktank.com>
 599 Date:   Tue Jul 16 10:09:02 2013 -0700
 600
 601     msg/Pipe: avoid creating empty out_q entry
 602
 603     We need to maintain the invariant that all sub queues in out_q are never
 604     empty.  Fix discard_requeued_up_to() to avoid creating an entry unless we
 605     know it is already present.
 606
 607     This bug leads to an incorrect reconnect attempt when
 608
 609      - we accept a pipe (lossless peer)
 610      - they send some stuff, maybe
 611      - fault
 612      - we initiate reconnect, even tho we have nothing queued
 613
 614     In particular, we shouldn't reconnect because we aren't checking for
 615     resets, and the fact that our out_seq is 0 while the peer's might be
 616     something else entirely will trigger asserts later.
 617
 618     This fixes at least one source of #5626, and possibly #5517.
 619
 620     Backport: cuttlefish
 621     Signed-off-by: Sage Weil <sage@inktank.com>
 622     (cherry picked from commit 9f1c27261811733f40acf759a72958c3689c8516)
 623
 624 commit 21e27262edc6f5f090ea8915517ee867e30b9066
 625 Author: Sage Weil <sage@inktank.com>
 626 Date:   Mon Jul 15 14:47:05 2013 -0700
 627
 628     msg/Pipe: assert lock is held in various helpers
 629
 630     These all require that we hold pipe_lock.
 631
 632     Signed-off-by: Sage Weil <sage@inktank.com>
 633     (cherry picked from commit 579d858aabbe5df88543d096ef4dbddcfc023cca)
 634
 635 commit 25f4786ac41869b3f135bd072000634765bb8919
 636 Author: Sage Weil <sage@inktank.com>
 637 Date:   Sun Jul 14 08:55:52 2013 -0700
 638
 639     msg/Pipe: be a bit more explicit about encoding outgoing messages
 640
 641     Signed-off-by: Sage Weil <sage@inktank.com>
 642     (cherry picked from commit 4282971d47b90484e681ff1a71ae29569dbd1d32)
 643
 644 commit 48105a32605aa59b6970eb89fce4ecc4201e8d04
 645 Author: Sage Weil <sage@inktank.com>
 646 Date:   Fri Jul 12 16:21:24 2013 -0700
 647
 648     msg/Pipe: fix RECONNECT_SEQ behavior
 649
 650     Calling handle_ack() here has no effect because we have already
 651     spliced sent messages back into our out queue.  Instead, pull them out
 652     of there and discard.  Add a few assertions along the way.
 653
 654     Signed-off-by: Sage Weil <sage@inktank.com>
 655     Reviewed-by: Greg Farnum <greg@inktank.com>
 656     (cherry picked from commit 495ee108dbb39d63e44cd3d4938a6ec7d11b12e3)
 657
 658 commit 1eab069017ce6b71e4bc2bb9679dbe31b50ae938
 659 Author: Sage Weil <sage@inktank.com>
 660 Date:   Mon Jun 17 13:32:38 2013 -0700
 661
 662     msgr: reaper: make sure pipe has been cleared (under pipe_lock)
 663
 664     All paths to pipe shutdown should have cleared the con->pipe reference
 665     already.  Assert as much.
 666
 667     Also, do it under pipe_lock!
 668
 669     Signed-off-by: Sage Weil <sage@inktank.com>
 670     (cherry picked from commit 9586305a2317c7d6bbf31c9cf5b67dc93ccab50d)
 671
 672 commit db06a5092bc45d0479fe492a5d592713a7c53494
 673 Author: Sage Weil <sage@inktank.com>
 674 Date:   Mon Jun 17 14:14:02 2013 -0700
 675
 676     msg/Pipe: goto fail_unlocked on early failures in accept()
 677
 678     Instead of duplicating an incomplete cleanup sequence (that does not
 679     clear_pipe()), goto fail_unlocked and do the cleanup in a generic way.
 680     s/rc/r/ while we are here.
 681
 682     Signed-off-by: Sage Weil <sage@inktank.com>
 683     (cherry picked from commit ec612a5bda119cea52bbac9b2a49ecf1e83b08e5)
 684
 685 commit 8612e50fd70bfceebd6c291e6cab10d9dfd39e8c
 686 Author: Sage Weil <sage@inktank.com>
 687 Date:   Mon Jun 17 13:32:07 2013 -0700
 688
 689     msgr: clear con->pipe inside pipe_lock on mark_down
 690
 691     We need to do this under protection of the pipe_lock.
 692
 693     Signed-off-by: Sage Weil <sage@inktank.com>
 694     (cherry picked from commit afafb87e8402242d3897069f4b94ba46ffe0c413)
 695
 696 commit 8aafe131acadc22cb069f3d98bba6922ab09c749
 697 Author: Sage Weil <sage@inktank.com>
 698 Date:   Mon Jun 17 12:47:11 2013 -0700
 699
 700     msgr: clear_pipe inside pipe_lock on mark_down_all
 701
 702     Observed a segfault in rebind -> mark_down_all -> clear_pipe -> put that
 703     may have been due to a racing thread clearing the connection_state pointer.
 704     Do the clear_pipe() call under the protection of pipe_lock, as we do in
 705     all other contexts.
 706
 707     Signed-off-by: Sage Weil <sage@inktank.com>
 708     (cherry picked from commit 5fc1dabfb3b2cbffdee3214d24d7769d6e440e45)
 709
 710     Conflicts:
 711
 712         src/msg/SimpleMessenger.cc
 713
 714 commit 2f7979d1262e9d4899be76963a1620db46b334e8
 715 Author: Samuel Just <sam.just@inktank.com>
 716 Date:   Thu Jul 18 19:26:02 2013 -0700
 717
 718     ReplicatedPG: track temp collection contents, clear during on_change
 719
 720     We also assert in on_flushed() that the temp collection is actually
 721     empty.
 722
 723     Fixes: #5670
 724     Signed-off-by: Samuel Just <sam.just@inktank.com>
 725     Reviewed-by: Sage Weil <sage@inktank.com>
 726     (cherry picked from commit 47516d9c4b7f023f3a16e166749fa7b1c7b3b24c)
 727
 728     Conflicts:
 729
 730         src/osd/ReplicatedPG.cc
 731
 732 commit c7e2945a42541f966017180684dd969389eef3ac
 733 Author: Samuel Just <sam.just@inktank.com>
 734 Date:   Thu Jul 18 19:25:14 2013 -0700
 735
 736     PG, ReplicatedPG: pass a transaction down to ReplicatedPG::on_change
 737
 738     Signed-off-by: Samuel Just <sam.just@inktank.com>
 739     Reviewed-by: Sage Weil <sage@inktank.com>
 740     (cherry picked from commit 9f56a7b8bfcb63cb4fbbc0c9b8ff01de9e518c57)
 741
 742 commit 7ffc65fc4d7d842954cf791c016fd2711f644a9c
 743 Author: Samuel Just <sam.just@inktank.com>
 744 Date:   Wed Jul 17 15:04:10 2013 -0700
 745
 746     PG: start flush on primary only after we process the master log
 747
 748     Once we start serving reads, stray objects must have already
 749     been removed.  Therefore, we have to flush all operations
 750     up to the transaction writing out the authoritative log.
 751     On replicas, we flush in Stray() if we will not eventually
 752     be activated and in ReplicaActive if we are in the acting
 753     set.  This way a replica won't serve a replica read until
 754     the store is consistent.
 755
 756     Signed-off-by: Samuel Just <sam.just@inktank.com>
 757     Reviewed-by: Sage Weil <sage@inktank.com>
 758     (cherry picked from commit b41f1ba48563d1d3fd17c2f62d10103b5d63f305)
 759
 760 commit 850da0890da5df7e670df9268afe420d0c906c38
 761 Author: Samuel Just <sam.just@inktank.com>
 762 Date:   Wed Jul 17 12:51:19 2013 -0700
 763
 764     ReplicatedPG: replace clean_up_local with a debug check
 765
 766     Stray objects should have been cleaned up in the merge_log
 767     transactions.  Only on the primary have those operations
 768     necessarily been flushed at activate().
 769
 770     Fixes: 5084
 771     Signed-off-by: Samuel Just <sam.just@inktank.com>
 772     Reviewed-by: Sage Weil <sage@inktank.com>
 773     (cherry picked from commit 278c7b59228f614addf830cb0afff4988c9bc8cb)
 774
 775 commit 95b1b5da439f1b7e2fb1886aaeec2d61532183f0
 776 Author: Samuel Just <sam.just@inktank.com>
 777 Date:   Thu Jul 18 10:12:17 2013 -0700
 778
 779     FileStore: add global replay guard for split, collection_rename
 780
 781     In the event of a split or collection rename, we need to ensure that
 782     we don't replay any operations on objects within those collections
 783     prior to that point.  Thus, we mark a global replay guard on the
 784     collection after doing a syncfs and make sure to check that in
 785     _check_replay_guard() for all object operations.
 786
 787     Fixes: #5154
 788     Signed-off-by: Samuel Just <sam.just@inktank.com>
 789     Reviewed-by: Sage Weil <sage@inktank.com>
 790     (cherry picked from commit f3f92fe21061e21c8b259df5ef283a61782a44db)
 791
 792     Conflicts:
 793
 794         src/os/FileStore.cc
 795
 796 commit d92a43d8ff0123b234e47a94c2ce73fcaae7f625
 797 Author: Samuel Just <sam.just@inktank.com>
 798 Date:   Mon Jul 15 13:44:20 2013 -0700
 799
 800     OSD: add config option for peering_wq batch size
 801
 802     Large peering_wq batch sizes may excessively delay
 803     peering messages resulting in unreasonably long
 804     peering.  This may speed up peering.
 805
 806     Backport: cuttlefish
 807     Related: #5084
 808     Signed-off-by: Samuel Just <sam.just@inktank.com>
 809     Reviewed-by: Sage Weil <sage@inktank.com>
 810     (cherry picked from commit 39e5a2a406b77fa82e9a78c267b679d49927e3c3)