X-Git-Url: https://gerrit.opnfv.org/gerrit/gitweb?a=blobdiff_plain;f=src%2Fceph%2Fdoc%2Fchangelog%2Fv0.48.1argonaut.txt;fp=src%2Fceph%2Fdoc%2Fchangelog%2Fv0.48.1argonaut.txt;h=0000000000000000000000000000000000000000;hb=7da45d65be36d36b880cc55c5036e96c24b53f00;hp=cdd557f97a914260d1b1231431d1131f9e3fd770;hpb=691462d09d0987b47e112d6ee8740375df3c51b2;p=stor4nfv.git diff --git a/src/ceph/doc/changelog/v0.48.1argonaut.txt b/src/ceph/doc/changelog/v0.48.1argonaut.txt deleted file mode 100644 index cdd557f..0000000 --- a/src/ceph/doc/changelog/v0.48.1argonaut.txt +++ /dev/null @@ -1,1286 +0,0 @@ -commit a7ad701b9bd479f20429f19e6fea7373ca6bba7c -Author: Sage Weil -Date: Mon Aug 13 14:58:51 2012 -0700 - - v0.48.1argonaut - -commit d4849f2f8a8c213c266658467bc5f22763010bc2 -Author: Yehuda Sadeh -Date: Wed Aug 1 13:22:38 2012 -0700 - - rgw: fix usage trim call encoding - - Fixes: #2841. - Usage trim operation was encoding the wrong op structure (usage read). - Since the structures somewhat overlapped it somewhat worked, but user - info wasn't encoded. - - Backport: argonaut - Signed-off-by: Yehuda Sadeh - -commit 515952d07107d442889754ec3bd6a344fad25d58 -Author: Yehuda Sadeh -Date: Wed Aug 8 15:21:53 2012 -0700 - - cls_rgw: fix rgw_cls_usage_log_trim_op encode/decode - - It was not encoding user, adding that and reset version - compatibility. - This changes affects command interface, makes use of - radosgw-admin usage trim incompatible. Use of old - radosgw-admin usage trim should be avoided, as it may - remove more data than requested. In any case, upgraded - server code will not handle old client's trim requests. - - backport: argonaut - Signed-off-by: Yehuda Sadeh - -commit 2e77130d5c80220be1612b5499d422de620d2d0b -Author: Yehuda Sadeh -Date: Tue Jul 31 16:17:22 2012 -0700 - - rgw: expand date format support - - Relaxing the date format parsing function to allow UTC - instead of GMT. - - Signed-off-by: Yehuda Sadeh - -commit 14fa77d9277b5ef5d0c6683504b368773b39ccc4 -Author: Yehuda Sadeh -Date: Thu Aug 2 11:13:05 2012 -0700 - - rgw: complete multipart upload can handle chunked encoding - - Fixes: #2878 - We now allow complete multipart upload to use chunked encoding - when sending request data. With chunked encoding the HTTP_LENGTH - header is not required. - - Backport: argonaut - Signed-off-by: Yehuda Sadeh - -commit a06f7783fbcc02e775fc36f30e422fe0f9e0ec2d -Author: Yehuda Sadeh -Date: Wed Aug 1 11:19:32 2012 -0700 - - rgw_xml: xml_handle_data() appends data string - - Fixes: #2879. - xml_handle_data() appends data to the object instead of just - replacing it. Parsed data can arrive in pieces, specifically - when data is escaped. - - Backport: argonaut - Signed-off-by: Yehuda Sadeh - -commit a8b224b9c4877a559ce420a2e04f19f68c8c5680 -Author: Yehuda Sadeh -Date: Wed Aug 1 13:09:41 2012 -0700 - - rgw: ETag is unquoted in multipart upload complete - - Fixes #2877. - Removing quotes from ETag before comparing it to what we - have when completing a multipart upload. - - Backport: argonaut - Signed-off-by: Yehuda Sadeh - -commit 22259c6efda9a5d55221fd036c757bf123796753 -Author: Josh Durgin -Date: Wed Aug 8 15:24:57 2012 -0700 - - MonMap: return error on failure in build_initial - - If mon_host fails to parse, return an error instead of success. - This avoids failing later on an assert monmap.size() > 0 in the - monmap in MonClient. - - Fixes: #2913 - Signed-off-by: Josh Durgin - -commit 49b2c7b5a79b8fb4a3941eca2cb0dbaf22f658b7 -Author: Josh Durgin -Date: Wed Aug 8 15:10:27 2012 -0700 - - addr_parsing: report correct error message - - getaddrinfo uses its return code to report failures. - - Signed-off-by: Josh Durgin - -commit 7084f29544f431b7c6a3286356f2448ae0333eda -Author: Sage Weil -Date: Wed Aug 8 14:01:53 2012 -0700 - - mkcephfs: use default osd_data, _journal values - - Signed-off-by: Sage Weil - Reviewed-by: Greg Farnum - -commit 96b1a496cdfda34a5efdb6686becf0d2e7e3a1c0 -Author: Sage Weil -Date: Wed Aug 8 14:01:35 2012 -0700 - - mkcephfs: use new default keyring locations - - The ceph-conf command only parses the conf; it does not apply default - config values. This breaks mkcephfs if values are not specified in the - config. - - Let ceph-osd create its own key, fix copying, and fix creation/copying for - the mds. - - Fixes: #2845 - Reported-by: Florian Haas - Signed-off-by: Sage Weil - Reviewed-by: Greg Farnum - -commit 4bd466d6ed49c7192df4a5bf0d63bda5d7d7dd9a -Author: Sage Weil -Date: Tue Jul 31 14:01:57 2012 -0700 - - osd: peering: detect when log source osd goes down - - The Peering state has a generic check based on the prior set osds that - will restart peering if one of them goes down (or one of the interesting - down ones comes up). The GetLog state, however, can pull the log from - a peer that is not in the prior set if it got a notify from them (e.g., an - osd in an old interval that was down when the prior set was calculated). - If that osd goes down, we don't detect it and will block forward. - - Fix by adding a simple check in GetLog for the newest_update_osd going - down. - - (BTW GetMissing does not suffer from this problem because - peer_missing_requested is a subset of the prior set, so the Peering check - is sufficient.) - - Signed-off-by: Sage Weil - Reviewed-by: Samuel Just - -commit 87defa88a0c6d6aafaa65437a6e4ddd92418f834 -Author: Sylvain Munaut -Date: Tue Jul 31 11:55:56 2012 -0700 - - rbd: fix off-by-one error in key name - - Fixes: #2846 - Signed-off-by: Sylvain Munaut - -commit 37d5b46269c8a4227e5df61a88579d94f7b56772 -Author: Sylvain Munaut -Date: Tue Jul 31 11:54:29 2012 -0700 - - secret: return error on empty secret - - Signed-off-by: Sylvain Munaut - -commit 7b9d37c662313929b52011ddae47cc8abab99095 -Author: Sage Weil -Date: Sat Jul 28 10:05:47 2012 -0700 - - osd: set STRAY on pg load when non-primary - - The STRAY bit indicates that we should annouce ourselves to the primary, - but it is only set in start_peering_interval(). We also need to set it - initially, so that a PG that is loaded but whose role does not change - (e.g., the stray replica stays a stray) will notify the primary. - - Observed: - - osd starts up - - mapping does not change, STRAY not set - - does not announce to primary - - primary does not re-check must_have_unfound, objects appear unfound - - Fix this by initializing STRAY when pg is loaded or created whenever we - are not the primary. - - Fixes: #2866 - Signed-off-by: Sage Weil - -commit 96feca450c5505a06868bc012fe998a03371b77f -Author: Sage Weil -Date: Fri Jul 27 16:03:26 2012 -0700 - - osd: peering: make Incomplete a Peering substate - - This allows us to still catch changes in the prior set that would affect - our conclusions (that we are incomplete) and, when they happen, restart - peering. - - Consider: - - calc prior set, osd A is down - - query everyone else, no good info - - set down, go to Incomplete (previously WaitActingChange) state. - - osd A comes back up (we do nothing) - - osd A sends notify message with good info (we ignore) - - By making this a Peering substate, we catch the Peering AdvMap reaction, - which will notice a prior set down osd is now up and move to Reset. - - Fixes: #2860 - Signed-off-by: Sage Weil - -commit a71e442fe620fa3a22ad9302413d8344a3a1a969 -Author: Sage Weil -Date: Fri Jul 27 15:39:40 2012 -0700 - - osd: peering: move to Incomplete when.. incomplete - - PG::choose_acting() may return false and *not* request an acting set change - if it can't find any suitable peers with enough info to recover. In that - case, we should move to Incomplete, not WaitActingChange, just like we do - a bit lower in GetLog() if we have non-contiguous logs. The state name is - more accurate, and this is also needed to fix bug #2860. - - Signed-off-by: Sage Weil - -commit 623026d9bc8ea4c845eb3b06d79e0ca9bef50deb -Merge: 87b6e80 9db7809 -Author: Sage Weil -Date: Fri Jul 27 14:00:52 2012 -0700 - - Merge remote-tracking branch 'gh/stable' into stable-next - -commit 9db78090451e609e3520ac3e57a5f53da03f9ee2 -Author: Sage Weil -Date: Thu Jul 26 16:35:00 2012 -0700 - - osd: fixing sharing of past_intervals on backfill restart - - We need to share past_intervals whenever we instantiate the PG on a peer. - In the PG activation case, this is based on whether our peer_info[] value - for that peer is dne(). However, the backfill code was updating the - peer info (history) in the block preceeding the dne() check, which meant - we never shared past_intervals in this case and the peer would have to - chew through a potentially large number of maps if the PG has not been - clean recently. - - Fix by checking dne() prior to the backfill block. We still need to fill - in the message later because it isn't yet instantiated. - - Fixes: #2849 - Signed-off-by: Sage Weil - Reviewed-by: Yehuda Sadeh - -commit 87b6e8045a3a1ff6439d2684e960ad0dc8988b33 -Merge: 81d72e5 7dfdf4f -Author: Sage Weil -Date: Thu Jul 26 15:04:12 2012 -0700 - - Merge remote-tracking branch 'gh/wip-rbd-bid' into stable-next - -commit 81d72e5d7ba4713eb7c290878d901e21c0709028 -Author: Sage Weil -Date: Mon Jul 23 10:47:10 2012 -0700 - - mon: make 'ceph osd rm ...' wipe out all state bits, not just EXISTS - - This ensures that when a new osd reclaims that id it behaves as if it were - really new. - - Backport: argonaut - Signed-off-by: Sage Weil - -commit ad9c37f2c029f6eb372efb711b234014397057e9 -Author: Sage Weil -Date: Mon Jul 9 20:54:19 2012 -0700 - - test_stress_watch: just one librados instance - - This was creating a new cluster connection/session per iteration, and - along with it a few service threads and sockets and so forth. - - Unfortunately, librados leaks like a sieve, starting with CephContext - and ceph::crypto::init(). See #845 and #2067. - - Signed-off-by: Sage Weil - -commit c60afe1842a48dd75944822c0872fce6a7229f5a -Merge: 8833050 35b1326 -Author: Sage Weil -Date: Thu Jul 26 15:03:50 2012 -0700 - - Merge commit '35b13266923f8095650f45562d66372e618c8824' into stable-next - - First batch of msgr fixes. - -commit 88330505cc772a5528e9405d515aa2b945b0819e -Author: Samuel Just -Date: Mon Jul 9 15:53:31 2012 -0700 - - ReplicatedPG: fix replay op ordering - - After a client reconnect, the client replays outstanding ops. The - OSD then immediately responds with success if the op has already - committed (version < ReplicatedPG::get_first_in_progress). - Otherwise, we stick it in waiting_for_ondisk to be replied to when - eval_repop concludes that waitfor_disk is empty. - - Fixes #2508 - - Signed-off-by: Samuel Just - - Conflicts: - - src/osd/ReplicatedPG.cc - -commit 682609a9343d0488788b1c6b03bc437b7905e4d6 -Author: Sage Weil -Date: Wed Jul 18 12:55:35 2012 -0700 - - objecter: always resend linger registrations - - If a linger op (watch) is sent to the OSD and updates the object, and then - the client loses the reply, it will resend the request. The OSD will see - that it is a dup, however, and not set up the in-memory session state for - the watch. This in turn will break the watch (i.e., notifies won't - get delivered). - - Instead, always resend linger registration ops, so that we always have a - unique reqid and do the correct session registeration for each session. - - * track the tid of the registation op for each LingerOp - * mark registrations ops as should_resend=false; cancel as needed - * when we send a new registration op, cancel the old one to ensure we - ignore the reply. This is needed becuase we resend linger ops on any - pg change, not just a primary change. - * drop the first_send arg to send_linger(), as we can now infer that - from register_tid == 0. - - The bug was easily reproduced with ms inject socket failures = 500 and the - test_stress_watch utility. - - Fixes: #2796 - Signed-off-by: Sage Weil - Reviewed-by: Josh Durgin - -commit 4d7d3e276967d555fed8a689976047f72c96c2db -Author: Sage Weil -Date: Mon Jul 9 13:22:42 2012 -0700 - - osd: guard class call decoding - - Backport: argonaut - Signed-off-by: Sage Weil - -commit 7fbbe4652ffb2826978aa1f1cacce4456d2ef1fc -Author: Sage Weil -Date: Thu Jul 5 18:08:58 2012 -0700 - - librados: take lock when signaling notify cond - - When we are signaling the cond to indicate that a notify is complete, - take the appropriate lock. This removes the possibility of a race - that loses our signal. (That would be very difficult given that there - are network round trips involved, but this makes the lock/cond usage - "correct.") - - Signed-off-by: Sage Weil - -commit 6ed01df412b4f4745c8f427a94446987c88b6bef -Author: Sage Weil -Date: Sun Jul 22 07:46:11 2012 -0700 - - workqueue: kick -> wake or _wake, depending on locking - - Break kick() into wake() and _wake() methods, depending on whether the - lock is already held. (The rename ensures that we audit/fix all - callers.) - - Signed-off-by: Sage Weil - - Conflicts: - - src/common/WorkQueue.h - src/osd/OSD.cc - -commit d2d40dc3059d91450925534f361f2c03eec9ef88 -Author: Sage Weil -Date: Wed Jul 4 15:11:21 2012 -0700 - - client: fix locking for SafeCond users - - Need to wait on flock, not client_lock. - - Signed-off-by: Sage Weil - -commit c963a21a8620779d97d6cbb51572551bdbb50d0b -Author: Sage Weil -Date: Thu Jul 26 15:01:05 2012 -0700 - - filestore: check for EIO in read path - - Check for EIO in read methods and helpers. Try to do checks in low-level - methods (e.g., lfn_*()) to avoid duplication in higher-level methods. - - The transaction apply function already checks for EIO on writes, and will - generate a nicer error message, so we can largely ignore the write path, - as long as errors get passed up correctly. - - Signed-off-by: Sage Weil - -commit 6bd89aeb1bf3b1cbb663107ae6bcda8a84dd8601 -Author: Sage Weil -Date: Thu Jul 26 09:07:46 2012 -0700 - - filestore: add 'filestore fail eio' option, default true - - By default we will assert/fail/crash on EIO from the underlying fs. We - already do this in the write path, but not the read path, or in various - internal infrastructure. - - Signed-off-by: Sage Weil - -commit e9b5a289838f17f75efbf9d1640b949e7485d530 -Author: Sage Weil -Date: Tue Jul 24 13:53:03 2012 -0700 - - config: fix 'config set' admin socket command - - Fixes: #2832 - Backport: argonaut - Signed-off-by: Sage Weil - -commit 1a6cd9659abcdad0169fe802ed47967467c448b3 -Author: Sage Weil -Date: Wed Jul 25 16:35:09 2012 -0700 - - osd: break potentially large transaction into pieces - - We do a similar trick elsewhere. Control this via a tunable. Eventually - we'll control the others (in a non-stable branch). - - Signed-off-by: Sage Weil - -commit 15e1622959f5a46f7a98502cdbaebfda2247a35b -Author: Sage Weil -Date: Wed Jul 25 14:53:34 2012 -0700 - - osd: only commit past intervals at end of parallel build - - We don't check for gaps in the past intervals, so we should only commit - this when we are completely done. Otherwise a partial run and rsetart will - leave the gap in place, which may confuse the peering code that relies on - this information. - - Signed-off-by: Sage Weil - -commit 16302acefd8def98fc4597366d6ba2845e17fcb6 -Author: Sage Weil -Date: Wed Jul 25 10:57:35 2012 -0700 - - osd: generate past intervals in parallel on boot - - Even though we aggressively share past_intervals with notifies etc, it is - still possible for an osd to get buried behind a pile of old maps and need - to generate these if it has been out of the cluster for a while. This has - happened to us in the past but, sadly, we did not merge the work then. - On the bright side, this implementation is much much much cleaner than the - old one because of the pg_interval_t helper we've since switched to. - - On bootup, we look at the intervals each pg needs and calclate the union, - and then iterate over that map range. The inner bit of the loop is - functionally identical to PG::build_past_intervals(), keeping the per-pg - state in the pistate struct. - - Backport: argonaut - Signed-off-by: Sage Weil - Reviewed-by: Yehuda Sadeh - Reviewed-by: Josh Durgin - -commit fca65ff52a5f7d49bcac83b3b2232963a879e446 -Author: Sage Weil -Date: Wed Jul 25 10:58:07 2012 -0700 - - osd: move calculation of past_interval range into helper - - PG::generate_past_intervals() first calculates the range over which it - needs to generate past intervals. Do this in a helper function. - - Signed-off-by: Sage Weil - Reviewed-by: Yehuda Sadeh - Reviewed-by: Josh Durgin - -commit 5979351ef3d3d03bced9286f79cbc22524c4a8de -Author: Sage Weil -Date: Wed Jul 25 10:58:28 2012 -0700 - - osd: fix map epoch boot condition - - We only want to join the cluster if we can catch up to the latest - osdmap with a small number of maps, in this case a single map message. - - Backport: argonaut - Signed-off-by: Sage Weil - Reviewed-by: Yehuda Sadeh - -commit 8c7186d02627f8255273009269d50955172efb52 -Author: Sage Weil -Date: Tue Jul 24 20:18:01 2012 -0700 - - mon: ignore pgtemp messages from down osds - - Signed-off-by: Sage Weil - -commit b17f54671f350fd4247f895f7666d46860736728 -Author: Sage Weil -Date: Tue Jul 24 20:16:04 2012 -0700 - - mon: ignore osd_alive messages from down osds - - Signed-off-by: Sage Weil - -commit 7dfdf4f8de16155edd434534e161e06ba7c79d7d -Author: Josh Durgin -Date: Mon Jul 23 14:05:53 2012 -0700 - - librbd: replace assign_bid with client id and random number - - The assign_bid method has issues with replay because it is a write - that also returns data. This means that the replayed operation would - return success, but no data, and cause a create to fail. Instead, let - the client set the bid based on its global id and a random number. - - This only affects the creation of new images, since the bid is put - into an opaque string as part of the object prefix. - - Keep the server side assign_bid around in case there are old clients - still using it. - - Signed-off-by: Josh Durgin - -commit dc2d67112163bee8b111f75ae3e3ca42884b09b4 -Author: Dan Mick -Date: Mon Jul 9 14:11:23 2012 -0700 - - librados: add new constructor to form a Rados object from IoCtx - - This creates a separate reference to an existing connection, for - use when a client holding IoCtx needs to consult another (say, - for rbd cloning) - - Signed-off-by: Dan Mick - Reviewed-by: Josh Durgin - -commit c99671201de9d9cdf03bbf0f4e28e8afb70c280c -Author: Sage Weil -Date: Wed Jul 18 19:49:58 2012 -0700 - - add CRUSH_TUNABLES feature bit - - Signed-off-by: Sage Weil - -commit 0b579546cfddec35095b2aec753028d8e63f3533 -Author: Josh Durgin -Date: Wed Jul 18 10:24:58 2012 -0700 - - ObjectCacher: fix cache_bytes_hit accounting - - Misses are not hits! - - Signed-off-by: Josh Durgin - -commit 2869039b79027e530c2863ebe990662685e4bbe6 -Author: Pascal de Bruijn | Unilogic Networks B.V -Date: Wed Jul 11 15:23:16 2012 +0200 - - Robustify ceph-rbdnamer and adapt udev rules - - Below is a patch which makes the ceph-rbdnamer script more robust and - fixes a problem with the rbd udev rules. - - On our setup we encountered a symlink which was linked to the wrong rbd: - - /dev/rbd/mypool/myrbd -> /dev/rbd1 - - While that link should have gone to /dev/rbd3 (on which a - partition /dev/rbd3p1 was present). - - Now the old udev rule passes %n to the ceph-rbdnamer script, the problem - with %n is that %n results in a value of 3 (for rbd3), but in a value of - 1 (for rbd3p1), so it seems it can't be depended upon for rbdnaming. - - In the patch below the ceph-rbdnamer script is made more robust and it - now it can be called in various ways: - - /usr/bin/ceph-rbdnamer /dev/rbd3 - /usr/bin/ceph-rbdnamer /dev/rbd3p1 - /usr/bin/ceph-rbdnamer rbd3 - /usr/bin/ceph-rbdnamer rbd3p1 - /usr/bin/ceph-rbdnamer 3 - - Even with all these different styles of calling the modified script, it - should now return the same rbdname. This change "has" to be combined - with calling it from udev with %k though. - - With that fixed, we hit the second problem. We ended up with: - - /dev/rbd/mypool/myrbd -> /dev/rbd3p1 - - So the rbdname was symlinked to the partition on the rbd instead of the - rbd itself. So what probably went wrong is udev discovering the disk and - running ceph-rbdnamer which resolved it to myrbd so the following - symlink was created: - - /dev/rbd/mypool/myrbd -> /dev/rbd3 - - However partitions would be discovered next and ceph-rbdnamer would be - run with rbd3p1 (%k) as parameter, resulting in the name myrbd too, with - the previous correct symlink being overwritten with a faulty one: - - /dev/rbd/mypool/myrbd -> /dev/rbd3p1 - - The solution to the problem is in differentiating between disks and - partitions in udev and handling them slightly differently. So with the - patch below partitions now get their own symlinks in the following style - (which is fairly consistent with other udev rules): - - /dev/rbd/mypool/myrbd-part1 -> /dev/rbd3p1 - - Please let me know any feedback you have on this patch or the approach - used. - - Regards, - Pascal de Bruijn - Unilogic B.V. - - Signed-off-by: Pascal de Bruijn - Signed-off-by: Josh Durgin - -commit 426384f6beccabf9e9b9601efcb8147904ec97c2 -Author: Sage Weil -Date: Mon Jul 16 16:02:14 2012 -0700 - - log: apply log_level to stderr/syslog logic - - In non-crash situations, we want to make sure the message is both below the - syslog/stderr threshold and also below the normal log threshold. Otherwise - we get anything we gather on those channels, even when the log level is - low. - - Signed-off-by: Sage Weil - -commit 8dafcc5c1906095cb7d15d648a7c1d7524df3768 -Author: Sage Weil -Date: Mon Jul 16 15:40:53 2012 -0700 - - log: fix event gather condition - - We should gather an event if it is below the log or gather threshold. - - Previously we were only gathering if we were going to print it, which makes - the dump no more useful than what was already logged. - - Signed-off-by: Sage Weil - -commit ec5cd6def9817039704b6cc010f2797a700d8500 -Author: Samuel Just -Date: Mon Jul 16 13:11:24 2012 -0700 - - PG::RecoveryState::Stray::react(LogEvt&): reset last_pg_scrub - - We need to reset the last_pg_scrub data in the osd since we - are replacing the info. - - Probably fixes #2453 - - In cases like 2453, we hit the following backtrace: - - 0> 2012-05-19 17:24:09.113684 7fe66be3d700 -1 osd/OSD.h: In function 'void OSD::unreg_last_pg_scrub(pg_t, utime_t)' thread 7fe66be3d700 time 2012-05-19 17:24:09.095719 - osd/OSD.h: 840: FAILED assert(last_scrub_pg.count(p)) - - ceph version 0.46-313-g4277d4d (commit:4277d4d3378dde4264e2b8d211371569219c6e4b) - 1: (OSD::unreg_last_pg_scrub(pg_t, utime_t)+0x149) [0x641f49] - 2: (PG::proc_primary_info(ObjectStore::Transaction&, pg_info_t const&)+0x5e) [0x63383e] - 3: (PG::RecoveryState::ReplicaActive::react(PG::RecoveryState::MInfoRec const&)+0x4a) [0x633eda] - 4: (boost::statechart::detail::reaction_result boost::statechart::simple_state, (boost::statechart::history_mode)0>::local_react_impl_non_empty::local_react_impl, boost::statechart::custom_reaction, boost::statechart::custom_reaction >, boost::statechart::simple_state, (boost::statechart::history_mode)0> >(boost::statechart::simple_state, (boost::statechart::history_mode)0>&, boost::statechart::event_base const&, void const*)+0x130) [0x6466a0] - 5: (boost::statechart::simple_state, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x81) [0x646791] - 6: (boost::statechart::state_machine, boost::statechart::null_exception_translator>::send_event(boost::statechart::event_base const&)+0x5b) [0x63dfcb] - 7: (boost::statechart::state_machine, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x11) [0x63e0f1] - 8: (PG::RecoveryState::handle_info(int, pg_info_t&, PG::RecoveryCtx*)+0x177) [0x616987] - 9: (OSD::handle_pg_info(std::tr1::shared_ptr)+0x665) [0x5d3d15] - 10: (OSD::dispatch_op(std::tr1::shared_ptr)+0x2a0) [0x5d7370] - 11: (OSD::_dispatch(Message*)+0x191) [0x5dd4a1] - 12: (OSD::ms_dispatch(Message*)+0x153) [0x5ddda3] - 13: (SimpleMessenger::dispatch_entry()+0x863) [0x77fbc3] - 14: (SimpleMessenger::DispatchThread::entry()+0xd) [0x746c5d] - 15: (()+0x7efc) [0x7fe679b1fefc] - 16: (clone()+0x6d) [0x7fe67815089d] - NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this. - - Because we don't clear the scrub state before reseting info, - the last_scrub_stamp state in the info.history structure - changes without updating the osd state resulting in the - above assert failure. - - Backport: stable - - Signed-off-by: Samuel Just - -commit 248cfaddd0403c7bae8e1533a3d2e27d1a335b9b -Author: Samuel Just -Date: Mon Jul 9 17:57:03 2012 -0700 - - ReplicatedPG: don't warn if backfill peer stats don't match - - pinfo.stats might be wrong if we did log-based recovery on the - backfilled portion in addition to continuing backfill. - - bug #2750 - - Signed-off-by: Samuel Just - -commit bcb1073f9171253adc37b67ee8d302932ba1667b -Author: Sage Weil -Date: Sun Jul 15 20:30:34 2012 -0700 - - mon/MonitorStore: always O_TRUNC when writing states - - It is possible for a .new file to already exist, potentially with a - larger size. This would happen if: - - - we were proposing a different value - - we crashed (or were stopped) before it got renamed into place - - after restarting, a different value was proposed and accepted. - - This isn't so unlikely for the log state machine, where we're - aggregating random messages. O_TRUNC ensure we avoid getting the tail - end of some previous junk. - - I observed #2593 and found that a logm state value had a larger size on - one mon (after slurping) than the others, pointing to put_bl_sn_map(). - - While we are at it, O_TRUNC put_int() too; the same type of bug is - possible there, too. - - Fixes: #2593 - Signed-off-by: Sage Weil - -commit 41a570778a51fe9a36a5b67a177d173889e58363 -Author: Sage Weil -Date: Sat Jul 14 14:31:34 2012 -0700 - - osd: based misdirected op role calc on acting set - - We want to look at the acting set here, nothing else. This was causing us - to erroneously queue ops for later (wasting memory) and to erroneously - print out a 'misdrected op' message in the cluster log (confusion and - incorrect [but ignored] -ENXIO reply). - - Fixes: #2022 - Signed-off-by: Sage Weil - -commit b3d077c61e977e8ebb91288aa2294fb21c197fe7 -Author: Josh Durgin -Date: Fri Jul 13 09:42:20 2012 -0700 - - qa: download tests from specified branch - - These python tests aren't installed, so they need to be downloaded - - Signed-off-by: Josh Durgin - -commit e855cb247b5a9eda6845637e2da5b6358f69c2ed -Author: Yehuda Sadeh -Date: Mon Jun 25 09:47:37 2012 -0700 - - rgw: don't override subuser perm mask if perm not specified - - Bug #2650. We were overriding subuser perm mask whenever subuser - was modified, even if perm mask was not passed. - - Signed-off-by: Yehuda Sadeh - -commit d6c766ea425d87a2f2405c08dcec66f000a4e1a0 -Author: James Page -Date: Wed Jul 11 11:34:21 2012 -0700 - - debian: fix ceph-fs-common-dbg depends - - Signed-off-by: James Page - -commit 95e8d87bc3fb12580e4058401674b93e19df6e02 -Author: Yehuda Sadeh -Date: Wed Jul 11 11:52:24 2012 -0700 - - rados tool: remove -t param option for target pool - - Bug #2772. This fixes an issue that was introduced when we - added the 'rados cp' command. The -t param was already used - for rados bench. With this change the only way to specify - a target pool is using --target-pool. - Though this problem is post argonaut, the 'rados cp' command - has been backported, so we need this fix there too. - - Backport: argonaut - - Signed-off-by: Yehuda Sadeh - -commit 5b10778399d5bee602e57035df7d40092a649c06 -Author: Sage Weil -Date: Wed Jul 11 09:19:00 2012 -0700 - - Makefile: don't install crush headers - - This is leftover from when we built a libcrush.so. We can re-add when we - start doing that again. - - Reported-by: Laszlo Boszormenyi - Signed-off-by: Sage Weil - -commit 35b13266923f8095650f45562d66372e618c8824 -Author: Sage Weil -Date: Tue Jul 10 13:18:27 2012 -0700 - - msgr: take over existing Connection on Pipe replacement - - If a new pipe/socket is taking over an existing session, it should also - take over the Connection* associated with the existing session. Because - we cannot clear existing->connection_state, we just take another reference. - - Clean up the comments a bit while we're here. - - This affects MDS<->client sessions when reconnecting after a socket fault. - It probably also affects intra-cluster (osd/osd, mds/mds, mon/mon) - sessions as well, but I did not confirm that. - - Backport: argonaut - Signed-off-by: Sage Weil - -commit b387077b1d019ee52b28bc3bc5305bfb53dfd892 -Author: Sage Weil -Date: Sun Jul 8 20:33:12 2012 -0700 - - debian: include librados-config in librados-dev - - Reported-by: Laszlo Boszormenyi - Signed-off-by: Sage Weil - -commit 03c2dc244af11b711e2514fd5f32b9bfa34183f6 -Author: Sage Weil -Date: Tue Jul 3 13:04:28 2012 -0700 - - lockdep: increase max locks - - Hit this limit with the rados api tests. - - Signed-off-by: Sage Weil - -commit b554d112c107efe78ec64f85b5fe588f1e7137ce -Author: Sage Weil -Date: Tue Jul 3 12:07:28 2012 -0700 - - config: add unlocked version of get_my_sections; use it internally - - Signed-off-by: Sage Weil - -commit 01da287b8fdc07262be252f1a7c115734d3cc328 -Author: Sage Weil -Date: Tue Jul 3 08:20:06 2012 -0700 - - config: fix lock recursion in get_val_from_conf_file() - - Introduce a private, already-locked version. - - Signed-off-by: Sage Weil - -commit c73c64a0f722477a5b0db93da2e26e313a5f52ba -Author: Sage Weil -Date: Tue Jul 3 08:15:08 2012 -0700 - - config: fix recursive lock in parse_config_files() - - The _impl() helper is only called from parse_config_files(); don't retake - the lock. - - Signed-off-by: Sage Weil - -commit 6646e891ff0bd31c935d1ce0870367b1e086ddfd -Author: Sage Weil -Date: Tue Jul 3 18:51:02 2012 -0700 - - rgw: initialize fields of RGWObjEnt - - This fixes various valgrind warnings triggered by the s3test - test_object_create_unreadable. - - Signed-off-by: Sage Weil - -commit b33553aae63f70ccba8e3d377ad3068c6144c99a -Author: Yehuda Sadeh -Date: Fri Jul 6 13:14:53 2012 -0700 - - rgw: handle response-* params - - Handle response-* params that set response header field values. - Fixes #2734, #2735. - Backport: argonaut - - Signed-off-by: Yehuda Sadeh - -commit 74f687501a8a02ef248a76f061fbc4d862a9abc4 -Author: Sage Weil -Date: Wed Jul 4 13:59:04 2012 -0700 - - osd: add missing formatter close_section() to scrub status - - Also add braces to make the open/close matchups easier to see. Broken - by f36617392710f9b3538bfd59d45fd72265993d57. - - Signed-off-by: Sage Weil - -commit 020b29961303b12224524ddf78c0c6763a61242e -Author: Mike Ryan -Date: Wed Jun 27 14:14:30 2012 -0700 - - pg: report scrub status - - Signed-off-by: Mike Ryan - -commit db6d83b3ed51c07b361b27d2e5ce3227a51e2c60 -Author: Mike Ryan -Date: Wed Jun 27 13:30:45 2012 -0700 - - pg: track who we are waiting for maps from - - Signed-off-by: Mike Ryan - -commit e1d4855fa18b1cda85923ad9debd95768260d4eb -Author: Mike Ryan -Date: Tue Jun 26 16:25:27 2012 -0700 - - pg: reduce scrub write lock window - - Wait for all replicas to construct the base scrub map before finalizing - the scrub and locking out writes. - - Signed-off-by: Mike Ryan - -commit 27409aa1612c1512bf393de22b62bbfe79b104c1 -Author: Yehuda Sadeh -Date: Thu Jul 5 15:52:51 2012 -0700 - - rgw: don't store bucket info indexed by bucket_id - - Issue #2701. This info wasn't really used anywhere and we weren't - removing it. It was also sharing the same pool namespace as the - info indexed by bucket name, which is bad. - - Signed-off-by: Yehuda Sadeh - -commit 9814374a2b40e15c13eb03ce6b8e642b0f7f93e4 -Author: Yehuda Sadeh -Date: Thu Jul 5 14:59:22 2012 -0700 - - test_rados_tool.sh: test copy pool - - Signed-off-by: Yehuda Sadeh - -commit d75100667a539baf47c79d752b787ed5dcb51d7a -Author: Yehuda Sadeh -Date: Thu Jul 5 13:42:23 2012 -0700 - - rados tool: copy object in chunks - - Instead of reading the entire object and then writing it, - we read it in chunks. - - Signed-off-by: Yehuda Sadeh - -commit 16ea64fbdebb7a74e69e80a18d98f35d68b8d9a1 -Author: Yehuda Sadeh -Date: Fri Jun 29 14:43:00 2012 -0700 - - rados tool: copy entire pool - - A new rados tool command that copies an entire pool - into another existing pool. - - Signed-off-by: Yehuda Sadeh - -commit 960c2124804520e81086df97905a299c8dd4e08c -Author: Yehuda Sadeh -Date: Fri Jun 29 14:09:08 2012 -0700 - - rados tool: copy object - - New rados command: rados cp [dest-obj] - - Requires specifying source pool. Target pool and locator can be specified. - The new command preserves object xattrs and omap data. - - Signed-off-by: Yehuda Sadeh - -commit 23d31d3e2aa7f2b474a7b8e9d40deb245d8be9de -Author: Sage Weil -Date: Fri Jul 6 08:47:44 2012 -0700 - - ceph.spec.in: add ceph-disk-{activate,prepare} - - Reported-by: Jimmy Tang - Signed-off-by: Sage Weil - -commit ea11c7f9d8fd9795e127cfd7e8a1f28d4f5472e9 -Author: Wido den Hollander -Date: Thu Jul 5 15:29:54 2012 +0200 - - Allow URL-safe base64 cephx keys to be decoded. - - In these cases + and / are replaced by - and _ to prevent problems when using - the base64 strings in URLs. - - Signed-off-by: Wido den Hollander - Signed-off-by: Sage Weil - -commit f67fe4e368b5f250f0adfb183476f5f294e8a529 -Author: Wido den Hollander -Date: Wed Jul 4 15:46:04 2012 +0200 - - librados: Bump the version to 0.48 - - Signed-off-by: Wido den Hollander - Signed-off-by: Sage Weil - -commit 35b9ec881aecf84b3a49ec0395d7208de36dc67d -Author: Yehuda Sadeh -Date: Tue Jun 26 17:28:51 2012 -0700 - - rgw-admin: use correct modifier with strptime - - Bug #2658: used %I (12h) instead of %H (24h) - - Signed-off-by: Yehuda Sadeh - -commit da251fe88503d32b86113ee0618db7c446d34853 -Author: Yehuda Sadeh -Date: Thu Jun 21 15:40:27 2012 -0700 - - rgw: send both swift x-storage-token and x-auth-token - - older clients need x-storage-token, newer x-auth-token - - Signed-off-by: Yehuda Sadeh - -commit 4c19ecb9a34e77e71d523a0a97e17f747bd5767d -Author: Yehuda Sadeh -Date: Thu Jun 21 15:17:19 2012 -0700 - - rgw: radosgw-admin date params now also accept time - - The date format now is "YYYY-MM-DD[ hh:mm:ss]". Got rid of - the --time param for the old ops log stuff. - - Signed-off-by: Yehuda Sadeh - - Conflicts: - - src/test/cli/radosgw-admin/help.t - -commit 6958aeb898fc683159483bfbb798f069a9b5330a -Author: Yehuda Sadeh -Date: Thu Jun 21 13:14:47 2012 -0700 - - rgw-admin: fix usage help - - s/show/trim - - Signed-off-by: Yehuda Sadeh - -commit 83c043f803ab2ed74fa9a84ae9237dd7df2a0c57 -Author: Sage Weil -Date: Tue Jul 3 14:07:16 2012 -0700 - - radosgw-admin: fix clit test - - Signed-off-by: Sage Weil - -commit 5674158163e9c1d50985796931240b237676b74d -Author: Sage Weil -Date: Tue Jul 3 11:32:57 2012 -0700 - - ceph: fix cli help test - - Signed-off-by: Sage Weil - -commit 151bf0eef59acae2d1fcf3f0feb8b6aa963dc2f6 -Author: Samuel Just -Date: Tue Jul 3 11:23:16 2012 -0700 - - ReplicatedPG: remove faulty scrub assert in sub_op_modify_applied - - This assert assumed that all ops submitted before MOSDRepScrub was - submitted were processed by the time that MOSDRepScrub was - processed. In fact, MOSDRepScrub's scrub_to may refer to a - last_update yet to be seen by the replica. - - Bug #2693 - - Signed-off-by: Samuel Just - -commit 32833e88a1ad793fa4be86101ce9c22b6f677c06 -Author: Kyle Bader -Date: Tue Jul 3 11:20:38 2012 -0700 - - ceph: better usage - - Signed-off-by: Kyle Bader - -commit 67455c21879c9c117f6402259b5e2da84524e169 -Author: Sage Weil -Date: Tue Jul 3 09:20:35 2012 -0700 - - debian: strip new ceph-mds package - - Reported-by: Amon Ott - Signed-off-by: Sage Weil - -commit b53cdb97d15f9276a9b26bec9f29034149f93358 -Author: Sage Weil -Date: Tue Jul 3 06:46:10 2012 -0700 - - config: remove bad argparse_flag argument in parse_option() - - This is wrong, and thankfully valgrind picks it up. - - Signed-off-by: Sage Weil - -commit f7d4e39740fd2afe82ac40c711bd3fe7a282e816 -Author: Sage Weil -Date: Sun Jul 1 17:23:28 2012 -0700 - - msgr: restart_queue when replacing existing pipe and taking over the queue - - The queue may have been previously stopped (by discard_queue()), and needs - to be restarted. - - Fixes consistent failures from the mon_recovery.py integration tests. - - Signed-off-by: Sage Weil - -commit 5dfd2a512d309f7f641bcf7c43277f08cf650b01 -Author: Sage Weil -Date: Sun Jul 1 15:37:31 2012 -0700 - - msgr: choose incoming connection if ours is STANDBY - - If the connect_seq matches, but our existing connection is in STANDBY, take - the incoming one. Otherwise, the other end will wait indefinitely for us - to connect but we won't. - - Alternatively, we could "win" the race and trigger a connection by sending - a keepalive (or similar), but that is more work; we may as well accept the - incoming connection we have now. - - This removes STANDBY from the acceptable WAIT case states. It also keeps - responsibility squarely on the shoulders of the peer with something to - deliver. - - Without this patch, a 3-osd vstart cluster with - 'ms inject socket failures = 100' and rados bench write -b 4096 would start - generating slow request warnings after a few minutes due to the osds - failing to connect to each other. With the patch, I complete a 10 minute - run without problems. - - Signed-off-by: Sage Weil - -commit b7007a159f6d941fa8313a24af5810ce295b36ca -Author: Sage Weil -Date: Thu Jun 28 17:50:47 2012 -0700 - - msgr: preserve incoming message queue when replacing pipes - - If we replace an existing pipe with a new one, move the incoming queue - of messages that have not yet been dispatched over to the new Pipe so that - they are not lost. This prevents messages from being lost. - - Alternatively, we could set in_seq = existing->in_seq - existing->in_qlen, - but that would make the other end resend those messages, which is a waste - of bandwidth. - - Very easy to reproduce the original bug with 'ms inject socket failures'. - - Signed-off-by: Sage Weil - -commit 1f3a722e150f9f27fe7919e9579b5a88dcd15639 -Author: Sage Weil -Date: Thu Jun 28 17:45:24 2012 -0700 - - msgr: move dispatch_entry into DispatchQueue class - - A bit cleaner. - - Signed-off-by: Sage Weil - -commit 03445290dad5b1213dd138cacf46e379400201c9 -Author: Sage Weil -Date: Thu Jun 28 17:38:34 2012 -0700 - - msgr: move incoming queue to separate class - - This extricates the incoming queue and its funky relationship with - DispatchQueue from Pipe and moves it into IncomingQueue. There is now a - single IncomingQueue attached to each Pipe. DispatchQueue is now no - longer tied to Pipe. - - This modularizes the code a bit better (tho that is still a work in - progress) and (more importantly) will make it possible to move the - incoming messages from one pipe to another in accept(). - - Signed-off-by: Sage Weil - -commit 0dbc54169512da776c16161ec3b8fa0b3f08e248 -Author: Sage Weil -Date: Wed Jun 27 17:06:40 2012 -0700 - - msgr: make D_CONNECT constant non-zero, fix ms_handle_connect() callback - - A while ago we inadvertantly broke ms_handle_connect() callbacks because - of a check for m being non-zero in the dispatch_entry() thread. Adjust the - enums so that they get delivered again. - - This fixes hangs when, for example, the ceph tool sends a command, gets a - connection reset, and doesn't get the connect callback to resend after - reconnecting to a new monitor. - - Signed-off-by: Sage Weil - -commit 2429556a51e8f60b0d9bdee71ef7b34b367f2f38 -Author: Sage Weil -Date: Tue Jun 26 17:10:40 2012 -0700 - - msgr: fix pipe replacement assert - - We may replace an existing pipe in the STANDBY state if the previous - attempt failed during accept() (see previous patches). - - This might fix #1378. - - Signed-off-by: Sage Weil - -commit 204bc594be1a6046d1b362693d086b49294c2a27 -Author: Sage Weil -Date: Tue Jun 26 17:07:31 2012 -0700 - - msgr: do not try to reconnect con with CLOSED pipe - - If we have a con with a closed pipe, drop the message. For lossless - sessions, the state will be STANDBY if we should reconnect. For lossy - sessions, we will end up with CLOSED and we *should* drop the message. - - Signed-off-by: Sage Weil - -commit e6ad6d25a58b8e34a220d090d01e26293c2437b4 -Author: Sage Weil -Date: Tue Jun 26 17:06:41 2012 -0700 - - msgr: move to STANDBY if we replace during accept and then fail - - If we replace an existing pipe during accept() and then fail, move to - STANDBY so that our connection state (connect_seq, etc.) is preserved. - Otherwise, we will throw out that information and falsely trigger a - RESETSESSION on the next connection attempt. - - Signed-off-by: Sage Weil