ceph/vsm OSD nearfull thresholds not in alignment

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

ceph/vsm OSD nearfull thresholds not in alignment

jcalcote
Hi all --

This is a query to the community for insight into how we might solve a potentially confusing issue for VSM users. Here's a screenshot to illustrate the problem:



Note specifically that the warning in the "Cluster Status" widget at the top states there is "1 near full osd(s)", while the "OSD" widget in the bottom left corner shows there are 7 near full osds.

This difference comes from the fact that ceph has a near-full threshold, which defaults to 85%, while VSM has its own near-full threshold, which defaults to 75%.

An obvious first-reaction might be to reset the hard-coded VSM default to the same value as ceph, but that's a non-solution, really - as soon as it deviates (because the user changes the value in VSM settings), the problem surfaces again.

I don't know how this issue originated - perhaps early ceph versions didn't have an integrated near-full threshold. It doesn't matter - the point is, it does have one now, and it likes to report that value in health status output, so the potential for user confusion is now inherent in the conflict.

Here are some ideas I have:

1. Filter the ceph near-full statement out of the text sent to the "Cluster Status" edit box.
2. Add code to VSM to keep the near-full threshold value in sync with that of ceph and change the VSM setting value so that it actually sets Ceph's near-full threshold value.

Thoughts?
John
Reply | Threaded
Open this post in threaded view
|

Re: ceph/vsm OSD nearfull thresholds not in alignment

ywang19
Administrator
there are settings in cluster.manifest to nearfull thresholds, the settings expect to be copied to all ceph nodes when cluster creation.

if there are modifications directly on /etc/ceph/ceph.conf, vsm will not get notified, which will cause out of sync.

another case is, if the cluster is imported, those settings from ceph.conf will be get updated. normally, vsm assumes the ceph.conf copy in database as the central point. one workaround is to manually execute "import_ceph_conf" tool to update newer ceph.conf into database.


-yaguang
Reply | Threaded
Open this post in threaded view
|

Re: ceph/vsm OSD nearfull thresholds not in alignment

jcalcote
Thanks Yaguang - it may be that my changes to make the last touched ceph.conf file authoritative may just fix this problem without any other changes. I'll see after the pull request is merged, and work on a fix if something is broken with keeping near-full in sync. --John