Ceph rados namespace的设计

1. 描述

关于rados namespace的资料比较少,namespace, 是从Luminous 版才提出来的一个概念,目的是用namespace 来在逻辑上隔离各个用户,官方的描述如下

RADOS currently uses pools both for data distribution (pools are shared into PGs, which map to OSDs) and as the granularity for security (capabilities can restrict access by pool). 
Overloading pools for both purposes makes it hard to do multi-tenancy because it not a good idea to have a very large number of pools.

A namespace would be a division of a pool into separate logical namespaces. Instead of (pool, object) to identify an object,
it would be (pool, namespace, object) where the default namespace is the empty string.

整体上说是因为,如果使用不同pool去实现多租户,会出现很多pool,这不是一个好的方案。这也是为什么J版rgw有这么多pool ,而L版rgw只剩下很少的几个pool

// J版rgw
.rgw.root
default.rgw.control
default.rgw.data.root
default.rgw.gc
default.rgw.log
default.rgw.users.uid
default.rgw.users.keys
default.rgw.users.email
default.rgw.users.swift
default.rgw.buckets.index
default.rgw.buckets.data
.
.
.

// L版rgw
# rados lspools
default.rgw.buckets.data
.rgw.root
default.rgw.buckets.index
default.rgw.log
default.rgw.meta
default.rgw.control

L版用namespace整合了用户相关的信息(uid,key,email,swift)到 default.rgw.meta 池。使用namespace的概念,取代了多pool的做法

2.验证

// J版
# radosgw-admin zone get --rgw-zone=default
{
    "id": "e3833f0a-3f19-4021-a30e-32de342674e9",
    "name": "default",
    "domain_root": "default.rgw.data.root",
    "control_pool": "default.rgw.control",
    "gc_pool": "default.rgw.gc",
    "log_pool": "default.rgw.log",
    "intent_log_pool": "default.rgw.intent-log",
    "usage_log_pool": "default.rgw.usage",
    "user_keys_pool": "default.rgw.users.keys",
    "user_email_pool": "default.rgw.users.email",
    "user_swift_pool": "default.rgw.users.swift",
    "user_uid_pool": "default.rgw.users.uid",
    ...
    ...
}

// L版
# radosgw-admin zone get --rgw-zone=default
{
    "id": "e1cbec25-48d7-4b4a-a15c-74e0796f8b85",
    "name": "default",
    "domain_root": "default.rgw.meta:root",
    "control_pool": "default.rgw.control",
    "gc_pool": "default.rgw.log:gc",
    "lc_pool": "default.rgw.log:lc",
    "log_pool": "default.rgw.log",
    "intent_log_pool": "default.rgw.log:intent",
    "usage_log_pool": "default.rgw.log:usage",
    "reshard_pool": "default.rgw.log:reshard",
    "user_keys_pool": "default.rgw.meta:users.keys",
    "user_email_pool": "default.rgw.meta:users.email",
    "user_swift_pool": "default.rgw.meta:users.swift",
    "user_uid_pool": "default.rgw.meta:users.uid",
    ...
    ...
}

可以看到像J版的gc池 整合为default.rgw.log:gc 像 keys,email,swift , uid都整合到meta池中一个独立的namespace。

# rados ls -p default.rgw.meta --all
root    .bucket.meta.2018-07-17-r-12:e1cbec25-48d7-4b4a-a15c-74e0796f8b85.2123293.3
root    .bucket.meta.2018-07-11-18:e1cbec25-48d7-4b4a-a15c-74e0796f8b85.1789297.57
root    2018-07-30-03
root    2018-08-16-10
users.uid    tupu
users.keys    5R4ZY0NYTLR4O6G8ZYDH
users.uid    tupu.buckets

–all 显示所有namespace下的object,可以用 -N 参数指定namespace,上面显示左边一列就是 namespace 右边一列就是object, 默认不指定的namespace为” 空字符串

# rados ls -p default.rgw.buckets.data -N '' | head -n 5
.dir.e1cbec25-48d7-4b4a-a15c-74e0796f8b85.2611842.17.55
.dir.e1cbec25-48d7-4b4a-a15c-74e0796f8b85.2606095.1.45
.dir.e1cbec25-48d7-4b4a-a15c-74e0796f8b85.3417948.2.18
.dir.e1cbec25-48d7-4b4a-a15c-74e0796f8b85.1664628.1407.69
.dir.e1cbec25-48d7-4b4a-a15c-74e0796f8b85.1778823.87.27

这里查看了buckets.data的数据做验证,是因为meta pool下这个” namespace是没有数据的

看看官方对meta下各个namespace的解释

.rgw.root
    Unspecified region, zone, and global information records, one per object.
<zone>.rgw.control
    notify.<N>
<zone>.rgw.meta
    Multiple namespaces with different kinds of metadata:

namespace: root
    <bucket> .bucket.meta.<bucket>:<marker> # see put_bucket_instance_info()

The tenant is used to disambiguate buckets, but not bucket instances. Example:

    .bucket.meta.prodtx:test%25star:default.84099.6
    .bucket.meta.testcont:default.4126.1
    .bucket.meta.prodtx:testcont:default.84099.4
    prodtx/testcont
    prodtx/test%25star
    testcont
namespace: users.uid
Contains _both_ per-user information (RGWUserInfo) in “<user>” objects and per-user lists of buckets in omaps of “<user>.buckets” objects. The “<user>” may contain the tenant if non-empty, for example:

    prodtx$prodt
    test2.buckets
    prodtx$prodt.buckets
    test2
namespace: users.email
    Unimportant
namespace: users.keys
    47UA98JSTJZ9YAN3OS3O

    This allows radosgw to look up users by their access keys during authentication.

namespace: users.swift
    test:tester
<zone>.rgw.buckets.index
    Objects are named “.dir.<marker>”, each contains a bucket index. If the index is sharded, each shard appends the shard index after the marker.
<zone>.rgw.buckets.data
    default.7593.4__shadow_.488urDFerTYXavx4yAd-Op8mxehnvTI_1 <marker>_<key>
An example of a marker would be “default.16004.1” or “default.7593.4”. The current format is “<zone>.<instance_id>.<bucket_id>”. But once generated, a marker is not parsed again, so its format may change freely in the future.
参考文档: