Rgw Indexless实践
1. 起因
为什么突然考虑测试使用indexless ?
首先从一开始使用的rgw的时候遇到了bucket shard的问题,最后通过定期的设置shard的分片 (配置文件 rgw_override_bucket_index_max_shards 或者 zone 配置文件的 bucket_index_max_shards) 来优化shard,达到性能可控。
然后使用场景中出现了需要大量的删除rgw 数据的情况,这个时候存放index pool的那几个osd压力非常的大,导致ssd osd经常崩溃,虽然可以增加ssd osd去分担删除带来的rocksdb的压力,但是始终是有瓶颈
我们的场景,不需要额外的index 元数据,也就是说只需要帮忙存图片数据,元数据比如文件叫什么,不需要ceph去维护(如果要额外打标签的话就需要index了)
综合上面三个决定测试indexless
为什么index 会有这么多问题,是因为在写入数据的时候每次都会在index pool所在的osd的rocksdb中插入shard中文件的index,所以在很多小文件的时候会出现rocksdb 非常的大,这也是删除的时候瓶颈所在
2. indexless 的优缺点
优点:
- 减少io路径,能一定程度提高写入性能
- 没有index 存储,删除没有压力
缺点:
- 无法统计bucket 信息,无法list
- 无法使用multisite,无法同步数据
- 删除的时候直接删除bucket 不管里面是否有数据,因为,bucket不知道自己是否有数据
3. 操作过程
zonegroup 和 zone可以设置多组placement
a. 首先要创建 indexless 专用的pool
# ceph osd pool create com-zone1.rgw.buckets.indexless 256 256 pool 'com-zone1.rgw.buckets.indexless' created
b. zonegroup 创建多个placement
// 查看默认的placement # radosgw-admin zonegroup get --rgw-zonegroup=com-zonegroup { "id": "f0df824c-162b-42f7-9327-d38ade3e13b6", "name": "com-zonegroup", "api_name": "com-zonegroup", "is_master": "true", "endpoints": [ "http://172.25.52.205:7480" ], "hostnames": [], "hostnames_s3website": [], "master_zone": "15e4026c-2d5e-4add-9013-a88c047010c6", "zones": [ { "id": "15e4026c-2d5e-4add-9013-a88c047010c6", "name": "com-zone1", "endpoints": [ "http://172.25.52.205:7480", "http://172.25.52.206:7480" ], "log_meta": "false", "log_data": "false", "bucket_index_max_shards": 128, "read_only": "false", "tier_type": "", "sync_from_all": "true", "sync_from": [] } ], "placement_targets": [ { "name": "default-placement", "tags": [] } ], "default_placement": "default-placement", "realm_id": "193bd183-e66f-41a9-b750-e1085669a6d9" } --------------------------------------------------------- // 新加placement # radosgw-admin zonegroup placement add --placement-id=indexless-placement [ { "key": "default-placement", "val": { "name": "default-placement", "tags": [] } }, { "key": "indexless-placement", "val": { "name": "indexless-placement", "tags": [] } } ] --------------------------------------------------------- // 查看新加的结果 # radosgw-admin zonegroup get --rgw-zonegroup=com-zonegroup { "id": "f0df824c-162b-42f7-9327-d38ade3e13b6", "name": "com-zonegroup", "api_name": "com-zonegroup", "is_master": "true", "endpoints": [ "http://172.25.52.205:7480" ], "hostnames": [], "hostnames_s3website": [], "master_zone": "15e4026c-2d5e-4add-9013-a88c047010c6", "zones": [ { "id": "15e4026c-2d5e-4add-9013-a88c047010c6", "name": "com-zone1", "endpoints": [ "http://172.25.52.205:7480", "http://172.25.52.206:7480" ], "log_meta": "false", "log_data": "false", "bucket_index_max_shards": 128, "read_only": "false", "tier_type": "", "sync_from_all": "true", "sync_from": [] } ], "placement_targets": [ { "name": "default-placement", "tags": [] }, { "name": "indexless-placement", "tags": [] } ], "default_placement": "default-placement", "realm_id": "193bd183-e66f-41a9-b750-e1085669a6d9" }
已经添加了placement : indexless-placement
c. zone 创建多个placement 注意这里的placement 跟zonegroup的一样
// 查看当前的zone配置 # radosgw-admin zone get --rgw-zone=com-zone1 { "id": "15e4026c-2d5e-4add-9013-a88c047010c6", "name": "com-zone1", "domain_root": "com-zone1.rgw.meta:root", "control_pool": "com-zone1.rgw.control", "gc_pool": "com-zone1.rgw.log:gc", "lc_pool": "com-zone1.rgw.log:lc", "log_pool": "com-zone1.rgw.log", "intent_log_pool": "com-zone1.rgw.log:intent", "usage_log_pool": "com-zone1.rgw.log:usage", "reshard_pool": "com-zone1.rgw.log:reshard", "user_keys_pool": "com-zone1.rgw.meta:users.keys", "user_email_pool": "com-zone1.rgw.meta:users.email", "user_swift_pool": "com-zone1.rgw.meta:users.swift", "user_uid_pool": "com-zone1.rgw.meta:users.uid", "system_key": { "access_key": "5R4ZY0NYTLR4O6G8ZYDH", "secret_key": "jznAGs4SlN1tySvD9Qs2hw2AOgbJhDK11BFkHYgw" }, "placement_pools": [ { "key": "default-placement", "val": { "index_pool": "com-zone1.rgw.buckets.index", "data_pool": "com-zone1.rgw.buckets.data", "data_extra_pool": "com-zone1.rgw.buckets.non-ec", "index_type": 0, "compression": "" } } ], "metadata_heap": "", "tier_config": [], "realm_id": "193bd183-e66f-41a9-b750-e1085669a6d9" } // 添加placement # radosgw-admin zone placement add --placement-id=indexless-placement --index-pool=com-zone1.rgw.buckets.indexless --data-pool=com-zone1.rgw.buckets.data --data-extra-pool=com-zone1.rgw.buckets.non-ec --placement-index-type=1 { "id": "15e4026c-2d5e-4add-9013-a88c047010c6", "name": "com-zone1", "domain_root": "com-zone1.rgw.meta:root", "control_pool": "com-zone1.rgw.control", "gc_pool": "com-zone1.rgw.log:gc", "lc_pool": "com-zone1.rgw.log:lc", "log_pool": "com-zone1.rgw.log", "intent_log_pool": "com-zone1.rgw.log:intent", "usage_log_pool": "com-zone1.rgw.log:usage", "reshard_pool": "com-zone1.rgw.log:reshard", "user_keys_pool": "com-zone1.rgw.meta:users.keys", "user_email_pool": "com-zone1.rgw.meta:users.email", "user_swift_pool": "com-zone1.rgw.meta:users.swift", "user_uid_pool": "com-zone1.rgw.meta:users.uid", "system_key": { "access_key": "5R4ZY0NYTLR4O6G8ZYDH", "secret_key": "jznAGs4SlN1tySvD9Qs2hw2AOgbJhDK11BFkHYgw" }, "placement_pools": [ { "key": "default-placement", "val": { "index_pool": "com-zone1.rgw.buckets.index", "data_pool": "com-zone1.rgw.buckets.data", "data_extra_pool": "com-zone1.rgw.buckets.non-ec", "index_type": 0, "compression": "" } }, { "key": "indexless-placement", "val": { "index_pool": "com-zone1.rgw.buckets.indexless", "data_pool": "com-zone1.rgw.buckets.data", "data_extra_pool": "com-zone1.rgw.buckets.non-ec", "index_type": 1, "compression": "" } } ], "metadata_heap": "", "tier_config": [], "realm_id": "193bd183-e66f-41a9-b750-e1085669a6d9" } // 提交配置文件 # radosgw-admin period update --commit
注意这里的index-type 1表示indexless , 0 表示 正常的index
这里除了index的pool重新创建外,其他的都公用
每执行一次需要提交一次配置
d. 修改zonegroup 默认的placement 为indexless-placement
// 获取配置 # sudo radosgw-admin zonegroup get > zonegroup.json // 设置 # sudo radosgw-admin zonegroup set < zonegroup.json { "id": "f0df824c-162b-42f7-9327-d38ade3e13b6", "name": "com-zonegroup", "api_name": "com-zonegroup", "is_master": "true", "endpoints": [ "http://172.25.52.205:7480" ], "hostnames": [], "hostnames_s3website": [], "master_zone": "15e4026c-2d5e-4add-9013-a88c047010c6", "zones": [ { "id": "15e4026c-2d5e-4add-9013-a88c047010c6", "name": "com-zone1", "endpoints": [ "http://172.25.52.205:7480", "http://172.25.52.206:7480" ], "log_meta": "false", "log_data": "false", "bucket_index_max_shards": 2, "read_only": "false", "tier_type": "", "sync_from_all": "true", "sync_from": [] } ], "placement_targets": [ { "name": "default-placement", "tags": [] }, { "name": "indexless-placement", "tags": [] } ], "default_placement": "indexless-placement", "realm_id": "193bd183-e66f-41a9-b750-e1085669a6d9" } // 提交配置 # sudo radosgw-admin period update --commit
查看底层数据
a. indexless pool 和data pool
// 查看indexless的对象 # rados ls -p com-zone1.rgw.buckets.indexless .dir.15e4026c-2d5e-4add-9013-a88c047010c6.24816.1.0 .dir.15e4026c-2d5e-4add-9013-a88c047010c6.24816.1.1 // 查看indexless 对象的omap # rados listomapkeys -p com-zone1.rgw.buckets.indexless .dir.15e4026c-2d5e-4add-9013-a88c047010c6.24816.1.1 # rados listomapkeys -p com-zone1.rgw.buckets.indexless .dir.15e4026c-2d5e-4add-9013-a88c047010c6.24816.1.0 // 查看data pool rados ls -p com-zone1.rgw.buckets.data 15e4026c-2d5e-4add-9013-a88c047010c6.24816.1__multipart_5b854200-ab65-11e8-825b-2cf0ee1dc504_ffplay.png.2~A6J-EtBII4ysYQvh0fKZjuinXbNSVL-.2 15e4026c-2d5e-4add-9013-a88c047010c6.24816.1_5e24b050-ab5f-11e8-bd3d-2cf0ee1dc504_conn2016.png 15e4026c-2d5e-4add-9013-a88c047010c6.24816.1__shadow_5b854200-ab65-11e8-825b-2cf0ee1dc504_ffplay.png.2~A6J-EtBII4ysYQvh0fKZjuinXbNSVL-.2_1 15e4026c-2d5e-4add-9013-a88c047010c6.24816.1_5b854200-ab65-11e8-825b-2cf0ee1dc504_ffplay.png 15e4026c-2d5e-4add-9013-a88c047010c6.24816.1_1e56d566-ab5a-11e8-a2cb-2cf0ee1dc504.png 15e4026c-2d5e-4add-9013-a88c047010c6.24816.1__shadow_5b854200-ab65-11e8-825b-2cf0ee1dc504_ffplay.png.2~A6J-EtBII4ysYQvh0fKZjuinXbNSVL-.1_1 15e4026c-2d5e-4add-9013-a88c047010c6.24816.1__shadow_.YV7FPknwVrxkqH2N-NOQ5MYMbir4ELh_1 15e4026c-2d5e-4add-9013-a88c047010c6.24816.1_c8c32618-ab59-11e8-8994-2cf0ee1dc504.png 15e4026c-2d5e-4add-9013-a88c047010c6.24816.1_3cf2a4e8-ab5b-11e8-aefe-2cf0ee1dc504.png 15e4026c-2d5e-4add-9013-a88c047010c6.24816.1__multipart_5b854200-ab65-11e8-825b-2cf0ee1dc504_ffplay.png.2~A6J-EtBII4ysYQvh0fKZjuinXbNSVL-.1 15e4026c-2d5e-4add-9013-a88c047010c6.24816.1_1f701f2e-ab5b-11e8-b92e-2cf0ee1dc504.png
可以看到index 分片上不会再存储文件的index, 但是底层data数据存储跟一般的是一摸一样的
b. bucket 数据 indexless 和正常index bucket 对比
// 正常index bucket stat # radosgw-admin bucket stats --bucket=2018-08-29-11 { "bucket": "2018-08-29-11", "zonegroup": "923eb5b0-5426-4dc1-87d4-f81cf5c15646", "placement_rule": "default-placement", "explicit_placement": { "data_pool": "", "data_extra_pool": "", "index_pool": "" }, "id": "e1cbec25-48d7-4b4a-a15c-74e0796f8b85.4254769.5", "marker": "e1cbec25-48d7-4b4a-a15c-74e0796f8b85.4254769.5", "index_type": "Normal", "owner": "com", "ver": "0#7,1#5,2#5,3#3,4#1,5#5,6#5,7#3,8#1,9#1,10#3,11#1,12#1,13#5,14#3,15#3,16#5,17#3,18#5,19#5,20#3", "master_ver": "0#0,1#0,2#0,3#0,4#0,5#0,6#0,7#0,8#0,9#0,10#0,11#0,12#0,13#0,14#0,15#0,16#0,17#0,18#0,19#0,20#0", "mtime": "2018-08-29 11:13:05.647788", "max_marker": "0#,1#,2#,3#,4#,5#,6#,7#,8#,9#,10#,11#,12#,13#,14#,15#,16#,17#,18#,19#,20#", "usage": { "rgw.main": { "size": 141080, "size_actual": 212992, "size_utilized": 141080, "size_kb": 138, "size_kb_actual": 208, "size_kb_utilized": 138, "num_objects": 26 } }, "bucket_quota": { "enabled": false, "check_on_raw": false, "max_size": -1, "max_size_kb": 0, "max_objects": -1 } } # radosgw-admin bucket limit check { "bucket": "2018-08-29-15", "tenant": "", "num_objects": 0, "num_shards": 21, "objects_per_shard": 0, "fill_status": "OK" } -------------------------------------------------------- // indexless bucket stat # radosgw-admin bucket stats --bucket=TPData { "bucket": "TPData", "zonegroup": "f0df824c-162b-42f7-9327-d38ade3e13b6", "placement_rule": "indexless-placement", "explicit_placement": { "data_pool": "", "data_extra_pool": "", "index_pool": "" }, "id": "15e4026c-2d5e-4add-9013-a88c047010c6.24816.1", "marker": "15e4026c-2d5e-4add-9013-a88c047010c6.24816.1", "index_type": "Indexless", "owner": "combi", "ver": "0#1,1#1", "master_ver": "0#0,1#0", "mtime": "2018-08-29 15:03:53.520191", "max_marker": "0#,1#", "usage": {}, "bucket_quota": { "enabled": false, "check_on_raw": false, "max_size": -1, "max_size_kb": 0, "max_objects": -1 } } # radosgw-admin bucket limit check [ { "user_id": "combi", "buckets": [ { "bucket": "TPData", "tenant": "", "num_objects": 0, "num_shards": 2, "objects_per_shard": 0, "fill_status": "OK" } ] }, { "user_id": "comk8s", "buckets": [] } ]
可以看到 indexless 没有了任何统计的信息,无法知道bucket有多少object,是一把双刃剑,删除的bucket时候直接删除,不会像index那样需要将每一条index数据全部删除才能够将bucket删除。这也是rocksdb压力大的原因