Rgw数据是如何存储在ceph中的_2 实践部分
首先贴出参考文档:
配置项
- rgw_max_chunk_size : 4194304
- rgw_obj_stripe_size : 4194304
一. 验证整体上传
1. 上传对象小于chunk size (4M)
1.1 使用python boto3 上传文件lion.png 到bucket TPData2,查看文件
# radosgw-admin bucket list --bucket TPData2 { "name": "lion.png", "instance": "", "ver": { "pool": 11, "epoch": 1 }, "locator": "", "exists": "true", "meta": { "category": 1, "size": 227247, "mtime": "2018-06-12 02:30:08.160356Z", "etag": "bcf0777b8994d10aee935d3ba6ba392e", "owner": "tupu", "owner_display_name": "tupudis", "content_type": "", "accounted_size": 227247, "user_data": "" }, "tag": "573d2f8c-12ca-4284-a5d3-398aa32ceadf.24145.2", "flags": 0, "pending_map": [], "versioned_epoch": 0 }
文件对象大小为227247 Byte,227KB7
1.2.1 在rados查看文件的实际存储情况
# rados ls -p default.rgw.buckets.data 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_conn2016.pdf 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__multipart_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.1 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_ffplay-3.4.dmg 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__shadow_.POU2vUerD5G1ZPvT9BuuArlE_gHyFyJ_1 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__shadow_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.2_1 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__multipart_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.2 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__shadow_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.1_1 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_lion.png
其中对象 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_lion.png 就是实际存储lion.png的rados对象,因为小于max chunk 所以只有一个对象。
1.2.2 查看此对象的大小
# rados stat 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_lion.png -p default.rgw.buckets.data default.rgw.buckets.data/573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_lion.png mtime 2018-06-12 10:30:08.000000, size 227247
可以看到此rados对象的大小为227247
1.2.3 查看xattr信息
// 查看对象的xattr # rados -p default.rgw.buckets.data listxattr 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_lion.png user.rgw.acl user.rgw.etag user.rgw.idtag user.rgw.manifest user.rgw.pg_ver user.rgw.source_zone user.rgw.tail_tag user.rgw.x-amz-content-sha256 user.rgw.x-amz-date // 导出manifest # rados -p default.rgw.buckets.data getxattr 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_lion.png user.rgw.manifest > /tmp/lion.png.manifest // 查看manifest # ceph-dencoder type RGWObjManifest import { "objs": [], "obj_size": 227247, "explicit_objs": "false", "head_size": 227247, <------------ 头文件大小等于文件大小 "max_head_size": 4194304, "prefix": ".yNQczYBScqbqyPgVBv6Th832_qXyOob_", "rules": [ { "key": 0, "val": { "start_part_num": 0, "start_ofs": 4194304, "part_size": 0, "stripe_max_size": 4194304, "override_prefix": "" } } ], "tail_instance": "", "tail_placement": { "bucket": { "name": "TPData2", "marker": "573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1", "bucket_id": "573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1", <------------ bucket id "tenant": "", "explicit_placement": { "data_pool": "", "data_extra_pool": "", "index_pool": "" } }, "placement_rule": "default-placement" } }
user.rgw.manifest 就是存储rgw对象跟rados对象对应关系的拓展属性
2. 上传对象大于max chunk
2.1 使用python boto3 上传文件conn2016.pdf 到bucket TPData2,查看文件
# radosgw-admin bucket list --bucket TPData2 { "name": "conn2016.pdf", "instance": "", "ver": { "pool": 11, "epoch": 1 }, "locator": "", "exists": "true", "meta": { "category": 1, "size": 4632900, "mtime": "2018-06-12 02:17:22.467932Z", "etag": "568dbefc09c209fa4d5598258d7f0831", "owner": "tupu", "owner_display_name": "tupudis", "content_type": "", "accounted_size": 4632900, "user_data": "" }, "tag": "573d2f8c-12ca-4284-a5d3-398aa32ceadf.24143.1", "flags": 0, "pending_map": [], "versioned_epoch": 0 },
文件大小为 4632900 B ,4.6MB
2.2.1 在rados查看文件的实际存储情况
# rados ls -p default.rgw.buckets.data 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_conn2016.pdf // conn2016.pdf head object 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__multipart_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.1 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_ffplay-3.4.dmg 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__shadow_.POU2vUerD5G1ZPvT9BuuArlE_gHyFyJ_1 // conn2016.pdf 的shadow文件 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__shadow_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.2_1 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__multipart_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.2 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__shadow_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.1_1 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_lion.png
上面加了注释的两个rados对象就是conn2016.pdf 的实际存储。
2.2.2 查看这两个对象的大小
// head obj # rados stat 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_conn2016.pdf -p default.rgw.buckets.data default.rgw.buckets.data/573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_conn2016.pdf mtime 2018-06-12 10:17:22.000000, size 4194304 // shadow obj # rados stat 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__shadow_.POU2vUerD5G1ZPvT9BuuArlE_gHyFyJ_1 -p default.rgw.buckets.data default.rgw.buckets.data/573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__shadow_.POU2vUerD5G1ZPvT9BuuArlE_gHyFyJ_1 mtime 2018-06-12 10:17:22.000000, size 438596
size 4194304 + 438596 = 4632900 跟rgw obj对象相等, 那么怎么确定 这个shadow就是这个head 的shadow呢?因为manifest信息保存在head obj文件中,看下面
2.2.3 查看xattr信息
// 查看对象的xattr # rados -p default.rgw.buckets.data listxattr 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_conn2016.pdf user.rgw.acl user.rgw.etag user.rgw.idtag user.rgw.manifest user.rgw.pg_ver user.rgw.source_zone user.rgw.tail_tag user.rgw.x-amz-content-sha256 user.rgw.x-amz-date // 导出manifest # rados -p default.rgw.buckets.data getxattr 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_conn2016.pdf user.rgw.manifest > /tmp/conn2016.pdf.manifest // 查看manifset # ceph-dencoder type RGWObjManifest import /tmp/conn2016.pdf.manifest decode dump_json { "objs": [], "obj_size": 4632900, "explicit_objs": "false", "head_size": 4194304, "max_head_size": 4194304, "prefix": ".POU2vUerD5G1ZPvT9BuuArlE_gHyFyJ_", // 这个prefix 就是shadow的随机码 "rules": [ { "key": 0, "val": { "start_part_num": 0, "start_ofs": 4194304, "part_size": 0, "stripe_max_size": 4194304, "override_prefix": "" } } ], "tail_instance": "", "tail_placement": { "bucket": { "name": "TPData2", "marker": "573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1", "bucket_id": "573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1", "tenant": "", "explicit_placement": { "data_pool": "", "data_extra_pool": "", "index_pool": "" } }, "placement_rule": "default-placement" } }
prefix : ”.POU2vUerD5G1ZPvT9BuuArlEgHyFyJ“ 跟shadow
573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_shadow.POU2vUerD5G1ZPvT9BuuArlE_gHyFyJ_1 就是这样对应的2.2.4 其他元数据信息 acl 和 etag
// 导出acl # rados -p default.rgw.buckets.data getxattr 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_conn2016.pdf user.rgw.acl > /tmp/conn2016.pdf.acl // 查看acl # ceph-dencoder type RGWAccessControlPolicy import /tmp/conn2016.pdf.acl decode dump_json { "acl": { "acl_user_map": [ { "user": "tupu", "acl": 15 } ], "acl_group_map": [], "grant_map": [ { "id": "tupu", "grant": { "type": { "type": 0 }, "id": "tupu", "email": "", "permission": { "flags": 15 }, "name": "tupudis", "group": 0, "url_spec": "" } } ] }, "owner": { "id": "tupu", "display_name": "tupudis" } } // 查看etag # rados -p default.rgw.buckets.data getxattr 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_conn2016.pdf user.rgw.etag 568dbefc09c209fa4d5598258d7f0831
对于非分片上传的对象文件而言,etag就是MD5,几乎在对象文件,head_obj的扩展属性中,对象文件的ACL信息,也记录在head_obj的扩展属性中,用户指定的metadata也是存放在此处 acl
二. 验证分段上传
1 使用python boto3 上传文件ffplay-3.4.dmg 到bucket TPData2,查看文件
# radosgw-admin bucket list --bucket TPData2 { "name": "ffplay-3.4.dmg", "instance": "", "ver": { "pool": 11, "epoch": 1 }, "locator": "", "exists": "true", "meta": { "category": 1, "size": 16235019, "mtime": "2018-06-12 01:52:50.259141Z", "etag": "c3b5e76658302f7b9cf4e721a847f86d-2", "owner": "tupu", "owner_display_name": "tupudis", "content_type": "", "accounted_size": 16235019, "user_data": "" }, "tag": "573d2f8c-12ca-4284-a5d3-398aa32ceadf.4884.5", "flags": 0, "pending_map": [], "versioned_epoch": 0 }
文件大小为 16235019 B
2.1 在rados查看文件的实际存储情况
# rados ls -p default.rgw.buckets.data 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_conn2016.pdf 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__multipart_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.1 // fmpeg 分段1 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_ffplay-3.4.dmg // 独立对象保存shaodow multipat 的manifest,以及rgw跟rados对象映射关系,size为0 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__shadow_.POU2vUerD5G1ZPvT9BuuArlE_gHyFyJ_1 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__shadow_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.2_1 // fmpeg 分段1 的shadow 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__multipart_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.2 // fmpeg 分段2 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__shadow_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.1_1 // fmpeg 分段1 的shadow 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_lion.png
因为chunk 为 4M,stripe为4M,分段大小这里是8M,上传 16m的对象(其实不到16MB),总共分成两个分段,每个分段 8M 两个对象一个multi一个shaodw。
注意上面名字中的2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I,最容易让你困扰的是2~,这个2~是upload_id的前缀
#define MULTIPART_UPLOAD_ID_PREFIX_LEGACY "2/" #define MULTIPART_UPLOAD_ID_PREFIX "2~" // must contain a unique char that may not come up in gen_rand_alpha()
2.2 查看这5个对象的大小
# rados stat 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__multipart_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.1 -p default.rgw.buckets.data default.rgw.buckets.data/573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__multipart_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.1 mtime 2018-06-12 09:52:50.000000, size 4194304 # rados stat 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__shadow_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.1_1 -p default.rgw.buckets.data default.rgw.buckets.data/573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__shadow_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.1_1 mtime 2018-06-12 09:52:49.000000, size 4194304 # rados stat 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__multipart_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.2 -p default.rgw.buckets.data default.rgw.buckets.data/573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__multipart_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.2 mtime 2018-06-12 09:52:49.000000, size 4194304 # rados stat 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__shadow_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.2_1 -p default.rgw.buckets.data default.rgw.buckets.data/573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1__shadow_ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1.2_1 mtime 2018-06-12 09:52:49.000000, size 3652107 # rados stat 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_ffplay-3.4.dmg -p default.rgw.buckets.data default.rgw.buckets.data/573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_ffplay-3.4.dmg mtime 2018-06-12 09:52:50.000000, size 0
size 4194304 + 4194304 + 4194304 + 3652107 = 16235019 跟rgw obj对象相等,可以看到独立的对象size是0的
2.3 查看独立对象xattr信息
// 导出manifest # rados -p default.rgw.buckets.data getxattr 573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1_ffplay-3.4.dmg user.rgw.manifest > /tmp/ffplay-3.4.dmg.manifest // 查看manifset # ceph-dencoder type RGWObjManifest import/tmp/ffplay-3.4.dmg.manifest decode dump_json { "objs": [], "obj_size": 16235019, "explicit_objs": "false", "head_size": 0, "max_head_size": 0, "prefix": "ffplay-3.4.dmg.2~XOxNrJHJCqQG209kH60-Jgo0J1ih4I1", "rules": [ { "key": 0, "val": { "start_part_num": 1, "start_ofs": 0, "part_size": 8388608, "stripe_max_size": 4194304, "override_prefix": "" } }, { "key": 8388608, "val": { "start_part_num": 2, "start_ofs": 8388608, "part_size": 7846411, "stripe_max_size": 4194304, "override_prefix": "" } } ], "tail_instance": "", "tail_placement": { "bucket": { "name": "TPData2", "marker": "573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1", "bucket_id": "573d2f8c-12ca-4284-a5d3-398aa32ceadf.4883.1", "tenant": "", "explicit_placement": { "data_pool": "", "data_extra_pool": "", "index_pool": "" } }, "placement_rule": "default-placement" } }