study cloud-init (by quqi99)

文章探讨了cloud-init中的用户数据合并问题,包括user-data与/etc/cloud/cloud.cfg.d目录的差异,如何使用cloud-init配置,以及如何调试和定位问题。作者通过实例和源码分析,揭示了可能导致合并失败的原因和官方文档的指导。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

作者:张华 发表于:2024-03-29
版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明(http://blog.csdn.net/quqi99)

问题

用户传入虚机的user-data与虚机原镜像中的/etc/cloud/cloud.cfg.d/目录下的user-data没有合并,具体的问题描述可以参见: https://github.com/canonical/cloud-init/issues/5112

基本概念

这也是我第一次接触cloud-init,我需要先做一些概念上的了解。cloud-init的基本概念有:

  • metadata: 一般包含了一些VM信息,如instance id, display name
  • userdata: 文件,脚本,yaml等对系统进入配置所需的数据
  • part handler: 用于自定义userdata类型 handlers:用来解析和处理各类userdata,目前有四类默认的handler: boot hook, cloud config ,shell script, upstart job
  • datasource: 数据的来源,如openstack, aws等
  • modules: instance真正做定制工作的组件 instance: 云主机实例 frequencies: handler/module的运行频率, 一般有效的设置只有三个, PER_INSTANCE, PER_ALWAYS, PER_ONCE
  • stage: cloud-init有四个阶段:local, init, config, final (systemctl -a |grep cloud)

这四个stage分别是:

  • local: instance还不知道如何配置网卡的阶段,任务就是从config-drive中获取配置信息,然后入写/etc/network/interfaces), 如果没用config-drive会默认将所有网卡配置成dhcp模式
  • init, config, final三阶段:会在/etc/cloud/cloud.cfg文件定义cloud_init_modules, cloud_config_modules, config_final_modules三阶段。

这四个stage分别在systemd中调用的命令是:

  • cloud-init-local.service: /usr/bin/cloud-init init --local; /bin/touch /run/cloud-init/network-config-ready
  • cloud-init.service: /usr/bin/cloud-init init
  • cloud-config.service: /usr/bin/cloud-init modules --mode=config
  • cloud-final.service: /usr/bin/cloud-init modules --mode=final

数据目录则是:/var/lib/cloud/. 配置文件目录的加载顺序优先级从低到高为:内置配置(/var/lib/cloud/instances/90a1a29a-1f70-471a-a920-d1a6919270d0/user-data.txt) --> /etc/cloud/cloud.cfg{,.d} --> /run/cloud-init/cloud.cfg --> kernel cmdline

第一个cloud-init例子

cat << EOF |tee /tmp/my-user-data.yaml 
#cloud-config 
runcmd:
  - echo 'Hello, World!' > /tmp/hello-world.txt
EOF
lxc launch ubuntu:jammy test --config=user.user-data="$(cat /tmp/my-user-data.yaml)"
#lxc init ubuntu:jammy test 
#lxc config set test user.user-data - < userdata.yaml 
#lxc start test

lxc shell test
cat /var/lib/cloud/instances/6da68c17-4f11-4a76-937d-ea40cca547ef/cloud-config.txt
cat /var/lib/cloud/instances/6da68c17-4f11-4a76-937d-ea40cca547ef/network-config.json
cloud-init query userdata
cloud-init schema --system --annotate
cloud-init schema -c /var/lib/cloud/instances/291e6f16-9b4b-4909-95f8-e22ac771c321/user-data.txt --annotate
cloud-init status --long
cat /run/cloud-init/ds-identify.log
cloud-init status
cloud-init clean --logs --reboot

root@test:~# cloud-init status --wait
status: done
root@test:~# cloud-init query userdata
#cloud-config 
runcmd:
  - echo 'Hello, World!' > /tmp/hello-world.txt
root@test:~# cloud-init schema --system --annotate
Found cloud-config data types: user-data, network-config
1. user-data at /var/lib/cloud/instances/6da68c17-4f11-4a76-937d-ea40cca547ef/cloud-config.txt:
  Valid schema user-data
2. network-config at /var/lib/cloud/instances/6da68c17-4f11-4a76-937d-ea40cca547ef/network-config.json:
  Valid schema network-config

How to debug cloud-init

export _CLOUD_INIT_SAVE_STDIN=1
export _CLOUD_INIT_SAVE_STDOUT=1
rm -rf /var/lib/cloud  #run the complete cloud-init again
cloud-init -d init
cloud-init -d init --local
#/usr/bin/cloud-init modules --mode=config
#/usr/bin/cloud-init modules --mode=final
tail -f /var/log/cloud-init-output.log
vim /usr/lib/python3/dist-packages/cloudinit/cmd/main.py#main_init
import pdb;pdb.set_trace()

一个窍门是我将/etc/cloud/cloud.cfg.d/91_my.cfg文件故意弄成格式错误,再运行下面的"cloud-init schema --system --annotate"命令可以得到一个bt, 这就容易知道断点设置在哪里了。

root@test2:~# cloud-init schema --system --annotate
Traceback (most recent call last):
  File "/usr/bin/cloud-init", line 33, in <module>
    sys.exit(load_entry_point('cloud-init==23.4.4', 'console_scripts', 'cloud-init')())
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 1042, in main
    schema_parser(parser_schema)
  File "/usr/lib/python3/dist-packages/cloudinit/config/schema.py", line 1440, in get_parser
    f"{read_cfg_paths().get_runpath('instance_data')}"
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/devel/__init__.py", line 22, in read_cfg_paths
    init.read_cfg()
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 267, in read_cfg
    self._cfg = self._read_cfg(extra_fns)
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 279, in _read_cfg
    base_cfg=fetch_base_config(instance_data_file=instance_data_file),
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 1077, in fetch_base_config
    util.read_conf_with_confd(
  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1145, in read_conf_with_confd
    confd_cfg = read_conf_d(confd, instance_data_file=instance_data_file)
  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1104, in read_conf_d
    return mergemanydict(cfgs)
  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 909, in mergemanydict
    merger = mergers.construct(mergers_to_apply)
  File "/usr/lib/python3/dist-packages/cloudinit/mergers/__init__.py", line 142, in construct
    raise ImportError(msg)
ImportError: Could not find merger module named 'm_l' with attribute 'Merger' (searched ['cloudinit.mergers.m_l'])

如何修改lxd镜像

我们要重现这个问题需要在lxd镜像里添加/etc/cloud/cloud.cfg.d/91_my.cfg, 步骤如下。其实,也不用这么麻烦来改镜像的,直接如bug描述(https://github.com/canonical/cloud-init/issues/5112)里的随便启动一个没有/etc/cloud/cloud.cfg.d/91_my.cfg的lxd容器后,然后再添加/etc/cloud/cloud.cfg.d/91_my.cfg, 最后运行命令“cloud-init clean --logs --reboot”重启就是。

lxc launch ubuntu:jammy test
lxc shell test
cat << EOF |tee /etc/cloud/cloud.cfg.d/91_my.cfg
#cloud-config
runcmd:
- echo 'running command with runcmd from cloud.cfg.d' >> /var/log/cloud-init.log
EOF
exit
lxc snapshot test cloud-init-snapshot
lxc publish test/cloud-init-snapshot --alias cloud-init-snapshot
#lxc image export <image-alias> <destination-path>
lxc image list
cat << EOF |tee /tmp/my-user-data.yaml 
#cloud-config
merge_how:
 - name: list
   settings: [append]
 - name: dict
   settings: [no_replace, recurse_list]
runcmd:
- echo 'running command with runcmd from userdata' >> /var/log/cloud-init.log
EOF
lxc launch cloud-init-snapshot test2 --config=user.user-data="$(cat /tmp/my-user-data.yaml)"
lxc shell test2
cat /etc/cloud/cloud.cfg.d/91_my.cfg
cat /var/lib/cloud/instances/ba383df7-cb79-4629-890b-84dbf3def51c/user-data.txt
grep -r 'running command with runcmd' /var/log/cloud-init.log

是bug还是feature

下面的代码显然是/etc/cloud/cloud.cfg.d/能合并,但从来没有说/etc/cloud/cloud.cfg.d和/var/lib/cloud/instances/6da68c17-4f11-4a76-937d-ea40cca547ef/cloud-config.txt能合并。

$ find . \( -path '*test*' -o -path '*.tox' -o -path '*venv*' -o -path '*.git' \) -prune -o -type f -name '*.py' -print |xargs -i grep --color -H 'cloud.cfg.d' {}
./setup.py:    (ETC + "/cloud/cloud.cfg.d", glob("config/cloud.cfg.d/*")),
./cloudinit/util.py:    files in /etc/cloud/cloud.cfg.d and merge all configs into a single dict.
./cloudinit/sources/__init__.py:            " /etc/cloud/cloud.cfg.d/"
./cloudinit/sources/DataSourceMAAS.py:                fpath = "/etc/cloud/cloud.cfg.d/" + fname + ".cfg"
./cloudinit/cmd/devel/logs.py:    ApportFile("/etc/cloud/cloud.cfg.d/99-installer.cfg", "InstallerCloudCfg"),
./cloudinit/stages.py:    " (/etc/cloud/cloud.cfg and /etc/cloud/cloud.cfg.d), metadata,"
./cloudinit/distros/alpine.py:# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
./cloudinit/distros/azurelinux.py:# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
./cloudinit/distros/debian.py:# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
./cloudinit/distros/mariner.py:# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
./cloudinit/warnings.py:   /etc/cloud/cloud.cfg.d/99-ec2-datasource.cfg
./cloudinit/warnings.py:into /etc/cloud/cloud.cfg.d/99-warnings.cfg
./cloudinit/config/cc_ubuntu_autoinstall.py:        ``#cloud-config`` user-data or ``/etc/cloud/cloud.cfg.d`` validate
./cloudinit/config/cc_apt_configure.py:    flist = glob.glob(subp.target_path(path="/etc/cloud/cloud.cfg.d/*dpkg*"))

$ grep -r 'cloud.cfg.d' ./cloudinit/util.py -B10
def read_conf_with_confd(cfgfile, *, instance_data_file=None) -> dict:
    """Read yaml file along with optional ".d" directory, return merged config

    Given a yaml file, load the file as a dictionary. Additionally, if there
    exists a same-named directory with .d extension, read all files from
    that directory in order and return the merged config. The template
    file is optional and will be applied to any applicable jinja file
    in the configs.

    For example, this function can read both /etc/cloud/cloud.cfg and all
    files in /etc/cloud/cloud.cfg.d and merge all configs into a single dict.

这个comment也是这样说的: https://github.com/canonical/cloud-init/issues/2611#issuecomment-1541879693
基实官方文档也是这样写的: https://cloudinit.readthedocs.io/en/latest/reference/merging.html#other-uses

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

quqi99

你的鼓励就是我创造的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值