• 为什么 rsyslog 把 etcd 日志采集到了 kernel.log

    etcd版本:

    etcd Version: 3.4.9
    Git SHA: 54ba95891
    Go Version: go1.12.17
    Go OS/Arch: linux/amd64
    

    经测试,rsyslog 的下面这个配置导致日志采集到了 kernel.log

    #### RULES ####
    
    # Log all kernel messages to the console.
    # Logging much else clutters up the screen.
    kern.*                                                  /var/log/kernel
    

    rsyslogd 版本

    rsyslogd 8.29.0, compiled with:
    	PLATFORM:				x86_64-redhat-linux-gnu
    	PLATFORM (lsb_release -d):
    	FEATURE_REGEXP:				Yes
    	GSSAPI Kerberos 5 support:		No
    	FEATURE_DEBUG (debug build, slow code):	No
    	32bit Atomic operations supported:	Yes
    	64bit Atomic operations supported:	Yes
    	memory allocator:			system default
    	Runtime Instrumentation (slow code):	No
    	uuid support:				Yes
    	Number of Bits in RainerScript integers: 64
    
    See http://www.rsyslog.com for more information.
    

    ETCD 在 journald 中的一条日志如下:

    {
    	"__CURSOR" : "s=78fdc2bf435b4aa6b7df9f50ff1e9c9f;i=662c5390;b=f3260b93a64641889bbf8fed67f4365a;m=2f4d3ea8e9fc;t=5ee329ba40462;x=82eb53b68142dca7",
    	"__REALTIME_TIMESTAMP" : "1669276010546274",
    	"__MONOTONIC_TIMESTAMP" : "52008810244604",
    	"_BOOT_ID" : "f3260b93a64641889bbf8fed67f4365a",
    	"PRIORITY" : "7",
    	"SYSLOG_IDENTIFIER" : "etcd",
    	"_TRANSPORT" : "journal",
    	"_PID" : "840",
    	"_UID" : "997",
    	"_GID" : "993",
    	"_COMM" : "etcd",
    	"_EXE" : "/usr/local/bin/etcd",
    	"_CMDLINE" : "/usr/local/bin/etcd --config-file /etc/etcd/etcd.conf.yml --log-output stderr",
    	"_CAP_EFFECTIVE" : "0",
    	"_SYSTEMD_CGROUP" : "/system.slice/etcd.service",
    	"_SYSTEMD_UNIT" : "etcd.service",
    	"_SYSTEMD_SLICE" : "system.slice",
    	"_MACHINE_ID" : "c94b645006e94b62b253832779707d12",
    	"_HOSTNAME" : "SVR15178IN5112",
    	"PACKAGE" : "etcdserver/api/v3rpc",
    	"MESSAGE" : "start time = 2022-11-24 15:46:50.545251478 +0800 CST m=+51575280.245951347, time spent = 821.324\uffffffc2\uffffffb5s, remote = 10.4.241.133:54952, response type = /etcdserverpb.KV/Txn, request count = 0, request size = 0, response count = 0, response size = 31, request content = compare:<key:\"cilium/state/identities/v1/id/292984\" version:0 > success:<request_put:<key:\"cilium/state/identities/v1/id/292984\" value_size:196 >> failure:<>",
    	"_SOURCE_REALTIME_TIMESTAMP" : "1669276010546119"
    }
    

    粗看下来,是因为 etcd 使用一些库记录日志到 journald 的时候,没有加 FACILITY 字段。

    rsyslog 采集日志的时候,会通过 PRIORITY » 3 的方式计算 FACILITY。计算结果为0,认为FACILITY 是 kernel。

    Rsyslog 的一些 const value: https://github.com/rsyslog/rsyslog/blob/d083a2a2c20df6852a53e45f1e7a3f47679236d6/runtime/rsyslog.h#L202

    Rsyslog 计算 FACILITY 的宏 https://github.com/rsyslog/rsyslog/blob/d083a2a2c20df6852a53e45f1e7a3f47679236d6/runtime/rsyslog.h#L251

  • [译]Linux下 OOMKiller 什么时候被触发

    原文Roughly when the Linux Out-Of-Memory killer triggers (as of mid-2019)

    原文发表时间,2019-08-11

    因为某些别的原因,我最近想了解 Linux 里面 OOM Killer 是何时触发(以及不触发)的,以及为什么。这方面的详细文档不多,以及有些已经过时了。我在这里也没办法添加详细文档,因为这需要对 Linux 内核代码很了解,但我至少可以大概写些观点供我自己使用。

    现如今有两种不同类型的 OOM Killer:全局的 OOM Killer 和 依赖 cgroup 的 OOM Killer(后者通过cgroup 内存控制器实现)。我主要是对全局的感兴趣,部分是因为 cgroup OOM Killer 相对来说容易预测。

    简单来说,当内核在分配物理内存页有有困难时,全局 OOM Killer 触发。 当内核尝试分配内存页(不管用于任何用途,用于内核使用或需要内存页的进程)并且失败时,它将尝试各种方法来回收和压缩内存。如果成功了或至少取得了一些进展,内核会继续重试分配(我从代码中了解到); 如果他们未能释放内存页或取得进展,它会在许多(但不是全部)情况下触发 OOM Killer。

    (比如说,内核申请连续大段内存页失败,是不会触发的,参考Decoding the Linux kernel’s page allocation failure messages。如果申请的连续内存小于 等于 30KB 才会触发。git blame 显示从2007年就开始是这样的了。)

  • [译]SO_REUSEPORT选项

    原文 The SO_REUSEPORT socket option [LWN.net]

    One of the features merged in the 3.9 development cycle was TCP and UDP support for the SO_REUSEPORT socket option; that support was implemented in a series of patches by Tom Herbert. The new socket option allows multiple sockets on the same host to bind to the same port, and is intended to improve the performance of multithreaded network server applications running on top of multicore systems.

    Linux 3.9 版本合并了一个 SO_REUSEPORT 的特性。可以在 TCP 和 UDP 的套接字上面配置 SO_REUSEPORT 这个选项;这项支持是 Tom Herbert 在一系列的补丁中实现的。这个新的套接字参数可以允许一台机器上面将多个套接字绑定在同一个端口上面,目的是为了提高多核机器上面的多线程网络服务应用的性能。

  • kafka-lib fetch api的数据解析

  • Journald Crash

    昨天和芽哥一起查了一个 ETCD 切主的问题,挺有意思的,记录一下。

    最终查明了根因,是使用 filebeat 采集 Journal 日志,引发 journald flush,导致高磁盘 IO,进而导致 ETCD 切主。

  • Docker Proxy

    Proxy effect in dockerd(docker daemon) is different from that in docker cli.

    Proxy setting in dockerd acts when dealing with registry, such as docker pull push login.

    Command in dockerfile when doing docker build and docker run use proxy setting in docker cli (NOT automatically in environment way, see details below).

  • Docker Push 500 Path Not Found

    最近几天 Harbor 出现比较多的 500 错误,引发告警。排查一下原因。

    registry.log 里面的错误日志

    Apr 6 11:46:40 172.25.0.1 registry[3915]: time=”2022-04-06T03:46:40.800755496Z” level=error msg=”response completed with error” auth.user.name=”harbor_registry_user” err.code=unknown err.detail=”s3aws: Path not found: /docker/registry/v2/repositories/gitlab-ci/100017681/_uploads/9ac3ea44-cbb6-4a1b-8048-6e59767c7370/data” err.message=”unknown error” go.version=go1.15.6 http.request.host=hub.xxx.com http.request.id=5bfcac34-34fa-4a05-ae52-b3fbb3714de2 http.request.method=PATCH http.request.remoteaddr=10.109.6.219 http.request.uri=”/v2/gitlab-ci/100017681/blobs/uploads/9ac3ea44-cbb6-4a1b-8048-6e59767c7370?_state=YmUOEA6F6k5OZsTu9CyQfl5CRNqlVEvx7XKgf_BVqLl7Ik5hbWUiOiJnaXRsYWItY2kvMTAwMDE3NjgxIiwiVVVJRCI6IjlhYzNlYTQ0LWNiYjYtNGExYi04MDQ4LTZlNTk3NjdjNzM3MCIsIk9mZnNldCI6MCwiU3RhcnRlZEF0IjoiMjAyMi0wNC0wNlQwMzo0NjozNi4yNTQ4MzA2MzRaIn0%3D” http.request.useragent=”docker/17.09.0-ce go/go1.8.3 git-commit/afdb6d4 kernel/4.19.118-1.el7.centos.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/17.09.0-ce (linux))” http.response.contenttype=”application/json; charset=utf-8” http.response.duration=2.255332326s http.response.status=500 http.response.written=203 vars.name=”gitlab-ci/100017681” vars.uuid=9ac3ea44-cbb6-4a1b-8048-6e59767c7370

    看代码梳理一下 docker push 的流程

    Docker Cli 这边 upload blob 的流程是:

    1. docker cli 调用 POST /v2//blobs/uploads/ 获取 uuid
    2. docker cli 调用 PATCH /v2//blobs/uploads/

    着重翻代码看一下 Registry 这边对这两个操作的对应的处理流程:

    1. buh.StartBlobUpload
      1. 调用到 s3.Writer(ctx context.Context, path string, append bool)。append = false
      2. 调用到 d.S3.CreateMultipartUpload
      3. 生成一个 UUID
      4. 调用 S3 接口 Create 一个 Object。POST /{Bucket}/{Key+}?uploads
      5. TLDR: 在 S3 创建一个对象,然后生成一个 UUID 并返回给 Client。
    2. buh.PatchBlobData
      1. 调用到 bub.ResumeBlobUpload
      2. 调用到 blobs.Resume(buh, buh.UUID)
      3. 调用到 s3.Writer(ctx context.Context, path string, append bool)。append = true
      4. for _, multi := range d.S3.ListMultipartUploads{} 。遍历失败,返回 storagedriver.PathNotFoundError
      5. TLDR: 按 UUID 找到对应的对象,然后写数据进去。

    流程看起来问题不大,先创建对象,然后写数据。问题出在,第一步里面调用的 S3 创建对象是异步处理的,第二步请求来的时候,对象还没有创建好。

    日志验证

    通过看 S3 的 Harbor 的日志,基本上可以确定这个问题了。但还没有在代码层面看到异步的证据。应该是需要看 S3 SDK,一时看不动了。

    我们拿一个出错的请求,来通过日志确认一下。

    请求在 registry.log 里面的日志如下:

    Apr  6 11:46:38 172.25.0.1 registry[3915]: time="2022-04-06T03:46:38.546533285Z" level=debug msg="authorizing request" go.version=go1.15.6 http.request.host=hub.xxx.com http.request.id=5bfcac34-34fa-4a05-ae52-b3fbb3714de2 http.request.method=PATCH http.request.remoteaddr=10.109.6.219 http.request.uri="/v2/gitlab-ci/100017681/blobs/uploads/9ac3ea44-cbb6-4a1b-8048-6e59767c7370?_state=YmUOEA6F6k5OZsTu9CyQfl5CRNqlVEvx7XKgf_BVqLl7Ik5hbWUiOiJnaXRsYWItY2kvMTAwMDE3NjgxIiwiVVVJRCI6IjlhYzNlYTQ0LWNiYjYtNGExYi04MDQ4LTZlNTk3NjdjNzM3MCIsIk9mZnNldCI6MCwiU3RhcnRlZEF0IjoiMjAyMi0wNC0wNlQwMzo0NjozNi4yNTQ4MzA2MzRaIn0%3D" http.request.useragent="docker/17.09.0-ce go/go1.8.3 git-commit/afdb6d4 kernel/4.19.118-1.el7.centos.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/17.09.0-ce \(linux\))" vars.name="gitlab-ci/100017681" vars.uuid=9ac3ea44-cbb6-4a1b-8048-6e59767c7370
    Apr  6 11:46:38 172.25.0.1 registry[3915]: time="2022-04-06T03:46:38.549190549Z" level=info msg="authorized request" go.version=go1.15.6 http.request.host=hub.xxx.com http.request.id=5bfcac34-34fa-4a05-ae52-b3fbb3714de2 http.request.method=PATCH http.request.remoteaddr=10.109.6.219 http.request.uri="/v2/gitlab-ci/100017681/blobs/uploads/9ac3ea44-cbb6-4a1b-8048-6e59767c7370?_state=YmUOEA6F6k5OZsTu9CyQfl5CRNqlVEvx7XKgf_BVqLl7Ik5hbWUiOiJnaXRsYWItY2kvMTAwMDE3NjgxIiwiVVVJRCI6IjlhYzNlYTQ0LWNiYjYtNGExYi04MDQ4LTZlNTk3NjdjNzM3MCIsIk9mZnNldCI6MCwiU3RhcnRlZEF0IjoiMjAyMi0wNC0wNlQwMzo0NjozNi4yNTQ4MzA2MzRaIn0%3D" http.request.useragent="docker/17.09.0-ce go/go1.8.3 git-commit/afdb6d4 kernel/4.19.118-1.el7.centos.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/17.09.0-ce \(linux\))" vars.name="gitlab-ci/100017681" vars.uuid=9ac3ea44-cbb6-4a1b-8048-6e59767c7370
    Apr  6 11:46:38 172.25.0.1 registry[3915]: time="2022-04-06T03:46:38.549281495Z" level=debug msg="(*linkedBlobStore).Resume" auth.user.name="harbor_registry_user" go.version=go1.15.6 http.request.host=hub.xxx.com http.request.id=5bfcac34-34fa-4a05-ae52-b3fbb3714de2 http.request.method=PATCH http.request.remoteaddr=10.109.6.219 http.request.uri="/v2/gitlab-ci/100017681/blobs/uploads/9ac3ea44-cbb6-4a1b-8048-6e59767c7370?_state=YmUOEA6F6k5OZsTu9CyQfl5CRNqlVEvx7XKgf_BVqLl7Ik5hbWUiOiJnaXRsYWItY2kvMTAwMDE3NjgxIiwiVVVJRCI6IjlhYzNlYTQ0LWNiYjYtNGExYi04MDQ4LTZlNTk3NjdjNzM3MCIsIk9mZnNldCI6MCwiU3RhcnRlZEF0IjoiMjAyMi0wNC0wNlQwMzo0NjozNi4yNTQ4MzA2MzRaIn0%3D" http.request.useragent="docker/17.09.0-ce go/go1.8.3 git-commit/afdb6d4 kernel/4.19.118-1.el7.centos.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/17.09.0-ce \(linux\))" vars.name="gitlab-ci/100017681" vars.uuid=9ac3ea44-cbb6-4a1b-8048-6e59767c7370
    Apr  6 11:46:38 172.25.0.1 registry[3915]: time="2022-04-06T03:46:38.552098655Z" level=debug msg="s3aws.GetContent("/docker/registry/v2/repositories/gitlab-ci/100017681/_uploads/9ac3ea44-cbb6-4a1b-8048-6e59767c7370/startedat")" auth.user.name="harbor_registry_user" go.version=go1.15.6 http.request.host=hub.xxx.com http.request.id=5bfcac34-34fa-4a05-ae52-b3fbb3714de2 http.request.method=PATCH http.request.remoteaddr=10.109.6.219 http.request.uri="/v2/gitlab-ci/100017681/blobs/uploads/9ac3ea44-cbb6-4a1b-8048-6e59767c7370?_state=YmUOEA6F6k5OZsTu9CyQfl5CRNqlVEvx7XKgf_BVqLl7Ik5hbWUiOiJnaXRsYWItY2kvMTAwMDE3NjgxIiwiVVVJRCI6IjlhYzNlYTQ0LWNiYjYtNGExYi04MDQ4LTZlNTk3NjdjNzM3MCIsIk9mZnNldCI6MCwiU3RhcnRlZEF0IjoiMjAyMi0wNC0wNlQwMzo0NjozNi4yNTQ4MzA2MzRaIn0%3D" http.request.useragent="docker/17.09.0-ce go/go1.8.3 git-commit/afdb6d4 kernel/4.19.118-1.el7.centos.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/17.09.0-ce \(linux\))" trace.duration=2.760534ms trace.file="/go/src/github.com/docker/distribution/registry/storage/driver/base/base.go" trace.func="github.com/docker/distribution/registry/storage/driver/base.(*Base).GetContent" trace.id=aa22d5f1-24db-4d80-8867-bec6b3daadc2 trace.line=95 vars.name="gitlab-ci/100017681" vars.uuid=9ac3ea44-cbb6-4a1b-8048-6e59767c7370
    Apr  6 11:46:40 172.25.0.1 registry[3915]: time="2022-04-06T03:46:40.800564701Z" level=debug msg="s3aws.Writer("/docker/registry/v2/repositories/gitlab-ci/100017681/_uploads/9ac3ea44-cbb6-4a1b-8048-6e59767c7370/data", true)" auth.user.name="harbor_registry_user" go.version=go1.15.6 http.request.host=hub.xxx.com http.request.id=5bfcac34-34fa-4a05-ae52-b3fbb3714de2 http.request.method=PATCH http.request.remoteaddr=10.109.6.219 http.request.uri="/v2/gitlab-ci/100017681/blobs/uploads/9ac3ea44-cbb6-4a1b-8048-6e59767c7370?_state=YmUOEA6F6k5OZsTu9CyQfl5CRNqlVEvx7XKgf_BVqLl7Ik5hbWUiOiJnaXRsYWItY2kvMTAwMDE3NjgxIiwiVVVJRCI6IjlhYzNlYTQ0LWNiYjYtNGExYi04MDQ4LTZlNTk3NjdjNzM3MCIsIk9mZnNldCI6MCwiU3RhcnRlZEF0IjoiMjAyMi0wNC0wNlQwMzo0NjozNi4yNTQ4MzA2MzRaIn0%3D" http.request.useragent="docker/17.09.0-ce go/go1.8.3 git-commit/afdb6d4 kernel/4.19.118-1.el7.centos.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/17.09.0-ce \(linux\))" trace.duration=2.248339896s trace.file="/go/src/github.com/docker/distribution/registry/storage/driver/base/base.go" trace.func="github.com/docker/distribution/registry/storage/driver/base.(*Base).Writer" trace.id=8dfdaa69-9ace-4c3e-86cc-ef9c330b5106 trace.line=142 vars.name="gitlab-ci/100017681" vars.uuid=9ac3ea44-cbb6-4a1b-8048-6e59767c7370
    Apr  6 11:46:40 172.25.0.1 registry[3915]: time="2022-04-06T03:46:40.800637996Z" level=error msg="error resolving upload: s3aws: Path not found: /docker/registry/v2/repositories/gitlab-ci/100017681/_uploads/9ac3ea44-cbb6-4a1b-8048-6e59767c7370/data" auth.user.name="harbor_registry_user" go.version=go1.15.6 http.request.host=hub.xxx.com http.request.id=5bfcac34-34fa-4a05-ae52-b3fbb3714de2 http.request.method=PATCH http.request.remoteaddr=10.109.6.219 http.request.uri="/v2/gitlab-ci/100017681/blobs/uploads/9ac3ea44-cbb6-4a1b-8048-6e59767c7370?_state=YmUOEA6F6k5OZsTu9CyQfl5CRNqlVEvx7XKgf_BVqLl7Ik5hbWUiOiJnaXRsYWItY2kvMTAwMDE3NjgxIiwiVVVJRCI6IjlhYzNlYTQ0LWNiYjYtNGExYi04MDQ4LTZlNTk3NjdjNzM3MCIsIk9mZnNldCI6MCwiU3RhcnRlZEF0IjoiMjAyMi0wNC0wNlQwMzo0NjozNi4yNTQ4MzA2MzRaIn0%3D" http.request.useragent="docker/17.09.0-ce go/go1.8.3 git-commit/afdb6d4 kernel/4.19.118-1.el7.centos.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/17.09.0-ce \(linux\))" vars.name="gitlab-ci/100017681" vars.uuid=9ac3ea44-cbb6-4a1b-8048-6e59767c7370
    Apr  6 11:46:40 172.25.0.1 registry[3915]: time="2022-04-06T03:46:40.800755496Z" level=error msg="response completed with error" auth.user.name="harbor_registry_user" err.code=unknown err.detail="s3aws: Path not found: /docker/registry/v2/repositories/gitlab-ci/100017681/_uploads/9ac3ea44-cbb6-4a1b-8048-6e59767c7370/data" err.message="unknown error" go.version=go1.15.6 http.request.host=hub.xxx.com http.request.id=5bfcac34-34fa-4a05-ae52-b3fbb3714de2 http.request.method=PATCH http.request.remoteaddr=10.109.6.219 http.request.uri="/v2/gitlab-ci/100017681/blobs/uploads/9ac3ea44-cbb6-4a1b-8048-6e59767c7370?_state=YmUOEA6F6k5OZsTu9CyQfl5CRNqlVEvx7XKgf_BVqLl7Ik5hbWUiOiJnaXRsYWItY2kvMTAwMDE3NjgxIiwiVVVJRCI6IjlhYzNlYTQ0LWNiYjYtNGExYi04MDQ4LTZlNTk3NjdjNzM3MCIsIk9mZnNldCI6MCwiU3RhcnRlZEF0IjoiMjAyMi0wNC0wNlQwMzo0NjozNi4yNTQ4MzA2MzRaIn0%3D" http.request.useragent="docker/17.09.0-ce go/go1.8.3 git-commit/afdb6d4 kernel/4.19.118-1.el7.centos.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/17.09.0-ce \(linux\))" http.response.contenttype="application/json; charset=utf-8" http.response.duration=2.255332326s http.response.status=500 http.response.written=203 vars.name="gitlab-ci/100017681" vars.uuid=9ac3ea44-cbb6-4a1b-8048-6e59767c7370
    

    对应的 Registry Access log 如下:

    第一步请求:

    request-1

    第二步请求:

    request-2

    第一步中 对应的 S3 的 Access 日志如下:

    request-3

    可以看到两点:

    1. 请求 2.2 秒才返回,但 Registry 这边 58 毫秒就返回了。所以怀疑是异步。
    2. S3 请求处理完的时候,Client 这边的第二步已经执行了,导致 PathNotFoundError。
  • harbor 中 oidc 认证的一些笔记

    使用版本是 harbor v2.2.0。本文只是记录了一些 OIDC 认证相关的东西。需要事先了解一些 SSO 知识和 Docker HTTP V2 API

    Harbor Core 组件分两个比较独立的功能,一个是提供 Token 服务,一个是反向代理后面的 Registry 。两者都有和 OIDC 打交道的地方。

    Harbor 中使用 OIDC 的地方,大的来说有两个。一个是 Web 页面登陆的时候,一个是 docker login/pull/push 时的身份认证。

    数据库里面和 OIDC 相关的一个重要表是 oidc_user,里面有两个重要的列,一个 secret,也就是密码,另外一个是 token,用来做验证(比密码更多一层安全?)。

  • cgroup v2 学习笔记简记

    参考资料

    启用 cgroup-v2

    测试系统环境:

    # cat /etc/centos-release
    CentOS Linux release 7.6.1810 (Core)
    
    # systemctl --version
    systemd 219
    +PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN
    
    # uname -a
    Linux VMS171583 5.10.56-trip20211224.el7.centos.x86_64 #1 SMP Fri Dec 24 02:11:17 EST 2021 x86_64 x86_64 x86_64 GNU/Linux
    

    如果控制器(cpu,memory 等)已经绑定在 cgroup v1,那就没办法再绑定到 cgroup v2 了。

    systemd 又是 pid 1 进程,所以第一步需要让 systemd 使用 cgroup v2。

    grubby --update-kernel=ALL --args=systemd.unified_cgroup_hierarchy=1
    

    使用上面命令添加内核启动参数,然后重启,可以让 systemd 使用 cgroup v2。

    但我的测试环境中,这个版本的 systemd 还不支持 cgroup v2,所以添加了这个参数也没用。https://github.com/systemd/systemd/issues/19760

    所以需要强制关闭 cgroup v1:

    grubby --update-kernel=ALL --args=cgroup_no_v1=all
    

    重启之后,mount | grep cgroup 可以看到 cgroup v1 的 mount 已经没有了。(可能 systemd 不能再对服务做资源控制了?未验证)

    mount cgroup v2

    # mkdir /tmp/abcd
    # mount -t cgroup2 nodev /tmp/abcd/
    # mount | grep cgroup2
    nodev on /tmp/abcd type cgroup2 (rw,relatime)
    

    成功 mount cgroup v2 了,接下来就能使用 v2 做资源控制了。

    开启 CPU 限制

    看一下 v2 里面有哪些可用的控制器

    # cat cgroup.controllers
    cpuset cpu io memory hugetlb pids
    

    创建一个 child cgroup

    # cd /tmp/abcd
    [root@VMS171583 abcd]# mkdir t
    

    开启 CPU 控制器

    cd /tmp/abcd
    
    [root@VMS171583 abcd]# ll t
    total 0
    -r--r--r-- 1 root root 0 Mar 22 15:10 cgroup.controllers
    -r--r--r-- 1 root root 0 Mar 22 15:10 cgroup.events
    -rw-r--r-- 1 root root 0 Mar 22 15:10 cgroup.freeze
    -rw-r--r-- 1 root root 0 Mar 22 15:10 cgroup.max.depth
    -rw-r--r-- 1 root root 0 Mar 22 15:10 cgroup.max.descendants
    -rw-r--r-- 1 root root 0 Mar 22 15:10 cgroup.procs
    -r--r--r-- 1 root root 0 Mar 22 15:10 cgroup.stat
    -rw-r--r-- 1 root root 0 Mar 22 15:10 cgroup.subtree_control
    -rw-r--r-- 1 root root 0 Mar 22 15:10 cgroup.threads
    -rw-r--r-- 1 root root 0 Mar 22 15:10 cgroup.type
    -r--r--r-- 1 root root 0 Mar 22 15:10 cpu.stat
    
    echo "+cpu" > cgroup.subtree_control
    
    [root@VMS171583 abcd]# ll t
    total 0
    -r--r--r-- 1 root root 0 Mar 22 15:10 cgroup.controllers
    -r--r--r-- 1 root root 0 Mar 22 15:10 cgroup.events
    -rw-r--r-- 1 root root 0 Mar 22 15:10 cgroup.freeze
    -rw-r--r-- 1 root root 0 Mar 22 15:10 cgroup.max.depth
    -rw-r--r-- 1 root root 0 Mar 22 15:10 cgroup.max.descendants
    -rw-r--r-- 1 root root 0 Mar 22 15:10 cgroup.procs
    -r--r--r-- 1 root root 0 Mar 22 15:10 cgroup.stat
    -rw-r--r-- 1 root root 0 Mar 22 15:10 cgroup.subtree_control
    -rw-r--r-- 1 root root 0 Mar 22 15:10 cgroup.threads
    -rw-r--r-- 1 root root 0 Mar 22 15:10 cgroup.type
    -rw-r--r-- 1 root root 0 Mar 22 15:14 cpu.max
    -r--r--r-- 1 root root 0 Mar 22 15:10 cpu.stat
    -rw-r--r-- 1 root root 0 Mar 22 15:14 cpu.weight
    -rw-r--r-- 1 root root 0 Mar 22 15:14 cpu.weight.nice
    

    可以看到 CPU 的接口文件就自动创建好了。

    更改一下 CPU 资源控制的参数:

    cd /tmp/abcd/t
    echo 20000 100000 > cpu.max
    

    跑一个 Python 死循环脚本,可以看到 CPU 使用率100%。

    cd /tmp/abcd/t
    echo $pythonPID > cgroup.procs
    

    可以看到 python 进程的 CPU 使用率被限制到 20% 了。

  • Docker Push 503

    记一下刚刚发生的 Docker Push 一直重试,最后报错 received unexpected HTTP status: 503 Service Unavailable 的事。

  • 1
  • 2