wenmo8 发布的文章

简化 ASP.NET Core 依赖注入（DI）注册-Scrutor

作者: wenmo8
时间: 2024-12-12
分类: 其它
评论

简化 ASP.NET Core 依赖注入（DI）注册-Scrutor

Scrutor 是一个开源库，旨在简化 ASP.NET Core 应用程序中依赖注入（DI）的注册过程。通过自动扫描程序集中的类型并根据提供的规则自动注册服务。

基本使用

使用 NuGet 包管理器安装：

dotnet add package Scrutor

步骤

选择器（Selector）
：指定要注册的服务实现。
注册策略（Registration Strategy）
：处理重复服务的策略。
服务（Services）
：指定服务的具体实现。
生命周期（Lifetime）
：服务的生命周期，如 Singleton、Scoped、Transient。

选择程序集
：指定从哪个程序集中扫描类型。

services.Scan(scan => scan
 .FromAssemblyOf<MyClass>(); // 从包含MyClass的程序集中扫描

添加类
：过滤需要注册的类。

services.Scan(scan => scan
 .FromAssemblyOf<MyClass>()
 .AddClasses(classes => classes.Where(t => t.Name.EndsWith("Service"))); // 只注册以Service结尾的类

指定注册方式
：指定类应该注册为哪些接口。

services.Scan(scan => scan
 .FromAssemblyOf<MyClass>()
 .AddClasses(classes => classes.Where(t => t.Name.EndsWith("Service")))
 .AsImplementedInterfaces(); // 注册为实现的所有接口

设置生命周期
：为注册的服务设置生命周期。

services.Scan(scan => scan
 .FromAssemblyOf<MyClass>()
 .AddClasses(classes => classes.Where(t => t.Name.EndsWith("Service")))
 .AsImplementedInterfaces()
 .WithScopedLifetime(); // 设置为Scoped生命周期

进阶使用

进行服务扫描（Scanning）

FromAssemblyOf
()
从包含指定类型
T
的程序集中扫描服务。
FromAssembliesOf(params Type[] types)
可以指定多个类型，Scrutor 将扫描包含这些类型的所有程序集。
FromCallingAssembly()
扫描调用
Scan
方法的程序集。
FromExecutingAssembly()
扫描正在执行
Scan
方法的程序集。
FromEntryAssembly()
扫描应用程序的入口程序集（通常是包含
Main
方法的程序集）。
等等...

进行服务装饰（Decoration）

服务装饰是一种在不改变现有服务实现的情况下，动态地为服务添加额外功能的技术。

使用案例

装饰器模式，允许你在不修改现有类的情况下为服务添加行为。例如，为
IMessageSender
接口添加一个记录日志的装饰器：

public interface IMessageSender
{
 void SendMessage(string message);
}

public class EmailMessageSender : IMessageSender
{
 public void SendMessage(string message)
 {
  Console.WriteLine($"Sending email: {message}");
 }
}

public class LoggingMessageSender : IMessageSender
{
 private readonly IMessageSender _inner;
 public LoggingMessageSender(IMessageSender inner)
 {
  _inner = inner;
 }
 public void SendMessage(string message)
 {
  Console.WriteLine("Starting to send message...");
  _inner.SendMessage(message);
  Console.WriteLine("Message sent successfully.");
 }
}

// 使用 Scrutor 注册服务和装饰器
services.Scan(scan => scan
 .FromAssemblyOf<IMessageSender>()
 .AddClasses(classes => classes.AssignableTo<IMessageSender>())
 .AsImplementedInterfaces()
 .WithTransientLifetime());
services.Decorate<IMessageSender, LoggingMessageSender>();

生命周期管理

想要为特定服务设置不同的生命周期，可以链式调用不同的生命周期设置

services.Scan(scan => scan
    .FromAssemblyOf<CombinedService>()
    .AddClasses(classes => classes.AssignableTo<ICombinedService>()) // 过滤服务
    .AsSelfWithInterfaces() // 注册为接口
    .WithSingletonLifetime()) // 设置为 Singleton 生命周期

// 对于其他服务，设置 Scoped 生命周期
.AddClasses(x => x.AssignableTo(typeof(IOpenGeneric<>))) // 可以关闭泛型类型
    .AsMatchingInterface()
    .WithScopedLifetime(); // 设置为 Scoped 生命周期

多重过滤

services.Scan(scan => scan
    .FromAssemblyOf<MyService>()
    .AddClasses()
    .UsingRegistrationStrategy(RegistrationStrategy.Skip) // 如果服务已存在，则跳过注册
    .AsSelf()
    .WithTransientLifetime());

链式注册

Scrutor 允许你将多个扫描整合成一个调用链，为不同的类应用不同的规则子集：

services.Scan(scan => scan
 .FromAssemblyOf<Startup>()
 .AddClasses(classes => classes.Where(t => t.Name.EndsWith("Repository")))
  .AsImplementedInterfaces()
  .WithTransientLifetime()
 .AddClasses(classes => classes.Where(t => t.Name.EndsWith("Service")))
  .AsImplementedInterfaces()
  .WithScopedLifetime());

总结

Scrutor 在实际项目中简化依赖注入的配置工作，特别是在大型项目中，提供了灵活的扫描和注册机制，使得服务注册更加简洁和可维护。

欢迎关注我的公众号“
Net分享
”，技术文章第一时间推送，随缘更新 , 分享一些你可能注意不到的细节。

仓库地址：
https://github.com/khellang/Scrutor

浅谈右值引用移动语义完美转发 std::move std::forward，窥探模板元编程的一角

作者: wenmo8
时间: 2024-12-12
分类: 其它
评论

右值引用移动语义完美转发具体是什么，就不说了，网上一搜一大堆，主要介绍下std::move和std::forward

std::move std::forward

查下源码，gcc版本:gcc version 7.3.0 (GCC),grep -r "forward(" /usr/include/c++/7.3.0/bits/,move和forward都在/usr/include/c++/7.3.0/bits/move.h文件中，源码如下：

/**

 92    *  @brief  Convert a value to an rvalue.

 93    *  @param  __t  A thing of arbitrary type.

 94    *  @return The parameter cast to an rvalue-reference to allow moving it.

 95*/
 96   template<typename _Tp>
 97     constexpr typename std::remove_reference<_Tp>::type&&
 98     move(_Tp&&__t) noexcept99     { return static_cast<typename std::remove_reference<_Tp>::type&&>(__t); }/**

 66    *  @brief  Forward an lvalue.

 67    *  @return The parameter cast to the specified type.

 68    *

 69    *  This function is used to implement "perfect forwarding".

 70*/
 71   template<typename _Tp>
 72     constexpr _Tp&&
 73     forward(typename std::remove_reference<_Tp>::type&__t) noexcept74     { return static_cast<_Tp&&>(__t); }75
 76   /**

 77    *  @brief  Forward an rvalue.

 78    *  @return The parameter cast to the specified type.

 79    *

 80    *  This function is used to implement "perfect forwarding".

 81*/
 82   template<typename _Tp>
 83     constexpr _Tp&&
 84     forward(typename std::remove_reference<_Tp>::type&&__t) noexcept85{86       static_assert(!std::is_lvalue_reference<_Tp>::value, "template argument"
 87             "substituting _Tp is an lvalue reference type");88       return static_cast<_Tp&&>(__t);89     }

move forward

本质就是强制类型转换，move并不进行所谓的“移动”

用c++14实现一下，更简单，如下：

//C++14 version of std::move
template<typename _Tp>constexpr decltype(auto)

move(_Tp&&__t) noexcept

{return static_cast<std::remove_reference_t<_Tp>&&>(__t);

}//C++14 version of std::forward for lvalues
template<typename _Tp>constexpr decltype(auto)

forward(std::remove_reference_t<_Tp>&__t) noexcept

{return static_cast<_Tp&&>(__t);

}//C++14 version of std::forward for rvalues
template<typename _Tp>constexpr decltype(auto)

forward(std::remove_reference_t<_Tp>&&__t) noexcept

{

    static_assert(!std::is_lvalue_reference_v<_Tp>, "template argument substituting _Tp is an lvalue reference type");return static_cast<_Tp&&>(__t);

}

c++14 move forward

写了一个测试程序，如下：

#include <iostream>#include<utility>  //for std::move, std::forward
#include <type_traits>  //for remove_reference_t, is_lvalue_reference_v//C++14 version of std::move
template<typename _Tp>constexpr decltype(auto)

move(_Tp&&__t) noexcept

{return static_cast<std::remove_reference_t<_Tp>&&>(__t);

}//C++14 version of std::forward for lvalues
template<typename _Tp>constexpr decltype(auto)

forward(std::remove_reference_t<_Tp>&__t) noexcept

{return static_cast<_Tp&&>(__t);

}//C++14 version of std::forward for rvalues
template<typename _Tp>constexpr decltype(auto)

forward(std::remove_reference_t<_Tp>&&__t) noexcept

{

    static_assert(!std::is_lvalue_reference_v<_Tp>, "template argument substituting _Tp is an lvalue reference type");return static_cast<_Tp&&>(__t);

}//Test class with move and copy constructors
classWidget {public:

    Widget() { std::cout<< "Widget default constructor\n"; }



    Widget(const Widget&) {

        std::cout<< "Widget copy constructor\n";

    }



    Widget(Widget&&) noexcept {

        std::cout<< "Widget move constructor\n";

    }

};//Function to test std::forward
template <typename T>
void forward_test(T&&arg) {

    Widget w= std::forward<T>(arg);

}intmain() {//Test std::move
Widget widget1;

    std::cout<< "Using std::move:\n";

    Widget widget2= std::move(widget1);  //Should call move constructor//Test std::forward with lvalue
    std::cout << "\nUsing std::forward with lvalue:\n";

    Widget widget3;

    forward_test(widget3);//Should call copy constructor//Test std::forward with rvalue
    std::cout << "\nUsing std::forward with rvalue:\n";

    forward_test(Widget());//Should call move constructor

    return 0;

}

test

因为is_lvalue_reference_v c++17才支持，所以编译：g++ test_move_forward.cpp -o test_move_forward -std=c++17

标签分发

有个全局的names,需要定义两个函数，一个是函数模板用的万能引用，一个函数的参数是普通的int(通过id检索到name，省略此实现)，代码如下：

#include <iostream>#include<type_traits>#include<utility>  //for std::forward
#include <unordered_set>

//全局数据结构
std::unordered_set<std::string>names;//日志函数
void log(const char*message) {

    std::cout<< "Log:" << message <<std::endl;

}//模板版本
template<typename T>
void logAndAdd(T&&name) {

    log("logAndAdd (perfect forwarding)");

    names.emplace(std::forward<T>(name));

}void logAndAdd(intidx) {

    log("logAndAdd (int version)");//处理 int 类型的逻辑
}intmain() {

    std::string name = "Alice";int idx = 42;//测试左值
    logAndAdd(name);  //应该调用模板版本//测试右值
    logAndAdd(std::string("Bob"));  //应该调用模板版本//测试 int 类型
logAndAdd(idx);//测试 short 类型
    short idx2 = 222;

    logAndAdd(idx2);return 0;

}

标签分发

上面的代码，没有测试 short 类型的那两行代码，是没问题的，但测试 short 类型的会匹配到完美转发那个函数，下面先用标签分发解决一下，代码如下：

#include <iostream>#include<type_traits>#include<unordered_set>#include<chrono>#include<utility>  //for std::forward, std::move>
#include <string>

//全局数据结构
std::unordered_set<std::string>names;//日志函数
void log(const char*message) {

    auto now=std::chrono::system_clock::now();

    auto time=std::chrono::system_clock::to_time_t(now);

    std::cout<< "Log [" << std::ctime(&time) << "]:" << message <<std::endl;

}//完美转发版本
template<typename T>auto logAndAddImpl(T&& name) -> std::enable_if_t<
    !std::is_convertible_v<T, int>,void
>{

    log("logAndAdd (perfect forwarding)");

    names.emplace(std::forward<T>(name));

}//普通版本，专门处理 int 类型及其可隐式转换为 int 的类型
void logAndAddImpl(intidx) {

    log("logAndAdd (int version)");//处理 int 类型的逻辑//例如，将 int 转换为字符串并添加到集合中
names.insert(std::to_string(idx));

}//分发函数
template<typename T>
void logAndAdd(T&&name) {if constexpr (std::is_convertible_v<T, int>) {

        logAndAddImpl(static_cast<int>(std::forward<T>(name)));

    }else{

        logAndAddImpl(std::forward<T>(name));

    }

}//额外的非模板版本，专门处理 int 类型
void logAndAdd(intidx) {

    logAndAddImpl(idx);

}intmain() {

    std::string name = "Alice";int idx = 42;short idx2 = 222;//测试左值
    std::cout << "Testing lvalue:\n";

    logAndAdd(name);//应该调用完美转发版本//测试右值
    std::cout << "\nTesting rvalue:\n";

    logAndAdd(std::string("Bob"));  //应该调用完美转发版本//测试 int 类型
    std::cout << "\nTesting int type:\n";

    logAndAdd(idx);//应该调用普通版本//测试 short 类型
    std::cout << "\nTesting short type:\n";

    logAndAdd(idx2);//应该调用普通版本//打印全局数据结构中的名字
    std::cout << "\nNames in the global set:\n";for (const auto&name : names) {

        std::cout<< name <<std::endl;

    }return 0;

}

标签分发2

SFINAE (enable_if)

代码如下：

#include <iostream>#include<type_traits>#include<unordered_set>#include<chrono>#include<utility>  //for std::forward, std::move>
#include <string>

//全局数据结构
std::unordered_set<std::string>names;//日志函数
void log(const char*message) {

    auto now=std::chrono::system_clock::now();

    auto time=std::chrono::system_clock::to_time_t(now);

    std::cout<< "Log [" << std::ctime(&time) << "]:" << message <<std::endl;

}//完美转发版本
template<typename T>auto logAndAdd(T&& name) -> std::enable_if_t<
    !std::is_convertible_v<T, int>,void
>{

    log("logAndAdd (perfect forwarding)");

    names.emplace(std::forward<T>(name));

}//普通版本，专门处理 int 类型及其可隐式转换为 int 的类型
template<typename T>auto logAndAdd(T&& idx) -> std::enable_if_t<std::is_convertible_v<T, int>,void
>{

    log("logAndAdd (int version)");//处理 int 类型的逻辑//例如，将 int 转换为字符串并添加到集合中
    names.insert(std::to_string(static_cast<int>(idx)));

}//额外的非模板版本，专门处理 int 类型
void logAndAdd(intidx) {

    log("logAndAdd (int version)");

    names.insert(std::to_string(idx));

}intmain() {

    std::string name = "Alice";int idx = 42;short idx2 = 222;//测试左值
    std::cout << "Testing lvalue:\n";

    logAndAdd(name);//应该调用完美转发版本//测试右值
    std::cout << "\nTesting rvalue:\n";

    logAndAdd(std::string("Bob"));  //应该调用完美转发版本//测试 int 类型
    std::cout << "\nTesting int type:\n";

    logAndAdd(idx);//应该调用普通版本//测试 short 类型
    std::cout << "\nTesting short type:\n";

    logAndAdd(idx2);//应该调用普通版本//打印全局数据结构中的名字
    std::cout << "\nNames in the global set:\n";for (const auto&name : names) {

        std::cout<< name <<std::endl;

    }return 0;

}

SFINAE

还有一种方式模板特化，就不写代码了，写的脑壳疼

总结

一入模板深似海，推荐两本书:Effective Modern C++,C++ Templates，有大佬有好的书，可以评论区推荐，感谢

RocksDB 内存超限问题剖析

作者: wenmo8
时间: 2024-12-12
分类: 其它
评论

作者：来自 vivo 互联网服务器团队- Zeng Luobin

在使用 RocksDB 存储引擎的过程中，有部分开发者遇到了内存使用超出预期的情况。本文针对这一问题展开了深入分析，从内存使用原理、RocksDB 内存管理机制、常见内存使用问题等方面进行了详细探讨，并提出了相应的解决方案和优化建议，希望能够帮助开发者更好地理解和优化 RocksDB 的内存使用情况，提升系统性能和稳定性。

一、背景

1.1 前言

在现代数据库系统中，RocksDB 作为一种高性能的键值存储引擎，广泛应用于需要高吞吐量和低延迟的场景。然而，在使用过程中观察到 RocksDB 的内存使用常常超出预设的阈值，这一现象对系统的稳定性和可用性构成了严重威胁。

RocksDB 提供了通过 block-cache-size 参数来控制缓存使用的机制。开发者可以通过以下代码片段设置缓存大小：

std::shared_ptr<rocksdb::Cache> cache = rocksdb::NewLRUCache(cache_size, -1, true);

然而，实际应用中发现，RocksDB 的内存占用往往超出了设定的 cache_size 值。这种内存使用的不可预测性导致了内存分配的失控，甚至触发了程序的 OOM（Out of Memory）错误，严重影响了服务的连续性和可靠性。

有部分开发者报告了相似的内存超额使用问题，该问题在 GitHub 社区也引起了广泛关注。

1.2 内存分析流程

在分析内存的过程中，可以搭配许多 Linux 的命令工具来进行。以下是一套内存分析的基本思路：

图片来源：
https://learn.lianglianglee.com/

1、可以先用 free 和 top，查看系统整体的内存使用情况。
2、再用 vmstat 和 pidstat，查看一段时间的趋势，从而判断出内存问题的类型。
3、最后进行详细分析，比如内存分配分析、缓存/缓冲区分析、具体进程的内存使用分析等。

其中，第一步和第二步可以观察到内存问题的现象，而最难的往往是第三步，对内存的使用情况进行分析。第三步中需要结合业务代码，对问题的根因提出假设，然后配合一些工具来验证假设。分析的过程更像在做实验：提出假设，收集数据，验证假设，得出结论。下文中，也会搭配内存工具进行分析，供读者参考。

二、问题描述

在前文所述的 RocksDB 内存使用问题背景下，我们业务生产环境遭遇了相似的挑战。应用程序采用 glibc 的 ptmalloc 作为内存分配器。在程序中，存在两个 RocksDB 实例，分别用于存储不同类型的数据。根据配置，两个实例的 block-cache-size 分别被设定为4GB和8GB。然而，实际的内存消耗量远远超出了这一预设值，导致整体内存使用量显著高于预期。

通过执行 free -g 命令，监测到程序的内存使用量达到了59GB，这一数值已经接近了物理服务器的内存容量阈值。此外，通过定期执行 vmstat 3 命令，观察到自服务启动以来，内存使用量持续上升，直至接近100%的使用率。这一现象表明，系统内存已极度紧张，存在触发 OOM（Out of Memory）错误的风险。

鉴于当前内存使用情况，确认了内存管理问题的存在，并认识到需要进一步结合源代码进行深入分析，以识别内存使用异常的根本原因，并探索相应的优化措施。

三、分析过程

3.1 内存泄露分析

以下分析均在内部测试环境中进行，使用的是16C32G的机器。起初，怀疑 RocksDB 存在内存泄露，会不断申请内存并且不会回收。

分析内存泄露的常用工具有 valgrind、memleak、strace、jemalloc 的 jeprof。这里用到的工具是 jemalloc 的 jeprof。jeprof 的原理主要是在内存的 malloc 和 free 的地方进行监控并收集数据，使用时可以设置定期打印数据。

通过 RocksDB 提供的的 db.getProperty() 方法对各个模块占用内存情况进行取值，结果如下：

rocksdb.estimate-table-readers-mem: 16014055104  // 重点关注
rocksdb.block-cache-usage: 1073659024  // 重点关注

发现主要占用内存的地方有两个：block-cache-usage 和 estimate-table-readers-mem。这两个属性分别对应了 RocksDB 中的 block_cache 以及 indexs/filters。

但是随着时间的推移，block_cache 和 indexs/filters 会达到一个均衡点，不再增加上涨。与 RocksDB 存在内存泄露的假设不相符。

进一步分析 RocksDB 分配内存的调用堆栈，由于 glibc ptmalloc 无法打印调用堆栈，将 glibc ptmalloc 切换成了 jemalloc，通过 jeprof 进行内存调用堆栈的打印，以下是 jemalloc 的安装方法：

# 用jemalloc 对于服务来说没有改造成本。
# 可以直接使用LD_PRELOAD=/usr/local/lib/libjemalloc.so这种动态链接的方式去植入
# 前提是Linux机器上需要先安装jemalloc：
wget https://github.com/jemalloc/jemalloc/archive/5.1.0.tar.gz tar zxvf jemalloc-5.1.0.tar.gz
cd jemalloc-5.1.0/
./autogen.sh
./configure --prefix=/usr/local/jemalloc-5.1.0 --enable-prof
make && make install_bin install_include install_lib

上述命令中，--enable-prof 代表开启 jemalloc 的 jeprof 功能。

安装完成后，通过 LD_PRELOAD 命令来开启 jemalloc 的 malloc 和 free。LD_PRELOAD 的原理是直接使用 jemalloc 的 malloc 和 free 方法替换掉 glibc 的 malloc/free。

通过以下命令启动程序：

export MALLOC_CONF="prof:true,lg_prof_interval:29"
LD_PRELOAD=/usr/local/jemalloc-5.1.0/lib/libjemalloc.so ./process_start

上述命令中 export MALLOC_CONF="prof:true,lg_prof_interval:29" 代表开启 jeprof 的信息捕获，内存每次上涨 2的29次方 btyes (512MB) 便记录一次信息。最终输出了结果，可以通过以下命令将结果转成调用堆栈图：

jeprof  --show_bytes --pdf ./process_start jeprof.34447.0.f.heap > result.pdf

最终观察堆栈图（只截取了部分）发现，RocksDB 正常调用分配内存的方法：rocksdb::AllocateBlock，没有观察到有内存泄露的情况。

3.2 系统 glibc ptmalloc 分析

搜索了很多类似的问题，发现也有开发者都遇到了 glibc 内存分配不释放的问题，便怀疑是否是 glibc 的内存分配不合理导致的。目前线上环境 glibc 的版本是2.17。

查看了线上机器的 /proc/meminfo，大部分内存主要用在了程序申请的栈内存和堆内存中，可以看到下图中 Active(anon) 匿名内存占用了52G，这部分内存申请后没有被释放。

glibc 申请的内存均属于这部分内存。

其次，通过 pmap -X pid 查看进程的内存块，发现有很多64MB的内存段。

为什么会创建这么多的64M的内存区域？这个跟 glibc 的内存分配器有关系。glibc 每次进行 mmap 分配时申请内存的大小在64位系统上默认为64MB。

此时便进一步提出了新的假设：是否因为 glibc 的内存分配机制不合理，导致内存不断申请，但是不释放资源？

分析 glibc 分配的内存情况，可以使用 glibc 提供的接口：malloc_info(https://man7.org/linux/man-pages/man3/malloc_info.3.html

The
malloc_info
() function exports an XML string that describes the current state of the memory-allocation implementation in the caller. The string is printed on the file stream stream. The exported string includes information about all arenas.

以下为 malloc_info 的接口定义。该接口会将内存分配的情况直接以 XML 的形式输出到文件中。

#include <malloc.h>
int malloc_info(int options, FILE *stream);

在程序中添加内存信息打印的代码，每隔一段时间触发一次打印：

FILE *filePointer;
filePointer = fopen("mem_info.log", "a");
if (filePointer != nullptr) {
  malloc_info(0, filePointer);
  fclose(filePointer);
}

以下为 malloc_info 输出的内容（截取部分内容）：

<malloc version="1">
<heap nr="0">
<sizes>
  <size from="17" to="32" total="32" count="1"/>
  <size from="33" to="48" total="48" count="1"/>
  <size from="81" to="96" total="1824" count="19"/>
  <size from="97" to="112" total="112" count="1"/>
  <size from="33" to="33" total="42636" count="1292"/>
  // ....
</sizes>
<total type="fast" count="22" size="2016"/>
<total type="rest" count="5509" size="33761685"/>
<system type="current" size="230117376"/>
<system type="max" size="230117376"/>
<aspace type="total" size="230117376"/>
<aspace type="mprotect" size="230117376"/>
</heap>

XML 内容阐述：

nr 即 arena，通常一个线程一个，线程间会相互争抢 arena。
<size from="17" to="32" total="32" count="1"/>大小在一定范围内的内存，会放到一个链表里，这就是其中一个链表。from 是内存下限，to是上限，上面的意思是内存分配在 [17,32] 范围内的空闲内存总共有32个。
<total type="fast" count="22" size="2016"/>即 fastbin 这链表当前有22个空闲内存块，大小为2016字节。
<total type="rest" count="5500" size="33761685"/>除 fastbin 以外，所有链表空闲的内存数量，以及内存大小，此处内存块数量为5509，大小为33761685字节。

因此，fast 和 rest 加起来为当前 glibc 中空闲的未归还给操作系统的内存。通过命令 awk 将文件中所有 fast 和 rest 占用的内存加起来后，发现约为 4G 。

当前 RocksDB 进程的内存使用量为20.48G，上述提到 block-cache-usage 和 estimate-table-readers-mem 加起来只有15.9G （1073659024 bytes + 16014055104 bytes）。相当于中间差距还有4G左右。刚好和 glibc 占用的空闲内存相吻合。

最终确认是由于 glibc 的 ptmalloc 内存管理器申请内存不回收，导致了机器内存紧张。

四、问题解决

发现是 glibc ptmalloc 的问题之后，解决也相对简单，业内有更好的 ptmalloc 替代方案，如 jemalloc 以及 tcmalloc。

将 jemalloc 应用到线上环境之后发现，确实像预期那样，内存的使用相比于 ptmalloc 更少，此前，机器的内存一直维持在高位，使用 jemalloc 之后，内存的使用下降了1/4（从95%+下降到80%+），随着内存地释放，有更多的内存可用于处理请求，IO和CPU的使用率就降低了，下图是内存、磁盘IO使用率以及 CPU 空闲率的对比图。

在相关性能指标得到优化之后，服务可用性以及RT也得到了提升。

五、总结

在进行内存超量使用问题的分析过程中，最初怀疑是 RocksDB 存在一些内存管理不合理的地方导致了内存超量使用。然而，经过深入研究和分析，发现实际的原因主要由 glibc的 ptmalloc 内存回收机制所导致。整个分析过程较为繁琐，需要结合一些合适的内存分析工具，逐层深入，不断假设并验证猜想。

总的来说，内存超量使用问题得到了解释，也成功解决。通过逐步深入，持续假设和验证，最终找到了真正的问题所在，希望能为读者在解决类似问题上提供一些灵感和思路。

参考文献：

1、
https://github.com/facebook/rocksdb/wiki/Partitioned-Index-Filters

2、
https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB

3、
http://jemalloc.net/jemalloc.3.html

4、
https://paper.seebug.org/papers/Archive/refs/heap/glibc%E5%86%85%E5%AD%98%E7%AE%A1%E7%90%86ptmalloc%E6%BA%90%E4%BB%A3%E7%A0%81%E5%88%86%E6%9E%90.pdf

5、
https://man7.org/linux/man-pages/man3/malloc_info.3.html

【VMware VCF】管理 VCF 环境中组件的密码策略。

作者: wenmo8
时间: 2024-12-12
分类: 其它
评论

使用 SDDC Manager 中的“密码管理”功能可以统一
管理 VCF 环境中组件的用户密码
，比如更新（Update）、轮换（Rotate）以及修复（Remediate）组件的密码等，您还可以创建密码轮换调度任务，以防止因遗忘或其他原因导致密码过期及组件中断，进而影响业务。

使用 SoS 实用程序可以检查 VCF 环境中组件的用户密码状态，比如最后一次修改日期、过期日期以及过期剩余时间等，如下所示。

vcf@vcf-mgmt01-sddc01 [ ~ ]$ sudo /opt/vmware/sddc-support/sos --password-health
[sudo] password for vcf
Welcome to Supportability and Serviceability(SoS) utility!
Performing SoS operation for vcf-mgmt01 domain components
Health Check : /var/log/vmware/vcf/sddc-support/healthcheck-2024-12-07-12-29-31-149728
Health Check log : /var/log/vmware/vcf/sddc-support/healthcheck-2024-12-07-12-29-31-149728/sos.log
NOTE : The Health check operation was invoked without --skip-known-host-check, additional identity checks will be included for Connectivity Health, Password Health and Certificate Health Checks because of security reasons.

SDDC Manager : vcf-mgmt01-sddc01.mulab.local                                                                                
+-------------------------+-----------+
|          Stage          |   Status  |
+-------------------------+-----------+
|         Bringup         | Completed |
| Management Domain State | Completed |
+-------------------------+-----------+
+--------------------+---------------+
|     Component      |    Identity   |
+--------------------+---------------+
|    SDDC-Manager    | 192.168.32.70 |
| Number of Servers  |       4       |
+--------------------+---------------+
Password Expiry Status : GREEN                                                                                 
+-----+-----------------------------------------+---------------------------+-------------------+--------------+-----------------+-------+
| SL# |                Component                |            User           | Last Changed Date | Expiry Date  | Expires in Days | State |
+-----+-----------------------------------------+---------------------------+-------------------+--------------+-----------------+-------+
|  1  |   ESXI : vcf-mgmt01-esxi01.mulab.local  | svc-vcf-vcf-mgmt01-esxi01 |    Dec 02, 2024   |    Never     |      Never      | GREEN |
|     |                                         |            root           |    Dec 02, 2024   |    Never     |      Never      | GREEN |
|  2  |   ESXI : vcf-mgmt01-esxi02.mulab.local  | svc-vcf-vcf-mgmt01-esxi02 |    Dec 02, 2024   |    Never     |      Never      | GREEN |
|     |                                         |            root           |    Dec 02, 2024   |    Never     |      Never      | GREEN |
|  3  |   ESXI : vcf-mgmt01-esxi03.mulab.local  | svc-vcf-vcf-mgmt01-esxi03 |    Dec 02, 2024   |    Never     |      Never      | GREEN |
|     |                                         |            root           |    Dec 02, 2024   |    Never     |      Never      | GREEN |
|  4  |   ESXI : vcf-mgmt01-esxi04.mulab.local  | svc-vcf-vcf-mgmt01-esxi04 |    Dec 02, 2024   |    Never     |      Never      | GREEN |
|     |                                         |            root           |    Dec 02, 2024   |    Never     |      Never      | GREEN |
|  5  |    NSX : vcf-mgmt01-nsx01.mulab.local   |           admin           |    Dec 07, 2024   | Mar 07, 2025 |     90 days     | GREEN |
|     |                                         |            root           |    Dec 07, 2024   | Mar 07, 2025 |     90 days     | GREEN |
|     |                                         |           audit           |    Dec 07, 2024   | Mar 07, 2025 |     90 days     | GREEN |
|  6  |   SDDC : vcf-mgmt01-sddc01.mulab.local  |            vcf            |    Dec 07, 2024   | Dec 07, 2025 |     365 days    | GREEN |
|     |                                         |            root           |    Dec 07, 2024   | Mar 07, 2025 |     90 days     | GREEN |
|     |                                         |           backup          |    Dec 07, 2024   | Dec 07, 2025 |     365 days    | GREEN |
|  7  | vCenter : vcf-mgmt01-vcsa01.mulab.local |            root           |    Dec 07, 2024   | Mar 07, 2025 |     89 days     | GREEN |
+-----+-----------------------------------------+---------------------------+-------------------+--------------+-----------------+-------+

Legend:

 GREEN - No attention required, health status is NORMAL
 YELLOW - May require attention, health status is WARNING
 RED - Requires immediate attention, health status is CRITICAL


Health Check completed successfully for : [VCF-SUMMARY, PASSWORD-CHECK]                                                                                
vcf@vcf-mgmt01-sddc01 [ ~ ]$

根据上面所输出的结果，能够很清楚的了解各个组件用户密码的状态，不过，你可能想知道我能不能重新调整一下这些组件的默认“密码策略”？比如密码过期、密码复杂度以及账户锁定等。答案是肯定的！首先，让我们参考
《Information Security and Access of Identity and Access Management for VMware Cloud Foundation》
产品文档，先来了解一下 VCF 环境中组件的默认密码策略。

一、密码过期策略

组件	级别	参数设置	默认	描述	备注
ESXi 主机	本地用户	Security.PasswordMaxDays	99999 (never)	设置多少天密码过期。	您可以使用 vSphere Client 或 Host Client 中的高级系统设置按主机管理密码过期策略。您可以修改每个 ESXi 主机上的配置设置，以优化设置并遵守组织的策略和法规标准。
vCenter Server	全局	Maximum (days)	90	设置最大多少天密码过期。	您可以按实例管理密码过期策略。您可以修改每个 vCenter Server 实例上的配置设置，以优化设置并遵守组织的策略和法规标准。
		Minimum (days)	0	设置最小多少天密码过期。
		Warning	7	设置密码过期前多少天警告。
	本地用户	Password Expires	Yes	设置 root 密码是否过期。
		Password validity	90	设置多少天密码过期。
		Email for expiration warning	-	设置密码过期警告的电子邮件。
		Warning (days)	7	设置密码过期前多少天警告。
	Single Sign-On	Maximum lifetime	90	设置多少天密码过期。	您可以管理每个内置身份提供程序域的 vCenter Single Sign-On 密码过期策略。密码过期策略仅适用于 vCenter Single Sign-On 内置身份提供程序域（例如 vsphere.local ）中的用户帐户。该策略不适用于本地系统账户或域的默认管理员账户（例如， administrator@vsphere.local ）。您可以修改 vCenter Single Sign-On 身份提供程序域的配置设置，以优化设置并遵守组织的策略和法规标准。
NSX + NSX Edge	本地用户	maxdays	90	设置最大多少天密码过期。	通过使用 NSX Local Manager 集群和 NSX Edge 节点上内置 NSX 帐户的 API，您可以按用户管理 NSX 密码过期策略。您可以修改每个 NSX Local Manager 集群和每个 NSX Edge 节点上的配置，以优化设置并遵守组织的策略和法规标准。
SDDC Manager	本地用户	maxdays	90	设置最大多少天密码过期。（VMware Cloud Foundation 4.5 及更高版本）	您可以基于用户管理密码过期策略。您可以修改用户的配置，以优化设置并遵守组织的策略和法规标准。
		maxdays	365	设置最大多少天密码过期。（VMware Cloud Foundation 4.4 及更高版本）
		mindays	0	设置最小多少天密码过期。
		warndays	7	设置密码过期前多少天警告。

二、密码复杂性策略

组件	级别	参数设置	默认	描述	备注
ESXi 主机	本地用户	Security.PasswordQualityControl	`retry=3 min=disabled,disabled,disabled,7,7`	密码设置或更新的重设次数，值为3表示设置密码时如果密码不符合上述要求的重试次数为3次。字符类别和密码短语最小长度要求。不允许使用包含一种或两种类别字符的密码，也不允许使用密码短语，因为前三项已停用。使用三种和四种字符类别的密码需要 7 个字符。	密码开头的大写字母不算入使用的字符类别数。密码结尾的数字不算入使用的字符类别数。您可以使用 vSphere Client 或 Host Client 中的高级系统设置按主机管理密码复杂性策略。您可以编辑和修改配置以优化设置并遵守组织的策略和法规标准。
ESXi 主机	本地用户	Security.PasswordHistory	0	设置记住曾设置密码的历史次数，值为0表示不限制。
vCenter Server	本地用户	dcredit	-1	设置密码应包含的数字字符（如0、1、2）数量，值为-1表示至少一个。	您可以通过按实例使用 /etc/pam.d/system-password 文件来管理本地用户密码复杂性策略。您可以修改每个 vCenter Server 实例上的配置设置，以优化设置并遵守组织的策略和法规标准。
		ucredit	-1	设置密码应包含的大写字母（如A、B、C）数量，值为-1表示至少一个。
		lcredit	-1	设置密码应包含的小写字母（如a、b、c）数量，值为-1表示至少一个。
		ocredit	-1	设置密码应包含的其他字符（如！、@、#）数量，值为-1表示至少一个。
		minlen	6	设置密码应具有的最小字符数量，值为6表示至少具有六个字符。
		difok	4	设置新密码与旧密码相比不同的字符数量，值为4表示至少具有四个字符不一样。
		remember	5	设置记住曾设置密码的历史次数，值为5表示新密码不应该是之前设置的5次密码中的任何一个。
	Single Sign-On	Restrict reuse	5	设置记住曾设置密码的历史次数，值为5表示新密码不应该是之前设置的5次密码中的任何一个。	您可以管理每个内置身份提供程序域的 vCenter Single Sign-On 密码过期策略。密码复杂性策略仅适用于 vCenter Single Sign-On 内置身份提供程序域（例如 vsphere.local ）中的用户帐户。该策略不适用于本地系统账户或内置身份提供商的默认管理员账户（例如， administrator@vsphere.local ）。
		Maximum length	20	设置最大密码长度（字符数）。
		Minimum length	8	设置最小密码长度（字符数）。
		Special characters	1	设置特殊字符的最少数量。
		Alphabetic characters	2	设置最少字母字符数。
		Uppercase characters	1	设置最少大写字符数。
		Lowercase characters	1	设置最少小写字符数。
		Numeric characters	1	设置最小数字字符数。
		Identical adjacent characters	1	设置相同相邻字符的最大数量。
NSX + NSX Edge	本地用户	dcredit	-1	设置密码应包含的数字字符（如0、1、2）数量，值为-1表示至少一个。	您可以通过对 NSX Manager 集群和 NSX Edge 节点上的内置 NSX 帐户按节点使用 /etc/pam.d/common-password 文件来管理密码复杂性策略。您可以修改每个 NSX Manager 节点和每个 NSX Edge 节点上的配置，以优化设置并遵守组织的策略和法规标准。
		ucredit	-1	设置密码应包含的大写字母（如A、B、C）数量，值为-1表示至少一个。
		lcredit	-1	设置密码应包含的小写字母（如a、b、c）数量，值为-1表示至少一个。
		ocredit	-1	设置密码应包含的其他字符（如！、@、#）数量，值为-1表示至少一个。
		minlen	15	设置密码应具有的最小字符数量，值为15表示至少具有十五个字符。
		difok	0	设置新密码与旧密码相比不同的字符数量，值为0表示不限制。
		retry	3	密码设置或更新的重设次数，值为3表示设置密码时如果密码不符合上述要求的重试次数为3次。
SDDC Manager	本地用户	dcredit	-1	设置密码应包含的数字字符（如0、1、2）数量，值为-1表示至少一个。	您可以使用 /etc/pam.d/system-password 文件管理密码复杂性策略。您可以编辑和修改配置以优化设置并遵守组织的策略和法规标准。
		ucredit	-1	设置密码应包含的大写字母（如A、B、C）数量，值为-1表示至少一个。
		lcredit	-1	设置密码应包含的小写字母（如a、b、c）数量，值为-1表示至少一个。
		ocredit	-1	设置密码应包含的其他字符（如！、@、#）数量，值为-1表示至少一个。
		minlen	8	设置密码应具有的最小字符数量，值为15表示至少具有十五个字符。
		minclass	4	设置密码必须使用的最小字符类型数（例如，大写、小写、数字等）。
		difok	4	设置新密码与旧密码相比不同的字符数量，值为4表示至少具有四个字符不一样。
		retry	3	密码设置或更新的重设次数，值为3表示设置密码时如果密码不符合上述要求的重试次数为3次。
		maxsequence	0	设置密码单个字符可以重复的最大次数，值为0表示不限制。
		remember	5	设置记住曾设置密码的历史次数，值为5表示新密码不应该是之前设置的5次密码中的任何一个。

三、账户锁定策略

组件	级别	参数设置	默认	描述	备注
ESXi 主机	本地用户	Security.AccountLockFailures	5	设置帐户被锁定之前的最大身份验证失败次数。	SSH 和 API 支持 ESXi 帐户锁定。如果用户尝试使用 SSH 或 API 使用不正确的本地帐户凭据登录，则该帐户将被锁定。DCUI 和 ESXi Shell 不支持帐户锁定。您可以使用 vSphere Client 或 Host Client 中的高级系统设置按主机管理帐户锁定策略。您可以编辑和修改配置以优化设置并遵守组织的策略和法规标准。
ESXi 主机	本地用户	Security.AccountUnlockTime	900	设置帐户处于锁定状态的时间（以秒为单位）。
vCenter Server	本地用户	deny	3	设置帐户被锁定之前的最大身份验证失败次数。	您可以通过按实例使用 /etc/pam.d/system-auth 文件来管理本地用户帐户锁定策略。您可以修改每个 vCenter Server 实例上的配置设置，以优化设置并遵守组织的策略和法规标准。
		unlock_time	900	设置帐户处于锁定状态的时间（以秒为单位）。
		root_unlock_time	300	设置 root 帐户处于锁定状态的时间（以秒为单位）。
	Single Sign-On	Maximum number of failed login attempts	5	设置帐户被锁定之前的最大身份验证失败次数。	您可以按内置身份提供程序域管理 vCenter Single Sign-On 帐户锁定策略。您可以编辑和修改配置以优化设置并遵守组织的策略和法规标准。
		Time interval between failures	180	设置登录失败的时间（以秒为单位），如 180 秒内连续失败了 5 次才将账户锁定。
		Unlock time	900	设置 root 帐户处于锁定状态的时间（以秒为单位）。如果将其设置为 0，则管理员必须显式解锁帐户。
NSX + NSX Edge	本地用户（API）	max-auth-failures	5	设置帐户被锁定之前的最大身份验证失败次数。	您可以使用身份验证策略，可以按 NSX Local Manager 集群的实例和 NSX Edge 节点的节点实例管理账户锁定策略。您可以为 NSX Local Manager 集群和 NSX Edge 节点的 NSX Manager 用户界面和 API 以及命令行界面（CLI）配置帐户锁定策略。您可以修改每个 NSX Local Manager 集群和每个 NSX Edge 节点上的配置，以优化设置并遵守组织的策略和法规标准。
		lockout-reset-period	180	设置登录失败的时间（以秒为单位），如 180 秒内连续失败了 5 次才将账户锁定。
		lockout-period	900	设置帐户处于锁定状态的时间（以秒为单位）。
	本地用户（CLI）	max-auth-failures	5	设置帐户被锁定之前的最大身份验证失败次数。
	本地用户（CLI）	lockout-period	900	设置帐户处于锁定状态的时间（以秒为单位）。
SDDC Manager	本地用户	deny	3	设置帐户被锁定之前的最大身份验证失败次数。	您可以使用 /etc/pam.d/system-auth 文件管理帐户锁定策略。您可以编辑和修改配置以优化设置并遵守组织的策略和法规标准。
		unlock_time	86400	设置帐户处于锁定状态的时间（以秒为单位）。
		root_unlock_time	300	设置 root 帐户处于锁定状态的时间（以秒为单位）。

四、管理密码策略

了解了 VCF 环境中组件的密码策略后，因为合规性或者安全性等要求，可能想要管理这些 VCF 组件的密码策略，因此您可以参考这篇
《VMware Cloud Foundation Operations Guide》
产品文档中的以下内容，然后根据需要执行对应组件的密码策略管理。

根据上面所列出的操作文档可知，要管理这些密码策略可以使用多种方式，最原始的办法就是针对不同的组件的不同密码策略，手动从组件中去修改那些参数，然后逐一完成组件的密码策略调整过程。但是这种方式太过于繁琐了，对于有成百上千台主机的大型环境来说，文档中的另一种方式会更加方便，那就是使用“PowerShell”命令来配置。使用 PowerShell 来配置需要借助以下 PowerShell 模块：

PowerShell Module for VMware Cloud Foundation Password Management
(PowerShell Gallery)
PowerShell Module for VMware Cloud Foundation Password Management
(Github)

参考文档，然后使用以下命令安装 VCF 密码管理模块。注意，安装使用这个模块之前，请参考这篇“
使用 PowerVCF 连接和管理 VMware Cloud Foundation 环境。
”文章准备这个模块所依赖的其他运行环境。

Install-Module -Name VMware.CloudFoundation.PasswordManagement -Scope CurrentUser
Get-Module -Name VMware.CloudFoundation.PasswordManagement -ListAvailable

查看模块所支持的命令选项。

Get-Command -Module VMware.CloudFoundation.PasswordManagement

验证环境是否满足运行 PowerShell 模块要求。

Test-VcfPasswordManagementPrereq

使用 PowerVCF 连接到 SDDC Manager。

Request-VCFToken -fqdn vcf-mgmt01-sddc01.mulab.local -username administrator@vsphere.local -password Vcf521@password

使用以下命令获取指定 VCF 版本所有组件的默认密码策略，或者直接将其输出为 JSON 文件。

Get-PasswordPolicyDefault -version '5.2.0.0'
Get-PasswordPolicyDefault -generateJson -jsonFile passwordPolicyConfig.json -version '5.2.0.0'

使用以下命令获取 VCF 环境中组件
密码策略
报告。

所有工作负载域

Invoke-PasswordPolicyManager -sddcManagerFqdn vcf-mgmt01-sddc01.mulab.local -sddcManagerUser administrator@vsphere.local -sddcManagerPass Vcf521@password -sddcRootPass Vcf521@password -reportPath "D:\Reporting" -darkMode -allDomains

指定工作负载域

Invoke-PasswordPolicyManager -sddcManagerFqdn vcf-mgmt01-sddc01.mulab.local -sddcManagerUser administrator@vsphere.local -sddcManagerPass Vcf521@password -sddcRootPass Vcf521@password -reportPath "D:\Reporting" -darkMode -workloadDomain vcf-mgmt01

使用以下命令获取 VCF 环境中组件
密码轮换
报告。

所有工作负载域

Invoke-PasswordRotationManager  -sddcManagerFqdn vcf-mgmt01-sddc01.mulab.local -sddcManagerUser administrator@vsphere.local -sddcManagerPass Vcf521@password -sddcRootPass Vcf521@password -reportPath "D:\Reporting" -darkMode -allDomains

指定工作负载域

Invoke-PasswordRotationManager -sddcManagerFqdn vcf-mgmt01-sddc01.mulab.local -sddcManagerUser administrator@vsphere.local -sddcManagerPass Vcf521@password -sddcRootPass Vcf521@password -reportPath "D:\Reporting" -darkMode -workloadDomain vcf-mgmt01

根据 JSON 文件以及报告文件，对 VCF 环境中的所有组件统一配置密码策略。

Start-PasswordPolicyConfig -sddcManagerFqdn vcf-mgmt01-sddc01.mulab.local -sddcManagerUser administrator@vsphere.local -sddcManagerPass Vcf521@password -sddcRootPass Vcf521@password -reportPath "D:\Reporting" -policyFile "passwordPolicyConfig.json"

检索指定 VCF 组件的本地用户的密码过期策略。

Request-LocalUserPasswordExpiration -server vcf-mgmt01-sddc01.mulab.local -user administrator@vsphere.local -pass Vcf521@password -domain vcf-mgmt01 -product vcenterServer -vmName vcf-mgmt01-vcsa01 -guestUser root -guestPassword Vcf521@password -localUser "root"

更新指定 VCF 组件的本地用户密码过期期限（以天为单位）。

Update-LocalUserPasswordExpiration -server vcf-mgmt01-sddc01.mulab.local -user administrator@vsphere.local -pass Vcf521@password -domain vcf-mgmt01 -vmName vcf-mgmt01-vcsa01 -guestUser root -guestPassword Vcf521@password -localUser "root","sshuser" -minDays 0 -maxDays 180 -warnDays 14

检索 SDDC Manager 组件的密码过期策略。

Request-SddcManagerPasswordExpiration -server vcf-mgmt01-sddc01.mulab.local -user administrator@vsphere.local -pass Vcf521@password  -rootPass Vcf521@password

更新 SDDC Manager 组件的密码过期策略。

Update-SddcManagerPasswordExpiration -server vcf-mgmt01-sddc01.mulab.local -user administrator@vsphere.local -pass Vcf521@password  -rootPass Vcf521@password -minDays 0 -maxDays 400 -warnDays 14

检索 ESXi 组件的密码复杂性策略。

Request-EsxiPasswordComplexity -server vcf-mgmt01-sddc01.mulab.local -user administrator@vsphere.local -pass Vcf521@password -domain vcf-mgmt01 -cluster vcf-mgmt01-cluster01

更新 ESXi 组件的密码复杂性策略。

Update-EsxiPasswordComplexity -server vcf-mgmt01-sddc01.mulab.local -user administrator@vsphere.local -pass Vcf521@password -domain vcf-mgmt01 -cluster vcf-mgmt01-cluster01 -policy "retry=5 min=disabled,disabled,disabled,8,8" -history 3

检索 ESXi 组件的账户锁定策略。

Request-EsxiAccountLockout -server vcf-mgmt01-sddc01.mulab.local -user administrator@vsphere.local -pass Vcf521@password -domain vcf-mgmt01 -cluster vcf-mgmt01-cluster01

更新 ESXi 组件的账户锁定策略。

Update-EsxiAccountLockout -server vcf-mgmt01-sddc01.mulab.local -user administrator@vsphere.local -pass Vcf521@password -domain vcf-mgmt01 -cluster vcf-mgmt01-cluster01 -failures 3 -unlockInterval 600

检索 SDDC Manager 管理的指定工作负载域的所有组件的密码轮换设置。

Request-PasswordRotationPolicy -server vcf-mgmt01-sddc01.mulab.local -user administrator@vsphere.local -pass Vcf521@password

更新 SDDC Manager 管理的 vCenter Server 组件的密码轮换设置。

Update-PasswordRotationPolicy -server vcf-mgmt01-sddc01.mulab.local -user administrator@vsphere.local -pass Vcf521@password -domain vcf-mgmt01 -resource vcenterServer -resourceName vcf-mgmt01-vcsa01.mulab.local -credential SSH -credentialName root -autoRotate enabled -frequencyInDays 90

高效文件处理：Python pathlib实战指南

作者: wenmo8
时间: 2024-12-12
分类: 其它
评论

在使用
Python
处理文件路径时，强烈建议使用
pathlib
。

pathlib
以面向对象的方式处理文件路径，既避免了很多陷阱，也能使执行许多路径的相关操作变得更容易。

本篇总结了常用的使用
pathlib
进行文件路径处理的方法。

1. 常用操作

首先介绍如何使用
pathlib
来完成一些常规的文件路径相关操作。

1.1. 构造路径

构建路径对象，只需要将文件或文件夹路径的字符串传给
Path
即可。

from pathlib import Path

fp = "D:\\temp\\pathlib"
path = Path(fp)
path

# path 对象
# WindowsPath('D:/temp/pathlib')

构造路径对象之后，
Path
会自动判断出是
windows
还是
linux
下的路径。

1.2. 拼接和拆分路径

用字符串来拼接和拆分路径时，最麻烦的就是不同系统中路径分隔符（
\ 和 /
）的处理。

使用
Path
对象，能够避免此困扰。

new_path = path.joinpath("abc")
new_path
# WindowsPath('D:/temp/pathlib/abc')

new_path = Path(fp, "test.py")
new_path
# WindowsPath('D:/temp/pathlib/test.py')

使用
joinpath
或者直接创建
Path
对象时拼接路径，不需要指定路径分隔符。

使用
Path
拆分路径也方便，它提供了多个属性来获取文件信息。

my_path = Path(fp, "program.py")
my_path
# WindowsPath('D:/temp/pathlib/program.py')

# 文件完整名
my_path.name
# 'program.py'

# 文件目录
my_path.parent
# WindowsPath('D:/temp/pathlib')

# 文件名（不带后缀）
my_path.stem
# 'program'

# 文件后缀名
my_path.suffix
# '.py'

# 修改文件后缀
my_path.with_suffix(".go")
# WindowsPath('D:/temp/pathlib/program.go')

1.3. 相对路径和绝对路径

相对路径转换为绝对路径，推荐使用
Path
对象的
resolve
方法。

path = Path("main.py")
path
# WindowsPath('main.py')

# 转换为绝对路径
path.resolve()
# WindowsPath('D:/projects/python/samples/main.py')

1.4. 遍历目录

遍历目录也是常用的文件路径操作。

fp = "D:\\temp\\pathlib\\a"
path = Path(fp)

for f in path.glob("*.txt"):
    print(f)

# D:\temp\pathlib\a\1.txt
# D:\temp\pathlib\a\2.txt
# D:\temp\pathlib\a\3.txt

glob
函数是只遍历目录下的文件，如果要遍历子目录中的文件，使用
rglob
函数。

for f in path.rglob("*.txt"):
    print(f)

# D:\temp\pathlib\a\1.txt
# D:\temp\pathlib\a\2.txt
# D:\temp\pathlib\a\3.txt
# D:\temp\pathlib\a\sub_a\sub_1.txt

1.5. 读写文件

传统的读写文件方式，一般都是两个步骤：先通过open函数打开文件，再进行读或者写。

# 写入
with open("d:\\readme.txt", "w") as f:
    f.write("abcdefg")

# 读取
with open("d:\\readme.txt", "r") as f:
    content = f.read()
    print(content)
    # abcdefg

使用
Path
对象，读写操作更加简单，代码也更清晰。

fp = "d:\\readme.txt"
path = Path(fp)
path.write_text("uvwxyz")

content = path.read_text()
print(content)
# uvwxyz

2. 更方便的操作

除了上面的常用操作，对于下面这些略微复杂文件路径的操作，

使用
Path
也能更容易的完成。

2.1. 检查文件或目录是否存在

fp = "D:\\temp\\pathlib\\a"
path = Path(fp)

path.is_dir() # True
path.is_file() # False
path.exists() # True

2.2. 创建目录

创建目录使用
Path
对象可以帮助我们自动处理异常情况。

path = Path("D:\\temp\\a\\b\\c\\d")
path.mkdir(exist_ok=True, parents=True)

exist_ok
和
parents
参数为了创建文件夹时省了很多判断。

exist_ok=True
表示如果
文件夹d
存在就不创建，也不报错，反之会报错。

parents=True
表示
文件夹d
的上层的各级文件夹如果不存在就自动创建，反之如果
文件夹d
的上层有不存在的文件夹则报错。

2.3. 路径自动规范化

使用
Path
来操作路径，不用过于关心不同操作系统的路径分割符问题。

在
windows
系统中，也可以使用
linux
的路径分割符，比如，下面两种方式都可以正常运行。

fp = "D:\\temp\\pathlib\\a"
path = Path(fp)

fp = "D:/temp/pathlib/a"
path = Path(fp)

3. 与os.path对比

pathlib
主要就是为了取代
os.path
，它们之间的对比整理如下：

路径操作	pathlib	os.path
读取所有文件内容	`path.read_text()`	`open(path).read()`
获取绝对文件路径	`path.resolve()`	`os.path.abspath(path)`
获取文件名	`path.name`	`os.path.basename(path)`
获取父目录	`path.parent`	`os.path.dirname(path)`
获取文件扩展名	`path.suffix`	`os.path.splitext(path)[1]`
文件名（不包含扩展名）	`path.stem`	`os.path.splitext(path)[0]`
相对路径	`path.relative_to(parent)`	`os.path.relpath(path, parent)`
验证路径是否为文件	`path.is_file()`	`os.path.isfile(path)`
验证路径是否为目录	`path.is_dir()`	`os.path.isdir(path)`
创建目录	`path.mkdir(parents=True)`	`os.makedirs(path)`
获取当前目录	`pathlib.Path.cwd()`	`os.getcwd()`
获取主目录	`pathlib.Path.home()`	`os.path.expanduser("~")`
按模式查找文件	`path.glob(pattern)`	`glob.iglob(pattern)`
递归查找文件	`path.rglob(pattern)`	`glob.iglob(pattern, recursive=True)`
规格化路径分隔符	`pathlib.Path(name)`	`os.path.normpath(name)`
拼接路径	`Path(paraent, name)`	`os.path.join(parent, name)`
获取文件大小	`path.stat().st_size`	`os.path.getsize(path)`
遍历文件树	`path.walk()`	`os.walk()`
将文件重定向到新路径	`path.rename(target)`	`os.rename(path, target)`
删除文件	`path.unlink()`	`os.remove(path)`

对比两种方式，就能体会pathlib的改进带来的好处。

wenmo8 发布的文章