2026年CentOS上SQL Server资源占用监控与管理完整指南(2026)

一、SQL Server资源管理概述

SQL Server on Linux的资源管理是确保数据库性能稳定的关键。与Windows版SQL Server不同,Linux版本的资源管理有其独特之处,需要管理员深入了解CPU、内存、磁盘I/O和网络等系统资源的分配与监控。

SQL Server资源组件
内存管理器:统一缓冲池,支持动态内存配置
调度器:支持多处理器调度,优化并发处理
I/O管理器:异步I/O,优化磁盘访问
网络接口:支持高带宽网络传输

资源管理的重要性
– 防止单一查询占用过多资源
– 确保多用户并发访问的公平性
– 优化整体系统性能
– 避免资源耗尽导致服务中断

二、系统资源要求

2.1 硬件要求

组件 最低要求 推荐配置 说明
CPU 2核心 8+核心 支持SMP和NUMA架构
内存 2GB 16GB+ SQL Server建议预留充足内存
磁盘空间 6GB 100GB+ 数据文件和日志文件
磁盘I/O 500 IOPS 5000+ IOPS SSD推荐

2.2 内存要求详解

最低内存要求说明

最小:2GB(仅SQL Server引擎)
推荐:4GB+(生产环境基本要求)
理想:8GB+(流畅运行)
最佳:16GB+(高性能应用)

内存分配建议
– 预留2-4GB给操作系统
– SQL Server最大服务器内存设置:总内存 – 4GB
– 考虑其他服务(如SSMS、Agent)的内存需求

2.3 CPU要求详解

CPU架构支持
– x64架构(推荐)
– ARM64架构(SQL Server 2019+支持)

CPU配置建议
– 生产环境:8核心以上
– 启用SQL Server的CPU亲和性
– 考虑NUMA架构优化

三、内存管理与配置

3.1 查看内存使用

# 查看系统内存概况
free -h

# 查看SQL Server进程内存使用
ps aux | grep mssql

# 查看详细内存信息
cat /proc/meminfo

# 查看SQL Server内存占用(持续监控)
watch -n 5 'ps aux | grep mssql | grep -v grep'

SQL Server内存使用示例

USER       PID   %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
mssql    12345   45.2  62.1 16384  8000 ?        Ssl   02:00 245:30 /opt/mssql/bin/sqlservr

3.2 配置最大服务器内存

# 查看当前配置
sudo /opt/mssql/bin/mssql-conf get memory.memorylimitmb

# 设置最大内存(例如32GB = 32768MB)
sudo /opt/mssql/bin/mssql-conf set memory.memorylimitmb 32768

# 设置最小内存(例如16GB = 16384MB)
sudo /opt/mssql/bin/mssql-conf set memory.memorylimitmb 16384

# 重启SQL Server生效
sudo systemctl restart mssql-server

3.3 使用T-SQL查看内存配置

-- 查看当前内存配置
SELECT 
    name,
    value,
    value_in_use,
    description
FROM sys.configurations
WHERE name IN ('max server memory (MB)', 'min server memory (MB)');

-- 查看SQL Server内存使用详情
SELECT 
    (SELECT COUNT(*) * 8 FROM sys.dm_exec_requests) AS allocated_memory_kb,
    (SELECT COUNT(*) * 8 FROM sys.dm_os_memory_clerks) AS total_memory_kb,
    (SELECT physical_memory_kb FROM sys.dm_os_sys_info) AS total_physical_memory_kb;

3.4 内存压力诊断

-- 查看内存 clerk 信息
SELECT 
    type,
    name,
    pages_kb / 1024 AS pages_mb,
    virtual_memory_committed_kb / 1024 AS virtual_mb
FROM sys.dm_os_memory_clerks
ORDER BY pages_kb DESC;

-- 查看_buffer_pool信息
SELECT 
    COUNT(*) * 8 / 1024 AS buffer_pool_size_mb,
    SUM(free_pages * 8) / 1024 AS free_pages_mb
FROM sys.dm_os_memory_cache_entries;

-- 查看内存压力信号
SELECT 
    signal_wait_time_ms,
    page_life_expectancy,
    batch_requests_per_sec,
    buffer_cache_hit_ratio
FROM sys.dm_os_ring_buffers
WHERE ring_buffer_type = 'RING_BUFFER_RESOURCE_MONITOR';

四、CPU管理与配置

4.1 查看CPU使用

# 查看CPU使用率(总体)
top
htop

# 查看SQL Server CPU使用
top -p $(pgrep -f sqlservr)

# 查看CPU核心数
nproc
lscpu | grep -E "^CPU\(s\)|Core|Socket"

# 查看每个核心的负载
mpstat -P ALL 1 5

# CPU使用详情
cat /proc/cpuinfo

4.2 配置CPU亲和性

# 查看当前CPU亲和性配置
sudo /opt/mssql/bin/mssql-conf get processaffinity

# 设置CPU亲和性(使用前4个核心)
sudo /opt/mssql/bin/mssql-conf set processaffinity "0-3"

# 禁用CPU亲和性(让系统自动调度)
sudo /opt/mssql/bin/mssql-conf set processaffinity 0

# 重启生效
sudo systemctl restart mssql-server

4.3 使用T-SQL查看CPU使用

-- 查看当前CPU使用情况
SELECT 
    scheduler_id,
    cpu_id,
    is_online,
    is_idle,
    current_tasks_count,
    runnable_tasks_count,
    active_workers_count,
    pending_disk_io_count
FROM sys.dm_os_schedulers
WHERE status = 'VISIBLE ONLINE';

-- 查看最近执行的查询CPU使用
SELECT TOP 20
    query_text,
    total_worker_time / 1000 AS cpu_time_ms,
    execution_count,
    total_worker_time / execution_count AS avg_cpu_time
FROM (
    SELECT 
        SUBSTRING(st.text, (qs.statement_start_offset/2) + 1,
            ((CASE statement_end_offset
                WHEN -1 THEN DATALENGTH(st.text)
                ELSE qs.statement_end_offset END - qs.statement_start_offset)/2) + 1) AS query_text,
        qs.total_worker_time,
        qs.execution_count,
        qs.statement_start_offset,
        qs.statement_end_offset
    FROM sys.dm_exec_query_stats qs
    CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
) AS q
ORDER BY total_worker_time DESC;

4.4 高CPU使用诊断

-- 查看正在运行的查询CPU占用
SELECT 
    r.session_id,
    r.status,
    r.cpu_time,
    r.total_elapsed_time,
    t.text AS query_text
FROM sys.dm_exec_requests r
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) t
WHERE r.cpu_time > 10000
ORDER BY r.cpu_time DESC;

-- 查看CPU密集型查询
SELECT 
    SUBSTRING(t.text, (r.statement_start_offset/2) + 1,
        ((CASE statement_end_offset
            WHEN -1 THEN DATALENGTH(t.text)
            ELSE r.statement_end_offset END - r.statement_start_offset)/2) + 1) AS query_text,
    r.cpu_time,
    r.total_elapsed_time,
    r.execution_count
FROM sys.dm_exec_requests r
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) t
WHERE r.status = 'running'
ORDER BY r.cpu_time DESC;

五、磁盘I/O监控

5.1 查看磁盘使用

# 查看磁盘空间
df -h

# 查看SQL Server数据目录
ls -lh /var/opt/mssql/data/

# 查看磁盘I/O统计
iostat -x 1 5

# 查看磁盘延迟
iostat -d -x 1 2 | awk 'NR>3 {print $1, $4, $7, $8}'

# 查看文件I/O延迟
iotop -b -n 3 | grep mssql

5.2 SQL Server I/O统计

-- 查看数据库文件I/O统计
SELECT 
    DB_NAME(database_id) AS DatabaseName,
    file_id,
    num_of_reads,
    num_of_writes,
    num_of_bytes_read / 1024 / 1024 AS mb_read,
    num_of_bytes_written / 1024 / 1024 AS mb_written,
    io_stall_read_ms / NULLIF(num_of_reads, 0) AS avg_read_latency_ms,
    io_stall_write_ms / NULLIF(num_of_writes, 0) AS avg_write_latency_ms
FROM sys.dm_io_virtual_file_stats(NULL, NULL)
WHERE database_id > 4
ORDER BY io_stall DESC;

-- 查看所有文件的I/O等待
SELECT 
    name AS file_name,
    physical_name,
    type_desc,
    state_desc,
    is_read_only,
    size * 8 / 1024 AS size_mb,
    growth * 8 / 1024 AS growth_mb
FROM sys.master_files
WHERE database_id > 4;

5.3 I/O瓶颈诊断

-- 查看I/O等待统计
SELECT 
    wait_type,
    waiting_tasks_count,
    wait_time_ms,
    signal_wait_time_ms,
    wait_time_ms / NULLIF(waiting_tasks_count, 0) AS avg_wait_ms
FROM sys.dm_os_wait_stats
WHERE wait_type IN (
    'PAGEIOLATCH_SH', 'PAGEIOLATCH_EX', 'PAGEIOLATCH_UP',
    'IO_COMPLETION', 'ASYNC_IO_COMPLETION', 'LOGMGR_FLUSH'
)
ORDER BY wait_time_ms DESC;

-- 查看Paginated读取
SELECT 
    database_id,
    file_id,
    num_of_reads,
    num_of_writes,
    io_stall_read_ms,
    io_stall_write_ms,
    CAST(num_of_bytes_read / 1048576.0 AS DECIMAL(10,2)) AS mb_read,
    CAST(num_of_bytes_written / 1048576.0 AS DECIMAL(10,2)) AS mb_written
FROM sys.dm_io_virtual_file_stats(NULL, NULL)
ORDER BY io_stall_read_ms DESC;

六、网络资源监控

6.1 查看网络使用

# 查看网络连接
netstat -an | grep 1433

# 查看SQL Server网络连接状态
ss -tlnp | grep 1433

# 查看网络流量
nload
nethogs

# 查看SQL Server端口
cat /var/opt/mssql/log/errorlog | grep "listening"

6.2 SQL Server网络配置

# 查看TCP端口配置
sudo /opt/mssql/bin/mssql-conf get network.tcpport

# 修改TCP端口(例如改为14333)
sudo /opt/mssql/bin/mssql-conf set network.tcpport 14333

# 查看网络加密配置
sudo /opt/mssql/bin/mssql-conf get network.forceencryption

# 启用强制加密
sudo /opt/mssql/bin/mssql-conf set network.forceencryption 1

# 重启生效
sudo systemctl restart mssql-server

6.3 网络性能监控

-- 查看当前连接
SELECT 
    session_id,
    login_time,
    host_name,
    program_name,
    client_interface_name,
    client_net_address,
    auth_scheme,
    encrypt_option
FROM sys.dm_exec_connections;

-- 查看连接数统计
SELECT 
    host_name,
    program_name,
    COUNT(*) AS connection_count,
    MIN(login_time) AS first_connect,
    MAX(login_time) AS last_connect
FROM sys.dm_exec_sessions
WHERE is_user_process = 1
GROUP BY host_name, program_name
ORDER BY connection_count DESC;

-- 查看网络I/O
SELECT 
    session_id,
    client_net_address,
    num_reads,
    num_writes,
    bytes_sent / 1024 AS kb_sent,
    bytes_received / 1024 AS kb_received
FROM sys.dm_exec_connections;

七、资源监控脚本

7.1 综合监控脚本

#!/bin/bash
# monitor_sqlserver_resources.sh - SQL Server资源综合监控脚本

OUTPUT_FILE="/tmp/mssql_resource_monitor_$(date +%Y%m%d_%H%M%S).txt"
SA_PASSWORD="YourStrong@Password"

echo "=== SQL Server资源监控报告 ===" > $OUTPUT_FILE
echo "监控时间:$(date)" >> $OUTPUT_FILE
echo "" >> $OUTPUT_FILE

# 1. 系统资源
echo "【系统资源】" >> $OUTPUT_FILE
echo "CPU核心数:$(nproc)" >> $OUTPUT_FILE
echo "总内存:$(free -h | grep Mem | awk '{print $2}')" >> $OUTPUT_FILE
echo "磁盘空间:" >> $OUTPUT_FILE
df -h /var/opt/mssql >> $OUTPUT_FILE
echo "" >> $OUTPUT_FILE

# 2. SQL Server进程
echo "【SQL Server进程】" >> $OUTPUT_FILE
ps aux | grep mssql | grep -v grep >> $OUTPUT_FILE 2>&1
echo "" >> $OUTPUT_FILE

# 3. 内存使用
echo "【内存使用】" >> $OUTPUT_FILE
sqlcmd -S localhost -U sa -P "$SA_PASSWORD" -C -Q "
SELECT 
    name AS config_name,
    value AS current_value,
    value_in_use AS running_value,
    description
FROM sys.configurations
WHERE name IN ('max server memory (MB)', 'min server memory (MB)');
" >> $OUTPUT_FILE 2>&1
echo "" >> $OUTPUT_FILE

# 4. 连接数
echo "【当前连接数】" >> $OUTPUT_FILE
sqlcmd -S localhost -U sa -P "$SA_PASSWORD" -C -Q "
SELECT 
    COUNT(*) AS total_connections,
    SUM(CASE WHEN status = 'running' THEN 1 ELSE 0 END) AS active_connections,
    SUM(CASE WHEN is_user_process = 1 THEN 1 ELSE 0 END) AS user_connections
FROM sys.dm_exec_sessions;
" >> $OUTPUT_FILE 2>&1
echo "" >> $OUTPUT_FILE

# 5. 性能指标
echo "【性能指标】" >> $OUTPUT_FILE
sqlcmd -S localhost -U sa -P "$SA_PASSWORD" -C -Q "
SELECT 
    AVG(batch_requests_per_sec) AS batch_requests,
    AVG(buffer_cache_hit_ratio) AS cache_hit_ratio,
    AVG(page_life_expectancy) AS page_life_expectancy,
    AVG(workfiles_created_per_sec) AS workfiles_created
FROM sys.dm_os_performance_counters
WHERE counter_name IN ('Batch Requests/sec', 'Buffer cache hit ratio', 'Page life expectancy', 'Workfiles Created/sec');
" >> $OUTPUT_FILE 2>&1

echo ""
echo "报告已保存到:$OUTPUT_FILE"
cat $OUTPUT_FILE

7.2 资源告警脚本

#!/bin/bash
# resource_alert.sh - SQL Server资源告警脚本

ALERT_EMAIL="admin@example.com"
SA_PASSWORD="YourStrong@Password"
CPU_THRESHOLD=80
MEM_THRESHOLD=90
CONN_THRESHOLD=80

log_alert() {
    echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a /tmp/mssql_alerts.log
}

# CPU检查
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1)
if (( $(echo "$CPU_USAGE > $CPU_THRESHOLD" | bc -l) )); then
    log_alert "警告:CPU使用率超过${CPU_THRESHOLD}%,当前:${CPU_USAGE}%"
fi

# 内存检查
MEM_USAGE=$(free | grep Mem | awk '{print int($3/$2 * 100)}')
if [ $MEM_USAGE -gt $MEM_THRESHOLD ]; then
    log_alert "警告:内存使用率超过${MEM_THRESHOLD}%,当前:${MEM_USAGE}%"
fi

# 连接数检查
MAX_CONN=$(sqlcmd -S localhost -U sa -P "$SA_PASSWORD" -C -h -1 -Q "SELECT value FROM sys.configurations WHERE name = 'max server connections'" 2>/dev/null | tr -d ' ')
CUR_CONN=$(sqlcmd -S localhost -U sa -P "$SA_PASSWORD" -C -h -1 -Q "SELECT COUNT(*) FROM sys.dm_exec_sessions WHERE is_user_process = 1" 2>/dev/null | tr -d ' ')

if [ ! -z "$MAX_CONN" ] && [ ! -z "$CUR_CONN" ]; then
    CONN_PCT=$((CUR_CONN * 100 / MAX_CONN))
    if [ $CONN_PCT -gt $CONN_THRESHOLD ]; then
        log_alert "警告:数据库连接数超过${CONN_THRESHOLD}%,当前:${CONN_PCT}%(${CUR_CONN}/${MAX_CONN})"
    fi
fi

八、性能优化建议

8.1 内存优化

-- 优化内存配置建议
-- 1. 设置合理的最大服务器内存
EXEC sp_configure 'max server memory (MB)', 8192;
RECONFIGURE;

-- 2. 启用锁页内存(需root权限)
sudo bash -c 'echo "LockPagesInMemory" >> /etc/security/limits.d/mssql.conf'
sudo bash -c 'echo "* - memlock unlimited" >> /etc/security/limits.d/mssql.conf'

-- 3. 监控内存压力
SELECT * FROM sys.dm_os_ring_buffers 
WHERE ring_buffer_type = 'RING_BUFFER_RESOURCE_MONITOR';

8.2 CPU优化

-- 启用SQL Server的"max degree of parallelism"
EXEC sp_configure 'max degree of parallelism', 4;
RECONFIGURE;

-- 配置"cost threshold for parallelism"
EXEC sp_configure 'cost threshold for parallelism', 25;
RECONFIGURE;

-- 查看并行查询
SELECT 
    query,
    plan_handle,
    query_plan
FROM sys.dm_exec_query_stats qs
WHERE EXISTS (
    SELECT 1 FROM sys.dm_exec_query_plan(qs.plan_handle) p
    WHERE p.query_plan.value('//@Parallel', 'int') = 1
);

8.3 I/O优化

-- 查看延迟高的数据文件
SELECT 
    DB_NAME(mf.database_id) AS database_name,
    mf.name AS logical_name,
    mf.type_desc,
    io_stalls_read_ms / NULLIF(num_of_reads, 0) AS avg_read_latency,
    io_stalls_write_ms / NULLIF(num_of_writes, 0) AS avg_write_latency
FROM sys.dm_io_virtual_file_stats(NULL, NULL) vfs
JOIN sys.master_files mf ON vfs.database_id = mf.database_id AND vfs.file_id = mf.file_id
WHERE vfs.database_id > 4
ORDER BY io_stalls_read_ms + io_stalls_write_ms DESC;

-- 创建聚集索引优化I/O
CREATE CLUSTERED INDEX IX_TableName_PrimaryKey
ON TableName(PrimaryKeyColumn)
WITH (DROP_EXISTING = ON);

九、资源限制配置

9.1 配置资源 governor

-- 创建资源池
CREATE RESOURCE POOL Pool_Production
WITH (
    min_cpu_percent = 20,
    max_cpu_percent = 80,
    min_memory_percent = 30,
    max_memory_percent = 70
);

CREATE RESOURCE POOL Pool_Development
WITH (
    min_cpu_percent = 10,
    max_cpu_percent = 50,
    min_memory_percent = 10,
    max_memory_percent = 30
);

-- 创建工作负载组
CREATE WORKLOAD GROUP Group_Production
USING Pool_Production;

CREATE WORKLOAD GROUP Group_Development
USING Pool_Development;

-- 启用Resource Governor
ALTER RESOURCE GOVERNOR RECONFIGURE;

9.2 查询限制

-- 限制查询执行时间(秒)
EXEC sp_configure 'query wait (s)', 30;
RECONFIGURE;

-- 限制最大查询内存(KB)
EXEC sp_configure 'max server memory (MB)', 8192;
RECONFIGURE;

-- 限制工作线程数
EXEC sp_configure 'max worker threads', 512;
RECONFIGURE;

十、常见问题与解决方案

Q1:SQL Server内存占用过高?

原因分析
– 未设置最大内存限制
– 存在内存泄漏
– 配置了过大的缓冲池

解决方案

# 设置最大内存(例如32GB)
sudo /opt/mssql/bin/mssql-conf set memory.memorylimitmb 32768
sudo systemctl restart mssql-server

# 检查内存泄漏
sqlcmd -S localhost -U sa -P "YourPassword" -Q "
SELECT type, name, pages_kb / 1024 AS mb
FROM sys.dm_os_memory_clerks
ORDER BY pages_kb DESC;
"

Q2:CPU使用率持续100%?

原因分析
– 存在CPU密集型查询
– 统计信息过时导致错误执行计划
– 缺少索引导致全表扫描

解决方案

-- 找出CPU密集型查询
SELECT TOP 10
    qs.total_worker_time / 1000 AS cpu_time_ms,
    qs.execution_count,
    qp.query_plan,
    SUBSTRING(st.text, (qs.statement_start_offset/2)+1,
        ((CASE qs.statement_end_offset
            WHEN -1 THEN DATALENGTH(st.text)
            ELSE qs.statement_end_offset END - qs.statement_start_offset)/2)+1) AS query_text
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
ORDER BY qs.total_worker_time DESC;

Q3:磁盘I/O成为瓶颈?

解决方案
1. 将数据和日志文件分散到不同磁盘
2. 使用SSD提升I/O性能
3. 启用数据文件初始化
4. 调整条带大小匹配I/O模式

# 查看当前I/O统计
sqlcmd -S localhost -U sa -P "YourPassword" -Q "
SELECT * FROM sys.dm_io_virtual_file_stats(NULL, NULL);
"

十一、资源监控最佳实践

11.1 监控频率建议

监控项 频率 告警阈值
CPU 每1分钟 >80%
内存 每1分钟 >90%
磁盘I/O 每5分钟 >50ms
连接数 每5分钟 >80%
查询性能 每15分钟 >5秒

11.2 建立监控基线

-- 创建性能基线表
CREATE TABLE PerformanceBaseline (
    ID INT IDENTITY PRIMARY KEY,
    CollectionTime DATETIME DEFAULT GETDATE(),
    CPUUsage DECIMAL(5,2),
    MemoryUsageMB DECIMAL(10,2),
    ActiveConnections INT,
    BatchRequestsPerSec DECIMAL(10,2),
    BufferCacheHitRatio DECIMAL(5,2),
    PageLifeExpectancy INT
);

-- 创建定期收集作业
CREATE PROCEDURE CollectBaselineMetrics
AS
BEGIN
    INSERT INTO PerformanceBaseline (CPUUsage, MemoryUsageMB, ActiveConnections, BatchRequestsPerSec, BufferCacheHitRatio, PageLifeExpectancy)
    VALUES (
        (SELECT cpu_busy FROM sys.dm_exec_sessions WHERE session_id = @@SPID),
        (SELECT working_set_kb / 1024 FROM sys.dm_os_process_memory),
        (SELECT COUNT(*) FROM sys.dm_exec_sessions WHERE is_user_process = 1),
        (SELECT AVG(batch_requests_per_sec) FROM sys.dm_os_performance_counters WHERE counter_name = 'Batch Requests/sec'),
        (SELECT AVG(buffer_cache_hit_ratio) FROM sys.dm_os_performance_counters WHERE counter_name = 'Buffer cache hit ratio'),
        (SELECT AVG(page_life_expectancy) FROM sys.dm_os_performance_counters WHERE counter_name = 'Page life expectancy')
    );
END;

11.3 自动化监控方案

  1. 使用Zabbix/Prometheus:系统级监控
  2. 使用SQL Server监控工具:数据库级监控
  3. 使用Grafana:可视化展示
  4. 设置邮件/短信告警:及时通知

总结

CentOS上SQL Server资源监控与管理的关键要点:

  • 内存管理:设置最大/最小服务器内存,监控缓冲池使用
  • CPU管理:配置亲和性,识别CPU密集型查询
  • I/O监控:监控读写延迟,优化数据文件分布
  • 网络监控:跟踪连接数,监控网络I/O
  • 自动化:脚本监控、告警设置、定期基线收集
  • 优化:根据监控数据调整配置,解决性能瓶颈

掌握这些资源监控技巧,可以确保SQL Server在CentOS环境下稳定高效运行。

本文基于SQL Server 2022和CentOS Stream 9编写,适用于CentOS 7+/RHEL 7+/CentOS Stream环境。

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注