mysql8 用C++源码角度看客户端发起sql网络请求，并处理sql命令

深入解析 MySQL 8：网络层读取客户端数据的实现

在 MySQL 服务器中，网络通信是处理客户端请求的核心环节之一。理解 MySQL 如何从网络中读取客户端发送的数据，不仅可以帮助我们更好地掌握其内部工作机制，还能为优化性能和排查问题提供重要线索。本文将深入解析 MySQL 8 的 C++ 源码，逐步剖析服务器如何读取客户端发送的网络数据。

1. 网络数据读取的入口：`THD::get_command`

在 MySQL 中，每个客户端连接都由一个 THD（Thread Handler）对象管理。THD 是 MySQL 中用于表示客户端会话的核心类，它封装了与客户端交互的所有状态信息和操作接口。当服务器准备从客户端读取数据时，流程从 THD::get_command 方法开始：

cpp复制

rc = thd->m_mem_cnt.reset();
if (rc)
    thd->m_mem_cnt.set_thd_error_status();
else {
    thd->m_server_idle = true;
    rc = thd->get_protocol()->get_command(&com_data, &command);
    thd->m_server_idle = false;
}

在这段代码中：

m_mem_cnt.reset()：重置内存计数器，用于监控内存使用情况。
m_server_idle：标记服务器是否处于空闲状态。这主要用于性能监控和事件跟踪，例如在读取新网络数据包时记录相关的性能指标。
get_protocol()->get_command：调用协议类（Protocol_classic）的 get_command 方法，从网络中读取客户端发送的命令。

2. 从网络中读取数据包：`Protocol_classic::get_command`

Protocol_classic::get_command 是 MySQL 传统协议的实现，负责从网络中读取数据包，并解析出客户端的命令类型：

cpp复制

int Protocol_classic::get_command(COM_DATA *com_data, enum_server_command *cmd) {
    if (const int rc = read_packet()) return rc;

    if (input_packet_length == 0) {
        input_raw_packet[0] = (uchar)COM_SLEEP;
        input_packet_length = 1;
    }

    input_raw_packet[input_packet_length] = '\0'; // 安全性检查

    *cmd = (enum enum_server_command)(uchar)input_raw_packet[0];
    if (*cmd >= COM_END) *cmd = COM_END;  // 错误命令

    input_packet_length--;
    input_raw_packet++;
    return parse_packet(com_data, *cmd);
}

关键步骤如下：

read_packet：从网络中读取数据包。如果读取失败，直接返回错误。
数据包长度检查：如果数据包长度为零，MySQL 会将其视为 COM_SLEEP 命令，这是一种特殊情况的处理。
命令解析：根据数据包的第一个字节解析出客户端的命令类型（例如 COM_QUERY、COM_STMT_EXECUTE 等）。
调用 parse_packet：进一步解析数据包内容，提取命令相关的参数。

3. 数据包的底层读取：`my_net_read`

read_packet 方法的核心是调用 my_net_read，这是 MySQL 网络通信的核心函数，负责从网络中读取数据包：

cpp复制

ulong my_net_read(NET *net) {
    size_t len;
    if (!vio_is_blocking(net->vio)) vio_set_blocking_flag(net->vio, true);

    if (net->compress)
        net_read_compressed_packet(net, len);
    else
        net_read_uncompressed_packet(net, len);

    return static_cast<ulong>(len);
}

my_net_read 的主要功能包括：

设置阻塞模式：确保网络操作是阻塞的，避免非阻塞模式下的复杂处理。
压缩与非压缩数据包的处理：
- 如果数据包被压缩（net->compress 为 true），调用 net_read_compressed_packet 进行解压。
- 否则，调用 net_read_uncompressed_packet 直接读取原始数据。

4. 数据包的解析：`Protocol_classic::parse_packet`

一旦数据包被读取到内存中，parse_packet 方法会根据命令类型解析数据包内容：

cpp复制

bool Protocol_classic::parse_packet(union COM_DATA *data, enum_server_command cmd) {
    switch (cmd) {
        case COM_INIT_DB:
            data->com_init_db.db_name = reinterpret_cast<const char *>(input_raw_packet);
            data->com_init_db.length = input_packet_length;
            break;

        case COM_QUERY:
            data->com_query.query = reinterpret_cast<const char *>(input_raw_packet);
            data->com_query.length = input_packet_length;
            break;

        case COM_STMT_EXECUTE:
            data->com_stmt_execute.stmt_id = uint4korr(input_raw_packet);
            // 解析更多参数...
            break;

        default:
            break;
    }

    return false;
}

parse_packet 的核心逻辑是根据不同的命令类型（cmd），提取数据包中的关键信息：

COM_INIT_DB：解析数据库名称。
COM_QUERY：提取 SQL 查询字符串。
COM_STMT_EXECUTE：解析预处理语句的 ID 和执行参数。

5. 多包处理与偏移更新：`net_read_update_offsets`

在某些情况下，客户端发送的数据可能跨越多个数据包（例如大查询或批量操作）。MySQL 通过 net_read_update_offsets 函数管理这些多包数据的偏移量：

cpp复制

static ulong net_read_update_offsets(NET *net, size_t start_of_packet, size_t first_packet_offset, size_t buf_length, uint multi_byte_packet) {
    net->read_pos = net->buff + first_packet_offset + NET_HEADER_SIZE;
    net->buf_length = buf_length;
    net->remain_in_buf = (ulong)(buf_length - start_of_packet);

    const ulong len = ((ulong)(start_of_packet - first_packet_offset) - NET_HEADER_SIZE - multi_byte_packet);
    if (net->remain_in_buf) {
        net->save_char = net->read_pos[len + multi_byte_packet];
    }
    net->read_pos[len] = '\0';  // 安全性检查
    return len;
}

此函数的主要功能包括：

更新 NET 结构中的偏移量，确保后续读取操作能够正确处理多包数据。
在数据包末尾添加终止符（'\0'），防止潜在的缓冲区溢出。

6. 总结

MySQL 8 的网络层设计通过分层和模块化的方式，高效地处理客户端发送的数据。从 THD::get_command 的入口，到 my_net_read 的底层读取，再到 parse_packet 的解析，整个流程清晰且高效。这种设计不仅保证了数据的正确性，还通过多包处理和偏移量管理，支持了复杂场景下的网络通信。

通过深入理解 MySQL 的网络读取机制，我们可以更好地优化网络性能，同时也能更有效地排查和解决与网络通信相关的问题。

MySQL 8 的 C++ 源码中，处理网络请求和 SQL 命令的流程涉及多个函数和类。以下是关键的函数和类，以及它们的作用：

1. `do_command` 函数

do_command 函数是 MySQL 服务器中处理客户端命令的核心函数。它从客户端读取一个命令并执行。这个函数在 sql/sql_parse.cc 文件中定义。

函数原型

cpp复制

bool do_command(THD *thd);

主要步骤

初始化：
- 清除可能的错误状态。
- 设置线程的当前查询块为空。
- 清除错误消息和诊断区域。
读取命令：
- 设置网络读取超时。
- 调用 thd->get_protocol()->get_command 从客户端读取命令。
处理命令：
- 如果读取命令失败，处理错误并返回。
- 否则，调用 dispatch_command 处理具体的命令。
清理：
- 无论成功还是失败，清理线程的资源和状态。

2. `dispatch_command` 函数

dispatch_command 函数根据客户端发送的命令类型（如 COM_QUERY、COM_STMT_PREPARE 等）调用相应的处理函数。这个函数也在 sql/sql_parse.cc 文件中定义。

函数原型

cpp复制

bool dispatch_command(THD *thd, COM_DATA *com_data, enum enum_server_command command);

主要步骤

检查命令类型：
- 根据命令类型调用相应的处理函数。例如：
  - COM_QUERY：调用 mysql_parse 解析和执行 SQL 查询。
  - COM_STMT_PREPARE：调用 mysql_parse 解析和准备预处理语句。
  - COM_QUIT：处理客户端断开连接。
执行命令：
- 调用相应的处理函数执行命令。
返回结果：
- 返回命令执行的结果。

3. `handle_connection` 函数

handle_connection 函数是处理新连接的线程函数。它在 sql/sql_connect.cc 文件中定义。这个函数负责初始化新连接的线程，并调用 do_command 处理客户端命令。

函数原型

cpp复制

extern "C" void *handle_connection(void *arg);

主要步骤

初始化线程：
- 初始化线程的资源和状态。
- 设置线程的 TLS（线程局部存储）。
处理连接：
- 调用 do_command 处理客户端命令。
- 如果客户端断开连接，清理线程的资源。
清理线程：
- 释放线程的资源。
- 退出线程。

4. `Connection_handler_manager` 类

Connection_handler_manager 类负责管理连接处理程序。它在 sql/sql_connect.cc 文件中定义。这个类的主要职责包括：

接受新的连接。
创建新的线程来处理连接。
管理连接的生命周期。

主要方法

process_new_connection：处理新的连接。
add_connection：将新的连接添加到连接处理程序中。

/**
  Read one command from connection and execute it (query or simple command).
  This function is called in loop from thread function.

  For profiling to work, it must never be called recursively.

  @retval
    0  success
  @retval
    1  request of thread shutdown (see dispatch_command() description)
*/

bool do_command(THD *thd) {
  bool return_value;
  int rc;
  NET *net = nullptr;
  enum enum_server_command command = COM_SLEEP;
  COM_DATA com_data;
  DBUG_TRACE;
  assert(thd->is_classic_protocol());

  /*
    indicator of uninitialized lex => normal flow of errors handling
    (see my_message_sql)
  */
  thd->lex->set_current_query_block(nullptr);

  /*
    XXX: this code is here only to clear possible errors of init_connect.
    Consider moving to prepare_new_connection_state() instead.
    That requires making sure the DA is cleared before non-parsing statements
    such as COM_QUIT.
  */
  thd->clear_error();  // Clear error message
  thd->get_stmt_da()->reset_diagnostics_area();

  /*
    This thread will do a blocking read from the client which
    will be interrupted when the next command is received from
    the client, the connection is closed or "net_wait_timeout"
    number of seconds has passed.
  */
  net = thd->get_protocol_classic()->get_net();
  my_net_set_read_timeout(net, thd->variables.net_wait_timeout);
  net_new_transaction(net);

  /*
    WL#15369 : to make connections threads sleep to test if
    ER_THREAD_STILL_ALIVE, and ER_NUM_THREADS_STILL_ALIVE are
    being logged in the intended way, i.e. when connection threads
    are still alive, even after forcefully disconnecting them in
    close_connections().
 */
  DBUG_EXECUTE_IF("simulate_connection_thread_hang", sleep(15););

  /*
    Synchronization point for testing of KILL_CONNECTION.
    This sync point can wait here, to simulate slow code execution
    between the last test of thd->killed and blocking in read().

    The goal of this test is to verify that a connection does not
    hang, if it is killed at this point of execution.
    (Bug#37780 - main.kill fails randomly)

    Note that the sync point wait itself will be terminated by a
    kill. In this case it consumes a condition broadcast, but does
    not change anything else. The consumed broadcast should not
    matter here, because the read/recv() below doesn't use it.
  */
  DEBUG_SYNC(thd, "before_do_command_net_read");

  /* For per-query performance counters with log_slow_statement */
  struct System_status_var query_start_status;
  thd->clear_copy_status_var();
  if (opt_log_slow_extra) {
    thd->copy_status_var(&query_start_status);
  }

  rc = thd->m_mem_cnt.reset();
  if (rc)
    thd->m_mem_cnt.set_thd_error_status();
  else {
    /*
      Because of networking layer callbacks in place,
      this call will maintain the following instrumentation:
      - IDLE events
      - SOCKET events
      - STATEMENT events
      - STAGE events
      when reading a new network packet.
      In particular, a new instrumented statement is started.
      See init_net_server_extension()
    */
    thd->m_server_idle = true;
    rc = thd->get_protocol()->get_command(&com_data, &command);
    thd->m_server_idle = false;
  }

  if (rc) {
#ifndef NDEBUG
    char desc[VIO_DESCRIPTION_SIZE];
    vio_description(net->vio, desc);
    DBUG_PRINT("info", ("Got error %d reading command from socket %s",
                        net->error, desc));
#endif  // NDEBUG

    MYSQL_NOTIFY_STATEMENT_QUERY_ATTRIBUTES(thd->m_statement_psi, false);

    /* Instrument this broken statement as "statement/com/error" */
    thd->m_statement_psi = MYSQL_REFINE_STATEMENT(
        thd->m_statement_psi, com_statement_info[COM_END].m_key);

    /* Check if we can continue without closing the connection */

    /* The error must be set. */
    assert(thd->is_error());
    thd->send_statement_status();

    /* Mark the statement completed. */
    MYSQL_END_STATEMENT(thd->m_statement_psi, thd->get_stmt_da());
    thd->m_statement_psi = nullptr;
    thd->m_digest = nullptr;

    if (rc < 0) {
      return_value = true;  // We have to close it.
      goto out;
    }
    net->error = NET_ERROR_UNSET;
    return_value = false;
    goto out;
  }

#ifndef NDEBUG
  char desc[VIO_DESCRIPTION_SIZE];
  vio_description(net->vio, desc);
  DBUG_PRINT("info", ("Command on %s = %d (%s)", desc, command,
                      Command_names::str_notranslate(command).c_str()));
  expected_from_debug_flag = TDM::ANY;
  DBUG_EXECUTE_IF("tdon", { expected_from_debug_flag = TDM::ON; });
  DBUG_EXECUTE_IF("tdzero", { expected_from_debug_flag = TDM::ZERO; });
  DBUG_EXECUTE_IF("tdna", { expected_from_debug_flag = TDM::NOT_AVAILABLE; });
#endif  // NDEBUG
  DBUG_PRINT("info", ("packet: '%*.s'; command: %d",
                      (int)thd->get_protocol_classic()->get_packet_length(),
                      thd->get_protocol_classic()->get_raw_packet(), command));
  if (thd->get_protocol_classic()->bad_packet)
    assert(0);  // Should be caught earlier

  // Reclaim some memory
  thd->get_protocol_classic()->get_output_packet()->shrink(
      thd->variables.net_buffer_length);
  /* Restore read timeout value */
  my_net_set_read_timeout(net, thd->variables.net_read_timeout);

  DEBUG_SYNC(thd, "before_command_dispatch");

  return_value = dispatch_command(thd, &com_data, command);
  thd->get_protocol_classic()->get_output_packet()->shrink(
      thd->variables.net_buffer_length);

out:
  /* The statement instrumentation must be closed in all cases. */
  assert(thd->m_digest == nullptr);
  assert(thd->m_statement_psi == nullptr);
  return return_value;
}

/**
  Perform one connection-level (COM_XXXX) command.

  @param thd             connection handle
  @param command         type of command to perform
  @param com_data        com_data union to store the generated command

  @todo
    set thd->lex->sql_command to SQLCOM_END here.
  @todo
    The following has to be changed to an 8 byte integer

  @retval
    0   ok
  @retval
    1   request of thread shutdown, i. e. if command is