Help Needed: How to Query DML SQL Statements Executed a Month Ago

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 求助如何查询一个月前的DML sql执行语句

| username: Alan

I want to retrieve a piece of data from a month ago. The general log was not enabled, but the binlog was enabled. I can’t find the storage directory for the binlog.
image

| username: 啦啦啦啦啦 | Original post link

Check the data directory of the pump component with tiup; by default, it is in data.pump.

| username: 普罗米修斯 | Original post link

tiup cluster edit-config ${Cluster_Name} to view the path

| username: 像风一样的男子 | Original post link

  1. Deploy a drainer component to output binlog logs as log files. The initial-commit-ts start time can be set by yourself.
    TiDB Binlog 配置说明 | PingCAP 文档中心
  2. Use Reparo to parse the binlog in the file. Set the downstream service type to print to only parse and print to standard output, and you can find the SQL you need during this process.
    Reparo 使用文档 | PingCAP 文档中心
| username: 小龙虾爱大龙虾 | Original post link

TiDB’s binlog includes two components, pump and drainer. Pump is responsible for collecting logs from each node, while drainer is responsible for the final sorting and output to downstream. The downstream can be a file, so you need to check the configuration of your drainer component to find the location where the file is stored. You can do this by using the command tiup cluster show-config <cluster_name>.

| username: Alan | Original post link

I saw the earliest one is only 2023-11-10 :grimacing:

| username: Alan | Original post link

Okay, thanks, I found it. But there are only logs up to November 10th. :sob:

| username: 像风一样的男子 | Original post link

Check how long the data retention period is set for the pump.

| username: Alan | Original post link

Indeed, it’s only 7 days. I want to ask if enabling the general log has a significant impact on performance. The customer requires 180 days of logs.

| username: 像风一样的男子 | Original post link

This is generally enabled temporarily when troubleshooting issues. It is not recommended to keep it enabled for a long time as it will consume a lot of resources.

| username: tidb菜鸟一只 | Original post link

Generally, no one keeps the general log on all the time. The logs would become enormous, and how would you search through them… If you really want to record them, I suggest deploying an ELK stack and storing the logs in Elasticsearch…

| username: Mzb329 | Original post link

  1. First, locate where the binary logs (binlog) are stored. Typically, on Linux systems, it is /var/log/mysql, and on Windows systems, it is C:\ProgramData\MySQL\MySQL Server 8.0\Data.
  2. Identify the binary log file corresponding to the time point one month ago.
  3. Query the database operation records from one month ago using the following command:
mysql -e "SET GLOBAL log_bin_trust_function_names=1; SET GLOBAL log_bin_trust_timestamps=1; SELECT * FROM information_schema.general_log WHERE event_time >= DATE_SUB(NOW(), INTERVAL 1 MONTH) AND command_type = 'Query' AND argument LIKE '%INSERT%' OR argument LIKE '%UPDATE%' OR argument LIKE '%DELETE%';"

This command will find all database operation records within the past month (up to now).
4. Based on the query results, find the required data. Executing these operations may affect the database, so back up the data before performing the operations.

| username: Alan | Original post link

Got it, thanks.

| username: Alan | Original post link

Got it, thanks.

| username: Alan | Original post link

Thank you

| username: Kongdom | Original post link

The commercial version supports audit requirements, so you can suggest that the customer deploy the commercial version.

| username: zhanggame1 | Original post link

You should regularly back up the logs when enabling logging, right?

| username: 有猫万事足 | Original post link

If there was a BR backup before, you can consider using BR to restore the table where the DDL was executed to a new empty cluster and check the records inside. Version 6.1 should support BR. However, if you haven’t done a BR backup before, this method won’t be useful as a temporary solution.

| username: dba远航 | Original post link

Storing a large number of logs requires a lot of hard disk space, so it’s best to use backups to address this.

| username: Kongdom | Original post link

:yum: So it’s still important to develop the habit of regular backups~