No space left on device, DM cannot start

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: no space left on device,dm无法启动

| username: TiDBer_mGBoAnW9

【TiDB Usage Environment】Production Environment / Testing / Poc
【TiDB Version】
【Reproduction Path】What operations were performed when the problem occurred
【Encountered Problem: Problem Phenomenon and Impact】
【Resource Configuration】
【Attachment: Screenshot/Log/Monitoring】


Disk space remaining 1.5T, DM used to start normally before.

| username: WalterWj | Original post link

How is this a relative path…

| username: TiDBer_mGBoAnW9 | Original post link

The file was not found globally either.

| username: WalterWj | Original post link

You don’t have a disk, of course you can’t find it. In the downgrade, I think it will be in the dm data directory. Do you have a disk?

| username: TiDBer_mGBoAnW9 | Original post link

There is a disk.

| username: WalterWj | Original post link

Maybe there was no disk at that time. Try restarting the task.

| username: TiDBer_mGBoAnW9 | Original post link

Restarted. It’s the same.

| username: 大飞哥online | Original post link

Could you upload the configuration file for us to take a look?

| username: TiDBer_mGBoAnW9 | Original post link

name: “shard_merge”
task-mode: all # Perform full data migration + incremental data migration
meta-schema: “dm_meta”
ignore-checking-items: [“auto_increment_ID”]

target-database:
host: “192.168.0.1”
port: 4000
user: “root”
password: “”
syncers:
global:
worker-count: 16
batch: 100
safe-mode: true
compact: false
multiple-rows: false

mysql-instances:

source-id: "instance-1"        # Data source ID, can be obtained from the data source configuration
route-rules: ["store-route-rule", "sale-route-rule"] # Table route rules applied to this data source
filter-rules: ["store-filter-rule", "sale-filter-rule"] # Binlog event filter rules applied to this data source
block-allow-list:  "log-bak-ignored" # Block & Allow Lists rules applied to this data source
  • source-id: “instance-2”
    route-rules: [“store-route-rule”, “sale-route-rule”]
    filter-rules: [“store-filter-rule”, “sale-filter-rule”]
    block-allow-list: “log-bak-ignored”

Other common configurations shared by all instances

routes:
store-route-rule:
schema-pattern: “store_"
target-schema: “store”
sale-route-rule:
schema-pattern: "store_

table-pattern: “sale_*”
target-schema: “store”
target-table: “sale”

filters:
sale-filter-rule:
schema-pattern: “store_"
table-pattern: "sale_

events: [“truncate table”, “drop table”, “delete”]
action: Ignore
store-filter-rule:
schema-pattern: “store_*”
events: [“drop database”]
action: Ignore

block-allow-list:
log-bak-ignored:
do-dbs: [“user”, “store_*”]
ignore-tables:
- db-name: “user”
tbl-name: “log_bak”

| username: tidb菜鸟一只 | Original post link

Execute df -h on your machine to check if other directories are full. Was it deployed to the default disk? This clearly indicates a lack of disk space…

| username: TiDBer_mGBoAnW9 | Original post link

None of them are slow, just added configuration.

| username: 有猫万事足 | Original post link

This error is reported by unit sync.
So, the dm_worker running this task has run out of space. It’s not your dm cluster’s control machine or the master node.

tiup dm exec <dm-name> --command 'df -h'

Using the command above, you can see the disk usage of all machines in the cluster. Check which one has run out of space.

| username: TiDB_C罗 | Original post link

Check on Prometheus to see which node is full.

| username: redgame | Original post link

The node is full.