Can TiDB be used as a database for storing large amounts of text, images, videos, and other data?

translator_bot · June 22, 2024, 9:15pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 需要存储大量文本，图片，视频等数据，可以选用TiDB作为数据库吗？

| username: TiDBer_0qkgbkPp

Can TiDB be chosen as a database for storing large amounts of text, images, videos, and other data?

translator_bot · June 22, 2024, 9:15pm

| username: 我是咖啡哥 | Original post link

Never store images and videos in the database…

translator_bot · June 22, 2024, 9:15pm

| username: 裤衩儿飞上天 | Original post link

There’s nothing wrong with that. But don’t directly throw image and video files into the database.

translator_bot · June 22, 2024, 9:15pm

| username: 张雨齐0720 | Original post link

A distributed file system should be the best, with the database storing the file paths. However, there’s no rule that says databases can’t store files. I once worked on a system where files were serialized and stored in the database, but those were small files, like ID photos.

translator_bot · June 22, 2024, 9:15pm

| username: 啦啦啦啦啦 | Original post link

It can be stored, but this is not the database’s strong suit.

translator_bot · June 22, 2024, 9:15pm

| username: tidb菜鸟一只 | Original post link

We all store it on OSS and save a link in the database.

translator_bot · June 22, 2024, 9:15pm

| username: TiDBer_0qkgbkPp | Original post link

Thank you very much! I would like to ask another question: when a file is sent, how does a distributed database know which folder it should be stored in? How does a distributed database know what the file path is?

translator_bot · June 22, 2024, 9:15pm

| username: 张雨齐0720 | Original post link

If you want to store files in the database, the application must first receive the files, and the application should have its own logic.

There are two scenarios:

First scenario:
Persisting to disk: The received file should have a fixed storage path. First, use sftp or other transfer tools to store the file locally, then notify the program. The application reads the file, serializes it, and stores it in the database.

Second scenario:
Not persisting to disk: After the application receives the file transfer request, it directly stores the received file stream into the database.

Of course, it is not recommended to do it this way. There should be shared storage or distributed storage systems like NFS/OSS, and you can just store the file path in the database.

translator_bot · June 22, 2024, 9:15pm

| username: ti-tiger | Original post link

Do we need to first figure out how this file was uploaded, whether it was the front-end program or something else?

translator_bot · June 22, 2024, 9:15pm

| username: TiDBer_0qkgbkPp | Original post link

Thank you for your reply! After reading everyone’s answers, I decided not to store the data directly in the database. What confuses me now is how to store the file data obtained from a server onto the disk (do I need to write an application myself, specify a fixed storage path, and receive the file?). Then, how does the distributed database get this file path?

translator_bot · June 22, 2024, 9:15pm

| username: TiDBer_0qkgbkPp | Original post link

It can be understood as taking files from one server and storing them on multiple idle servers, using a distributed database to manage this data.

translator_bot · June 22, 2024, 9:15pm

| username: 张雨齐0720 | Original post link

NFS/OSS shared storage is the simplest way to handle this. If there is no shared storage, then the only other option is sftp. You need to write your own sftp implementation class, have the remote server open port 22, and provide you with a username and password. You can then call the implementation class to read and write data on the remote server.

translator_bot · June 22, 2024, 9:15pm

| username: ti-tiger | Original post link

I think what’s missing here is a program, and this program should be responsible for the “take” action you mentioned. Either you write it yourself, or you look for third-party software.

translator_bot · June 22, 2024, 9:15pm

| username: huanglao2002 | Original post link

Both object storage and NAS are quite suitable. The database creates indexes.

translator_bot · June 22, 2024, 9:15pm

| username: 会飞的土拨鼠 | Original post link

You can store the addresses of images and videos in the database instead of storing them directly in the database.

translator_bot · June 22, 2024, 9:15pm

| username: ohammer | Original post link

In the database, we generally store the file paths. In our environment, images and videos are stored in object storage. There is a URL field in the table.

translator_bot · June 22, 2024, 9:15pm

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.