[Product Research] Do you have application scenarios that require TiDB to support JSON data format? New handbag giveaway!

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 【产品调研】你有需要 TiDB 支持 JSON 数据格式的应用场景吗?新款挎包抽奖送!

| username: Billmay表妹

[Product Research for This Issue]

Do you have application scenarios that require TiDB to support JSON data format?

We look forward to everyone actively sharing current or needed application scenarios where TiDB supports JSON data format.


[What Problems Can JSON Solve]

For example: common format for network application storage, flexible data structures, access and analysis of semi-structured data, etc.

[Participate in the Research]

You can describe the usage scenario like this:

  1. Application Scenario: (Industry) As a blockchain company, (Application Scenario) we need to collect various Web3 events, which have various flexible schema formats. Therefore, if we import all events into one table instead of creating thousands of small tables, JSON support can greatly improve development and management efficiency.

  2. Demand Assessment: Strong demand, can greatly improve efficiency.

  3. Technical Indicators: 10+ billion records, max size < 1kb; frequent queries and inserts, almost no updates needed.

  4. Additional Notes: The most commonly used function is json_extract.

You can also provide feedback on the need for support for a specific JSON function below,

Reference: 请问tidb能支持mongodb那种对JSON中数组元素的索引吗 - TiDB 的问答社区

If you particularly, particularly, particularly need support for the JSON data format, you can also add me on WeChat: billmay, and I can arrange for you to have an in-depth chat with the PM about the future of JSON support.

[Research Participation Rewards]

50 experience points & points

[Lottery Rewards]

One lucky participant will receive a new shoulder bag.

[Follow-up Plan]

After the research, we will randomly select 3-5 participants with strong demand scenarios for 1v1 in-depth communication about product requirements. If there are iteration plans, they will be updated in this post as soon as possible.

| username: ShawnYan | Original post link

Scenario: User comments or microtalk

| username: 半瓶醋仙 | Original post link

Scenario: Similar to using Python web crawlers to fetch data, it can obtain user subscriptions and send advertisements based on their preferences.
Demand intensity: It’s just like my XML (Little Donkey), not much difference.

| username: Kongdom | Original post link

Yes, for example, when we connect with WeChat and Alipay, we often need to store JSON strings. Sometimes when doing rule parsing, especially dynamically, we generally use JSON.

| username: xfworld | Original post link

  • Save irregular configuration information

  • When the structure is uncertain, the best solution is JSON

| username: TiDBer_pFFcXLgY | Original post link

Scenario: In the financial industry, using JSON for data such as holdings would be much more convenient.
Demand intensity: It’s for critical business use, and the demand is quite high.

| username: Z六月星星 | Original post link

Yes, obtain some raw data from third-party interfaces.

| username: Hacker007 | Original post link

  1. JSON is a frequently encountered data format in the field of big data, especially in scenarios involving web scraping and tracking data. If TiDB supports JSON format, it would be very convenient to parse and extract data (and many databases already support it).
  2. Since it is not currently supported, TiDB is not used in related scenarios, and there are other alternative solutions. The demand intensity is not strong.
| username: ddhe9527 | Original post link

I hope TiDB can be compatible with these JSON features of MySQL 5.7, as we use them a lot in production.
https://dev.mysql.com/doc/refman/5.7/en/json.html
https://dev.mysql.com/doc/refman/5.7/en/json-function-reference.html

| username: xiaohetao | Original post link

Yes, it is used for irregular data transformation processing, such as enterprise surveys.

| username: neolithic | Original post link

It can be supported.

| username: Yves | Original post link

Currently, it is supported as an experimental feature - JSON 类型 | PingCAP 文档中心

| username: wfxxh | Original post link

Need to use

| username: lxs_data | Original post link

What I currently understand is that web crawlers, API returned data, and configuration files all require JSON data format. At present, I am practicing storing some data from a Python web crawler in MySQL in JSON format. There are definitely application scenarios for JSON data format, and the most I know about right now is web crawler data.

| username: TiDBer_CQ | Original post link

Supporting the JSON data format is a necessity. Currently, the required use cases include configuration information for various data sources and data returned by third-party interfaces, which need to use the JSON data format.

| username: Jiawei | Original post link

I think it’s not only necessary to support the JSON format, but also to support some JSON-related processing functions, so that using JSON can be more handy.

| username: Billmay表妹 | Original post link

Please elaborate in detail~ For example, which functions~

| username: 数据小黑 | Original post link

  1. Application Scenario: As a software development company for fast-moving consumer goods sales, on the data analysis side, we need to construct various user profiles or analysis reports, similar to Taobao’s personalized recommendations. The data usually needs to be organized and stored in JSON format.
  2. Requirement Assessment: For general requirements, having JSON storage is more efficient. Without JSON storage, relational tables can also be used to achieve the same goals.
  3. Technical Specifications: Over 100 million records, max size < 512KB; bulk inserts, frequent queries, mostly point queries, with a small amount of update requirements.
  4. Additional Notes: Is it possible to support full-text scanning of JSON content?
| username: ealam_小羽 | Original post link

There are scenarios that require it!

I have multiple different types of data that need to be unified into a single business data table. This table is used to describe the data of a company at various stages and trace its origins (such as company IPO application data, company IPO data, etc.).

Currently, to unify the data, I have a single data type in the table, and the data values are stored as JSON. :joy: