Tang Liu: Reflections on Product Quality - My Basic Understanding

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 唐刘:关于产品质量的思考 - 我的基本认知

| username: 社区小助手

In the article “TiDB in 2023 - A Simple Review,” a problem I’ve always faced is mentioned: how can I confidently tell customers that the quality of each new TiDB release is good and that they can use it with peace of mind?

Frankly, this question is not easy to answer. I plan to share my thoughts on product quality through a series of articles, and this is the first one, mainly discussing my basic understanding of quality.

It should be noted that these are my personal understandings and are not absolutely correct. I will continuously absorb and update my understanding of quality. Additionally, the products I mention are mainly infrastructure software products like TiDB, which may not be applicable to other products.

High-Quality Products Are Made Through Use

“High-quality products are made through use,” actually, this sentence has a second half, which is “High-quality products are made through use, not just through testing.” This is a profound understanding I’ve gained in recent years. Starting with such a viewpoint might make people think I’m making excuses for not being able to release high-quality products directly, or it might cause unnecessary anxiety among users, making them feel like guinea pigs for testing the product.

Why do I have this understanding? You can look at the causal loop diagram below:

Regarding the causal loop diagram, I will write an article to introduce it later. You can also check out Wiki - Causal loop diagram first.

The above is a reinforcing loop. Starting from the point “Quality Product,” the overall loop logic is as follows:

When we have a high-quality product, we will gain more customers. The specific customer acquisition can be through sales efforts, customer self-recommendation, brand promotion, etc., which we will not discuss in detail here.

When we have more customers, the product will face more business scenarios and handle more loads.

More business scenarios and loads are more likely to push the product’s boundary capabilities, allowing us to discover more bugs and defects.

We will then put more effort into fixing these quality issues, thereby improving product quality. The improvement in product quality will further attract more customers.

In the loop diagram above, I started the discussion from the point “Quality Product,” which means we need to have a product with decent quality. If our product quality is poor, due to this reinforcing loop, we will lose more customers and will not be able to continue refining the product.

So, how do we release a product with decent quality? One important task is testing, which is crucial. But we need to be clear that testing is just an abstraction of our customers’ business systems, our own understanding of our product system’s capabilities. Our understanding cannot cover all real-world situations, so we also need to refine our products in various real-world scenarios to make the product quality better and better.

In response to the “guinea pig” viewpoint mentioned above, I believe that not only TiDB but most software products on the market are like this. We usually first find some sample customers to refine the product, and after refining it, we promote it to more customers. The use by more customers will help us discover more problems, thereby continuing to improve the product. This actually aligns with the previous causal loop diagram.

More Features, More Bugs

Continuing the previous causal loop, we can expand another reinforcing loop. When we handle more customer scenarios and loads, customers will make more demands on us, i.e., they will request more features. In this case, we need to invest in the development of new features. The more new features we develop, the more competitive our product will be in more dimensions, naturally attracting more customers to use it.

This outer reinforcing loop looks very promising, but there are two points that need special attention:

The development of new features, from initial design to final release, to customers actually starting to use these new features, takes a long time, usually measured in quarters. Therefore, the improvement in product competitiveness will have a certain delay. This is why I added a delay mark between new features and product competitiveness. Although there are delays in other aspects as well, I want to emphasize this point here.

More importantly, the bandwidth of R&D is a physical constraint; the company cannot increase R&D resources indefinitely. When we invest more R&D bandwidth in developing new features, it naturally squeezes the bandwidth for quality improvement. So, whether it’s new features introducing bugs or accumulated unresolved bugs, they will reduce product quality.

Note: I drew a negative feedback connection between Feature Development Efforts and Quality Improvement Efforts above. Although it forms a balancing loop, because balancing loops need to converge to a target, the diagram is not perfect. Let’s just leave it as it is for now…

Here comes my second understanding, “More features, more bugs.”

This is actually my lesson. In previous TiDB versions, sometimes we went too far and developed too many features, and the more features a version had, the less stable its quality was at the beginning. So, we spent a lot of effort to improve the quality. Note that there is another balancing loop here. When we invest more resources in quality improvement, it naturally affects the development of new features. Fewer new features will affect the subsequent competitiveness of the product. This is why, starting from version 7.5, we have been trying to find a balance between competitiveness and quality while controlling the number of new features.

Another reality we need to face is that any feature development, even bug fixes, involves code changes. In an extremely complex product, any code change can introduce new bugs. I believe developers do not want to write buggy code, but this will not change according to the developers’ will.

Of course, we can use many methods to improve the quality of our new feature development or the speed of bug fixes, which I will explain in detail in later articles. What I want to emphasize is that the above understanding is my understanding of reality. Only by accepting this can we make better trade-offs and create competitive, high-quality products.

Summary

When I wrote down the above points of my understanding, I was very surprised. I admit that if I went back 10 years, I would definitely not have this understanding. Of course, I do not expect my understanding to be correct, and it may be refreshed soon. I will also update the articles accordingly.

Writing this reminds me of a language joke, allegedly from C++ father Bjarne Stroustrup - “There are only two kinds of languages in the world: those people complain about and those nobody uses.” The same goes for products. A good product must have a large number of users. The more it is used, the more complaints it will naturally receive. Of course, the final result is that it gets better and better. This may be the inevitable experience of product growth.

Updates

An interesting thing is that on the day I published this article, 2024-02-24, there was also a post discussing software quality on the front page of Hacker News, How to think about software quality (2022) | Hacker News. It seems that not only me but probably developers worldwide are also troubled by this issue.

I also found a very early post about MySQL quality, some of the views in it coincide with mine, 7 Reasons why MySQL Quality will never be the same

| username: TiDBer_jYQINSnf | Original post link

Awesome! Now that we have the guidelines, are there any articles about the specific methods PingCAP uses internally to ensure the quality of a version?

| username: 望海崖2084 | Original post link

It reminds me of when I just graduated and worked in quality management at a car factory before switching careers. I concluded that good quality is designed. In my later work, I increasingly felt that balancing and finding the middle ground is also an important aspect of achieving success.

| username: redgame | Original post link

The article provides in-depth reflections on the quality of software products and the development process, demonstrating an understanding of the complex relationship between product quality and feature development.

| username: WinterLiu | Original post link

You’re thoughtful.

| username: TiDBer_aaO4sU46 | Original post link

The article is long, but I finished reading it and gained a lot.

| username: TiDBer_5cwU0ltE | Original post link

Product quality is indeed a topic that accompanies the product lifecycle. After reading your thoughts, I also have some ideas: the completeness of product documentation is sometimes very important. If the official documentation is done well, it can serve as a great promotion. Additionally, with the support of related forums and other feedback channels, the product quality will get better and better. I feel that TiDB does a good job in this regard.

| username: 清风明月 | Original post link

Gained a lot

| username: Aionn | Original post link

Learned a lot, very beneficial.

| username: TiDBer_iLonNMYE | Original post link

I have read it. There is another path to improving product quality, which is innovation proposed from a theoretical perspective or published in the form of papers and then validated in practice. The prosperous development of databases from network to relational to NoSQL to distributed over the past fifty years has been inseparable from continuous theoretical advancements. The progress of theory is also driven by real-world practical problems. They coexist and depend on each other, continuously spiraling upwards.

| username: wailon | Original post link

Good quality is definitely designed.
There is no absolute definition of quality, but PDCA is cyclical, not just a one-time thing or a single function, and not necessarily the whole.

| username: 小于同学 | Original post link

I have learned a lot and benefited greatly.

| username: TiDBer_rvITcue9 | Original post link

Learned.

| username: itfarmer | Original post link

:+1:

| username: 纯白镇的小智 | Original post link

The increase in functional requirements will inevitably lead to more bugs. Quality assurance is a key area, and my deepest impression of TiDB is that this database performs quite well in terms of quality.