You can search for the documentation on Baidu. The reason why columnar storage does not require indexes is that the purpose of querying columnar storage is for columnar statistical analysis, and it often involves querying large amounts of data, which usually does not require index filtering.
There is no documentation, but there is source code that you can directly look at…
The official blog has some posts you can refer to:
The biggest difference between columnar storage and row storage is that they use different data retrieval models. Row storage uses the volcano model, while columnar storage uses the vector engine.
The link you provided leads to a specific article on Zhihu, which I cannot access directly. Please provide the text you need translated, and I will translate it for you.
I casually drew something to help with understanding.
For row-based table data retrieval, if it’s generally through rowid, it directly determines the data of the corresponding row and then returns the required columns. If it’s through an index, it first determines the rowid of the row through the index field, then goes back to the table to determine the data of the corresponding row, and then returns the required columns (if only the index column data is needed, there’s no need to go back to the table).
For column-based data retrieval, it’s generally for summarizing one or several columns, directly querying the summary data of the corresponding columns. If querying a single value, it retrieves the corresponding value data of the corresponding column based on the rowkey.
You can actually think of column storage as a row-based table with multiple indexes where each field has an index… So each field in column storage is equivalent to an index, and there’s no need to create separate indexes.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.