PostgreSQL索引扫描时为什么indexonlyscan不返回ctid

2024-04-02 19:55

短信预约 -IT技能 免费直播动态提醒

我们都知道在PostgreSQL中使用索引扫描时，是通过索引中存储的ctid去表中得到数据的。同时在PostgreSQL中如果要查询的列都在索引中，我们还可以使用index only scan。

既然如此，当我们在查询中用到ctid时，是否还能使用index only scan呢？

按理来说是没有问题的，例如在Oracle中：

SQL> select rowid,id from t1 where id = 1;
---------------------------------------------------------------------------
| Id  | Operation        | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |     1 |    25 |     1   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| IDX_T1 |     1 |    25 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------

我们的查询包含了rowid，仍然不需要回表TABLE ACCESS BY INDEX ROWID BATCHED的步骤。但是在PostgreSQL似乎并不是这样。

index only scan：

bill=# explain analyze select c1 from t1 where c1 = 10;
                                                     QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
 Index Only Scan using idx_t1 on t1  (cost=0.29..10.74 rows=523 width=4) (actual time=0.021..0.117 rows=523 loops=1)
   Index Cond: (c1 = 10)
   Heap Fetches: 0
 Planning Time: 0.076 ms
 Execution Time: 0.196 ms
(5 rows)

带上ctid后：

bill=# explain analyze select ctid,c1 from t1 where c1 = 10;
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Index Scan using idx_t1 on t1  (cost=0.29..81.71 rows=523 width=10) (actual time=0.038..0.447 rows=523 loops=1)
   Index Cond: (c1 = 10)
 Planning Time: 0.098 ms
 Execution Time: 0.537 ms
(4 rows)

可以看到没有再去使用index only scan，取而代之的是普通的索引扫描。

为什么会这样呢？ctid必然是包含在任何btree索引中的，为什么用到ctid的时候就不能用index only scan？

在网上看到类似的问题：

传送门

解答是说和HOT有关，乍一看似乎有点道理，但是仔细想想，如果是HOT那么也会通过vm文件去判断多版本，那么对于ctid我们只要通过vm文件判断其可见性不是就可以了，至少当表中没有任何不可见的行时应该要使用index only scan啊。

这其实因为在使用vm文件进行可见性判断前，优化器在parse阶段就已经决定了是使用index scan还是index only scan，通过check_index_only函数来判断是否使用index only scan：

for (i = 0; i < index->ncolumns; i++)
{
	int			attno = index->indexkeys[i];
	
	if (attno == 0)
		continue;
	if (index->canreturn[i])
		index_canreturn_attrs =
			bms_add_member(index_canreturn_attrs,
						   attno - FirstLowInvalidHeapAttributeNumber);
	else
		index_cannotreturn_attrs =
			bms_add_member(index_cannotreturn_attrs,
						   attno - FirstLowInvalidHeapAttributeNumber);
}
index_canreturn_attrs = bms_del_members(index_canreturn_attrs,
										index_cannotreturn_attrs);

result = bms_is_subset(attrs_used, index_canreturn_attrs);

简单解释下上面这段代码的逻辑，pg在判断是否使用index only scan时，就是将索引列取出放到一个bitmap位图index_canreturn_attrs中，将查询用到的列放到一个bitmap位图attrs_used中，然后判断attrs_used位图是否是index_canreturn_attrs的子集，如果是则使用index only scan，而这里的index_canreturn_attrs信息是从pg_index中去获取的，自然是不会存放ctid的信息。

到此这篇关于PostgreSQL索引扫描时为什么index only scan不返回ctid的文章就介绍到这了,更多相关PostgreSQL index only scan内容请搜索编程网以前的文章或继续浏览下面的相关文章希望大家以后多多支持编程网！

免责声明：

① 本站未注明“稿件来源”的信息均来自网络整理。其文字、图片和音视频稿件的所属权归原作者所有。本站收集整理出于非商业性的教育和科研之目的，并不意味着本站赞同其观点或证实其内容的真实性。仅作为临时的测试数据，供内部测试之用。本站并未授权任何人以任何方式主动获取本站任何信息。

② 本站未注明“稿件来源”的临时测试数据将在测试完成后最终做删除处理。有问题或投稿请发送至: 邮箱/279061341@qq.com QQ/279061341

阅读原文内容投诉