Postgresql 中文操作指南
64.5. Index Uniqueness Checks #
PostgreSQL 使用 unique indexes 强制执行 SQL 唯一性约束,该索引不允许有多个具有相同键的条目。支持此功能的访问方法将 amcanunique 设置为 true。(目前,只有 b-tree 支持此功能。)在实施唯一性时,不会考虑 INCLUDE 子句中列出的列。
PostgreSQL enforces SQL uniqueness constraints using unique indexes, which are indexes that disallow multiple entries with identical keys. An access method that supports this feature sets amcanunique true. (At present, only b-tree supports it.) Columns listed in the INCLUDE clause are not considered when enforcing uniqueness.
由于 MVCC,在索引中始终需要允许物理存在重复条目:这些条目可能引用单个逻辑行的连续版本。我们实际想要强制执行的行为是,没有任何 MVCC 快照可以包含两个具有相等索引键的行。当将新行插入到唯一索引中时,这会分解为必须检查的以下情况:
Because of MVCC, it is always necessary to allow duplicate entries to exist physically in an index: the entries might refer to successive versions of a single logical row. The behavior we actually want to enforce is that no MVCC snapshot could include two rows with equal index keys. This breaks down into the following cases that must be checked when inserting a new row into a unique index:
此外,在根据上述规则报告唯一性违规之前,访问方法必须重新检查要插入的行是否有效。如果已提交死点,则不应报告违规。(在当前事务刚创建的行插入的常规情况下,不会发生这种情况。不过,它可能在 CREATE UNIQUE INDEX CONCURRENTLY 期间发生。)
Furthermore, immediately before reporting a uniqueness violation according to the above rules, the access method must recheck the liveness of the row being inserted. If it is committed dead then no violation should be reported. (This case cannot occur during the ordinary scenario of inserting a row that’s just been created by the current transaction. It can happen during CREATE UNIQUE INDEX CONCURRENTLY, however.)
我们需要索引访问方法自行应用这些测试,这意味着它必须进入堆中才能检查根据索引内容显示为具有重复键的任何行的提交状态。这毫无疑问是丑陋且不可模块化的,但它节省了冗余的工作:如果我们进行了单独探测,那么在找到插入新行索引条目的位置时,对冲突行的索引查找基本上会被重复。更重要的是,在冲突检查不是新索引条目插入的组成部分时,没有明显的方法可以避免竞争条件。
We require the index access method to apply these tests itself, which means that it must reach into the heap to check the commit status of any row that is shown to have a duplicate key according to the index contents. This is without a doubt ugly and non-modular, but it saves redundant work: if we did a separate probe then the index lookup for a conflicting row would be essentially repeated while finding the place to insert the new row’s index entry. What’s more, there is no obvious way to avoid race conditions unless the conflict check is an integral part of insertion of the new index entry.
如果唯一约束是可以延迟的,那么还有附加的复杂性:我们需要能够为新行插入索引条目,但将任何唯一性违规错误延迟到语句结束甚至是更晚。为了避免对索引进行不必要的重复搜索,索引访问方法应在最初插入期间进行初步唯一性检查。如果这表明绝对没有冲突的实时元组,那么我们就完成了。否则,我们安排在强制执行约束时进行重新检查。如果在重新检查时,已插入的元组和另一个具有相同键的元组都处于活动状态,则必须报告错误。(请注意,出于此目的,“已激活” 实际上意味着“索引条目 HOT 链中的任何元组都已激活”。)为了实现这一点,将向 aminsert 函数传递一个 checkUnique 参数,其具有以下值之一:
If the unique constraint is deferrable, there is additional complexity: we need to be able to insert an index entry for a new row, but defer any uniqueness-violation error until end of statement or even later. To avoid unnecessary repeat searches of the index, the index access method should do a preliminary uniqueness check during the initial insertion. If this shows that there is definitely no conflicting live tuple, we are done. Otherwise, we schedule a recheck to occur when it is time to enforce the constraint. If, at the time of the recheck, both the inserted tuple and some other tuple with the same key are live, then the error must be reported. (Note that for this purpose, “live” actually means “any tuple in the index entry’s HOT chain is live”.) To implement this, the aminsert function is passed a checkUnique parameter having one of the following values: