Hibernate ORM 中文操作指南
7. Tuning and performance
一旦你有一个正在使用 Hibernate 访问数据库的程序,你肯定会发现在某些地方性能令人失望或无法接受。
Once you have a program up and running using Hibernate to access the database, it’s inevitable that you’ll find places where performance is disappointing or unacceptable.
幸运的是,只要记住一些简单的原则,大多数性能问题都可以通过 Hibernate 为你提供的工具轻松解决。
Fortunately, most performance problems are relatively easy to solve with the tools that Hibernate makes available to you, as long as you keep a couple of simple principles in mind.
首先也是最重要的:你使用 Hibernate 的原因是它让事情变得更容易。如果对于某个问题,它让事情 harder,那就停止使用它。改用不同的工具来解决此问题。
First and most important: the reason you’re using Hibernate is that it makes things easier. If, for a certain problem, it’s making things harder, stop using it. Solve this problem with a different tool instead.
第二:在使用 Hibernate 的程序中,有两个主要的潜在性能瓶颈来源:
Second: there are two main potential sources of performance bottlenecks in a program that uses Hibernate:
-
too many round trips to the database, and
-
memory consumption associated with the first-level (session) cache.
因此,性能调优主要涉及减少对数据库的访问次数,和/或控制会话缓存的大小。
So performance tuning primarily involves reducing the number of accesses to the database, and/or controlling the size of the session cache.
但在我们讨论这些更高级的话题之前,我们应该先调整连接池。
But before we get to those more advanced topics, we should start by tuning the connection pool.
7.1. Tuning the connection pool
Hibernate 中内置的连接池适合于测试,但不用于产品环境。相反,Hibernate 支持一系列不同的连接池,包括我们最喜欢的 Agroal。
The connection pool built in to Hibernate is suitable for testing, but isn’t intended for use in production. Instead, Hibernate supports a range of different connection pools, including our favorite, Agroal.
要选择并配置 Agroal,您需要设置一些额外的配置属性,除了我们已经在 Basic configuration settings 中看到设置。前缀为 hibernate.agroal 的属性被传递到 Agroal:
To select and configure Agroal, you’ll need to set some extra configuration properties, in addition to the settings we already saw in Basic configuration settings. Properties with the prefix hibernate.agroal are passed through to Agroal:
# configure Agroal connection pool
hibernate.agroal.maxSize 20
hibernate.agroal.minSize 10
hibernate.agroal.acquisitionTimeout PT1s
hibernate.agroal.reapTimeout PT10s
只要你至少设置了一个前缀为 hibernate.agroal 的属性,就会自动选择 AgroalConnectionProvider。有很多可供选择:
As long as you set at least one property with the prefix hibernate.agroal, the AgroalConnectionProvider will be selected automatically. There’s many to choose from:
表格 43. 配置 Agroal 的设置
Table 43. Settings for configuring Agroal
Configuration property name |
Purpose |
hibernate.agroal.maxSize |
The maximum number of connections present on the pool |
hibernate.agroal.minSize |
The minimum number of connections present on the pool |
hibernate.agroal.initialSize |
The number of connections added to the pool when it is started |
hibernate.agroal.maxLifetime |
The maximum amount of time a connection can live, after which it is removed from the pool |
hibernate.agroal.acquisitionTimeout |
The maximum amount of time a thread can wait for a connection, after which an exception is thrown instead |
hibernate.agroal.reapTimeout |
The duration for eviction of idle connections |
hibernate.agroal.leakTimeout |
The duration of time a connection can be held without causing a leak to be reported |
hibernate.agroal.idleValidationTimeout |
A foreground validation is executed if a connection has been idle on the pool for longer than this duration |
hibernate.agroal.validationTimeout |
The interval between background validation checks |
hibernate.agroal.initialSql |
A SQL command to be executed when a connection is created |
以下设置对 Hibernate 支持的所有连接池通用:
The following settings are common to all connection pools supported by Hibernate:
表格 44. 连接池的通用设置
Table 44. Common settings for connection pools
hibernate.connection.autocommit |
The default autocommit mode |
hibernate.connection.isolation |
The default transaction isolation level |
7.2. Enabling statement batching
一种轻易提高某些事务性能的方法,几乎不需要任何工作,就是启用自动 DML 语句批处理。批处理仅有助于程序在单个事务中对同一张表执行多次插入、更新或删除的情况。
An easy way to improve performance of some transactions, with almost no work at all, is to turn on automatic DML statement batching. Batching only helps in cases where a program executes many inserts, updates, or deletes against the same table in a single transaction.
我们只需要设置一个属性:
All we need to do is set a single property:
表 45. 启用 JDBC 批量处理
Table 45. Enabling JDBC batching
Configuration property name |
Purpose |
Alternative |
hibernate.jdbc.batch_size |
Maximum batch size for SQL statement batching |
setJdbcBatchSize() |
比 DML 语句批处理更好的方法是使用 HQL update 或 delete 查询,甚至是调用存储过程的原生 SQL! |
Even better than DML statement batching is the use of HQL update or delete queries, or even native SQL that calls a stored procedure! |
7.3. Association fetching
在 ORM 中实现高性能意味着尽量减少与数据库的往返次数。无论在何时使用 Hibernate 编写数据访问代码时,这个目标都应该放在首位。ORM 中最基本的经验法则就是:
Achieving high performance in ORM means minimizing the number of round trips to the database. This goal should be uppermost in your mind whenever you’re writing data access code with Hibernate. The most fundamental rule of thumb in ORM is:
-
explicitly specify all the data you’re going to need right at the start of a session/transaction, and fetch it immediately in one or two queries,
-
and only then start navigating associations between persistent entities.

毫无疑问,Java 程序中数据访问代码性能低下的最常见原因就是 N+1 selects 问题。此处,将在初始查询中从数据库中检索 N 行列表,然后使用 N 后续查询获取相关实体的关联实例。
Without question, the most common cause of poorly-performing data access code in Java programs is the problem of N+1 selects. Here, a list of N rows is retrieved from the database in an initial query, and then associated instances of a related entity are fetched using N subsequent queries.
这不是 Hibernate 的缺陷或限制;这个问题甚至影响了在 DAO 背后的典型的手写 JDBC 代码。只有你能作为开发者解决这个问题,因为只有你知道在给定的工作单元中你事先需要什么数据。但这没关系。Hibernate 会为你提供你需要的所有工具。
This isn’t a bug or limitation of Hibernate; this problem even affects typical handwritten JDBC code behind DAOs. Only you, the developer, can solve this problem, because only you know ahead of time what data you’re going to need in a given unit of work. But that’s OK. Hibernate gives you all the tools you need.
在这一部分中,我们将讨论避免与数据库进行此类“多嘴”交互的不同方式。
In this section we’re going to discuss different ways to avoid such "chatty" interaction with the database.
Hibernate 提供了多种策略,用于高效获取关联并避免 N+1 选择:
Hibernate provides several strategies for efficiently fetching associations and avoiding N+1 selects:
-
outer join fetching—where an association is fetched using a left outer join,
-
batch fetching—where an association is fetched using a subsequent select with a batch of primary keys, and
-
subselect fetching—where an association is fetched using a subsequent select with keys re-queried in a subselect.
在这些策略中,你几乎始终应该使用外连接获取。但我们先来考虑一下其他方法。
Of these, you should almost always use outer join fetching. But let’s consider the alternatives first.
7.4. Batch fetching and subselect fetching
考虑以下代码:
Consider the following code:
List<Book> books =
session.createSelectionQuery("from Book order by isbn", Book.class)
.getResultList();
books.forEach(book -> book.getAuthors().forEach(author -> out.println(book.title + " by " + author.name)));
这段代码 very 效率低下,默认情况下会导致执行 N+1 select 条语句,其中 N 是 Book 的数量。
This code is very inefficient, resulting, by default, in the execution of N+1 select statements, where N is the number of _Book_s.
让我们看看如何改进。
Let’s see how we can improve on that.
SQL for batch fetching
启用批量提取后,Hibernate 可能会在 PostgreSQL 上执行以下 SQL:
With batch fetching enabled, Hibernate might execute the following SQL on PostgreSQL:
/* initial query for Books */
select b1_0.isbn,b1_0.price,b1_0.published,b1_0.publisher_id,b1_0.title
from Book b1_0
order by b1_0.isbn
/* first batch of associated Authors */
select a1_0.books_isbn,a1_1.id,a1_1.bio,a1_1.name
from Book_Author a1_0
join Author a1_1 on a1_1.id=a1_0.authors_id
where a1_0.books_isbn = any (?)
/* second batch of associated Authors */
select a1_0.books_isbn,a1_1.id,a1_1.bio,a1_1.name
from Book_Author a1_0
join Author a1_1 on a1_1.id=a1_0.authors_id
where a1_0.books_isbn = any (?)
第一个 select 语句查询并检索 Book_s. The second and third queries fetch the associated _Author_s in batches. The number of batches required depends on the configured _batch size。此处,需要两个批次,因此执行了两个 SQL 语句。
The first select statement queries and retrieves Book_s. The second and third queries fetch the associated _Author_s in batches. The number of batches required depends on the configured _batch size. Here, two batches were required, so two SQL statements were executed.
用于批处理提取的 SQL 根据数据库略有不同。这里在 PostgreSQL 中,Hibernate 将主密钥值批次作为 SQL ARRAY 传递。 |
The SQL for batch fetching looks slightly different depending on the database. Here, on PostgreSQL, Hibernate passes a batch of primary key values as a SQL ARRAY. |
SQL for subselect fetching
另一方面,使用子查询抓取,Hibernate 将执行以下 SQL:
On the other hand, with subselect fetching, Hibernate would execute this SQL:
/* initial query for Books */
select b1_0.isbn,b1_0.price,b1_0.published,b1_0.publisher_id,b1_0.title
from Book b1_0
order by b1_0.isbn
/* fetch all associated Authors */
select a1_0.books_isbn,a1_1.id,a1_1.bio,a1_1.name
from Book_Author a1_0
join Author a1_1 on a1_1.id=a1_0.authors_id
where a1_0.books_isbn in (select b1_0.isbn from Book b1_0)
请注意,第一个查询在第二个查询的子查询中被重新执行。执行子查询可能会相对便宜,因为数据可能已由数据库缓存。很巧妙,对吧?
Notice that the first query is re-executed in a subselect in the second query. The execution of the subselect is likely to be relatively inexpensive, since the data should already be cached by the database. Clever, huh?
Enabling the use of batch or subselect fetching
批次获取和子查询获取在默认情况下都已禁用,但我们可以使用属性在全局启用一个或另一个。
Both batch fetching and subselect fetching are disabled by default, but we may enable one or the other globally using properties.
表 46. 启用批次和子查询获取的配置设置
Table 46. Configuration settings to enable batch and subselect fetching
Configuration property name |
Property value |
Alternatives |
hibernate.default_batch_fetch_size |
A sensible batch size >1 to enable batch fetching |
@BatchSize(), setFetchBatchSize() |
hibernate.use_subselect_fetch |
true to enable subselect fetching |
@Fetch(SUBSELECT), setSubselectFetchingEnabled() |
或者,我们可以在给定的会话中启用一个或另一个:
Alternatively, we can enable one or the other in a given session:
session.setFetchBatchSize(5);
session.setSubselectFetchingEnabled(true);
我们可以通过使用 @Fetch 注释对集合或多值关联进行注释来请求更具选择性的子查询获取。 |
We may request subselect fetching more selectively by annotating a collection or many-valued association with the @Fetch annotation. |
@ManyToMany @Fetch(SUBSELECT) Set<Author> authors; @ManyToMany @Fetch(SUBSELECT) Set<Author> authors; 请注意, @Fetch(SUBSELECT) 的效果与 @Fetch(SELECT) 相同,除了在执行 HQL 或条件查询之后。但在查询执行后, @Fetch(SUBSELECT) 能够更有效地获取关联。
@ManyToMany @Fetch(SUBSELECT) Set<Author> authors; @ManyToMany @Fetch(SUBSELECT) Set<Author> authors; Note that @Fetch(SUBSELECT) has the same effect as @Fetch(SELECT), except after execution of a HQL or criteria query. But after query execution, @Fetch(SUBSELECT) is able to much more efficiently fetch associations.
稍后,我们将了解如何使用 fetch profiles 更加有选择性地执行此操作。
Later, we’ll see how we can use fetch profiles to do this even more selectively.
仅此而已。太简单了,对吧?
That’s all there is to it. Too easy, right?
遗憾的是,这并非故事的结局。虽然批次获取可能会 mitigate 涉及 N+1 选择的问题,但它不会解决问题。真正正确的解决方案是使用联接获取关联。在罕见情况下,外部联接获取会导致笛卡尔积和庞大的结果集时,批次获取(或子查询获取)才能成为 best 解决方案。
Sadly, that’s not the end of the story. While batch fetching might mitigate problems involving N+1 selects, it won’t solve them. The truly correct solution is to fetch associations using joins. Batch fetching (or subselect fetching) can only be the best solution in rare cases where outer join fetching would result in a cartesian product and a huge result set.
但批次获取和子查询获取有一个共同的重要特性:它们可以 lazily 执行。原则上,这非常方便。当我们查询数据并随后浏览对象图时,延迟获取可为我们省去提前规划的麻烦。事实证明,这是一个我们需要放弃的便利。
But batch fetching and subselect fetching have one important characteristic in common: they can be performed lazily. This is, in principle, pretty convenient. When we query data, and then navigate an object graph, lazy fetching saves us the effort of planning ahead. It turns out that this is a convenience we’re going to have to surrender.
7.5. Join fetching
外部联接获取通常是获取关联的最佳方法,也是我们大多数时间使用的方法。遗憾的是,联接获取从其本质来说根本无法延迟。因此,为了利用联接获取,我们必须提前规划。我们的普遍建议是:
Outer join fetching is usually the best way to fetch associations, and it’s what we use most of the time. Unfortunately, by its very nature, join fetching simply can’t be lazy. So to make use of join fetching, we must plan ahead. Our general advice is:
现在,我们并不是说应该默认映射关联以便快速提取!这将是一个可怕的想法,会导致提取整个数据库几乎所有数据的简单会话操作。因此: |
Now, we’re not saying that associations should be mapped for eager fetching by default! That would be a terrible idea, resulting in simple session operations that fetch almost the entire database. Therefore: |
听起来这个提示与前面的提示相矛盾,但事实并非如此。它表示你必须明确指定关联的快速提取,仅在需要的时候和需要的地方提取。 |
It sounds as if this tip is in contradiction to the previous one, but it’s not. It’s saying that you must explicitly specify eager fetching for associations precisely when and where they are needed. |
如果我们需要在某些特定事务中执行急切联接获取,我们有四种不同的方式指定。
If we need eager join fetching in some particular transaction, we have four different ways to specify that.
Passing a JPA EntityGraph |
We’ve already seen this in Entity graphs and eager fetching |
Specifying a named fetch profile |
We’ll discuss this approach later in Named fetch profiles |
Using left join fetch in HQL/JPQL |
See A Guide to Hibernate Query Language for details |
Using From.fetch() in a criteria query |
Same semantics as join fetch in HQL |
通常情况下,查询是最方便的选项。下面是如何在 HQL 中请求联接获取:
Typically, a query is the most convenient option. Here’s how we can ask for join fetching in HQL:
List<Book> booksWithJoinFetchedAuthors =
session.createSelectionQuery("from Book join fetch authors order by isbn")
.getResultList();
这是同一查询,使用条件 API 编写:
And this is the same query, written using the criteria API:
var builder = sessionFactory.getCriteriaBuilder();
var query = builder.createQuery(Book.class);
var book = query.from(Book.class);
book.fetch(Book_.authors);
query.select(book);
query.orderBy(builder.asc(book.get(Book_.isbn)));
List<Book> booksWithJoinFetchedAuthors =
session.createSelectionQuery(query).getResultList();
不管哪种方式,都会执行一条 SQL select 语句:
Either way, a single SQL select statement is executed:
select b1_0.isbn,a1_0.books_isbn,a1_1.id,a1_1.bio,a1_1.name,b1_0.price,b1_0.published,b1_0.publisher_id,b1_0.title
from Book b1_0
join (Book_Author a1_0 join Author a1_1 on a1_1.id=a1_0.authors_id)
on b1_0.isbn=a1_0.books_isbn
order by b1_0.isbn
好多了!
Much better!
联接获取尽管无法延迟,但显然比批次或子查询获取效率更高,而我们建议避免使用延迟获取的依据正在于此。
Join fetching, despite its non-lazy nature, is clearly more efficient than either batch or subselect fetching, and this is the source of our recommendation to avoid the use of lazy fetching.
联接提取变得低效有一个有趣的情况:当我们提取两个多值关联 in parallel 时。想象一下,我们想要在某些工作单元中提取 Author.books 和 Author.royaltyStatements。在一个查询中联接这两个集合将导致表的笛卡尔积以及一个较大的 SQL 结果集。子选择提取在此处派上了用场,它允许我们使用联接提取 books,并使用单一的后续 select 提取 royaltyStatements。 |
There’s one interesting case where join fetching becomes inefficient: when we fetch two many-valued associations in parallel. Imagine we wanted to fetch both Author.books and Author.royaltyStatements in some unit of work. Joining both collections in a single query would result in a cartesian product of tables, and a large SQL result set. Subselect fetching comes to the rescue here, allowing us to fetch books using a join, and royaltyStatements using a single subsequent select. |
当然,避免多次往返数据库的另一种方法是在 Java 客户端中缓存我们所需的数据。如果我们期望在本地缓存中找到关联数据,那么我们很有可能根本不需要联接抓取。
Of course, an alternative way to avoid many round trips to the database is to cache the data we need in the Java client. If we’re expecting to find the associated data in a local cache, we probably don’t need join fetching at all.
但是,如果我们不能 certain 确保所有关联的数据都处于缓存中,该如何办?在这种情况下,我们也许能够通过启用批处理提取来减少高速缓存未命中成本。 |
But what if we can’t be certain that all associated data will be in the cache? In that case, we might be able to reduce the cost of cache misses by enabling batch fetching. |
7.6. The second-level cache
减少对数据库访问次数的经典方法是使用二级缓存,允许以会话为单位共享内存中缓存的数据。
A classic way to reduce the number of accesses to the database is to use a second-level cache, allowing data cached in memory to be shared between sessions.
根据特性,二级缓存往往会破坏关系数据库中事务处理的 ACID 特性。我们 don’t 使用分布式事务和两阶段提交,以确保对缓存和数据库的更改以原子方式进行。因此,二级缓存通常是改善系统性能最简单的途径,但代价是让有关并发性的推理变得更加困难。因此,缓存是难以隔离和重现的错误的潜在来源。
By nature, a second-level cache tends to undermine the ACID properties of transaction processing in a relational database. We don’t use a distributed transaction with two-phase commit to ensure that changes to the cache and database happen atomically. So a second-level cache is often by far the easiest way to improve the performance of a system, but only at the cost of making it much more difficult to reason about concurrency. And so the cache is a potential source of bugs which are difficult to isolate and reproduce.
因此,默认情况下,实体没有资格存储在二级缓存中。我们必须使用 org.hibernate.annotations 中的 @Cache 注释显式地标记将在二级缓存中存储的每个实体。
Therefore, by default, an entity is not eligible for storage in the second-level cache. We must explicitly mark each entity that will be stored in the second-level cache with the @Cache annotation from org.hibernate.annotations.
但仍不够。Hibernate 本身不包含二级缓存的实现,因此,有必要配置外部 cache provider。
But that’s still not enough. Hibernate does not itself contain an implementation of a second-level cache, so it’s necessary to configure an external cache provider.
默认情况下已禁用高速缓存。为了最大限度降低数据丢失的风险,我们强制你暂停并思考,然后再将任何实体放入高速缓存。
Caching is disabled by default. To minimize the risk of data loss, we force you to stop and think before any entity goes into the cache.
Hibernate 将二级缓存分割成命名 regions,每个 regions 分别用于:
Hibernate segments the second-level cache into named regions, one for each:
-
mapped entity hierarchy or
-
collection role.
例如,可能存在 @{1}、@{2}、@{3}、@{4} 的独立缓存区。
For example, there might be separate cache regions for Author, Book, Author.books, and Book.authors.
每个区域都被允许使用它自己的过期、持久化和复制策略。这些策略必须在外部配置为 Hibernate。
Each region is permitted its own policies for expiry, persistence, and replication. These policies must be configured externally to Hibernate.
适当的策略取决于实体所表示的数据类型。例如,程序可能对“参考”数据、交易数据和用于分析的数据使用不同的缓存策略。通常,这些策略的实现是基础缓存实现的责任。
The appropriate policies depend on the kind of data an entity represents. For example, a program might have different caching policies for "reference" data, for transactional data, and for data used for analytics. Ordinarily, the implementation of those policies is the responsibility of the underlying cache implementation.
7.7. Specifying which data is cached
默认情况下,没有任何数据符合存储在二级缓存中的条件。
By default, no data is eligible for storage in the second-level cache.
可以使用 @{5} 注解为实体层级或集合角色分配一个区域。如果不明确指定区域名称,则区域名称仅仅是实体类或集合角色的名称。
An entity hierarchy or collection role may be assigned a region using the @Cache annotation. If no region name is explicitly specified, the region name is just the name of the entity class or collection role.
@Entity
@Cache(usage=NONSTRICT_READ_WRITE, region="Publishers")
class Publisher {
...
@Cache(usage=READ_WRITE, region="PublishedBooks")
@OneToMany(mappedBy=Book_.PUBLISHER)
Set<Book> books;
...
}
由 @{6} 注解定义的缓存会被 Hibernate 自动利用来:
The cache defined by a @Cache annotation is automatically utilized by Hibernate to:
-
retrieve an entity by id when find() is called, or
-
to resolve an association by id.[.iokays-translated-5524a29b54be2a47f43e2bdc9a9dce60]
@Cache 注释必须指定在实体继承级别的 root class 中。将注释放置在子类实体中是错误的。
The @Cache annotation must be specified on the root class of an entity inheritance hierarchy. It’s an error to place it on a subclass entity.
@Cache 注释总是指定一个 CacheConcurrencyStrategy ,即管理并发事务对二级缓存访问的策略。
The @Cache annotation always specifies a CacheConcurrencyStrategy, a policy governing access to the second-level cache by concurrent transactions.
表格 47。缓存并发
Table 47. Cache concurrency
Concurrency policy |
Interpretation |
Explanation |
READ_ONLY |
Immutable dataRead-only access |
Indicates that the cached object is immutable, and is never updated. If an entity with this cache concurrency is updated, an exception is thrown.This is the simplest, safest, and best-performing cache concurrency strategy. It’s particularly suitable for so-called "reference" data. |
NONSTRICT_READ_WRITE |
Concurrent updates are extremely improbableRead/write access with no locking |
Indicates that the cached object is sometimes updated, but that it’s extremely unlikely that two transactions will attempt to update the same item of data at the same time.This strategy does not use locks. When an item is updated, the cache is invalidated both before and after completion of the updating transaction. But without locking, it’s impossible to completely rule out the possibility of a second transaction storing or retrieving stale data in or from the cache during the completion process of the first transaction. |
READ_WRITE |
Concurrent updates are possible but not commonRead/write access using soft locks |
Indicates a non-vanishing likelihood that two concurrent transactions attempt to update the same item of data simultaneously.This strategy uses "soft" locks to prevent concurrent transactions from retrieving or storing a stale item from or in the cache during the transaction completion process. A soft lock is simply a marker entry placed in the cache while the updating transaction completes.A second transaction may not read the item from the cache while the soft lock is present, and instead simply proceeds to read the item directly from the database, exactly as if a regular cache miss had occurred.Similarly, the soft lock also prevents this second transaction from storing a stale item to the cache when it returns from its round trip to the database with something that might not quite be the latest version. |
TRANSACTIONAL |
Concurrent updates are frequentTransactional access |
Indicates that concurrent writes are common, and the only way to maintain synchronization between the second-level cache and the database is via the use of a fully transactional cache provider. In this case, the cache and the database must cooperate via JTA or the XA protocol, and Hibernate itself takes on little responsibility for maintaining the integrity of the cache. |
哪些策略有意义也可能取决于基础二级缓存实现。
Which policies make sense may also depend on the underlying second-level cache implementation.
JPA 有一个类似的注释,名为 @Cacheable 。不幸的是,它对我们几乎没用,因为: |
JPA has a similar annotation, named @Cacheable. Unfortunately, it’s almost useless to us, since: |
它提供的指定缓存实体性质及管理其缓存的方式的信息,且
it provides no way to specify any information about the nature of the cached entity and how its cache should be managed, and
可能不能用于注释关联,因此我们甚至无法使用它将集合角色标记为有资格存储在二级缓存中。
it may not be used to annotate associations, and so we can’t even use it to mark collection roles as eligible for storage in the second-level cache.
7.8. Caching by natural id
如果我们的实体有一个 natural id,我们可以通过注释实体 @NaturalIdCache 来启用一个附加缓存,该缓存保存从自然 ID 到主键 ID 的交叉引用。默认情况下,自然 ID 缓存存储在二级缓存的专用区域中,单独于缓存的实体数据。
If our entity has a natural id, we can enable an additional cache, which holds cross-references from natural id to primary id, by annotating the entity @NaturalIdCache. By default, the natural id cache is stored in a dedicated region of the second-level cache, separate from the cached entity data.
@Entity
@Cache(usage=READ_WRITE, region="Book")
@NaturalIdCache(region="BookIsbn")
class Book {
...
@NaturalId
String isbn;
@NaturalId
int printing;
...
}
当使用执行 lookup by natural id 的 Session 的操作之一来检索实体时,将使用此缓存。
This cache is utilized when the entity is retrieved using one of the operations of Session which performs lookup by natural id.
由于自然 ID 缓存不包含实体的实际状态,因此除非实体已经符合存储在二级缓存中的条件(即也注释了 @Cache),否则对实体 @NaturalIdCache 进行注释毫无意义。 |
Since the natural id cache doesn’t contain the actual state of the entity, it doesn’t make sense to annotate an entity @NaturalIdCache unless it’s already eligible for storage in the second-level cache, that is, unless it’s also annotated @Cache. |
值得注意的是,与实体的主标识符不同,自然 ID 可能可变。
It’s worth noticing that, unlike the primary identifier of an entity, a natural id might be mutable.
现在我们必须考虑一个精细之处,在处理所谓的“参考数据”时经常会出现,即能轻松装入内存且不太会发生变化的数据。
We must now consider a subtlety that often arises when we have to deal with so-called "reference data", that is, data which fits easily in memory, and doesn’t change much.
7.9. Caching and association fetching
让我们再次考虑我们的 @{9} 类:
Let’s consider again our Publisher class:
@Cache(usage=NONSTRICT_READ_WRITE, region="Publishers")
@Entity
class Publisher { ... }
关于出版商的数据不会经常发生变化,而且数量也不多。假设我们已做好一切设置,以便出版商几乎 @{10} 可以在二级缓存中使用。
Data about publishers doesn’t change very often, and there aren’t so many of them. Suppose we’ve set everything up so that the publishers are almost always available in the second-level cache.
那么在这种情况下,我们需要仔细考虑类型 @{11} 的关联。
Then in this case we need to think carefully about associations of type Publisher.
@ManyToOne
Publisher publisher;
这个关联不需要延迟获取,因为我们希望它可以在内存中使用,因此我们不会设置它为 @{12}。但另一方面,如果我们将其标记为立即获取,则默认情况下,Hibernate 通常会使用连接来获取它。这会给数据库带来完全不必要的负载。
There’s no need for this association to be lazily fetched, since we’re expecting it to be available in memory, so we won’t set it fetch=LAZY. But on the other hand, if we leave it marked for eager fetching then, by default, Hibernate will often fetch it using a join. This places completely unnecessary load on the database.
解决方案是 @Fetch 注释:
The solution is the @Fetch annotation:
@ManyToOne @Fetch(SELECT)
Publisher publisher;
通过对关联 @{14} 进行注解,我们压制连接获取,给 Hibernate 一个机会在缓存中找到关联的 @{15}。
By annotating the association @Fetch(SELECT), we suppress join fetching, giving Hibernate a chance to find the associated Publisher in the cache.
因此,我们得出了以下经验法则:
Therefore, we arrive at this rule of thumb:
到“引用数据”或几乎总是可用于缓存中的任何其他数据的多对一关联都应映射 EAGER , SELECT 。 |
Many-to-one associations to "reference data", or to any other data that will almost always be available in the cache, should be mapped EAGER,SELECT. |
其他关联,如我们 already made clear ,应为 LAZY 。
Other associations, as we’ve already made clear, should be LAZY.
一旦我们将实体或集合标记为符合存储在二级缓存中的条件,我们仍然需要设置一个实际的缓存。
Once we’ve marked an entity or collection as eligible for storage in the second-level cache, we still need to set up an actual cache.
7.10. Configuring the second-level cache provider
配置二级缓存提供程序是一个相当复杂的话题,并且超出了本文档的范围。但是,如果它有帮助,我们经常使用以下配置来测试 Hibernate,其中使用 EHCache 作为缓存实现,如上所示 Optional dependencies:
Configuring a second-level cache provider is a rather involved topic, and quite outside the scope of this document. But in case it helps, we often test Hibernate with the following configuration, which uses EHCache as the cache implementation, as above in Optional dependencies:
表格 48. EHCache 配置
Table 48. EHCache configuration
Configuration property name |
Property value |
hibernate.cache.region.factory_class |
jcache |
hibernate.javax.cache.uri |
/ehcache.xml |
如果你正在使用 EHCache,你还需要包含一个 ehcache.xml 文件,该文件明确配置属于你的实体和集合的每个缓存区域的行为。有关配置 EHCache 的更多信息,请参阅 here。
If you’re using EHCache, you’ll also need to include an ehcache.xml file that explicitly configures the behavior of each cache region belonging to your entities and collections. You’ll find more information about configuring EHCache here.
我们可能使用 JCache 的任何其他实现,例如 Caffeine。JCache 会自动选择它在类路径中找到的任何实现。如果类路径上有多个实现,我们必须使用以下内容消除歧义:
We may use any other implementation of JCache, such as Caffeine. JCache automatically selects whichever implementation it finds on the classpath. If there are multiple implementations on the classpath, we must disambiguate using:
表格 49. 消除 JCache 实现歧义
Table 49. Disambiguating the JCache implementation
Configuration property name |
Property value |
hibernate.javax.cache.provider |
The implementation of javax.cache.spiCachingProvider, for example:_org.ehcache.jsr107.EhcacheCachingProvider_for EHCache_com.github.benmanes.caffeine.jcache.spi.CaffeineCachingProvider_for Caffeine |
org.ehcache.jsr107.EhcacheCachingProvider |
for EHCache |
com.github.benmanes.caffeine.jcache.spi.CaffeineCachingProvider |
for Caffeine |
或者,要使用 Infinispan 作为缓存实现,需要以下设置:
Alternatively, to use Infinispan as the cache implementation, the following settings are required:
表格 50. Infinispan 提供程序配置
Table 50. Infinispan provider configuration
Configuration property name |
Property value |
hibernate.cache.region.factory_class |
infinispan |
hibernate.cache.infinispan.cfg |
Path to infinispan configuration file, for example:_org/infinispan/hibernate/cache/commons/builder/infinispan-configs.xml_for a distributed cache_org/infinispan/hibernate/cache/commons/builder/infinispan-configs-local.xml_to test with local cache |
org/infinispan/hibernate/cache/commons/builder/infinispan-configs.xml |
for a distributed cache |
org/infinispan/hibernate/cache/commons/builder/infinispan-configs-local.xml |
to test with local cache |
当需要分布式缓存时,通常使用 Infinispan。有关将 Infinispan 与 Hibernate 配合使用的更多信息,请参阅 here。
Infinispan is usually used when distributed caching is required. There’s more about using Infinispan with Hibernate here.
最后,有一种方法可以全局禁用二级缓存:
Finally, there’s a way to globally disable the second-level cache:
表格 51. 禁用缓存的设置
Table 51. Setting to disable caching
Configuration property name |
Property value |
hibernate.cache.use_second_level_cache |
true to enable caching, or false to disable it |
设置 hibernate.cache.region.factory_class 时,此属性默认为 true。
When hibernate.cache.region.factory_class is set, this property defaults to true.
此设置使我们能够在进行故障排除或性能分析时轻松完全禁用二级缓存。 |
This setting lets us easily disable the second-level cache completely when troubleshooting or profiling performance. |
您可以在 User Guide 中找到有关二级缓存的更多信息。
You can find much more information about the second-level cache in the User Guide.
7.11. Caching query result sets
我们上面介绍的缓存仅用于通过 ID 或自然 ID 优化查找。Hibernate 还可以缓存查询的结果集,尽管这通常不是有效的方法。
The caches we’ve described above are only used to optimize lookups by id or by natural id. Hibernate also has a way to cache the result sets of queries, though this is only rarely an efficient thing to do.
必须显式启用查询缓存:
The query cache must be enabled explicitly:
表格 52. 启用查询缓存的设置
Table 52. Setting to enable the query cache
Configuration property name |
Property value |
hibernate.cache.use_query_cache |
true to enable the query cache |
要缓存查询结果,请调用 SelectionQuery.setCacheable(true):
To cache the results of a query, call SelectionQuery.setCacheable(true):
session.createQuery("from Product where discontinued = false")
.setCacheable(true)
.getResultList();
默认情况下,查询结果集存储在名为 default-query-results-region 的缓存区域中。由于不同的查询应具有不同的缓存策略,因此通常明确指定区域名称:
By default, the query result set is stored in a cache region named default-query-results-region. Since different queries should have different caching policies, it’s common to explicitly specify a region name:
session.createQuery("from Product where discontinued = false")
.setCacheable(true)
.setCacheRegion("ProductCatalog")
.getResultList();
结果集与 logical timestamp 一起缓存。所谓“逻辑”,即它实际上不会随着时间线性增加,它特别不是系统时间。
A result set is cached together with a logical timestamp. By "logical", we mean that it doesn’t actually increase linearly with time, and in particular it’s not the system time.
当 Product 更新时,Hibernate does not 遍历查询缓存,使受变更影响的每个已缓存结果集失效。而缓存有一个特殊区域存储每个表的最近更新的逻辑时间戳,称为 update timestamps cache,并且保留在区域 default-update-timestamps-region 中。
When a Product is updated, Hibernate does not go through the query cache and invalidate every cached result set that’s affected by the change. Instead, there’s a special region of the cache which holds a logical timestamp of the most-recent update to each table. This is called the update timestamps cache, and it’s kept in the region default-update-timestamps-region.
设置此高速缓存区域时必须确保使用了适当的策略。具体来说,更新时间戳绝不应过期或移除。
It’s your responsibility to ensure that this cache region is configured with appropriate policies. In particular, update timestamps should never expire or be evicted.
当从缓存中读取查询结果集时,Hibernate 会将时间戳与影响查询结果的每个表的时间戳进行比较,并且 only 在结果集不是过时的情况下返回结果集。如果结果集 is 过时,Hibernate 将继续执行针对数据库的查询并更新缓存的结果集。
When a query result set is read from the cache, Hibernate compares its timestamp with the timestamp of each of the tables that affect the results of the query, and only returns the result set if the result set isn’t stale. If the result set is stale, Hibernate goes ahead and re-executes the query against the database and updates the cached result set.
通常情况下,二级缓存与任何二级缓存一样,都会破坏事务的 ACID 规则。
As is generally the case with any second-level cache, the query cache can break the ACID properties of transactions.
7.12. Second-level cache management
二级缓存大多数是透明的。与 Hibernate 会话交互的程序逻辑不知道该缓存,并且不受缓存策略变更的影响。
For the most part, the second-level cache is transparent. Program logic which interacts with the Hibernate session is unaware of the cache, and is not impacted by changes to caching policies.
最差情况下,可以通过指定显式 CacheMode 来控制与缓存的交互:
At worst, interaction with the cache may be controlled by specifying of an explicit CacheMode:
session.setCacheMode(CacheMode.IGNORE);
或者,使用 JPA-standard API:
Or, using JPA-standard APIs:
entityManager.setCacheRetrieveMode(CacheRetrieveMode.BYPASS);
entityManager.setCacheStoreMode(CacheStoreMode.BYPASS);
JPA 定义的缓存模式分为两种:CacheRetrieveMode 和 CacheStoreMode。
The JPA-defined cache modes come in two flavors: CacheRetrieveMode and CacheStoreMode.
表 53. JPA 定义的缓存检索模式
Table 53. JPA-defined cache retrieval modes
Mode |
Interpretation |
CacheRetrieveMode.USE |
Read data from the cache if available |
CacheRetrieveMode.BYPASS |
Don’t read data from the cache; go direct to the database |
如果我们担心从缓存读取陈旧数据,则可以选择 CacheRetrieveMode.BYPASS。
We might select CacheRetrieveMode.BYPASS if we’re concerned about the possibility of reading stale data from the cache.
表 54. JPA 定义的缓存存储模式
Table 54. JPA-defined cache storage modes
Mode |
Interpretation |
CacheStoreMode.USE |
Write data to the cache when read from the database or when modified; do not update already-cached items when reading |
CacheStoreMode.REFRESH |
Write data to the cache when read from the database or when modified; always update cached items when reading |
CacheStoreMode.BYPASS |
Don’t write data to the cache |
如果我们查询不需要缓存的数据,则应该选择 CacheStoreMode.BYPASS。
We should select CacheStoreMode.BYPASS if we’re querying data that doesn’t need to be cached.
在运行返回大型结果集的查询之前(查询返回的是我们认为不久后不需要的大量数据),最好将 CacheStoreMode 设置为 BYPASS。这样做可以节省工作,并防止新读取的数据将先前缓存的数据推出。 |
It’s a good idea to set the CacheStoreMode to BYPASS just before running a query which returns a large result set full of data that we don’t expect to need again soon. This saves work, and prevents the newly-read data from pushing out the previously cached data. |
在 JPA 中,我们将使用此惯用语:
In JPA we would use this idiom:
entityManager.setCacheStoreMode(CacheStoreMode.BYPASS);
List<Publisher> allpubs =
entityManager.createQuery("from Publisher", Publisher.class)
.getResultList();
entityManager.setCacheStoreMode(CacheStoreMode.USE);
但是 Hibernate 有更好的方法:
But Hibernate has a better way:
List<Publisher> allpubs =
session.createSelectionQuery("from Publisher", Publisher.class)
.setCacheStoreMode(CacheStoreMode.BYPASS)
.getResultList();
Hibernate CacheMode 使用 CacheStoreMode 打包 CacheRetrieveMode。
A Hibernate CacheMode packages a CacheRetrieveMode with a CacheStoreMode.
表 55. Hibernate 缓存模式和 JPA 等效项
Table 55. Hibernate cache modes and JPA equivalents
Hibernate CacheMode |
Equivalent JPA modes |
NORMAL |
CacheRetrieveMode.USE, CacheStoreMode.USE |
IGNORE |
CacheRetrieveMode.BYPASS, CacheStoreMode.BYPASS |
GET |
CacheRetrieveMode.USE, CacheStoreMode.BYPASS |
PUT |
CacheRetrieveMode.BYPASS, CacheStoreMode.USE |
REFRESH |
CacheRetrieveMode.REFRESH, CacheStoreMode.BYPASS |
没有特别的理由选择 Hibernate 的 CacheMode 而不是 JPA 等效项。这种枚举只存在于 Hibernate 拥有缓存模式的时候,远早于将其添加到 JPA 的时候。
There’s no particular reason to prefer Hibernate’s CacheMode over the JPA equivalents. This enumeration only exists because Hibernate had cache modes long before they were added to JPA.
对于“参考”数据,即始终应在二级缓存中找到的数据,最好在启动时 prime 缓存。有一种方法非常简单:只需在获取 EntityManager 或 SessionFactory 后立即执行查询。 |
For "reference" data, that is, for data which is expected to always be found in the second-level cache, it’s a good idea to prime the cache at startup. There’s a really easy way to do this: just execute a query immediately after obtaining the EntityManager or SessionFactory. |
SessionFactory sessionFactory = setupHibernate(new Configuration()) .buildSessionFactory(); sessionFactory.inSession(session → { session.createSelectionQuery("from Country")) .setReadOnly(true) .getResultList(); session.createSelectionQuery("from Product where discontinued = false")) .setReadOnly(true) .getResultList(); }); SessionFactory sessionFactory = setupHibernate(new Configuration()) .buildSessionFactory(); sessionFactory.inSession(session → { session.createSelectionQuery("from Country")) .setReadOnly(true) .getResultList(); session.createSelectionQuery("from Product where discontinued = false")) .setReadOnly(true) .getResultList(); });
SessionFactory sessionFactory = setupHibernate(new Configuration()) .buildSessionFactory(); sessionFactory.inSession(session → { session.createSelectionQuery("from Country")) .setReadOnly(true) .getResultList(); session.createSelectionQuery("from Product where discontinued = false")) .setReadOnly(true) .getResultList(); }); SessionFactory sessionFactory = setupHibernate(new Configuration()) .buildSessionFactory(); sessionFactory.inSession(session → { session.createSelectionQuery("from Country")) .setReadOnly(true) .getResultList(); session.createSelectionQuery("from Product where discontinued = false")) .setReadOnly(true) .getResultList(); });
极少情况下,需要或可控明确地控制缓存,例如,清除我们知道已过时的某些数据。Cache 接口允许以编程方式清除缓存的项目。
Very occasionally, it’s necessary or advantageous to control the cache explicitly, for example, to evict some data that we know to be stale. The Cache interface allows programmatic eviction of cached items.
sessionFactory.getCache().evictEntityData(Book.class, bookId);
通过 Cache 接口管理二级缓存不会感知事务。Cache 的任何操作都不遵守基础高速缓存相关的任何隔离或事务语义。具体来说,通过此接口的方法进行移除会导致超出任何当前事务及/或锁定方案的立即“强制”移除。
Second-level cache management via the Cache interface is not transaction-aware. None of the operations of Cache respect any isolation or transactional semantics associated with the underlying caches. In particular, eviction via the methods of this interface causes an immediate "hard" removal outside any current transaction and/or locking scheme.
但是,通常在修改后,Hibernate 会自动逐出或更新缓存的数据,此外,未使用的缓存数据最终会根据配置的策略过期。
Ordinarily, however, Hibernate automatically evicts or updates cached data after modifications, and, in addition, cached data which is unused will eventually be expired according to the configured policies.
这与一级缓存中发生的情况完全不同。
This is quite different to what happens with the first-level cache.
7.13. Session cache management
当不再需要实体实例时,它们不会自动从会话缓存中清除。相反,它们在程序丢弃它们所在的会话之前一直固定在内存中。
Entity instances aren’t automatically evicted from the session cache when they’re no longer needed. Instead, they stay pinned in memory until the session they belong to is discarded by your program.
方法 detach() 和 clear() 允许你从会话缓存中删除实体,使其可供垃圾回收。由于大多数会话的持续时间都很短,因此你不太需要进行这些操作。而且如果你发现自己在想自己在某种情况下需要 do,则应该认真考虑替代解决方案:一个 stateless session。
The methods detach() and clear() allow you to remove entities from the session cache, making them available for garbage collection. Since most sessions are rather short-lived, you won’t need these operations very often. And if you find yourself thinking you do need them in a certain situation, you should strongly consider an alternative solution: a stateless session.
7.14. Stateless sessions
Hibernate 一个可以说未被充分重视的特性是 StatelessSession 接口,它提供了一种面向命令、更底层的与数据库交互的方法。
An arguably-underappreciated feature of Hibernate is the StatelessSession interface, which provides a command-oriented, more bare-metal approach to interacting with the database.
你可以从 SessionFactory 获取无状态会话:
You may obtain a stateless session from the SessionFactory:
StatelessSession ss = getSessionFactory().openStatelessSession();
一个无状态会话:
A stateless session:
-
doesn’t have a first-level cache (persistence context), nor does it interact with any second-level caches, and
-
doesn’t implement transactional write-behind or automatic dirty checking, so all operations are executed immediately when they’re explicitly called.
对于无状态会话,我们始终使用分离的对象。因此,编程模型有点不同:
For a stateless session, we’re always working with detached objects. Thus, the programming model is a bit different:
表 56. StatelessSession 的重要方法
Table 56. Important methods of the StatelessSession
Method name and parameters |
Effect |
get(Class, Object) |
Obtain a detached object, given its type and its id, by executing a select |
fetch(Object) |
Fetch an association of a detached object |
refresh(Object) |
Refresh the state of a detached object by executing a select |
insert(Object) |
Immediately insert the state of the given transient object into the database |
update(Object) |
Immediately update the state of the given detached object in the database |
delete(Object) |
Immediately delete the state of the given detached object from the database |
upsert(Object) |
Immediately insert or update the state of the given detached object using a SQL merge into statement |
在某些情况下,这使得无状态会话更容易处理和推理,但需要注意的是,无状态会话更容易受到数据别名影响,因为可以轻松得到两个非标识的 Java 对象,它们都表示同一数据库表的一行。 |
In certain circumstances, this makes stateless sessions easier to work with and simpler to reason about, but with the caveat that a stateless session is much more vulnerable to data aliasing effects, since it’s easy to get two non-identical Java objects which both represent the same row of a database table. |
如果我们在无状态会话中使用 fetch(),我们可以非常轻松地得到表示同一数据库行的两个对象!
If we use fetch() in a stateless session, we can very easily obtain two objects representing the same database row!
特别是,没有持久性上下文意味着我们可以安全地执行批量处理任务,而无需分配大量内存。使用 StatelessSession 减轻了调用它的需求:
In particular, the absence of a persistence context means that we can safely perform bulk-processing tasks without allocating huge quantities of memory. Use of a StatelessSession alleviates the need to call:
-
clear() or detach() to perform first-level cache management, and
-
setCacheMode() to bypass interaction with the second-level cache.
无状态会话可能很有用,但对于庞大数据集的批量操作,Hibernate 根本无法与存储过程竞争!
Stateless sessions can be useful, but for bulk operations on huge datasets, Hibernate can’t possibly compete with stored procedures! |
7.15. Optimistic and pessimistic locking
最后,我们没有在上面提到的受载期间行为的一个方面是行级别数据争用。当许多事务尝试读取和更新相同数据时,程序可能会因锁升级、死锁和锁获取超时错误而失去响应。
Finally, an aspect of behavior under load that we didn’t mention above is row-level data contention. When many transactions try to read and update the same data, the program might become unresponsive with lock escalation, deadlocks, and lock acquisition timeout errors.
Hibernate 中有两种基本的数据并发方法:
There’s two basic approaches to data concurrency in Hibernate:
-
optimistic locking using @Version columns, and
-
database-level pessimistic locking using the SQL for update syntax (or equivalent).
在 Hibernate 社区中,使用乐观锁定更常见,而 Hibernate 让这件事变得非常容易。
In the Hibernate community it’s much more common to use optimistic locking, and Hibernate makes that incredibly easy.
在多用户系统中若有可能的话,避免跨用户交互持有一个悲观锁。事实上,一般的做法是避免使用跨越用户交互的事务。对于多用户系统,乐观锁是王者。 |
Where possible, in a multiuser system, avoid holding a pessimistic lock across a user interaction. Indeed, the usual practice is to avoid having transactions that span user interactions. For multiuser systems, optimistic locking is king. |
也就是说,那里 is 也有悲观锁,它有时可以降低事务回滚的可能性。
That said, there is also a place for pessimistic locks, which can sometimes reduce the probability of transaction rollbacks.
因此,响应会话的 find() 、 lock() 和 refresh() 方法接受一个可选的 LockMode 。我们还可以为查询指定一个 LockMode 。锁定模式可用于请求悲观锁定或自定乐观锁定的行为:
Therefore, the find(), lock(), and refresh() methods of the reactive session accept an optional LockMode. We can also specify a LockMode for a query. The lock mode can be used to request a pessimistic lock, or to customize the behavior of optimistic locking:
表 57. 乐观和悲观锁模式
Table 57. Optimistic and pessimistic lock modes
LockMode type |
Meaning |
READ |
An optimistic lock obtained implicitly whenever an entity is read from the database using select |
OPTIMISTIC |
An optimistic lock obtained when an entity is read from the database, and verified using a select to check the version when the transaction completes |
OPTIMISTIC_FORCE_INCREMENT |
An optimistic lock obtained when an entity is read from the database, and enforced using an update to increment the version when the transaction completes |
WRITE |
A pessimistic lock obtained implicitly whenever an entity is written to the database using update or insert |
PESSIMISTIC_READ |
A pessimistic for share lock |
PESSIMISTIC_WRITE |
A pessimistic for update lock |
PESSIMISTIC_FORCE_INCREMENT |
A pessimistic lock enforced using an immediate update to increment the version |
NONE |
No lock; assigned when an entity is read from the second-level cache |
请注意,即使实体未经修改,OPTIMISTIC 锁也会始终在事务结束时进行验证。这与大多数人谈论“乐观锁”时的意思略有不同。从不必要在已修改实体上请求 OPTIMISTIC 锁,因为版本号总是在执行 SQL update 时进行验证。
Note that an OPTIMISTIC lock is always verified at the end of the transaction, even when the entity has not been modified. This is slightly different to what most people mean when they talk about an "optimistic lock". It’s never necessary to request an OPTIMISTIC lock on a modified entity, since the version number is always verified when a SQL update is executed.
JPA 有自己的 LockModeType,它列举了大部分相同的模式。但是,JPA 的 LockModeType.READ 是 OPTIMISTIC 的同义词,它与 Hibernate 的 LockMode.READ 不同。同样,LockModeType.WRITE 是 OPTIMISTIC_FORCE_INCREMENT 的同义词,与 LockMode.WRITE 不同。
JPA has its own LockModeType, which enumerates most of the same modes. However, JPA’s LockModeType.READ is a synonym for OPTIMISTIC — it’s not the same as Hibernate’s LockMode.READ. Similarly, LockModeType.WRITE is a synonym for OPTIMISTIC_FORCE_INCREMENT and is not the same as LockMode.WRITE.
7.16. Collecting statistics
我们可以通过设置此配置属性来让 Hibernate 收集其活动相关统计信息:
We may ask Hibernate to collect statistics about its activity by setting this configuration property:
Configuration property name |
Property value |
hibernate.generate_statistics |
true to enable collection of statistics |
这些统计数据由 Statistics 对象公开:
The statistics are exposed by the Statistics object:
long failedVersionChecks =
sessionFactory.getStatistics()
.getOptimisticFailureCount();
long publisherCacheMissCount =
sessionFactory.getStatistics()
.getEntityStatistics(Publisher.class.getName())
.getCacheMissCount()
Hibernate 的统计信息支持可观察性。 Micrometer 和 SmallRye Metrics 都能够公开这些指标。
Hibernate’s statistics enable observability. Both Micrometer and SmallRye Metrics are capable of exposing these metrics.
7.17. Tracking down slow queries
当在生产环境中发现某个 SQL 查询性能欠佳时,有时很难准确地跟踪到查询的 Java 代码位于何处。Hibernate 提供了两个可以使识别缓慢查询及其来源变得更简单的配置属性。
When a poorly-performing SQL query is discovered in production, it can sometimes be hard to track down exactly where in the Java code the query originates. Hibernate offers two configuration properties that can make it easier to identify a slow query and find its source.
表 58. 跟踪慢查询的设置
Table 58. Settings for tracking slow queries
Configuration property name |
Purpose |
Property value |
hibernate.log_slow_query |
Log slow queries at the INFO level |
The minimum execution time, in milliseconds, which characterizes a "slow" query |
hibernate.use_sql_comments |
Prepend comments to the executed SQL |
true or false |
当 hibernate.use_sql_comments 启用时,HQL 查询的文本会作为注释添加到生成的 SQL 前面,这通常可以让用户轻松地在 Java 代码中找到 HQL。
When hibernate.use_sql_comments is enabled, the text of the HQL query is prepended as a comment to the generated SQL, which usually makes it easy to find the HQL in the Java code.
可以自定义注释文本:
The comment text may be customized:
-
by calling Query.setComment(comment) or Query.setHint(AvailableHints.HINT_COMMENT,comment), or
-
via the @NamedQuery annotation.
一旦确定了速度较慢的查询,让查询速度更快的最佳方法之一就是 actually go and talk to someone who is an expert at making queries go fast。这些人被称为“数据库管理员”,如果您正在阅读此文档,您可能不是其中之一。数据库管理员了解大量 Java 开发人员不知道的东西。因此,如果足够幸运能有数据库管理员,您就不必依靠邓宁-克鲁格效应来解决查询速度较慢的问题。
Once you’ve identified a slow query, one of the best ways to make it faster is to actually go and talk to someone who is an expert at making queries go fast. These people are called "database administrators", and if you’re reading this document you probably aren’t one. Database administrators know lots of stuff that Java developers don’t. So if you’re lucky enough to have a DBA about, you don’t need to Dunning-Kruger your way out of a slow query. |
一个专业定义的索引可能是修复慢查询所需的一切。
An expertly-defined index might be all you need to fix a slow query.
7.18. Adding indexes
@Index 注释可用于向表格添加索引:
The @Index annotation may be used to add an index to a table:
@Entity
@Table(indexes=@Index(columnList="title, year, publisher_id"))
class Book { ... }
甚至可以为编制索引的列指定排序,或者指定索引应该不区分大小写:
It’s even possible to specify an ordering for an indexed column, or that the index should be case-insensitive:
@Entity
@Table(indexes=@Index(columnList="(lower(title)), year desc, publisher_id"))
class Book { ... }
这使我们可以为特定查询创建自定义索引。
This lets us create a customized index for a particular query.
请注意,诸如 lower(title) 的 SQL 表达式必须用括号括起来放在索引定义的 columnList 中。
Note that SQL expressions like lower(title) must be enclosed in parentheses in the columnList of the index definition.
对于 Java 代码的注释中是否应包含有关索引的信息这一点尚不确定。索引通常由数据库管理员维护和修改,理想情况下是由精通优化特定 RDBMS 性能的专家进行维护和修改。因此,最好将索引定义保留在 SQL DDL 脚本中,以便 DBA 轻松读取和修改。 Remember 中,我们可以使用属性 javax.persistence.schema-generation.create-script-source 要求 Hibernate 执行 DDL 脚本。 |
It’s not clear that information about indexes belongs in annotations of Java code. Indexes are usually maintained and modified by a database administrator, ideally by an expert in tuning the performance of one particular RDBMS. So it might be better to keep the definition of indexes in a SQL DDL script that your DBA can easily read and modify. Remember, we can ask Hibernate to execute a DDL script using the property javax.persistence.schema-generation.create-script-source. |
7.19. Dealing with denormalized data
一个规范化架构中的典型关系数据库表只包含少数列,因此,有选择地查询列并只填充实体类的某些字段几乎没有好处。
A typical relational database table in a well-normalized schema has a relatively small number of columns, and so there’s little to be gained by selectively querying columns and populating only certain fields of an entity class.
但是偶尔,我们听到有人询问如何映射有数百列或更多列的表!这种情况可能在以下情况下出现:
But occasionally, we hear from someone asking how to map a table with a hundred columns or more! This situation can arise when:
-
data is intentionally denormalized for performance,
-
the results of a complicated analytic query are exposed via a view, or
-
someone has done something crazy and wrong.
我们假设我们正在_not_处理最后的可能性。那么我们希望能够查询该巨型表,而不返回其所有列。乍一看,Hibernate 并不能为该问题提供完美的瓶装解决方案。这种第一印象是误导。实际上,Hibernate 提供了多种应对这种情况的方法,而真正的问题在于如何在这些方法之间进行选择。我们可以:
Let’s suppose that we’re not dealing with the last possibility. Then we would like to be able to query the monster table without returning all of its columns. At first glance, Hibernate doesn’t offer a perfect bottled solution to this problem. This first impression is misleading. Actually, Hibernate features more than one way to deal with this situation, and the real problem is deciding between the ways. We could:
-
map multiple entity classes to the same table or view, being careful about "overlaps" where a mutable column is mapped to more than one of the entities,
-
use HQL or native SQL queries returning results into record types instead of retrieving entity instances, or
-
use the bytecode enhancer and @LazyGroup for attribute-level lazy fetching.
其他一些 ORM 解决方案将第三个选项作为处理巨型表的推荐方法,但这一直不是 Hibernate 团队或 Hibernate 社区的偏好。使用前两个选项中的一个更加类型安全。
Some other ORM solutions push the third option as the recommended way to handle huge tables, but this has never been the preference of the Hibernate team or Hibernate community. It’s much more typesafe to use one of the first two options.
7.20. Reactive programming with Hibernate
最后,现在许多需要高可扩展性的系统都会使用反应式编程和反应式流。 Hibernate Reactive 将 O/R 映射引入反应式编程的世界。你可以从其 Reference Documentation 了解更多有关 Hibernate Reactive 的信息。
Finally, many systems which require high scalability now make use of reactive programming and reactive streams. Hibernate Reactive brings O/R mapping to the world of reactive programming. You can learn much more about Hibernate Reactive from its Reference Documentation.
Hibernate Reactive 可以与同一程序中的原生 Hibernate 一起使用,并且可以重复使用相同的实体类。这意味着您可以在需要的地方使用响应式编程模型,或许仅仅在您系统中的一个或两个位置。您不必使用响应式流重写整个程序。 |
Hibernate Reactive may be used alongside vanilla Hibernate in the same program, and can reuse the same entity classes. This means you can use the reactive programming model exactly where you need it—perhaps only in one or two places in your system. You don’t need to rewrite your whole program using reactive streams. |