Schema Management

Apache Cassandra 是一个在任何数据交互之前需要模式定义的数据存储。Spring Data for Apache Cassandra 可以用模式创建来支持你。

Apache Cassandra is a data store that requires a schema definition prior to any data interaction. Spring Data for Apache Cassandra can support you with schema creation.

Keyspaces and Lifecycle Scripts

首先从一个 Cassandra 键空间开始。键空间是共享相同副本因子和副本策略的表的逻辑分组。键空间管理位于 CqlSession 配置中,其中有 KeyspaceSpecification 以及启动和关闭 CQL 脚本执行。

The first thing to start with is a Cassandra keyspace. A keyspace is a logical grouping of tables that share the same replication factor and replication strategy. Keyspace management is located in the CqlSession configuration, which has the KeyspaceSpecification and startup and shutdown CQL script execution.

使用规范声明一个键空间允许创建和删除 Keyspace。它从规范中获取 CQL,这样你就不用自己编写 CQL。以下示例使用 XML 指定了一个 Cassandra 键空间:

Declaring a keyspace with a specification allows creating and dropping of the Keyspace. It derives CQL from the specification so that you need not write CQL yourself. The following example specifies a Cassandra keyspace by using XML:

Example 1. Specifying a Cassandra keyspace
Java
/*
 * Copyright 2020-2024 the original author or authors.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *      https://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package org.springframework.data.cassandra.example;

import java.util.Arrays;
import java.util.List;

import org.springframework.beans.factory.BeanClassLoaderAware;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.cassandra.config.AbstractCassandraConfiguration;
import org.springframework.data.cassandra.core.cql.keyspace.CreateKeyspaceSpecification;
import org.springframework.data.cassandra.core.cql.keyspace.DataCenterReplication;
import org.springframework.data.cassandra.core.cql.keyspace.DropKeyspaceSpecification;
import org.springframework.data.cassandra.core.cql.keyspace.KeyspaceOption;

// tag::class[]
@Configuration
public class CreateKeyspaceConfiguration extends AbstractCassandraConfiguration implements BeanClassLoaderAware {

	@Override
	protected List<CreateKeyspaceSpecification> getKeyspaceCreations() {

		CreateKeyspaceSpecification specification = CreateKeyspaceSpecification.createKeyspace("my_keyspace")
				.with(KeyspaceOption.DURABLE_WRITES, true)
				.withNetworkReplication(DataCenterReplication.of("foo", 1), DataCenterReplication.of("bar", 2));

		return Arrays.asList(specification);
	}

	@Override
	protected List<DropKeyspaceSpecification> getKeyspaceDrops() {
		return Arrays.asList(DropKeyspaceSpecification.dropKeyspace("my_keyspace"));
	}

	// ...
	// end::class[]
	@Override
	protected String getKeyspaceName() {
		return null;
	}
	// tag::class[]
}
// end::class[]
XML
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns:cassandra="http://www.springframework.org/schema/data/cassandra"
  xsi:schemaLocation="
    http://www.springframework.org/schema/data/cassandra
    https://www.springframework.org/schema/data/cassandra/spring-cassandra.xsd
    http://www.springframework.org/schema/beans
    https://www.springframework.org/schema/beans/spring-beans.xsd">

    <cassandra:session>

        <cassandra:keyspace action="CREATE_DROP" durable-writes="true" name="my_keyspace">
            <cassandra:replication class="NETWORK_TOPOLOGY_STRATEGY">
              <cassandra:data-center name="foo" replication-factor="1" />
              <cassandra:data-center name="bar" replication-factor="2" />
            </cassandra:replication>
      </cassandra:keyspace>

    </cassandra:session>
</beans>

密钥空间创建允许快速引导而无需外部密钥空间管理。这可能对某些场景有用,但应小心使用。在应用程序关闭时删除密钥空间会移除密钥空间以及密钥空间中表中的所有数据。

Keyspace creation allows rapid bootstrapping without the need of external keyspace management. This can be useful for certain scenarios but should be used with care. Dropping a keyspace on application shutdown removes the keyspace and all data from the tables in the keyspace.

Initializing a SessionFactory

org.springframework.data.cassandra.core.cql.session.init 包提供的支持用于初始化一个现有的 SessionFactory。你有时可能需要初始化一个在服务器上某个地方运行的键空间。

The org.springframework.data.cassandra.core.cql.session.init package provides support for initializing an existing SessionFactory. You may sometimes need to initialize a keyspace that runs on a server somewhere.

Initializing a Keyspace

你可以按照以下 Java 配置示例所示,在已配置的键空间中提供在 CqlSession 初始化和关闭时执行的任意 CQL:

You can provide arbitrary CQL that is executed on CqlSession initialization and shutdown in the configured keyspace, as the following Java configuration example shows:

Java
/*
 * Copyright 2020-2024 the original author or authors.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *      https://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package org.springframework.data.cassandra.example;

import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.ClassPathResource;
import org.springframework.data.cassandra.config.AbstractCassandraConfiguration;
import org.springframework.data.cassandra.core.cql.session.init.KeyspacePopulator;
import org.springframework.data.cassandra.core.cql.session.init.ResourceKeyspacePopulator;
import org.springframework.lang.Nullable;

// tag::class[]
@Configuration
public class KeyspacePopulatorConfiguration extends AbstractCassandraConfiguration {

	@Nullable
	@Override
	protected KeyspacePopulator keyspacePopulator() {
		return new ResourceKeyspacePopulator(new ClassPathResource("com/foo/cql/db-schema.cql"),
				new ClassPathResource("com/foo/cql/db-test-data.cql"));
	}

	@Nullable
	@Override
	protected KeyspacePopulator keyspaceCleaner() {
		return new ResourceKeyspacePopulator(scriptOf("DROP TABLE my_table;"));
	}

	// ...
	// end::class[]

	@Override
	protected String getKeyspaceName() {
		return null;
	}
	// tag::class[]
}
// end::class[]
XML
<cassandra:initialize-keyspace session-factory-ref="cassandraSessionFactory">
    <cassandra:script location="classpath:com/foo/cql/db-schema.cql"/>
    <cassandra:script location="classpath:com/foo/cql/db-test-data.cql"/>
</cassandra:initialize-keyspace>

前一个示例在键空间中运行指定的两个脚本。第一个脚本创建一个模式,而第二个脚本用一个测试数据集填充表。脚本位置也可以是模式,并且采用 Spring 中用于资源的通常 Ant 样式(例如,classpath*:/com/foo/*/cql/-data.cql)。如果你使用一个模式,则这些脚本将根据它们的 URL 或文件名按字典顺序运行。

The preceding example runs the two specified scripts against the keyspace. The first script creates a schema, and the second populates tables with a test data set. The script locations can also be patterns with wildcards in the usual Ant style used for resources in Spring (for example, classpath*:/com/foo/*/cql/-data.cql). If you use a pattern, the scripts are run in the lexical order of their URL or filename.

键空间初始化程序的默认行为是无条件运行所提供的脚本。这可能不总是你想要的——例如,如果你在已经包含测试数据的键空间上运行这些脚本。遵循先创建表然后插入数据的通用模式(之前已显示)可以减少意外删除数据的可能性。如果表已经存在,则第一步将失败。

The default behavior of the keyspace initializer is to unconditionally run the provided scripts. This may not always be what you want — for instance, if you run the scripts against a keyspace that already has test data in it. The likelihood of accidentally deleting data is reduced by following the common pattern (shown earlier) of creating the tables first and then inserting the data. The first step fails if the tables already exist.

但是,为了能够更好地控制现有数据的创建和删除,XML 命名空间提供了一些附加选项。第一个选项是一个开关,用于打开和关闭初始化。你可以根据环境设置该选项(例如,从系统属性或环境 bean 中获取一个布尔值)。以下示例从系统属性获取一个值:

However, to gain more control over the creation and deletion of existing data, the XML namespace provides a few additional options. The first is a flag to switch the initialization on and off. You can set this according to the environment (such as pulling a boolean value from system properties or from an environment bean). The following example gets a value from a system property:

<cassandra:initialize-keyspace session-factory-ref="cassandraSessionFactory"
    enabled="#{systemProperties.INITIALIZE_KEYSPACE}">    1
    <cassandra:script location="..."/>
</cassandra:initialize-database>
1 Get the value for enabled from a system property called INITIALIZE_KEYSPACE.

控制现有数据发生什么事情的第二个选项是对失败更加宽容。为此,你可以控制初始化程序忽略它从脚本执行的 CQL 中的某些错误的能力,如下例所示:

The second option to control what happens with existing data is to be more tolerant of failures. To this end, you can control the ability of the initializer to ignore certain errors in the CQL it executes from the scripts, as the following example shows:

Java
/*
 * Copyright 2020-2024 the original author or authors.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *      https://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package org.springframework.data.cassandra.example;

import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.ClassPathResource;
import org.springframework.data.cassandra.config.AbstractCassandraConfiguration;
import org.springframework.data.cassandra.core.cql.session.init.KeyspacePopulator;
import org.springframework.data.cassandra.core.cql.session.init.ResourceKeyspacePopulator;
import org.springframework.lang.Nullable;

// tag::class[]
@Configuration
public class KeyspacePopulatorFailureConfiguration extends AbstractCassandraConfiguration {

	@Nullable
	@Override
	protected KeyspacePopulator keyspacePopulator() {

		ResourceKeyspacePopulator populator = new ResourceKeyspacePopulator(
				new ClassPathResource("com/foo/cql/db-schema.cql"));

		populator.setIgnoreFailedDrops(true);

		return populator;
	}

	// ...
	// end::class[]

	@Override
	protected String getKeyspaceName() {
		return null;
	}
	// tag::class[]
}
// end::class[]
XML
<cassandra:initialize-keyspace session-factory-ref="cassandraSessionFactory" ignore-failures="DROPS">
    <cassandra:script location="..."/>
</cassandra:initialize-database>

在前一个示例中,我们说我们期望有时这些脚本在空键空间上运行,并且脚本中有一些 DROP 语句,因此会失败。因此已失败的 CQL DROP 语句将被忽略,但是其他失败将导致一个异常。如果你不想使用支持 DROP …​ IF EXISTS (或类似项),但是你想在重新创建之前无条件删除所有测试数据,则此操作非常有用。在这种情况下,第一个脚本通常是一组 DROP 语句,后面是一组 CREATE 语句。

In the preceding example, we are saying that we expect that, sometimes, the scripts are run against an empty keyspace, and there are some DROP statements in the scripts that would, therefore, fail. So failed CQL DROP statements will be ignored, but other failures will cause an exception. This is useful if you don’t want tu use support DROP …​ IF EXISTS (or similar) but you want to unconditionally remove all test data before re-creating it. In that case the first script is usually a set of DROP statements, followed by a set of CREATE statements.

ignore-failures 选项可以设置为 NONE(默认值)、DROPS(忽略失败的删除)或 ALL(忽略所有失败)。

The ignore-failures option can be set to NONE (the default), DROPS (ignore failed drops), or ALL (ignore all failures).

如果脚本中根本不包含 ; 字符,那么每条语句都应该使用 ; 或一个新建行进行分隔。你可以进行全局控制或逐个脚本控制,如下例所示:

Each statement should be separated by ; or a new line if the ; character is not present at all in the script. You can control that globally or script by script, as the following example shows:

Java
/*
 * Copyright 2020-2024 the original author or authors.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *      https://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package org.springframework.data.cassandra.example;

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.ClassPathResource;
import org.springframework.data.cassandra.SessionFactory;
import org.springframework.data.cassandra.config.AbstractCassandraConfiguration;
import org.springframework.data.cassandra.core.cql.session.init.CompositeKeyspacePopulator;
import org.springframework.data.cassandra.core.cql.session.init.ResourceKeyspacePopulator;
import org.springframework.data.cassandra.core.cql.session.init.SessionFactoryInitializer;

// tag::class[]
@Configuration
public class SessionFactoryInitializerConfiguration extends AbstractCassandraConfiguration {

	@Bean
	SessionFactoryInitializer sessionFactoryInitializer(SessionFactory sessionFactory) {

		SessionFactoryInitializer initializer = new SessionFactoryInitializer();
		initializer.setSessionFactory(sessionFactory);

		ResourceKeyspacePopulator populator1 = new ResourceKeyspacePopulator();
		populator1.setSeparator(";");
		populator1.setScripts(new ClassPathResource("com/myapp/cql/db-schema.cql"));

		ResourceKeyspacePopulator populator2 = new ResourceKeyspacePopulator();
		populator2.setSeparator("@@");
		populator2.setScripts(new ClassPathResource("classpath:com/myapp/cql/db-test-data-1.cql"), //
				new ClassPathResource("classpath:com/myapp/cql/db-test-data-2.cql"));

		initializer.setKeyspacePopulator(new CompositeKeyspacePopulator(populator1, populator2));

		return initializer;
	}

	// ...
	// end::class[]

	@Override
	protected String getKeyspaceName() {
		return null;
	}
	// tag::class[]
}
// end::class[]
XML
<cassandra:initialize-keyspace session-factory-ref="cassandraSessionFactory" separator="@@">
    <cassandra:script location="classpath:com/myapp/cql/db-schema.cql" separator=";"/>
    <cassandra:script location="classpath:com/myapp/cql/db-test-data-1.cql"/>
    <cassandra:script location="classpath:com/myapp/cql/db-test-data-2.cql"/>
</cassandra:initialize-keyspace>

在这个示例中,两个 test-data 脚本使用 @@ 作为语句分隔符,而只有 db-schema.cql 使用 ;。该配置指定默认分隔符是 @@,并为 db-schema 脚本覆盖该默认值。

In this example, the two test-data scripts use @@ as statement separator and only the db-schema.cql uses ;. This configuration specifies that the default separator is @@ and overrides that default for the db-schema script.

如果你需要比从 XML 命名空间获得的更多控制,则可以使用 SessionFactoryInitializer 并将其定义为应用程序中的组件。

If you need more control than you get from the XML namespace, you can use the SessionFactoryInitializer directly and define it as a component in your application.

Initialization of Other Components that Depend on the Keyspace

很大一部分应用程序(在 Spring 上下文开始后才使用数据库的应用程序)可以使用数据库初始化器,而无需进一步进行复杂化。如果你的应用程序不属于这些应用程序,则你可能需要阅读本节的其余部分。

A large class of applications (those that do not use the database until after the Spring context has started) can use the database initializer with no further complications. If your application is not one of those, you might need to read the rest of this section.

数据库初始化器依赖于一个 SessionFactory 实例,并运行其初始化回调中提供的脚本(与 XML bean 定义中的 init-method、组件中的 @PostConstruct 方法或实现 InitializingBean 的组件中的 afterPropertiesSet() 方法类似)。如果其他 bean 依赖于相同的数据源并使用初始化回调中的会话工厂,则可能会出问题,因为数据尚未初始化。一个常见的例子是一个在应用程序启动时立即初始化并从数据库加载数据的缓存。

The database initializer depends on a SessionFactory instance and runs the scripts provided in its initialization callback (analogous to an init-method in an XML bean definition, a @PostConstruct method in a component, or the afterPropertiesSet() method in a component that implements InitializingBean). If other beans depend on the same data source and use the session factory in an initialization callback, there might be a problem because the data has not yet been initialized. A common example of this is a cache that initializes eagerly and loads data from the database on application startup.

为了解决这个问题,你有两个选择:将缓存初始化策略更改为较晚的阶段,或确保键空间初始化器首先初始化。

To get around this issue, you have two options: change your cache initialization strategy to a later phase or ensure that the keyspace initializer is initialized first.

如果你控制应用程序并且没有其他方式,则更改缓存初始化策略可能会很容易。以下列出了如何实现此目的的一些建议:

Changing your cache initialization strategy might be easy if the application is in your control and not otherwise. Some suggestions for how to implement this include:

  • Make the cache initialize lazily on first usage, which improves application startup time.

  • Have your cache or a separate component that initializes the cache implement Lifecycle or SmartLifecycle. When the application context starts, you can automatically start a SmartLifecycle by setting its autoStartup flag, and you can manually start a Lifecycle by calling ConfigurableApplicationContext.start() on the enclosing context.

  • Use a Spring ApplicationEvent or similar custom observer mechanism to trigger the cache initialization. ContextRefreshedEvent is always published by the context when it is ready for use (after all beans have been initialized), so that is often a useful hook (this is how the SmartLifecycle works by default).

确保密钥空间初始化程序首先进行初始化也很容易。有关如何实现此操作的一些建议包括:

Ensuring that the keyspace initializer is initialized first can also be easy. Some suggestions on how to implement this include:

  • Rely on the default behavior of the Spring BeanFactory, which is that beans are initialized in registration order. You can easily arrange that by adopting the common practice of a set of <import/> elements in XML configuration that order your application modules and ensuring that the database and database initialization are listed first.

  • Separate the SessionFactory and the business components that use it and control their startup order by putting them in separate ApplicationContext instances (for example, the parent context contains the SessionFactory, and the child context contains the business components). This structure is common in Spring web applications but can be more generally applied.

  • Use the Schema management for Tables and User-defined Types to initialize the keyspace using Spring Data Cassandra’s built-in schema generator.

Tables and User-defined Types

面向 Apache Cassandra 的 Spring Data 使用与数据模型相符的映射实体类来访问数据。你可以使用这些实体类来创建 Cassandra 表格规范和用户类型定义。

Spring Data for Apache Cassandra approaches data access with mapped entity classes that fit your data model. You can use these entity classes to create Cassandra table specifications and user type definitions.

模式创建通过 SchemaAction 绑定到 CqlSession 初始化。支持以下操作:

Schema creation is tied to CqlSession initialization by SchemaAction. The following actions are supported:

  • SchemaAction.NONE: No tables or types are created or dropped. This is the default setting.

  • SchemaAction.CREATE: Create tables, indexes, and user-defined types from entities annotated with @Table and types annotated with @UserDefinedType. Existing tables or types cause an error if you tried to create the type.

  • SchemaAction.CREATE_IF_NOT_EXISTS: Like SchemaAction.CREATE but with IF NOT EXISTS applied. Existing tables or types do not cause any errors but may remain stale.

  • SchemaAction.RECREATE: Drops and recreates existing tables and types that are known to be used. Tables and types that are not configured in the application are not dropped.

  • SchemaAction.RECREATE_DROP_UNUSED: Drops all tables and types and recreates only known tables and types.

SchemaAction.RECREATESchemaAction.RECREATE_DROP_UNUSED 会删除表并丢失所有数据。RECREATE_DROP_UNUSED 还会删除应用程序不知道的表和类型。

SchemaAction.RECREATE and SchemaAction.RECREATE_DROP_UNUSED drop your tables and lose all data. RECREATE_DROP_UNUSED also drops tables and types that are not known to the application.

Enabling Tables and User-Defined Types for Schema Management

Metadata-based Mapping解释了具有约定和注释的对象映射。为防止创建不需要的类作为表或类型,架构管理仅对使用 `@Table`注释的实体和使用 `@UserDefinedType`注释的用户定义类型处于活动状态。实体是通过扫描类路径发现的。实体扫描需要一个或多个基本包。使用 `TupleValue`的元组类型列不提供任何类型化详细信息。因此,您必须使用 `@CassandraType(type = TUPLE, typeArguments = …)`注释此类列属性以指定所需的列类型。

Metadata-based Mapping explains object mapping with conventions and annotations. To prevent unwanted classes from being created as a table or a type, schema management is only active for entities annotated with @Table and user-defined types annotated with @UserDefinedType. Entities are discovered by scanning the classpath. Entity scanning requires one or more base packages. Tuple-typed columns that use TupleValue do not provide any typing details. Consequently, you must annotate such column properties with @CassandraType(type = TUPLE, typeArguments = …) to specify the desired column type.

以下示例展示如何在 XML 配置中指定实体基本包:

The following example shows how to specify entity base packages in XML configuration:

Example 2. Specifying entity base packages
Java
/*
 * Copyright 2020-2024 the original author or authors.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *      https://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package org.springframework.data.cassandra.example;

import org.springframework.context.annotation.Configuration;
import org.springframework.data.cassandra.config.AbstractCassandraConfiguration;

// tag::class[]
@Configuration
public class EntityBasePackagesConfiguration extends AbstractCassandraConfiguration {

	@Override
	public String[] getEntityBasePackages() {
		return new String[] { "com.foo", "com.bar" };
	}

	// ...
	// end::class[]

	@Override
	protected String getKeyspaceName() {
		return null;
	}
	// tag::class[]
}
// end::class[]
XML
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns:cassandra="http://www.springframework.org/schema/data/cassandra"
  xsi:schemaLocation="
    http://www.springframework.org/schema/data/cassandra
    https://www.springframework.org/schema/data/cassandra/spring-cassandra.xsd
    http://www.springframework.org/schema/beans
    https://www.springframework.org/schema/beans/spring-beans.xsd">

    <cassandra:mapping entity-base-packages="com.foo,com.bar"/>
</beans>