Elasticsearch Object Mapping

Spring Data Elasticsearch 对象映射是将 Java 对象(领域实体)映射到存储在 Elasticsearch 中的 JSON 表示形式并返回的过程。用于此映射的类在内部是 MappingElasticsearchConverter

Spring Data Elasticsearch Object Mapping is the process that maps a Java object - the domain entity - into the JSON representation that is stored in Elasticsearch and back. The class that is internally used for this mapping is the MappingElasticsearchConverter.

Meta Model Object Mapping

基于元模型的方法使用领域类型信息从 Elasticsearch 中读取/写入。这允许为特定领域类型映射注册“转换器”实例。

The Metamodel based approach uses domain type information for reading/writing from/to Elasticsearch. This allows to register Converter instances for specific domain type mapping.

Mapping Annotation Overview

MappingElasticsearchConverter 使用元数据来推动将对象映射到文档。元数据来自实体的属性,这些属性可以添加注释。

The MappingElasticsearchConverter uses metadata to drive the mapping of objects to documents. The metadata is taken from the entity’s properties which can be annotated.

以下注释可用:

The following annotations are available:

  • @Document: Applied at the class level to indicate this class is a candidate for mapping to the database. The most important attributes are (check the API documentation for the complete list of attributes):

    • indexName: the name of the index to store this entity in. This can contain a SpEL template expression like "log-#{T(java.time.LocalDate).now().toString()}"

    • createIndex: flag whether to create an index on repository bootstrapping. Default value is true. See Automatic creation of indices with the corresponding mapping

  • @Id: Applied at the field level to mark the field used for identity purpose.

  • @Transient, @ReadOnlyProperty, @WriteOnlyProperty: see the following section Controlling which properties are written to and read from Elasticsearch for detailed information.

  • @PersistenceConstructor: Marks a given constructor - even a package protected one - to use when instantiating the object from the database. Constructor arguments are mapped by name to the key values in the retrieved Document.

  • @Field: Applied at the field level and defines properties of the field, most of the attributes map to the respective Elasticsearch Mapping definitions (the following list is not complete, check the annotation Javadoc for a complete reference):

    • name: The name of the field as it will be represented in the Elasticsearch document, if not set, the Java field name is used.

    • type: The field type, can be one of Text, Keyword, Long, Integer, Short, Byte, Double, Float, Half_Float, Scaled_Float, Date, Date_Nanos, Boolean, Binary, Integer_Range, Float_Range, Long_Range, Double_Range, Date_Range, Ip_Range, Object, Nested, Ip, TokenCount, Percolator, Flattened, Search_As_You_Type. See Elasticsearch Mapping Types. If the field type is not specified, it defaults to FieldType.Auto. This means, that no mapping entry is written for the property and that Elasticsearch will add a mapping entry dynamically when the first data for this property is stored (check the Elasticsearch documentation for dynamic mapping rules).

    • format: One or more built-in date formats, see the next section Date format mapping.

    • pattern: One or more custom date formats, see the next section Date format mapping.

    • store: Flag whether the original field value should be store in Elasticsearch, default value is false.

    • analyzer, searchAnalyzer, normalizer for specifying custom analyzers and normalizer.

  • @GeoPoint: Marks a field as geo_point datatype. Can be omitted if the field is an instance of the GeoPoint class.

  • @ValueConverter defines a class to be used to convert the given property. In difference to a registered Spring Converter this only converts the annotated property and not every property of the given type.

映射元数据基础设施定义在一个独立的 spring-data-commons 项目中,该项目与技术无关。

The mapping metadata infrastructure is defined in a separate spring-data-commons project that is technology agnostic.

Controlling which properties are written to and read from Elasticsearch

本节详细介绍定义是否将属性值写入或从 Elasticsearch 中读取的注释。

This section details the annotations that define if the value of a property is written to or read from Elasticsearch.

@Transient:使用此注释注释的属性不会写入映射,其值不会发送到 Elasticsearch,并且当从 Elasticsearch 返回文档时,此属性不会设置在结果实体中。

@Transient: A property annotated with this annotation will not be written to the mapping, it’s value will not be sent to Elasticsearch and when documents are returned from Elasticsearch, this property will not be set in the resulting entity.

@ReadOnlyProperty:使用此注释的属性不会将其值写入 Elasticsearch,但在返回数据时,属性会填充 Elasticsearch 中文档中返回的值。一个用例是索引映射中定义的运行时字段。

@ReadOnlyProperty: A property with this annotation will not have its value written to Elasticsearch, but when returning data, the property will be filled with the value returned in the document from Elasticsearch. One use case for this are runtime fields defined in the index mapping.

@WriteOnlyProperty:使用此注释的属性将在 Elasticsearch 中存储其值,但读取文档时不会使用任何值对其进行设置。例如,这可用于综合字段,这些字段应进入 Elasticsearch 索引,但不会在其他地方使用。

@WriteOnlyProperty: A property with this annotation will have its value stored in Elasticsearch but will not be set with any value when reading document. This can be used for example for synthesized fields which should go into the Elasticsearch index but are not used elsewhere.

Date format mapping

源自 TemporalAccessor 或类型为 java.util.Date 的属性必须具有类型为 FieldType.Date@Field 注释,或者必须为该类型注册自定义转换器。本段描述了 FieldType.Date 的用法。

Properties that derive from TemporalAccessor or are of type java.util.Date must either have a @Field annotation of type FieldType.Date or a custom converter must be registered for this type. This paragraph describes the use of FieldType.Date.

`@Field`注解有两个属性,它们定义将哪些日期格式信息写入映射(另请参见 Elasticsearch Built In FormatsElasticsearch Custom Date Formats)。

There are two attributes of the @Field annotation that define which date format information is written to the mapping (also see Elasticsearch Built In Formats and Elasticsearch Custom Date Formats)

format 属性用于定义至少一种预定义的格式。如果没有定义,那么将使用 _date_optional_timeepoch_millis 的默认值。

The format attribute is used to define at least one of the predefined formats. If it is not defined, then a default value of _date_optional_time and epoch_millis is used.

pattern 属性可用于添加其他自定义格式字符串。如果您只希望使用自定义日期格式,则必须将 format 属性设置为一个空集 {}

The pattern attribute can be used to add additional custom format strings. If you want to use only custom date formats, you must set the format property to empty {}.

下表显示了不同的属性以及从其值创建的映射:

The following table shows the different attributes and the mapping created from their values:

annotation format string in Elasticsearch mapping

@Field(type=FieldType.Date)

"date_optional_time

epoch_millis",

@Field(type=FieldType.Date, format=DateFormat.basic_date)

"basic_date"

@Field(type=FieldType.Date, format={DateFormat.basic_date, DateFormat.basic_time})

"basic_date

basic_time"

@Field(type=FieldType.Date, pattern="dd.MM.uuuu")

"date_optional_time

epoch_millis

dd.MM.uuuu",

@Field(type=FieldType.Date, format={}, pattern="dd.MM.uuuu")

"dd.MM.uuuu"

如果您使用的是自定义日期格式,则需要对年份使用 uuuu 而不是 yyyy。这是由于一个 change in Elasticsearch 7

If you are using a custom date format, you need to use uuuu for the year instead of yyyy. This is due to a change in Elasticsearch 7.

查看 org.springframework.data.elasticsearch.annotations.DateFormat 枚举的代码,以获取预定义值及其模式的完整列表。

Check the code of the org.springframework.data.elasticsearch.annotations.DateFormat enum for a complete list of predefined values and their patterns.

Range types

当字段添加了 Integer_Range、Float_Range、Long_Range、Double_Range、Date_Range、Ip_Range 中一种类型的注释时,该字段必须是一个类的实例,该类将映射到一个 Elasticsearch 范围,例如:

When a field is annotated with a type of one of Integer_Range, Float_Range, Long_Range, Double_Range, Date_Range, or Ip_Range the field must be an instance of a class that will be mapped to an Elasticsearch range, for example:

class SomePersonData {

    @Field(type = FieldType.Integer_Range)
    private ValidAge validAge;

    // getter and setter
}

class ValidAge {
    @Field(name="gte")
    private Integer from;

    @Field(name="lte")
    private Integer to;

    // getter and setter
}

作为 Spring Data Elasticsearch 提供的替代方案,Range<T> 类允许将前面的例子写成:

As an alternative Spring Data Elasticsearch provides a Range<T> class so that the previous example can be written as:

class SomePersonData {

    @Field(type = FieldType.Integer_Range)
    private Range<Integer> validAge;

    // getter and setter
}

<T> 类型支持的类是 IntegerLongFloatDoubleDate 以及实现了`TemporalAccessor` 接口的类。

Supported classes for the type <T> are Integer, Long, Float, Double, Date and classes that implement the TemporalAccessor interface.

Mapped field names

在没有进一步配置的情况下,Spring Data Elasticsearch 会将对象的属性名称用作 Elasticsearch 中的字段名称。这可以通过对该属性使用 @Field 注释来对各个字段进行更改。

Without further configuration, Spring Data Elasticsearch will use the property name of an object as field name in Elasticsearch. This can be changed for individual field by using the @Field annotation on that property.

也可以在客户端的配置中定义 FieldNamingStrategy (Elasticsearch Clients)。例如,如果配置了 SnakeCaseFieldNamingStrategy,则对象 sampleProperty 将被映射到 Elasticsearch 中的 sample_propertyFieldNamingStrategy 适用于所有实体;可以通过在属性上使用 @Field 设置特定名称来覆盖它。

It is also possible to define a FieldNamingStrategy in the configuration of the client (Elasticsearch Clients). If for example a SnakeCaseFieldNamingStrategy is configured, the property sampleProperty of the object would be mapped to sample_property in Elasticsearch. A FieldNamingStrategy applies to all entities; it can be overwritten by setting a specific name with @Field on a property.

Non-field-backed properties

通常,实体中使用的属性是实体类的字段。在某些情况下,属性值是在实体中计算的,并且应存储在 Elasticsearch 中。在这种情况下,getter 方法 (getProperty() 除了必须使用 @AccessType(AccessType.Type.PROPERTY) 注释外,还可以使用 @Field 注释。在此类情况下所需的第三个注释是 @WriteOnlyProperty,因为这样值才会写入 Elasticsearch。一个完整的示例:

Normally the properties used in an entity are fields of the entity class. There might be cases, when a property value is calculated in the entity and should be stored in Elasticsearch. In this case, the getter method (getProperty()) can be annotated with the @Field annotation, in addition to that the method must be annotated with @AccessType(AccessType.Type .PROPERTY). The third annotation that is needed in such a case is @WriteOnlyProperty, as such a value is only written to Elasticsearch. A full example:

@Field(type = Keyword)
@WriteOnlyProperty
@AccessType(AccessType.Type.PROPERTY)
public String getProperty() {
	return "some value that is calculated here";
}

Other property annotations

@IndexedIndexName

此注释可以设置在实体的 String 属性上。此属性将不会写入映射,它将不会存储在 Elasticsearch 中,并且它的值将不会从 Elasticsearch 文档中读取。在持久实体后,例如通过对 ElasticsearchOperations.save(T entity) 的调用,从该调用返回的实体将包含该属性中保存实体的索引名称。当索引名称由 Bean 动态设置时,或者当写入只写别名时,这一点非常有用。

This annotation can be set on a String property of an entity. This property will not be written to the mapping, it will not be stored in Elasticsearch and its value will not be read from an Elasticsearch document. After an entity is persisted, for example with a call to ElasticsearchOperations.save(T entity), the entity returned from that call will contain the name of the index that an entity was saved to in that property. This is useful when the index name is dynamically set by a bean, or when writing to a write alias.

将某些值放入这样的属性中不会设置实体存储其中的索引!

Putting some value into such a property does not set the index into which an entity is stored!

Mapping Rules

Type Hints

映射使用嵌入在发送到服务器的文档中的 类型提示 来允许通用类型映射。这些类型提示在文档中表示为 _class 属性,并且为每个聚合根编写。

Mapping uses type hints embedded in the document sent to the server to allow generic type mapping. Those type hints are represented as _class attributes within the document and are written for each aggregate root.

Example 1. Type Hints
public class Person {              1
  @Id String id;
  String firstname;
  String lastname;
}
{
  "_class" : "com.example.Person", 1
  "id" : "cb7bef",
  "firstname" : "Sarah",
  "lastname" : "Connor"
}
1 By default the domain types class name is used for the type hint.

可以配置类型提示以保存自定义信息。为此,请使用 @TypeAlias 注释。

Type hints can be configured to hold custom information. Use the @TypeAlias annotation to do so.

确保将 @TypeAlias 类型添加到初始实体集中 (AbstractElasticsearchConfiguration#getInitialEntitySet),以便在首次从存储中读取数据时已掌握实体信息。

Make sure to add types with @TypeAlias to the initial entity set (AbstractElasticsearchConfiguration#getInitialEntitySet) to already have entity information available when first reading data from the store.

Example 2. Type Hints with Alias
@TypeAlias("human")                1
public class Person {

  @Id String id;
  // ...
}
{
  "_class" : "human",              1
  "id" : ...
}
1 The configured alias is used when writing the entity.

以下情况下不会为嵌套对象编写类型提示:属性类型不是 Object,接口或实际值类型与属性声明不匹配。

Type hints will not be written for nested Objects unless the properties type is Object, an interface or the actual value type does not match the properties declaration.

Disabling Type Hints

当应该使用的索引已经存在、在它的映射中没有定义类型提示并且映射模式设置为严格时,可能有必要禁用类型提示的写入。在这种情况下,编写类型提示将产生错误,因为无法自动添加字段。

It may be necessary to disable writing of type hints when the index that should be used already exists without having the type hints defined in its mapping and with the mapping mode set to strict. In this case, writing the type hint will produce an error, as the field cannot be added automatically.

可以通过在从 AbstractElasticsearchConfiguration 派生的配置类中重写方法 writeTypeHints() 为整个应用程序禁用类型提示(请参阅 Elasticsearch Clients)。

Type hints can be disabled for the whole application by overriding the method writeTypeHints() in a configuration class derived from AbstractElasticsearchConfiguration (see Elasticsearch Clients).

作为一种替代方案,可以使用 @Document 注释为单个索引禁用它们:

As an alternative they can be disabled for a single index with the @Document annotation:

@Document(indexName = "index", writeTypeHint = WriteTypeHint.FALSE)

我们强烈建议不要禁用类型提示。只有在必要情况下才执行此操作。禁用类型提示可能导致 Elasticsearch 在多态数据情况下无法正确检索文档,或者文档检索可能完全失败。

We strongly advise against disabling Type Hints. Only do this if you are forced to. Disabling type hints can lead to documents not being retrieved correctly from Elasticsearch in case of polymorphic data or document retrieval may fail completely.

Geospatial Types

PointGeoPoint 等地理空间类型被转换为 lat/lon 对。

Geospatial types like Point & GeoPoint are converted into lat/lon pairs.

Example 3. Geospatial types
public class Address {
  String city, street;
  Point location;
}
{
  "city" : "Los Angeles",
  "street" : "2800 East Observatory Road",
  "location" : { "lat" : 34.118347, "lon" : -118.3026284 }
}

GeoJson Types

Spring Data Elasticsearch通过提供界面`GeoJson`以及针对不同几何的不同实现支持GeoJson类型。它们根据GeoJson规范映射到Elasticsearch文档。当写入索引映射时,将实体的相应属性在索引映射中指定为`geo_shape`。(也请检查 Elasticsearch documentation

Spring Data Elasticsearch supports the GeoJson types by providing an interface GeoJson and implementations for the different geometries. They are mapped to Elasticsearch documents according to the GeoJson specification. The corresponding properties of the entity are specified in the index mappings as geo_shape when the index mappings is written. (check the Elasticsearch documentation as well)

Example 4. GeoJson types
public class Address {

  String city, street;
  GeoJsonPoint location;
}
{
  "city": "Los Angeles",
  "street": "2800 East Observatory Road",
  "location": {
    "type": "Point",
    "coordinates": [-118.3026284, 34.118347]
  }
}

已经实现了以下 GeoJson 类型:

The following GeoJson types are implemented:

  • GeoJsonPoint

  • GeoJsonMultiPoint

  • GeoJsonLineString

  • GeoJsonMultiLineString

  • GeoJsonPolygon

  • GeoJsonMultiPolygon

  • GeoJsonGeometryCollection

Collections

对于集合内的值,当涉及到_type hints_和Custom Conversions时,应用与聚合根相同的映射规则。

For values inside Collections apply the same mapping rules as for aggregate roots when it comes to type hints and Custom Conversions.

Example 5. Collections
public class Person {

  // ...

  List<Person> friends;

}
{
  // ...

  "friends" : [ { "firstname" : "Kyle", "lastname" : "Reese" } ]
}

Maps

对于映射内的值,当涉及到_type hints_和Custom Conversions时,应用与聚合根相同的映射规则。但是,映射键需要是字符串才能被Elasticsearch处理。

For values inside Maps apply the same mapping rules as for aggregate roots when it comes to type hints and Custom Conversions. However the Map key needs to a String to be processed by Elasticsearch.

Example 6. Collections
public class Person {

  // ...

  Map<String, Address> knownLocations;

}
{
  // ...

  "knownLocations" : {
    "arrivedAt" : {
       "city" : "Los Angeles",
       "street" : "2800 East Observatory Road",
       "location" : { "lat" : 34.118347, "lon" : -118.3026284 }
     }
  }
}

Custom Conversions

通过 previous section ElasticsearchCustomConversions 中的 Configuration,可以注册特定规则,以映射域和简单类型。

Looking at the Configuration from the previous section ElasticsearchCustomConversions allows registering specific rules for mapping domain and simple types.

Example 7. Meta Model Object Mapping Configuration
@Configuration
public class Config extends ElasticsearchConfiguration  {

	@NonNull
	@Override
	public ClientConfiguration clientConfiguration() {
		return ClientConfiguration.builder() //
				.connectedTo("localhost:9200") //
				.build();
	}

  @Bean
  @Override
  public ElasticsearchCustomConversions elasticsearchCustomConversions() {
    return new ElasticsearchCustomConversions(
      Arrays.asList(new AddressToMap(), new MapToAddress()));       1
  }

  @WritingConverter                                                 2
  static class AddressToMap implements Converter<Address, Map<String, Object>> {

    @Override
    public Map<String, Object> convert(Address source) {

      LinkedHashMap<String, Object> target = new LinkedHashMap<>();
      target.put("ciudad", source.getCity());
      // ...

      return target;
    }
  }

  @ReadingConverter                                                 3
  static class MapToAddress implements Converter<Map<String, Object>, Address> {

    @Override
    public Address convert(Map<String, Object> source) {

      // ...
      return address;
    }
  }
}
{
  "ciudad" : "Los Angeles",
  "calle" : "2800 East Observatory Road",
  "localidad" : { "lat" : 34.118347, "lon" : -118.3026284 }
}
1 Add Converter implementations.
2 Set up the Converter used for writing DomainType to Elasticsearch.
3 Set up the Converter used for reading DomainType from search result.