Configuring a Step for Restart

在 "`Configuring and Running a Job"部分中讨论了重新启动`Job。重新启动对步骤有许多影响,因此可能需要一些特定的配置。

In the “Configuring and Running a Job” section , restarting a Job was discussed. Restart has numerous impacts on steps, and, consequently, may require some specific configuration.

Setting a Start Limit

在许多情况下,您可能需要控制 Step 可以启动的次数。例如,您可能需要配置一个特定的 Step,使其仅运行一次,因为它会使某些资源无效,这些资源必须在再次运行之前手动修复此问题。这是在 step 级别配置的,因为不同的 step 可能有不同的要求。仅执行一次的 Step 可以作为同一个 Job 的一部分存在,而另一个 Step 可以无限运行。

There are many scenarios where you may want to control the number of times a Step can be started. For example, you might need to configure a particular Step might so that it runs only once because it invalidates some resource that must be fixed manually before it can be run again. This is configurable on the step level, since different steps may have different requirements. A Step that can be executed only once can exist as part of the same Job as a Step that can be run infinitely.

Java

以下代码片段展示了 Java 中启动限制配置的示例:

The following code fragment shows an example of a start limit configuration in Java:

Java Configuration
@Bean
public Step step1(JobRepository jobRepository, PlatformTransactionManager transactionManager) {
	return new StepBuilder("step1", jobRepository)
				.<String, String>chunk(10, transactionManager)
				.reader(itemReader())
				.writer(itemWriter())
				.startLimit(1)
				.build();
}
XML

以下代码片段展示了 XML 中启动限制配置的示例:

The following code fragment shows an example of a start limit configuration in XML:

XML Configuration
<step id="step1">
    <tasklet start-limit="1">
        <chunk reader="itemReader" writer="itemWriter" commit-interval="10"/>
    </tasklet>
</step>

前一个示例中显示的步骤只能运行一次。尝试再次运行它将导致抛出 StartLimitExceededException。请注意,启动限制的默认值为 Integer.MAX_VALUE

The step shown in the preceding example can be run only once. Attempting to run it again causes a StartLimitExceededException to be thrown. Note that the default value for the start-limit is Integer.MAX_VALUE.

Restarting a Completed Step

对于可重新启动的作业,可能存在一个或多个步骤,无论它们是否第一次成功运行,都应该始终运行。一个示例可能是验证步骤或在处理之前清理资源的 Step。在重新启动作业的正常处理过程中,将跳过具有 COMPLETED 状态的任何步骤(这意味着它已经成功完成)。将 allow-start-if-complete 设置为 true 会替代此设置,以便始终运行该步骤。

In the case of a restartable job, there may be one or more steps that should always be run, regardless of whether or not they were successful the first time. An example might be a validation step or a Step that cleans up resources before processing. During normal processing of a restarted job, any step with a status of COMPLETED (meaning it has already been completed successfully), is skipped. Setting allow-start-if-complete to true overrides this so that the step always runs.

Java

以下代码片段展示了如何在 Java 中定义可重新启动的作业:

The following code fragment shows how to define a restartable job in Java:

Java Configuration
@Bean
public Step step1(JobRepository jobRepository, PlatformTransactionManager transactionManager) {
	return new StepBuilder("step1", jobRepository)
				.<String, String>chunk(10, transactionManager)
				.reader(itemReader())
				.writer(itemWriter())
				.allowStartIfComplete(true)
				.build();
}
XML

以下代码片段展示了如何在 XML 中定义可重新启动的作业:

The following code fragment shows how to define a restartable job in XML:

XML Configuration
<step id="step1">
    <tasklet allow-start-if-complete="true">
        <chunk reader="itemReader" writer="itemWriter" commit-interval="10"/>
    </tasklet>
</step>

Step Restart Configuration Example

Java

以下 Java 示例展示了如何配置作业,以便有可以重新启动的步骤:

The following Java example shows how to configure a job to have steps that can be restarted:

Java Configuration
@Bean
public Job footballJob(JobRepository jobRepository, Step playerLoad, Step gameLoad, Step playerSummarization) {
	return new JobBuilder("footballJob", jobRepository)
				.start(playerLoad)
				.next(gameLoad)
				.next(playerSummarization)
				.build();
}

@Bean
public Step playerLoad(JobRepository jobRepository, PlatformTransactionManager transactionManager) {
	return new StepBuilder("playerLoad", jobRepository)
			.<String, String>chunk(10, transactionManager)
			.reader(playerFileItemReader())
			.writer(playerWriter())
			.build();
}

@Bean
public Step gameLoad(JobRepository jobRepository, PlatformTransactionManager transactionManager) {
	return new StepBuilder("gameLoad", jobRepository)
			.allowStartIfComplete(true)
			.<String, String>chunk(10, transactionManager)
			.reader(gameFileItemReader())
			.writer(gameWriter())
			.build();
}

@Bean
public Step playerSummarization(JobRepository jobRepository, PlatformTransactionManager transactionManager) {
	return new StepBuilder("playerSummarization", jobRepository)
			.startLimit(2)
			.<String, String>chunk(10, transactionManager)
			.reader(playerSummarizationSource())
			.writer(summaryWriter())
			.build();
}
XML

以下 XML 示例展示了如何配置作业,以便有可以重新启动的步骤:

The following XML example shows how to configure a job to have steps that can be restarted:

XML Configuration
<job id="footballJob" restartable="true">
    <step id="playerload" next="gameLoad">
        <tasklet>
            <chunk reader="playerFileItemReader" writer="playerWriter"
                   commit-interval="10" />
        </tasklet>
    </step>
    <step id="gameLoad" next="playerSummarization">
        <tasklet allow-start-if-complete="true">
            <chunk reader="gameFileItemReader" writer="gameWriter"
                   commit-interval="10"/>
        </tasklet>
    </step>
    <step id="playerSummarization">
        <tasklet start-limit="2">
            <chunk reader="playerSummarizationSource" writer="summaryWriter"
                   commit-interval="10"/>
        </tasklet>
    </step>
</job>

前一个示例配置适用于加载有关足球比赛信息并进行总结的作业。它包含三个步骤:playerLoadgameLoadplayerSummarizationplayerLoad 步骤从平面文件加载球员信息,而 gameLoad 步骤对比赛执行相同操作。最后一步 playerSummarization 根据提供的比赛总结每个球员的统计数据。假设由 playerLoad 加载的文件只能加载一次,但 gameLoad 可以加载特定目录中找到的任何比赛,在成功将它们加载到数据库后将其删除。因此,playerLoad 步骤不包含任何附加配置。它可以启动任意次数,如果完成,则跳过。但是,gameLoad 步骤需要每次运行,以防自上次运行以来已添加其他文件。它将 allow-start-if-complete 设置为 true 以始终启动。(假设游戏加载到的数据库表具有进程指示器,以确保概括步骤可以正确找到新游戏)。摘要步骤是作业中最重要的一步,配置为启动限制为 2。这很有用,因为如果步骤持续失败,则会向控制作业执行的操作员返回一个新的退出代码,并且在进行人工干预之前不能再次启动。

The preceding example configuration is for a job that loads in information about football games and summarizes them. It contains three steps: playerLoad, gameLoad, and playerSummarization. The playerLoad step loads player information from a flat file, while the gameLoad step does the same for games. The final step, playerSummarization, then summarizes the statistics for each player, based upon the provided games. It is assumed that the file loaded by playerLoad must be loaded only once but that gameLoad can load any games found within a particular directory, deleting them after they have been successfully loaded into the database. As a result, the playerLoad step contains no additional configuration. It can be started any number of times is skipped if complete. The gameLoad step, however, needs to be run every time in case extra files have been added since it last ran. It has allow-start-if-complete set to true to always be started. (It is assumed that the database table that games are loaded into has a process indicator on it, to ensure new games can be properly found by the summarization step). The summarization step, which is the most important in the job, is configured to have a start limit of 2. This is useful because, if the step continually fails, a new exit code is returned to the operators that control job execution, and it can not start again until manual intervention has taken place.

此作业提供了此文档的示例,与 footballJob 中找到的示例项目不同。

This job provides an example for this document and is not the same as the footballJob found in the samples project.

本部分的剩余部分描述了 footballJob 示例的三次运行的每个运行的内容。

The remainder of this section describes what happens for each of the three runs of the footballJob example.

运行 1:

Run 1:

  1. playerLoad runs and completes successfully, adding 400 players to the PLAYERS table.

  2. gameLoad runs and processes 11 files worth of game data, loading their contents into the GAMES table.

  3. playerSummarization begins processing and fails after 5 minutes.

运行 2:

Run 2:

  1. playerLoad does not run, since it has already completed successfully, and allow-start-if-complete is false (the default).

  2. gameLoad runs again and processes another 2 files, loading their contents into the GAMES table as well (with a process indicator indicating they have yet to be processed).

  3. playerSummarization begins processing of all remaining game data (filtering using the process indicator) and fails again after 30 minutes.

运行 3:

Run 3:

  1. playerLoad does not run, since it has already completed successfully, and allow-start-if-complete is false (the default).

  2. gameLoad runs again and processes another 2 files, loading their contents into the GAMES table as well (with a process indicator indicating they have yet to be processed).

  3. playerSummarization is not started and the job is immediately killed, since this is the third execution of playerSummarization, and its limit is only 2. Either the limit must be raised or the Job must be executed as a new JobInstance.