Spring Shedlock is used to perform timed tasks in the case of distributed services, such as regularly deleting some data in the database, doing data migration and other operations. This technique is heavily used in the distributed services of the project. The main reasons for its use are the following.

  • Timed tasks. To perform some operations behind the scenes during the normal operation of the service and to meet our business needs, a scheduled task is essential.
  • Distributed services. Imagine the following scenario: As business grows, one day the pressure on a single service is too great and one service can’t support it anymore, we have to consider deploying multiple services to spread the pressure. The problem arises that the previous timed tasks will all run on each service, doing the same thing, and will cause some concurrency problems. This is definitely not the result we want, and we find that although we have multiple services, we can only have such timed tasks executed once. This is where we can consider controlling the tasks through the database. This is because the database for multiple services is still the same.

Usage

Taking postgres as an example, we can use shedlock in the following way

First we need to import the corresponding dependencies.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
<!-- shed lock -->
<dependency>
    <groupId>net.javacrumbs.shedlock</groupId>
    <artifactId>shedlock-spring</artifactId>
    <version>${schedlock.version}</version>
</dependency>
<dependency>
    <groupId>net.javacrumbs.shedlock</groupId>
    <artifactId>shedlock-provider-jdbc-template</artifactId>
    <version>${schedlock.version}</version>
</dependency>

Then we need to create a table, shedlock to store the sheduler lock information.

1
2
3
4
5
6
7
CREATE TABLE shedlock (
  name VARCHAR(64),
  lock_until TIMESTAMP(3) NULL,
  locked_at TIMESTAMP(3) NULL,
  locked_by VARCHAR(255),
  PRIMARY KEY (name)
)

Usually, we just use the name of the lock as the primary key of the table.

Then we need to configure LockProvider to use shedlock when accessing the database.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
@Configuration
@EnableScheduling
@EnableSchedulerLock(defaultLockAtMostFor = "30s")
public class SchedulerConfiguration {
 
    @Bean
    public LockProvider lockProvider(final DataSource dataSource) {
        return new JdbcTemplateLockProvider(dataSource);
    }
}

With the above configuration we are ready to create the task.

Creating tasks

To create a schedule task controlled by shedlock, we can just annotate @Scheduled and @SchedulerLock to the corresponding methods.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
@Component
public class SchedulerTaskTrigger {
 
    @Scheduled(fixedDelayString = "PT30S")
    @SchedulerLock(name = "TaskScheduler_scheduledTask", 
      lockAtLeastForString = "30s",
      lockAtMostForString = "3h")
    public void scheduledTask() {
        // ...
    }
}

Then, the logic inside the task will be executed according to our configuration. In the above configuration, the minimum lock time is set to 30s and the maximum lock time is set to 3 hours. This means that every time a task is executed, the task will be locked for at least 30 seconds and the maximum time will be 3 hours. This depends on the actual execution of the task.

Also, in the case of distributed multiple services running simultaneously, only one service will execute the task at the same time.

If each task takes 30s to execute, then we will see the following in the database table.

database

Next execution.

database

You can see that the time between lockedat and locked until is 30s, and the last locked until becomes the next locked at time.

If we look at the sql executed by task, we will see that each time a job is executed, shedlock will execute the following two sqls.

1
2
3
4
5
6
7
8
//task started
UPDATE shedlock SET lock_until = ?, locked_at = ?, locked_by = ? WHERE name = ? AND lock_until <= ?;
 
//task execution
...
 
//task end
UPDATE shedlock SET lock_until = ? WHERE name = ?;

Essentially, shedlock solves the distributed service timing task by leveraging the fact that multiple services are based on the same database. That is, although the services are distributed, they still rely on the same resource, i.e., the database. Some other solutions, such as using zookeeper or using redis, are essentially the same. The problem is essentially a distributed locking problem. The solution is naturally to find resources shared by distributed services.

Reference https://blog.csdn.net/topdeveloperr/article/details/123847669