Java Synchronization Aids를 사용하여 교착 상태를 디버깅합니다

가장 유명한 교착 상태 중 하나는 ‘식당 철학자’의 잘 알려진 문제에서 발생하는 것입니다. 간단히 말해서, ‘N’철학자들은 중국 음식을 목표로 한 원탁 테이블에 앉아 있다고합니다.

테이블에는 두 철학자 사이에 ‘n’젓가락이 있습니다. 공연장은 쾌적하고 생산적인 곳이므로, 그들은 식사뿐만 아니라 두 사람 사이를 번갈아 가며 생각하고 있습니다. 먹을 수 있으려면 먼저 두 개의 젓가락을 얻고 먹은 다음 테이블에 다시 넣고 생각으로 돌아갑니다. 더 자세한 내용을 얻지 않고, 각 철학자가 젓가락을 오른쪽으로 잡고 왼쪽에있는 것을 기다린다는 것을 쉽게 알 수 있습니다. 교착 상태가 나타납니다.

스레드로 이동하면 가장 간단한 교착 상태는 다른 사람이 영원히 자물쇠를 잡는 반면, 다른 사람들은 동일한 잠금 장치를 얻고 블록 대기를 목표로하는 것입니다. 두 스레드의 경우, 전자가 잠금을 고정하고 잠금 B를 목표로하는 경우, 후자는 잠금 B를 보유하고 잠금 A를 목표로하면 영원히 기다립니다.

마찬가지로, 데이터베이스 수준에서 두 개 이상의 동시 거래가 차단 될 때 교착 상태가 나타납니다. 각각은 다른 사람들이 필요한 리소스를 공개하기를 기다립니다.

추상적인

얼마 전, 나는 불행히도 우리 작품 중 하나에 나타난 데이터베이스 교착 상태에 직면했습니다. 문제가 확인되고 성공적으로 수정되었습니다. 그러나이 조사에서 주목을받은 것은 행동 교착 상태의 동시 실행을 보장함으로써 고립 된 방식으로 문제를 재현하려는 시도였습니다.

이 기사에서는 먼저 “간단한”교착 상태가 제작 된 다음 두 개의 Java Synchronization Aids를 활용하여 “가능한 한 동시에”두 가지 작업을 실행하는 방법에 중점을 둡니다. CyclickBarrier 그리고 CountDownLatch.

설정

개념 증명은 다음과 같이 구축됩니다.

Java 21
Postgresql 13
PostgreSQL 드라이버 버전 42.7.5.

간단하게하기 위해 Spring Boot 버전 3.4.4, Flyway 10.20.1 및 Apache Maven 3.9.9도 사용됩니다.

교착 상태

허락하다 Entity1 그리고 Entity2 두 개의 간단한 엔티티가 되십시오. 그것들은 유사하며 고유 식별자 (기본 키)와 텍스트에 의해서만 설명됩니다.

@Entity
@Table(name = "entity1")
public class Entity1 {
 
    @Id
    @Column(name = "id")
    private Long id;
 
    @Column(name = "text", nullable = false)
    private String text;
     
    ...
}
 
@Entity
@Table(name = "entity2")
public class Entity2 {
 
    @Id
    @Column(name = "id")
    private Long id;
 
    @Column(name = "text", nullable = false)
    private String text;
 
    ...
}

EntityProcessor 각각 자체 거래에서 실행되지만 동일한 두 엔티티를 포함하는 두 개의 운영을 정의하는 구성 요소입니다. Entity1 그리고 Entity2 각기. 분명히, 그것들이 존재한다고 가정합니다.

1. process1():
    - begins the transaction
    - reads Entity1 from the database
    - modifies its text
    - reads Entity2 from the database
    - modifies its text
    - commits the transaction
 
2. process2():
    - begins the transaction
    - reads Entity2 from the database
    - modifies its text
    - reads Entity1 from the database
    - modifies its text
    - commits the transaction

@Service
public class EntityProcessor {
 
    private final Entity1Repository entity1Repo;
    private final Entity2Repository entity2Repo;
 
    public EntityProcessor(Entity1Repository entity1Repo, Entity2Repository entity2Repo) {
        this.entity1Repo = entity1Repo;
        this.entity2Repo = entity2Repo;
    }
 
    @Transactional
    public void process1(long entity1Id, long entity2Id) {
        final int index = 1;
 
        processEntity1(index, entity1Id);
 
        processEntity2(index, entity2Id);
    }
 
    @Transactional
    public void process2(long entity1Id, long entity2Id) {
        final int index = 2;
 
        processEntity2(index, entity2Id);
 
        processEntity1(index, entity1Id);
    }
 
    private void processEntity1(int index, long entityId) {
        Entity1 entity1 = entity1Repo.findById(entityId)
                .orElseThrow(() -> new RuntimeException("Entity1 not found"));
 
        entity1.setText("Set by process " + index);
    }
 
    private void processEntity2(int index, long entityId) {
        Entity2 entity2 = entity2Repo.findById(entityId)
                .orElseThrow(() -> new RuntimeException("Entity2 not found"));
 
        entity2.setText("Set by process " + index);
    }
}

이 POC의 일환으로, “비즈니스”목표는 주문에 관계없이 두 개체의 텍스트를 두 번 수정하는 것입니다. 위의 작업 (트랜잭션)은 반대 잠금 시퀀스로 인해 동시에 실행되면 교착 상태로 이어집니다. 문제를 해결하거나 최소한 차단 위험을 크게 줄이려면 두 엔티티를 동일한 순서로 잠해야합니다.

언제 process1() 그리고 process2() (위에서 정의 된대로)는 병렬로 실행되며, 아래 예외는 얻어지며 교착 상태를 강조합니다.

[virtual-61] DEBUG SqlExceptionHelper#could not execute statement [update entity1 set text=? where id=?]
org.postgresql.util.PSQLException: ERROR: deadlock detected
  Detail: Process 36964 waits for ShareLock on transaction 29998; blocked by process 48636.
Process 48636 waits for ShareLock on transaction 29999; blocked by process 36964.
  Hint: See server log for query details.
  Where: while updating tuple (0,7) in relation "entity1"
    at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2733)
    at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2420)
    at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:372)
    ...
    at com.hcd.deadlock.service.EntityProcessor$$SpringCGLIB$$0.process2()

스택 추적은 문제를 명확하게 설명합니다.

병렬 실행

교착 상태는 이전 섹션에서 소개되었습니다. 이 기사의 주요 목표는 두 트랜잭션의 실제 동시 실행을 보장하기위한 적절한 설정을 작성하는 것입니다.

이 실험에서는 2 개의 Java Synchronization Aids가 사용됩니다. CyclicBarrier 그리고 CountDownLatch. 다음 공통 설정으로 각각에 대해 단위 테스트가 작성됩니다.

@SpringBootTest
@Rollback(false)
class Test {
     
    @Autowired
    private EntityProcessor entityProcessor;
 
    @Autowired
    private Entity1Repository entity1Repo;
 
    @Autowired
    private Entity2Repository entity2Repo;
 
    private Entity1 entity1;
    private Entity2 entity2;
 
    @BeforeEach
    void setUp() {
        entity1 = entity1Repo.save(new Entity1(1L));
        entity2 = entity2Repo.save(new Entity2(2L));
    }
 
    @AfterEach
    void tearDown() {
        entity1Repo.delete(entity1);
        entity2Repo.delete(entity2);
    }
}

첫 번째 Entity1 그리고 Entity2 그러면 만들어집니다 EntityProcessor#process1() 그리고 EntityProcessor#process2() 동시에 실행됩니다. 궁극적으로 정리가 완료되고 두 엔티티가 삭제됩니다. 따라서, 이러한 테스트는 가능한 실제 사용 사례를 시뮬레이션하기 위해 트랜잭션이 아니기 때문에 데이터베이스의 상태는 변경되지 않은 상태로 유지됩니다.

원조 1 : CountdownLatch

에이 CountDownLatch “하나 이상의 스레드가 다른 스레드에서 수행되는 일련의 작업 세트가 완료 될 때까지 기다릴 수있는 동기화 보조 도구입니다.” [1].

테스트는 실행을 위해 두 개의 스레드를 준비하고 일정을 잡습니다 EntityProcessor#process1() 그리고 EntityProcessor#process2() 각각 및 구성 a CountDownLatch. 그만큼 CountDownLatch 1 카운트로 초기화되므로 토글 또는 간단한 게이트 역할을합니다.

스레드의 동작은 비슷하고 ProcessTask Runnable. 그들은 먼저 호출합니다 latch.await() 결과적으로 대기 상태로 차단됩니다. latch.countDown()수를 0으로 가져오고 실제 병렬 처리를 시작합니다.

@Test
void run()  {
    CountDownLatch latch = new CountDownLatch(1);
 
    try (ExecutorService exec = Executors.newVirtualThreadPerTaskExecutor()) {
        Future> future1 = exec.submit(new ProcessTask(latch,
                () -> entityProcessor.process1(entity1.getId(), entity2.getId())));
 
        Future> future2 = exec.submit(new ProcessTask(latch,
                () -> entityProcessor.process2(entity1.getId(), entity2.getId())));
 
        latch.countDown();
 
        future1.get();
        future2.get();
 
    } catch (ExecutionException | InterruptedException e) {
        throw new RuntimeException(e);
    }
 
    log.info("All processors completed.");
}
 
private record ProcessTask(CountDownLatch latch, Runnable runnable) implements Runnable {
     
    @Override
    public void run() {     
        try {
            latch.await();
        } catch (InterruptedException e) {
            throw new RuntimeException(e);
        }
 
        runnable.run();
    }
}

원조 2 : Cyclicbarrier

에이 CyclicBarrier “모든 스레드 세트가 서로 공통 장벽 지점에 도달 할 때까지 기다릴 수 있습니다.” [2]이 경우 실제 처리 시작 – 실행 EntityProcessor#process1() 그리고 EntityProcessor#process2().

위와 마찬가지로, 테스트는 두 스레드를 준비하고 일정을 잡고 CyclicBarrier 세 명의 당사자 (두 작업과 메인 스레드)가 barrier.await() – 1 번 전화하십시오. 스레드의 동작은 전자와 비슷하며 ProcessTask Runnable. 그들은 먼저 호출됩니다 barrier.await() – 2 번 전화하십시오 그리고 3 번. 세 당사자 모두 전화를 걸면 barrier.await()그들은 모두 함께 진행하여 실제 병렬 처리를 시작합니다.

@Test
void run()  {
    CyclicBarrier barrier = new CyclicBarrier(3);
 
    try (ExecutorService exec = Executors.newVirtualThreadPerTaskExecutor()) {
        Future> future1 = exec.submit(new ProcessTask(barrier,
                () -> entityProcessor.process1(entity1.getId(), entity2.getId())));
 
        Future> future2 = exec.submit(new ProcessTask(barrier,
                () -> entityProcessor.process2(entity1.getId(), entity2.getId())));
 
        barrier.await();
 
        future1.get();
        future2.get();
 
    } catch (ExecutionException | InterruptedException | BrokenBarrierException e) {
        throw new RuntimeException(e);
    }
 
    log.info("All processors completed.");
}
 
 
private record ProcessTask(CyclicBarrier barrier, Runnable runnable) implements Runnable {
 
    @Override
    public void run() {
        try {
            barrier.await();
        } catch (InterruptedException | BrokenBarrierException e) {
            log.error("Could not await().", e);
        }
 
        runnable.run();
    }
}

테이크 아웃

의 경우 CountDownLatch처리 스레드는 독립적이고 병렬로 실행되지만 실제 처리를 시작합니다. runnable.run() – 메인 스레드에 의해 풀릴 때만 래치가 0으로 줄어 듭니다.

사용할 때 CyclicBarrier상황은 매우 비슷합니다. 각 당사자는 장벽을 기다리고 마지막으로 도착한 당사자는이를 방출합니다.

같지 않은 CountDownLatch일회성 사용 전용, a CyclicBarrier 여러 번 재사용 될 수 있으므로 스레드 세트가 서로 대기하고 반복적으로 동기화 해야하는 여러 단계가있을 때 특히 유용 할 수 있습니다.

이 기사에서는 교착 상태를 고립 된 두 가지 방법을 제시합니다.이 문제는 이러한 문제가 감지 될 때 매우 유용 할 수 있습니다.

위의 테스트 중 하나를 실행할 때 교착 상태가 나타나고 해당 예외가 발생합니다. 그럼에도 불구하고, 여기서의 목적은 일반적으로 교착 상태를 해결하는 방법을 설명하는 것이 아니라, 동시 잠금 운영을 주문하면 그러한 문제가 해결되거나 교착 상태의 위험을 크게 줄이는 것이 반복 할 가치가 있습니다.