TroubleShooting_Instance connection lost

Posted Jul 9, 2024

By So Hee Park 5 min read

🔴 Instance connection problem

After 5~6 hours after deployment with nohup, at some point the connection was lost.

🟠 Situation Analysis

EC2 Instance in AWS was running without problem.
connected with ubuntu by SSH connection

running command was as following

  
nohup java -jar ./build/libs/drug_store_be-0.0.1-SNAPSHOT.jar --spring.profiles.active=prod >>/dev/null 2>&1 &

and when checked ltpn, the four digit numbers were shown.

netstat -ltpn

However, after 5~6 hours, the conenction was lost.
And when tried to run netstat -ltpn, nothing was printed.

🔵 Tryout1: nohup.out

Intended to look at nohup.out file.
Altered the command to following.

  
nohup java -jar ./build/libs/drug_store_be-0.0.1-SNAPSHOT.jar --spring.profiles.active=prod &

As a result, nohup.out file was created.
Ran the following commands to see nohup.out file.

tail -f nohup.out

And after 5~6 hours when the instance lost connection, ran the following command to see nohup.out file.

tail -n 100 nohup.out

✔️ Result

However, in the nohup.out file there does not seem to be any error.

🔵 Tryout2: SWAP

Maybe there is too little memory? I am using t2.micro for EC2.

🟠 SWAP

space on HDD or SSD for temporarily holding data
that is not actively being used on RAM

RAM: Random Access Memory

acts as overflow area for your computer’s memory
SWAP은 EC2에 한정된 방법이 아니라 LinuxOS에서 가상 메모리 관리 시스템에서 사용되는 방법

LinuxOS애서 프로세스는 주로 RAM에 적재되어 실행된다.
그런데 시스템의 물리적인 RAM 용량보다 더 많은 메모리가 필요한 상황 발생 가능

✔️ Paging

When RAM is fully utilized, os can move inactive pages of memory to swap space Thus, freeing RAM for other tasks. 🔴 Like my situation, where I need more memory
SWAP will be used as an alternative memory space
SWAP uses hard disk to make more memory

✔️ How does SWAP help?

extend virtual memory
- virtual memory: RAM + SWAP space(RAM looks bigger storage that it really has)
handle memory overcommitment: paging
prevent OOM errors
- OOM: Out of Memory
- safety net when system is out of physical RAM
- graceful degradation

✔️ check current SWAP

free -h

As can see, there is no swap

✔️ Create the SWAP file

I wanted to make a SWAP file of 4GB

sudo fallocate -l 4G /swapfile

✔️ Set the correct permissions

sudo chmod 600 /swapfile

✔️ Set up the SWAP area

sudo mkswap /swapfile

✔️ Enable the Swap File

sudo swapon /swapfile

✔️ Verify the Swap

sudo swapon --show

✔️ Make the Swap File Permanent

To ensure the swap file is used at boot, add it to /etc/fstab

  
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab

Now, run free -h and you will see 4GB SWAP file created.
However, the instance connection lost problem was NOT solved.

💡 Useful bash commands

✔️ Check Disk Capacity

df -h

✔️ Check Memory

free -h

✔️ Check Instant Capacity

실시간 용량 모니터링

  
watch -n -1 --add command
watch -n -1 free -h

💡 Reference

https://velog.io/@kwontae1313/AWS-EC2-%EB%A9%94%EB%AA%A8%EB%A6%AC%EC%9A%A9%EB%9F%89-%EC%A6%9D%EC%84%A4

🔵 Tryout 3: Invalid character found in method name

After SWAP, the error was not fixed. Instance connection was down again🥲 But this time, nohup.out showed me something.

🔴 Error: Invalid character found in method name

nohup.out 🔴 Error: Invalid character found in method name. HTTP method names must be tokens

💡 Reference

https://medium.com/@beganjimoni23/invalid-character-found-in-method-name-http-method-names-must-be-tokens-11678f35f67f

In the blog reference, it said to update the build.gradle implementation spring-cloud-starter-netflix-eureka-client

🟢 Compatible versions

The cause of first error was that the degraded version of Spring Boot parent and Spring cloud version was not compatible.

💡 Spring Boot Parent

Spring Boot provides a POM.
This parent POM includes configurations and dependency management, for simplifying project setup for SpringBoot.

💡 Spring Cloud

Spring Cloud builds on Spring Boot
provides tools for building distributed systems.

Thus, I decided to update my build.gradle

✔️ build.gradle

  
implementation 'org.springframework.boot:spring-boot-starter-actuator'
implementation 'org.springframework.cloud:spring-cloud-starter-gateway'
implementation 'org.springframework.cloud:spring-cloud-starter-netflix-eureka-client'

Then, I had a new error!

🔴 Unable to load io.netty.resolver

🔴 Error: Unable to load io.netty.resolver.dns.macos.MacOSDnsServerAddressStreamProvider

💡 Reference

https://medium.com/@boysbee/unable-to-load-io-netty-resolver-dns-macos-macosdnsserveraddressstreamprovider-46d89bf74d42

So I decied to change the build.gradle file again. And the netty error was gone.

  
implementation{
      //instance 꺼지는 문제
    implementation 'org.springframework.boot:spring-boot-starter-actuator'
    implementation 'org.springframework.cloud:spring-cloud-starter-gateway'
    implementation 'org.springframework.cloud:spring-cloud-starter-netflix-eureka-client'
    runtimeOnly 'io.netty:netty-resolver-dns-native-macos:4.1.76.Final:osx-aarch_64'
}

//instance 꺼지는 문제
dependencyManagement {
    imports {
        mavenBom "org.springframework.cloud:spring-cloud-dependencies:${springCloudVersion}"
    }
}

🔵 Tryout 4: HikariCP connection pool

🔴 Error: HikariCP connection pool attempting to validate database connections that are already closed

It seemed the connection pool is trying to set a network timeout on a connection that is already closed.

🟠 Cause of problem

Network issues or database restart Don’t think this is the cause because network nor db is restarted.
MaxLifeime Configuration: If Configuration of HikariCP is too long, connections might stay in the pool for longer than they should, causing them be become invalid if database or network state changes.

✔️ update application.yaml

updated the hikari MaxLifetime to 30 minutes in application.yaml file.

🟢 Solution

Finally, instance is now running without stop.
However, whenever I push, the CICD is not complete and the deployment fails.
Next step is continuous deployment!

Project, Drug Store Project

project trouble

This post is licensed under CC BY 4.0 by the author.

🔴 Instance connection problem

🟠 Situation Analysis

🔵 Tryout1: nohup.out

✔️ Result

🔵 Tryout2: SWAP

🟠 SWAP

✔️ Paging

✔️ How does SWAP help?

✔️ check current SWAP

✔️ Create the SWAP file

✔️ Set the correct permissions

✔️ Set up the SWAP area

✔️ Enable the Swap File

✔️ Verify the Swap

✔️ Make the Swap File Permanent

💡 Useful bash commands

✔️ Check Disk Capacity

✔️ Check Memory

✔️ Check Instant Capacity

💡 Reference

🔵 Tryout 3: Invalid character found in method name

🔴 Error: Invalid character found in method name

💡 Reference

🟢 Compatible versions

✔️ build.gradle

🔴 Unable to load io.netty.resolver

💡 Reference

🔵 Tryout 4: HikariCP connection pool

🔴 Error: HikariCP connection pool attempting to validate database connections that are already closed

🟠 Cause of problem

✔️ update application.yaml

🟢 Solution

Trending Tags