On this guide, we will learn how to fix broken virtuozzo container. The server is actually a cloud storage server. The purpose of the cloud storage server function like almost high availability. When the hardware node down, the vps container inside the node will automatically migrate to available active hardware node. In this setup we have 5 hardware node or we also called it Parallels Cloud Storage(PCS).
Sometimes during this event of hardware failure or server down, some of the container will be broken due to sometimes there is no much time to recover from the downtime.
The broken container will be tag as ‘B’ if we run command ‘shaman stat’ example;
8826061 on 101.30.110.101 0 8833527 on 101.30.110.103 0 B 8916305 on Unknown 0 8959776 on 101.30.110.103 0
For this case the container ID ‘8916305’ is broken.
We need to go to ‘/pstorage/pcs1-cluster/.shaman/broken’ to see the actual location of the PCS server.
root@pcs3 [/pstorage/pcs1-cluster/.shaman/broken]# cat ct-8916305 RELOCATE_PRIO=0 PATH=private/8916305 NODE=101.30.110.104 TIME=1572058900 INFO=Resource relocation failed root@pcs3 [/pstorage/pcs1-cluster/.shaman/broken]#
In this case we now know the actual container located at PCS4. So we can just delete the file name ‘ct-8916305’
root@pcs3 [/pstorage/pcs1-cluster/.shaman/broken]# rm ct-8916305 rm: remove regular file `ct-8916305'? y
After that we need to check the configuration file of the container by login to the PCS where the container located. Make sure all setting are correct.
root@pcs4 [/pstorage/pcs1-cluster/.shaman/broken]# cat /vz/private/8916305/ve.conf DESCRIPTION="Test=20User=20Test=20User" HOSTNAME="svr1.test.com.my" IP_ADDRESS="192.168.26.8/255.255.255.0 192.168.26.9/255.255.255.0" NAMESERVER="8.8.8.8 8.8.4.4" NETFILTER="full" NETIF="" OFFLINE_MANAGEMENT="yes" OFFLINE_SERVICE="vzpp" ONBOOT="yes" ORIGIN_SAMPLE="4vCPU-Supreme-Plan" OSTEMPLATE=".centos-7-x86_64" SEARCHDOMAIN="google.com" SLMMODE="all" TEMPLATES="" UUID="0820b9ee-af22-4f0a-bb67-18578b1ce678" CPULIMIT_MHZ="8192" DISKSPACE="104857600:104857600" SLMMEMORYLIMIT="8589934592:8589934592" VE_ROOT="/vz/root/$VEID" VE_PRIVATE="/vz/private/$VEID" DISTRIBUTION="centos" TECHNOLOGIES="x86_64 nptl" VEID="8916305" CONFIG_CUSTOMIZED="yes" RATEBOUND="yes" NUMPROC="500:500" NAME="Test User Test User"
Register the container to the hardware node;
root@pcs4 [/pstorage/pcs1-cluster/.shaman/broken]# vzctl register /vz/private/8916305 8916305 Container was successfully registered Adding IP addresses to the pool: 192.168.26.8 192.168.26.9 Deleting offline management Adding offline management to Container(1): 4643 root@pcs4 [/pstorage/pcs1-cluster/.shaman/broken]#
Resume the container;
root@pcs4 [/pstorage/pcs1-cluster/.shaman/broken]# vzctl start 8916305 Mount image: /vz/private/8916305/root.hdd Container is mounted Starting the Container ... Os release: 3.10.0-042stab139.1 CPU limit: 100% Setting permissions 20002 dev 0x7d00 Adding offline management to Container(1): 4643 Adding IP addresses: 192.168.26.8/255.255.255.0 192.168.26.9/255.255.255.0 Hostname of the Container set: svr1.test.com.my File resolv.conf was modified Starting the Container ...