Monday, May 16, 2011

Replacing mirrored root disk in svm /sds in solaris 10 online

Scenerio :
OS - Solaris 10
SUN hardware : E2900
environment : system root disk is mirrored with SVM
disk details
c1t0d0
c1t1do
Reason : ONE of the root disk failed in this example c1t1d0 , how we concluded disk is failed
1) it could throwing lots read /write in /var/adm/messages
2) no. of h/w and transport is more then 15 in "iostat -en" cmd
3) disk is showing "not available " in format o/p

1. c1t1d0 drive not available
/ssm@0,0/pci@18,700000/scsi@2/sd@1,0


1) Failed disk will cause metastat -ac o/p maintenance

# metastat -ac
d60 m 19GB d61 d62 (maint)
d61 s 19GB c1t0d0s6
d62 s 19GB c1t1d0s6 (maint)
d40 m 11GB d41 d42 (maint)
d41 s 11GB c1t0d0s4
d42 s 11GB c1t1d0s4 (maint)
d10 m 9.8GB d11 d12 (maint)
d11 s 9.8GB c1t0d0s1
d12 s 9.8
d0 m 9.8GB d1 d2 (maint)
d1 s 9.8GB c1t0d0s0
d2 s 9.8GB c1t1d0s0 (maint)
d50 m 17GB d51 d52 (maint)
d51 s 17GB c1t0d0s5
d52 s 17GB c1t1d0s5 (maint)

2) metadb -i also showing metadb errored stat this is not true for all cases

sh# metadb -i
flags first blk block count
a m p luo 16 8192 /dev/dsk/c1t0d0s7
a p luo 8208 8192 /dev/dsk/c1t0d0s7
a p luo 16400 8192 /dev/dsk/c1t0d0s7
a p luo 24592 8192 /dev/dsk/c1t0d0s7
W p l 16 8192 /dev/dsk/c1t1d0s7
W p l 8208 8192 /dev/dsk/c1t1d0s7
W p l 16400 8192 /dev/dsk/c1t1d0s7
W p l 24592 8192 /dev/dsk/c1t1d0s7


NOTE : failed Disk here is c1t1d0 and MD devices are d2,d62,d42,d12,d52. please verify the disk target and MD change it as per failed target and MD

3) Detach and clear meta device

metadetach -f d0 d2
metadetach -f d60 d62
metadetach -f d40 d42
metadetach -f d10 d12
metadetach -f d50 d52

metaclear d2
metaclear d62
metaclear d42
metaclear d12
metaclear d52

4) delete metadat and confirm

metadb -d c1t1d0s7

metastat -ac

5) do cfgadm -al for getting failed info , o/p shud be like below

:sh# cfgadm -al | grep c1t1d0
c1::dsk/c1t1d0 disk connected configured unknown
sh#

6) remove disk from OS
cfgadm -c unconfigure c1::dsk/c1t1d0

/// if this doesn't works DON"T USE -f cmd given below ///


sh# cfgadm -al | grep c1t1d0

c1::dsk/c1t1d0 disk connected unconfigured unknown
#


7) Ask SUN FE to do disk replacement

cfgadm -al

cfgadm -c configure c1::dsk/c1t1d0

do cfgadm -al for getting failed info , o/p shud be like below

sh# cfgadm -al | grep c1t1d0
c1::dsk/c1t1d0 disk connected configured unknown
sh#

8) check with format , then do the following to recreate and reattach svm devices

prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s - /dev/rdsk/c1t1d0s2

/usr/sbin/installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c1t1d0s0

metadb -c 4 -a c1t1d0s7

metainit d2 1 1 c1t1d0s0
metainit d62 1 1 c1t1d0s6
metainit d42 1 1 c1t1d0s4
metainit d12 1 1 c1t1d0s1
metainit d52 1 1 c1t1d0s5


metattach d0 d2
metattach d60 d62
metattach d40 d42
metattach d10 d12
metattach d50 d52


9) hip hip horray U r all set keep running metastat -ac untill sync completed

1 comment:

Baban Gaigole said...

Precise and good explanation. Most people wont run cfgadm which will lead to more problems and more messages overflowing in /var/adm/messages.
Follow
http://hashprompt.blogspot.in/2012/02/root-mirroring-in-solaris10-update9-svm.html
for root mirroring procedure to be followed.