Thursday, May 1, 2008

Recovery of deleted files on linux

A recent thread on the oracle-l maillist (http://www.freelists.org/list/oracle-l) called “REDO LOG Concepts” caught my attention. Someone asked about the transaction state of the (oracle) database when the online redologfiles have been deleted prior to a commit.

Most people know that on linux and unixes a file keeps on living (=available) in the context of a process as long as the file is kept open by the process, despite deletion by another process. This file is gone for all processes that try to access it after the deletion, but the deleted files will only get really disposed once all processes that have the file opened close the the connection (file descriptor) to it.

I guess everybody who is doing/has done linux/unix administration has encountered the situation at least once where a filesystem gets filled with some kind of logfile or tracefile which grows rapidly and keeps on growing at a very fast pace, and deleting it does not release any space (because the process which is filling the file is not stopped, so the file is held open, so the file still exists, despite deletion)

Some suggestions where made for recovery and protecting against it (mirroring online redologfiles). I guess most people know recovery fundamentals, so I will not dive further into that matter.

A fellow oracle-l member (stefan knecht) however suggested:

I’ve run into this situation before as well. It happens, people accidentally delete files.

What would be interesting to research, is if you still can somehow recover those files. As they’re still there, and locked by the process accessing them (in your case LGWR), there might be a way to “steal” the filehandle from that process and just write the file to another location before it’s being closed. Though you would have to intercept the syscall closing the file.

That made me think: linux has the proc filesystem, which gives system information, but also process information. This process information is quite extensive. One of the sections (directory actually) is a list of all the file descriptors for a process. These file descriptors are shown as symlinks:


[oracle@centos50-oracle10203 bin]$ ./sqlplus '/as sysdba'

SQL*Plus: Release 10.2.0.3.0 - Production on Thu May 17 16:24:01 2007

Copyright (c) 1982, 2006, Oracle. All Rights Reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area 167772160 bytes
Fixed Size 1260696 bytes
Variable Size 130024296 bytes
Database Buffers 33554432 bytes
Redo Buffers 2932736 bytes
Database mounted.
Database opened.
SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options
[oracle@centos50-oracle10203 bin]$ ps -ef | grep pmon
oracle 8634 1 0 16:24 ? 00:00:00 ora_pmon_frits
oracle 8680 2642 0 16:24 pts/1 00:00:00 grep pmon
[oracle@centos50-oracle10203 bin]$ cd /proc/8634/fd
[oracle@centos50-oracle10203 fd]$ ls -ls
total 0
0 lr-x------ 1 oracle oracle 64 May 17 16:24 0 -> /dev/null
0 lr-x------ 1 oracle oracle 64 May 17 16:24 1 -> /dev/null
0 lrwx------ 1 oracle oracle 64 May 17 16:24 10 -> /oracle/db/10.2.0.3/dbs/hc_frits.dat
0 lrwx------ 1 oracle oracle 64 May 17 16:24 11 -> /oracle/db/10.2.0.3/admin/frits/adump/ora_8632.aud
0 lr-x------ 1 oracle oracle 64 May 17 16:24 12 -> /dev/zero
0 lr-x------ 1 oracle oracle 64 May 17 16:24 13 -> /dev/zero
0 lr-x------ 1 oracle oracle 64 May 17 16:24 14 -> /oracle/db/10.2.0.3/rdbms/mesg/oraus.msb
0 lrwx------ 1 oracle oracle 64 May 17 16:24 15 -> /oracle/db/10.2.0.3/dbs/hc_frits.dat
0 lrwx------ 1 oracle oracle 64 May 17 16:24 16 -> socket:[33865]
0 lrwx------ 1 oracle oracle 64 May 17 16:24 17 -> /oracle/db/10.2.0.3/dbs/lkFRITS
0 lr-x------ 1 oracle oracle 64 May 17 16:24 18 -> /oracle/db/10.2.0.3/rdbms/mesg/oraus.msb
0 lr-x------ 1 oracle oracle 64 May 17 16:24 2 -> /dev/null
0 lr-x------ 1 oracle oracle 64 May 17 16:24 3 -> /dev/null
0 lr-x------ 1 oracle oracle 64 May 17 16:24 4 -> /dev/null
0 l-wx------ 1 oracle oracle 64 May 17 16:24 5 -> /oracle/db/10.2.0.3/admin/frits/udump/frits_ora_8632.trc
0 l-wx------ 1 oracle oracle 64 May 17 16:24 6 -> /oracle/db/10.2.0.3/admin/frits/bdump/alert_frits.log
0 lrwx------ 1 oracle oracle 64 May 17 16:24 7 -> /oracle/db/10.2.0.3/dbs/lkinstfrits (deleted)
0 l-wx------ 1 oracle oracle 64 May 17 16:24 8 -> /oracle/db/10.2.0.3/admin/frits/bdump/alert_frits.log
0 l-wx------ 1 oracle oracle 64 May 17 16:24 9 -> pipe:[11466]

As you can see, these are the open files of the pmon process of an oracle 10.2.0.3 database just after startup.
(if you want to use ltrace, strace or gdb on a process, this is the place to find the mapping of file descriptor to a file, socket or pipe!)

Can I use a file descriptor to get data out of the file? Probably, because it’s a symlink which is perfectly normal to use. Well then, back to the suggestion made:

“As they’re still there, and locked by the process accessing them (in your case LGWR), there might be a way to “steal” the filehandle from that process and just write the file to another location before it’s being closed.”

A quick test on a test system (centos 4.4, oracle 10.2.0.3):


SQL> select * from v$logfile;

GROUP# STATUS TYPE MEMBER IS_
---------- ------- ------------------------------------------------------------------------ ---
3 ONLINE /oracle/frits/data/FRITS/onlinelog/o1_mf_3_31l7kp85_.log NO
3 ONLINE /oracle/frits/flashreco/FRITS/onlinelog/o1_mf_3_31l7kqsl_.log YES
2 ONLINE /oracle/frits/data/FRITS/onlinelog/o1_mf_2_31l7ko7c_.log NO
2 ONLINE /oracle/frits/flashreco/FRITS/onlinelog/o1_mf_2_31l7koq3_.log YES
1 ONLINE /oracle/frits/data/FRITS/onlinelog/o1_mf_1_31l7knbs_.log NO
1 ONLINE /oracle/frits/flashreco/FRITS/onlinelog/o1_mf_1_31l7knrh_.log YES

6 rows selected.

SQL>

Let’s see which log group is the current one:


SQL> select * from v$log;

GROUP# THREAD# SEQUENCE# BYTES MEMBERS ARC STATUS FIRST_CHANGE# IRST_TIM
---------- ---------- ---------- ---------- ---------- --- ----------------------------- ---------
1 1 38 52428800 2 NO INACTIVE 1198861 10-APR-07
2 1 39 52428800 2 NO INACTIVE 1238951 10-APR-07
3 1 40 52428800 2 NO CURRENT 1274534 17-MAY-07

SQL>

Log group 3 is the current one. Well, lets delete one of the files of group 1 (I have two members in every log group, lets pretend we have only one member):


[oracle@filetest bin]$ rm /oracle/frits/data/FRITS/onlinelog/o1_mf_1_31l7knbs_.log

Now look at the file descriptors of the logwriter:


...
1 lrwx------ 1 root root 64 May 17 17:07 21 -> /oracle/frits/data/FRITS/onlinelog/o1_mf_1_31l7knbs_.log (deleted)
...

And lets make a copy of the file using the file descriptor:


[root@filetest ~]# cat /proc/4703/fd/21 > o1_mf_1_31l7knbs_.log
[root@filetest ~]# ls -ls o1_mf_1_31l7knbs_.log
51260 -rw-r--r-- 1 root root 52429312 May 17 17:11 o1_mf_1_31l7knbs_.log

It looks like we’ve made a copy of a deleted file, using the file descriptor in the proc filesystem. Now let’s shutdown the database, and start it up. This will go wrong, because a member of the online redolog group 1 can not be identified:


SQL> shut immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area 167772160 bytes
Fixed Size 1260696 bytes
Variable Size 109052776 bytes
Database Buffers 54525952 bytes
Redo Buffers 2932736 bytes
Database mounted.
Database opened.

SQL>

Excuse me? Shouldn’t we get an error message saying an online logfile could not be identified? Let’s investigate further:


SQL> select * from v$logfile;

GROUP# STATUS TYPE MEMBER IS_
---------- ------- ------- ----------------------------------------------------------------- ---
3 ONLINE /oracle/frits/data/FRITS/onlinelog/o1_mf_3_31l7kp85_.log NO
3 ONLINE /oracle/frits/flashreco/FRITS/onlinelog/o1_mf_3_31l7kqsl_.log YES
2 ONLINE /oracle/frits/data/FRITS/onlinelog/o1_mf_2_31l7ko7c_.log NO
2 ONLINE /oracle/frits/flashreco/FRITS/onlinelog/o1_mf_2_31l7koq3_.log YES
1 INVALID ONLINE /oracle/frits/data/FRITS/onlinelog/o1_mf_1_31l7knbs_.log NO
1 ONLINE /oracle/frits/flashreco/FRITS/onlinelog/o1_mf_1_31l7knrh_.log YES

6 rows selected.

SQL>

Well, apparently not….I recall situations where a lost online redologfile would cause the startup to fail, and needing to copy an existing one in the group to the missing one. I guess this is altered in some version of the database. Anybody know which version?

Well, it makes sense. The status of the file is set to ‘INVALID’ and the surviving file(s) are used. A message is printed in the alert.log:


Thu May 17 17:43:13 2007
Errors in file /oracle/db/10.2.0.3/admin/frits/bdump/frits_lgwr_13960.trc:
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '/oracle/frits/data/FRITS/onlinelog/o1_mf_1_31l7knbs_.log'
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
Thu May 17 17:43:13 2007
Errors in file /oracle/db/10.2.0.3/admin/frits/bdump/frits_lgwr_13960.trc:
ORA-00321: log 1 of thread 1, cannot update log file header
ORA-00312: online log 1 thread 1: '/oracle/frits/data/FRITS/onlinelog/o1_mf_1_31l7knbs_.log'

Now, let’s replace the missing one with our copy from the file descriptor:


[root@filetest ~]# cat o1_mf_1_31l7knbs_.log > /oracle/frits/data/FRITS/onlinelog/o1_mf_1_31l7knbs_.log
[root@filetest ~]# chown oracle:oinstall /oracle/frits/data/FRITS/onlinelog/o1_mf_1_31l7knbs_.log

Now switch from logfile until we reach the log group in which the invalid file resides (log group 1)…


SQL> select * from v$logfile;

GROUP# STATUS TYPE MEMBER IS_
---------- ------- ------------------------------------------------------------------------ ---
3 ONLINE /oracle/frits/data/FRITS/onlinelog/o1_mf_3_31l7kp85_.log NO
3 ONLINE /oracle/frits/flashreco/FRITS/onlinelog/o1_mf_3_31l7kqsl_.log YES
2 ONLINE /oracle/frits/data/FRITS/onlinelog/o1_mf_2_31l7ko7c_.log NO
2 ONLINE /oracle/frits/flashreco/FRITS/onlinelog/o1_mf_2_31l7koq3_.log YES
1 ONLINE /oracle/frits/data/FRITS/onlinelog/o1_mf_1_31l7knbs_.log NO
1 ONLINE /oracle/frits/flashreco/FRITS/onlinelog/o1_mf_1_31l7knrh_.log YES

6 rows selected.

SQL>

Look! The file is valid again!

Once a logfile is invalid for whatever reason, oracle continues to work as long as there are valid members in the group. During a logswitch, the files in the log group where oracle switches to are identified, even if they are invalid, and reset to a valid state if they are usable again.

Also, it is possible to make a copy of an already deleted file (as root on centos 4.4 (centos 4.4 is a free version of RHEL4) and as the owner of the file or root on centos 5 (RHEL5)) using the filedescriptor in proc.

1 comment:

Unknown said...

I think that various data corruption incidents may be also resolved by the sql mdf utility