/share/doc/smm/03.fsck/3.t

https://bitbucket.org/freebsd/freebsd-head/ · Raku · 449 lines · 448 code · 0 blank · 1 comment · 29 complexity · 8d63d9d34c5be2cbe07be6dde32ea0a3 MD5 · raw file

  1. .\" Copyright (c) 1982, 1993
  2. .\" The Regents of the University of California. All rights reserved.
  3. .\"
  4. .\" Redistribution and use in source and binary forms, with or without
  5. .\" modification, are permitted provided that the following conditions
  6. .\" are met:
  7. .\" 1. Redistributions of source code must retain the above copyright
  8. .\" notice, this list of conditions and the following disclaimer.
  9. .\" 2. Redistributions in binary form must reproduce the above copyright
  10. .\" notice, this list of conditions and the following disclaimer in the
  11. .\" documentation and/or other materials provided with the distribution.
  12. .\" 4. Neither the name of the University nor the names of its contributors
  13. .\" may be used to endorse or promote products derived from this software
  14. .\" without specific prior written permission.
  15. .\"
  16. .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  17. .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  18. .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  19. .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  20. .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  21. .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  22. .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  23. .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  24. .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  25. .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  26. .\" SUCH DAMAGE.
  27. .\"
  28. .\" $FreeBSD$
  29. .\" @(#)3.t 8.1 (Berkeley) 6/5/93
  30. .\"
  31. .ds RH Fixing corrupted file systems
  32. .NH
  33. Fixing corrupted file systems
  34. .PP
  35. A file system
  36. can become corrupted in several ways.
  37. The most common of these ways are
  38. improper shutdown procedures
  39. and hardware failures.
  40. .PP
  41. File systems may become corrupted during an
  42. .I "unclean halt" .
  43. This happens when proper shutdown
  44. procedures are not observed,
  45. physically write-protecting a mounted file system,
  46. or a mounted file system is taken off-line.
  47. The most common operator procedural failure is forgetting to
  48. .I sync
  49. the system before halting the CPU.
  50. .PP
  51. File systems may become further corrupted if proper startup
  52. procedures are not observed, e.g.,
  53. not checking a file system for inconsistencies,
  54. and not repairing inconsistencies.
  55. Allowing a corrupted file system to be used (and, thus, to be modified
  56. further) can be disastrous.
  57. .PP
  58. Any piece of hardware can fail at any time.
  59. Failures
  60. can be as subtle as a bad block
  61. on a disk pack, or as blatant as a non-functional disk-controller.
  62. .NH 2
  63. Detecting and correcting corruption
  64. .PP
  65. Normally
  66. .I fsck_ffs
  67. is run non-interactively.
  68. In this mode it will only fix
  69. corruptions that are expected to occur from an unclean halt.
  70. These actions are a proper subset of the actions that
  71. .I fsck_ffs
  72. will take when it is running interactively.
  73. Throughout this paper we assume that
  74. .I fsck_ffs
  75. is being run interactively,
  76. and all possible errors can be encountered.
  77. When an inconsistency is discovered in this mode,
  78. .I fsck_ffs
  79. reports the inconsistency for the operator to
  80. chose a corrective action.
  81. .PP
  82. A quiescent\(dd
  83. .FS
  84. \(dd I.e., unmounted and not being written on.
  85. .FE
  86. file system may be checked for structural integrity
  87. by performing consistency checks on the
  88. redundant data intrinsic to a file system.
  89. The redundant data is either read from
  90. the file system,
  91. or computed from other known values.
  92. The file system
  93. .B must
  94. be in a quiescent state when
  95. .I fsck_ffs
  96. is run,
  97. since
  98. .I fsck_ffs
  99. is a multi-pass program.
  100. .PP
  101. In the following sections,
  102. we discuss methods to discover inconsistencies
  103. and possible corrective actions
  104. for the cylinder group blocks, the inodes, the indirect blocks, and
  105. the data blocks containing directory entries.
  106. .NH 2
  107. Super-block checking
  108. .PP
  109. The most commonly corrupted item in a file system
  110. is the summary information
  111. associated with the super-block.
  112. The summary information is prone to corruption
  113. because it is modified with every change to the file
  114. system's blocks or inodes,
  115. and is usually corrupted
  116. after an unclean halt.
  117. .PP
  118. The super-block is checked for inconsistencies
  119. involving file-system size, number of inodes,
  120. free-block count, and the free-inode count.
  121. The file-system size must be larger than the
  122. number of blocks used by the super-block
  123. and the number of blocks used by the list of inodes.
  124. The file-system size and layout information
  125. are the most critical pieces of information for
  126. .I fsck_ffs .
  127. While there is no way to actually check these sizes,
  128. since they are statically determined by
  129. .I newfs ,
  130. .I fsck_ffs
  131. can check that these sizes are within reasonable bounds.
  132. All other file system checks require that these sizes be correct.
  133. If
  134. .I fsck_ffs
  135. detects corruption in the static parameters of the default super-block,
  136. .I fsck_ffs
  137. requests the operator to specify the location of an alternate super-block.
  138. .NH 2
  139. Free block checking
  140. .PP
  141. .I Fsck_ffs
  142. checks that all the blocks
  143. marked as free in the cylinder group block maps
  144. are not claimed by any files.
  145. When all the blocks have been initially accounted for,
  146. .I fsck_ffs
  147. checks that
  148. the number of free blocks
  149. plus the number of blocks claimed by the inodes
  150. equals the total number of blocks in the file system.
  151. .PP
  152. If anything is wrong with the block allocation maps,
  153. .I fsck_ffs
  154. will rebuild them,
  155. based on the list it has computed of allocated blocks.
  156. .PP
  157. The summary information associated with the super-block
  158. counts the total number of free blocks within the file system.
  159. .I Fsck_ffs
  160. compares this count to the
  161. number of free blocks it found within the file system.
  162. If the two counts do not agree, then
  163. .I fsck_ffs
  164. replaces the incorrect count in the summary information
  165. by the actual free-block count.
  166. .PP
  167. The summary information
  168. counts the total number of free inodes within the file system.
  169. .I Fsck_ffs
  170. compares this count to the number
  171. of free inodes it found within the file system.
  172. If the two counts do not agree, then
  173. .I fsck_ffs
  174. replaces the incorrect count in the
  175. summary information by the actual free-inode count.
  176. .NH 2
  177. Checking the inode state
  178. .PP
  179. An individual inode is not as likely to be corrupted as
  180. the allocation information.
  181. However, because of the great number of active inodes,
  182. a few of the inodes are usually corrupted.
  183. .PP
  184. The list of inodes in the file system
  185. is checked sequentially starting with inode 2
  186. (inode 0 marks unused inodes;
  187. inode 1 is saved for future generations)
  188. and progressing through the last inode in the file system.
  189. The state of each inode is checked for
  190. inconsistencies involving format and type,
  191. link count,
  192. duplicate blocks,
  193. bad blocks,
  194. and inode size.
  195. .PP
  196. Each inode contains a mode word.
  197. This mode word describes the type and state of the inode.
  198. Inodes must be one of six types:
  199. regular inode, directory inode, symbolic link inode,
  200. special block inode, special character inode, or socket inode.
  201. Inodes may be found in one of three allocation states:
  202. unallocated, allocated, and neither unallocated nor allocated.
  203. This last state suggests an incorrectly formated inode.
  204. An inode can get in this state if
  205. bad data is written into the inode list.
  206. The only possible corrective action is for
  207. .I fsck_ffs
  208. is to clear the inode.
  209. .NH 2
  210. Inode links
  211. .PP
  212. Each inode counts the
  213. total number of directory entries
  214. linked to the inode.
  215. .I Fsck_ffs
  216. verifies the link count of each inode
  217. by starting at the root of the file system,
  218. and descending through the directory structure.
  219. The actual link count for each inode
  220. is calculated during the descent.
  221. .PP
  222. If the stored link count is non-zero and the actual
  223. link count is zero,
  224. then no directory entry appears for the inode.
  225. If this happens,
  226. .I fsck_ffs
  227. will place the disconnected file in the
  228. .I lost+found
  229. directory.
  230. If the stored and actual link counts are non-zero and unequal,
  231. a directory entry may have been added or removed without the inode being
  232. updated.
  233. If this happens,
  234. .I fsck_ffs
  235. replaces the incorrect stored link count by the actual link count.
  236. .PP
  237. Each inode contains a list,
  238. or pointers to
  239. lists (indirect blocks),
  240. of all the blocks claimed by the inode.
  241. Since indirect blocks are owned by an inode,
  242. inconsistencies in indirect blocks directly
  243. affect the inode that owns it.
  244. .PP
  245. .I Fsck_ffs
  246. compares each block number claimed by an inode
  247. against a list of already allocated blocks.
  248. If another inode already claims a block number,
  249. then the block number is added to a list of
  250. .I "duplicate blocks" .
  251. Otherwise, the list of allocated blocks
  252. is updated to include the block number.
  253. .PP
  254. If there are any duplicate blocks,
  255. .I fsck_ffs
  256. will perform a partial second
  257. pass over the inode list
  258. to find the inode of the duplicated block.
  259. The second pass is needed,
  260. since without examining the files associated with
  261. these inodes for correct content,
  262. not enough information is available
  263. to determine which inode is corrupted and should be cleared.
  264. If this condition does arise
  265. (only hardware failure will cause it),
  266. then the inode with the earliest
  267. modify time is usually incorrect,
  268. and should be cleared.
  269. If this happens,
  270. .I fsck_ffs
  271. prompts the operator to clear both inodes.
  272. The operator must decide which one should be kept
  273. and which one should be cleared.
  274. .PP
  275. .I Fsck_ffs
  276. checks the range of each block number claimed by an inode.
  277. If the block number is
  278. lower than the first data block in the file system,
  279. or greater than the last data block,
  280. then the block number is a
  281. .I "bad block number" .
  282. Many bad blocks in an inode are usually caused by
  283. an indirect block that was not written to the file system,
  284. a condition which can only occur if there has been a hardware failure.
  285. If an inode contains bad block numbers,
  286. .I fsck_ffs
  287. prompts the operator to clear it.
  288. .NH 2
  289. Inode data size
  290. .PP
  291. Each inode contains a count of the number of data blocks
  292. that it contains.
  293. The number of actual data blocks
  294. is the sum of the allocated data blocks
  295. and the indirect blocks.
  296. .I Fsck_ffs
  297. computes the actual number of data blocks
  298. and compares that block count against
  299. the actual number of blocks the inode claims.
  300. If an inode contains an incorrect count
  301. .I fsck_ffs
  302. prompts the operator to fix it.
  303. .PP
  304. Each inode contains a thirty-two bit size field.
  305. The size is the number of data bytes
  306. in the file associated with the inode.
  307. The consistency of the byte size field is roughly checked
  308. by computing from the size field the maximum number of blocks
  309. that should be associated with the inode,
  310. and comparing that expected block count against
  311. the actual number of blocks the inode claims.
  312. .NH 2
  313. Checking the data associated with an inode
  314. .PP
  315. An inode can directly or indirectly
  316. reference three kinds of data blocks.
  317. All referenced blocks must be the same kind.
  318. The three types of data blocks are:
  319. plain data blocks, symbolic link data blocks, and directory data blocks.
  320. Plain data blocks
  321. contain the information stored in a file;
  322. symbolic link data blocks
  323. contain the path name stored in a link.
  324. Directory data blocks contain directory entries.
  325. .I Fsck_ffs
  326. can only check the validity of directory data blocks.
  327. .PP
  328. Each directory data block is checked for
  329. several types of inconsistencies.
  330. These inconsistencies include
  331. directory inode numbers pointing to unallocated inodes,
  332. directory inode numbers that are greater than
  333. the number of inodes in the file system,
  334. incorrect directory inode numbers for ``\fB.\fP'' and ``\fB..\fP'',
  335. and directories that are not attached to the file system.
  336. If the inode number in a directory data block
  337. references an unallocated inode,
  338. then
  339. .I fsck_ffs
  340. will remove that directory entry.
  341. Again,
  342. this condition can only arise when there has been a hardware failure.
  343. .PP
  344. .I Fsck_ffs
  345. also checks for directories with unallocated blocks (holes).
  346. Such directories should never be created.
  347. When found,
  348. .I fsck_ffs
  349. will prompt the user to adjust the length of the offending directory
  350. which is done by shortening the size of the directory to the end of the
  351. last allocated block preceding the hole.
  352. Unfortunately, this means that another Phase 1 run has to be done.
  353. .I Fsck_ffs
  354. will remind the user to rerun fsck_ffs after repairing a
  355. directory containing an unallocated block.
  356. .PP
  357. If a directory entry inode number references
  358. outside the inode list, then
  359. .I fsck_ffs
  360. will remove that directory entry.
  361. This condition occurs if bad data is written into a directory data block.
  362. .PP
  363. The directory inode number entry for ``\fB.\fP''
  364. must be the first entry in the directory data block.
  365. The inode number for ``\fB.\fP''
  366. must reference itself;
  367. e.g., it must equal the inode number
  368. for the directory data block.
  369. The directory inode number entry
  370. for ``\fB..\fP'' must be
  371. the second entry in the directory data block.
  372. Its value must equal the inode number for the
  373. parent of the directory entry
  374. (or the inode number of the directory
  375. data block if the directory is the
  376. root directory).
  377. If the directory inode numbers are
  378. incorrect,
  379. .I fsck_ffs
  380. will replace them with the correct values.
  381. If there are multiple hard links to a directory,
  382. the first one encountered is considered the real parent
  383. to which ``\fB..\fP'' should point;
  384. \fIfsck_ffs\fP recommends deletion for the subsequently discovered names.
  385. .NH 2
  386. File system connectivity
  387. .PP
  388. .I Fsck_ffs
  389. checks the general connectivity of the file system.
  390. If directories are not linked into the file system, then
  391. .I fsck_ffs
  392. links the directory back into the file system in the
  393. .I lost+found
  394. directory.
  395. This condition only occurs when there has been a hardware failure.
  396. .ds RH "References"
  397. .SH
  398. \s+2Acknowledgements\s0
  399. .PP
  400. I thank Bill Joy, Sam Leffler, Robert Elz and Dennis Ritchie
  401. for their suggestions and help in implementing the new file system.
  402. Thanks also to Robert Henry for his editorial input to
  403. get this document together.
  404. Finally we thank our sponsors,
  405. the National Science Foundation under grant MCS80-05144,
  406. and the Defense Advance Research Projects Agency (DoD) under
  407. Arpa Order No. 4031 monitored by Naval Electronic System Command under
  408. Contract No. N00039-82-C-0235. (Kirk McKusick, July 1983)
  409. .PP
  410. I would like to thank Larry A. Wehr for advice that lead
  411. to the first version of
  412. .I fsck_ffs
  413. and Rick B. Brandt for adapting
  414. .I fsck_ffs
  415. to
  416. UNIX/TS. (T. Kowalski, July 1979)
  417. .sp 2
  418. .SH
  419. \s+2References\s0
  420. .LP
  421. .IP [Dolotta78] 20
  422. Dolotta, T. A., and Olsson, S. B. eds.,
  423. .I "UNIX User's Manual, Edition 1.1\^" ,
  424. January 1978.
  425. .IP [Joy83] 20
  426. Joy, W., Cooper, E., Fabry, R., Leffler, S., McKusick, M., and Mosher, D.
  427. 4.2BSD System Manual,
  428. .I "University of California at Berkeley" ,
  429. .I "Computer Systems Research Group Technical Report"
  430. #4, 1982.
  431. .IP [McKusick84] 20
  432. McKusick, M., Joy, W., Leffler, S., and Fabry, R.
  433. A Fast File System for UNIX,
  434. \fIACM Transactions on Computer Systems 2\fP, 3.
  435. pp. 181-197, August 1984.
  436. .IP [Ritchie78] 20
  437. Ritchie, D. M., and Thompson, K.,
  438. The UNIX Time-Sharing System,
  439. .I "The Bell System Technical Journal"
  440. .B 57 ,
  441. 6 (July-August 1978, Part 2), pp. 1905-29.
  442. .IP [Thompson78] 20
  443. Thompson, K.,
  444. UNIX Implementation,
  445. .I "The Bell System Technical Journal\^"
  446. .B 57 ,
  447. 6 (July-August 1978, Part 2), pp. 1931-46.
  448. .ds RH Appendix A \- Fsck_ffs Error Conditions
  449. .bp