PageRenderTime 33ms CodeModel.GetById 18ms app.highlight 7ms RepoModel.GetById 2ms app.codeStats 0ms

/share/man/man4/vinum.4

https://bitbucket.org/freebsd/freebsd-head/
Forth | 1171 lines | 1153 code | 18 blank | 0 comment | 34 complexity | 4d2a4ee9a66e5ab826fd34772577466b MD5 | raw file
   1.\"  Hey, Emacs, edit this file in -*- nroff-fill -*- mode
   2.\"-
   3.\" Copyright (c) 1997, 1998, 2003
   4.\"	Nan Yang Computer Services Limited.  All rights reserved.
   5.\"
   6.\"  This software is distributed under the so-called ``Berkeley
   7.\"  License'':
   8.\"
   9.\" Redistribution and use in source and binary forms, with or without
  10.\" modification, are permitted provided that the following conditions
  11.\" are met:
  12.\" 1. Redistributions of source code must retain the above copyright
  13.\"    notice, this list of conditions and the following disclaimer.
  14.\" 2. Redistributions in binary form must reproduce the above copyright
  15.\"    notice, this list of conditions and the following disclaimer in the
  16.\"    documentation and/or other materials provided with the distribution.
  17.\" 3. All advertising materials mentioning features or use of this software
  18.\"    must display the following acknowledgement:
  19.\"	This product includes software developed by Nan Yang Computer
  20.\"      Services Limited.
  21.\" 4. Neither the name of the Company nor the names of its contributors
  22.\"    may be used to endorse or promote products derived from this software
  23.\"    without specific prior written permission.
  24.\"
  25.\" This software is provided ``as is'', and any express or implied
  26.\" warranties, including, but not limited to, the implied warranties of
  27.\" merchantability and fitness for a particular purpose are disclaimed.
  28.\" In no event shall the company or contributors be liable for any
  29.\" direct, indirect, incidental, special, exemplary, or consequential
  30.\" damages (including, but not limited to, procurement of substitute
  31.\" goods or services; loss of use, data, or profits; or business
  32.\" interruption) however caused and on any theory of liability, whether
  33.\" in contract, strict liability, or tort (including negligence or
  34.\" otherwise) arising in any way out of the use of this software, even if
  35.\" advised of the possibility of such damage.
  36.\"
  37.\" $FreeBSD$
  38.\"
  39.Dd May 16, 2002
  40.Dt VINUM 4
  41.Os
  42.Sh NAME
  43.Nm vinum
  44.Nd Logical Volume Manager
  45.Sh SYNOPSIS
  46.Cd "device vinum"
  47.Sh DESCRIPTION
  48.Nm
  49is a logical volume manager inspired by, but not derived from, the Veritas
  50Volume Manager.
  51It provides the following features:
  52.Bl -bullet
  53.It
  54It provides device-independent logical disks, called
  55.Em volumes .
  56Volumes are
  57not restricted to the size of any disk on the system.
  58.It
  59The volumes consist of one or more
  60.Em plexes ,
  61each of which contain the
  62entire address space of a volume.
  63This represents an implementation of RAID-1
  64(mirroring).
  65Multiple plexes can also be used for:
  66.\" XXX What about sparse plexes?  Do we want them?
  67.Bl -bullet
  68.It
  69Increased read throughput.
  70.Nm
  71will read data from the least active disk, so if a volume has plexes on multiple
  72disks, more data can be read in parallel.
  73.Nm
  74reads data from only one plex, but it writes data to all plexes.
  75.It
  76Increased reliability.
  77By storing plexes on different disks, data will remain
  78available even if one of the plexes becomes unavailable.
  79In comparison with a
  80RAID-5 plex (see below), using multiple plexes requires more storage space, but
  81gives better performance, particularly in the case of a drive failure.
  82.It
  83Additional plexes can be used for on-line data reorganization.
  84By attaching an
  85additional plex and subsequently detaching one of the older plexes, data can be
  86moved on-line without compromising access.
  87.It
  88An additional plex can be used to obtain a consistent dump of a file system.
  89By
  90attaching an additional plex and detaching at a specific time, the detached plex
  91becomes an accurate snapshot of the file system at the time of detachment.
  92.\" Make sure to flush!
  93.El
  94.It
  95Each plex consists of one or more logical disk slices, called
  96.Em subdisks .
  97Subdisks are defined as a contiguous block of physical disk storage.
  98A plex may
  99consist of any reasonable number of subdisks (in other words, the real limit is
 100not the number, but other factors, such as memory and performance, associated
 101with maintaining a large number of subdisks).
 102.It
 103A number of mappings between subdisks and plexes are available:
 104.Bl -bullet
 105.It
 106.Em "Concatenated plexes"
 107consist of one or more subdisks, each of which
 108is mapped to a contiguous part of the plex address space.
 109.It
 110.Em "Striped plexes"
 111consist of two or more subdisks of equal size.
 112The file
 113address space is mapped in
 114.Em stripes ,
 115integral fractions of the subdisk
 116size.
 117Consecutive plex address space is mapped to stripes in each subdisk in
 118turn.
 119.if t \{\
 120.ig
 121.\" FIXME
 122.br
 123.ne 1.5i
 124.PS
 125move right 2i
 126down
 127SD0: box
 128SD1: box
 129SD2: box
 130
 131"plex 0" at SD0.n+(0,.2)
 132"subdisk 0" rjust at SD0.w-(.2,0)
 133"subdisk 1" rjust at SD1.w-(.2,0)
 134"subdisk 2" rjust at SD2.w-(.2,0)
 135.PE
 136..
 137.\}
 138The subdisks of a striped plex must all be the same size.
 139.It
 140.Em "RAID-5 plexes"
 141require at least three equal-sized subdisks.
 142They
 143resemble striped plexes, except that in each stripe, one subdisk stores parity
 144information.
 145This subdisk changes in each stripe: in the first stripe, it is the
 146first subdisk, in the second it is the second subdisk, etc.
 147In the event of a
 148single disk failure,
 149.Nm
 150will recover the data based on the information stored on the remaining subdisks.
 151This mapping is particularly suited to read-intensive access.
 152The subdisks of a
 153RAID-5 plex must all be the same size.
 154.\" Make sure to flush!
 155.El
 156.It
 157.Em Drives
 158are the lowest level of the storage hierarchy.
 159They represent disk special
 160devices.
 161.It
 162.Nm
 163offers automatic startup.
 164Unlike
 165.Ux
 166file systems,
 167.Nm
 168volumes contain all the configuration information needed to ensure that they are
 169started correctly when the subsystem is enabled.
 170This is also a significant
 171advantage over the Veritas\(tm File System.
 172This feature regards the presence
 173of the volumes.
 174It does not mean that the volumes will be mounted
 175automatically, since the standard startup procedures with
 176.Pa /etc/fstab
 177perform this function.
 178.El
 179.Sh KERNEL CONFIGURATION
 180.Nm
 181is currently supplied as a KLD module, and does not require
 182configuration.
 183As with other KLDs, it is absolutely necessary to match the KLD
 184to the version of the operating system.
 185Failure to do so will cause
 186.Nm
 187to issue an error message and terminate.
 188.Pp
 189It is possible to configure
 190.Nm
 191in the kernel, but this is not recommended.
 192To do so, add this line to the
 193kernel configuration file:
 194.Pp
 195.D1 Cd "device vinum"
 196.Ss Debug Options
 197The current version of
 198.Nm ,
 199both the kernel module and the user program
 200.Xr gvinum 8 ,
 201include significant debugging support.
 202It is not recommended to remove
 203this support at the moment, but if you do you must remove it from both the
 204kernel and the user components.
 205To do this, edit the files
 206.Pa /usr/src/sbin/vinum/Makefile
 207and
 208.Pa /usr/src/sys/modules/vinum/Makefile
 209and edit the
 210.Va CFLAGS
 211variable to remove the
 212.Li -DVINUMDEBUG
 213option.
 214If you have
 215configured
 216.Nm
 217into the kernel, either specify the line
 218.Pp
 219.D1 Cd "options VINUMDEBUG"
 220.Pp
 221in the kernel configuration file or remove the
 222.Li -DVINUMDEBUG
 223option from
 224.Pa /usr/src/sbin/vinum/Makefile
 225as described above.
 226.Pp
 227If the
 228.Va VINUMDEBUG
 229variables do not match,
 230.Xr gvinum 8
 231will fail with a message
 232explaining the problem and what to do to correct it.
 233.Ss Other Options
 234.Cd "options VINUM_AUTOSTART"
 235.Pp
 236Make
 237.Nm
 238automatically scan all available disks at attach time.
 239This is a deprecated way that is primarily intended for environments
 240that do not want to rely on kernel environment variables set by
 241.Xr loader 8 .
 242.Pp
 243.Nm
 244was previously available in two versions: a freely available version which did
 245not contain RAID-5 functionality, and a full version including RAID-5
 246functionality, which was available only from Cybernet Systems Inc.
 247The present
 248version of
 249.Nm
 250includes the RAID-5 functionality.
 251.Sh RUNNING VINUM
 252.Nm
 253is part of the base
 254.Fx
 255system.
 256It does not require installation.
 257To start it, start the
 258.Xr gvinum 8
 259program, which will load the KLD if it is not already present.
 260Before using
 261.Nm ,
 262it must be configured.
 263See
 264.Xr gvinum 8
 265for information on how to create a
 266.Nm
 267configuration.
 268.Pp
 269Normally, you start a configured version of
 270.Nm
 271at boot time.
 272Set the variable
 273.Va start_vinum
 274in
 275.Pa /etc/rc.conf
 276to
 277.Dq Li YES
 278to start
 279.Nm
 280at boot time.
 281(See
 282.Xr rc.conf 5
 283for more details.)
 284.Pp
 285If
 286.Nm
 287is loaded as a KLD (the recommended way), the
 288.Nm vinum Cm stop
 289command will unload it
 290(see
 291.Xr gvinum 8 ) .
 292You can also do this with the
 293.Xr kldunload 8
 294command.
 295.Pp
 296The KLD can only be unloaded when idle, in other words when no volumes are
 297mounted and no other instances of the
 298.Xr gvinum 8
 299program are active.
 300Unloading the KLD does not harm the data in the volumes.
 301.Ss Configuring and Starting Objects
 302Use the
 303.Xr gvinum 8
 304utility to configure and start
 305.Nm
 306objects.
 307.Sh AUTOMATIC STARTUP
 308The
 309.Nm
 310subsystem can be automatically started at attach time.
 311There are two kernel environment variables that can be set in
 312.Xr loader.conf 5
 313to accomplish this.
 314.Bl -tag -width ".Va vinum.autostart" -offset indent
 315.It Va vinum.autostart
 316If this variable is set (to any value), the attach function will attempt
 317to scan all available disks for valid
 318.Nm
 319configuration records.
 320This is the preferred way if automatic startup is desired.
 321.Pp
 322Example:
 323.Dl vinum.autostart="YES"
 324.It Va vinum.drives
 325Alternatively, this variable can enumerate a list of disk devices
 326to scan for configuration records.
 327Note that only the
 328.Dq bare
 329device names need to be given, since
 330.Nm
 331will automatically scan all possible slices and partitions.
 332.Pp
 333Example:
 334.Dl vinum.drives="da0 da1"
 335.El
 336.Pp
 337If automatic startup is used, it is not necessary to set the
 338.Va start_vinum
 339variable of
 340.Xr rc.conf 5 .
 341Note that if
 342.Nm
 343is to supply to the volume for the root file system, it is necessary
 344to start the subsystem early.
 345This can be achieved by specifying
 346.Pp
 347.Dl vinum_load="YES"
 348.Pp
 349in
 350.Xr loader.conf 5 .
 351.Sh IOCTL CALLS
 352.Xr ioctl 2
 353calls are intended for the use of the
 354.Xr gvinum 8
 355configuration program only.
 356They are described in the header file
 357.Pa /sys/dev/vinum/vinumio.h .
 358.Ss Disk Labels
 359Conventional disk special devices have a
 360.Em "disk label"
 361in the second sector of the device.
 362This disk label describes the layout of the partitions within
 363the device.
 364.Nm
 365does not subdivide volumes, so volumes do not contain a physical disk label.
 366For convenience,
 367.Nm
 368implements the ioctl calls
 369.Dv DIOCGDINFO
 370(get disk label),
 371.Dv DIOCGPART
 372(get partition information),
 373.Dv DIOCWDINFO
 374(write partition information) and
 375.Dv DIOCSDINFO
 376(set partition information).
 377.Dv DIOCGDINFO
 378and
 379.Dv DIOCGPART
 380refer to an internal
 381representation of the disk label which is not present on the volume.
 382As a
 383result, the
 384.Fl r
 385option of
 386.Xr disklabel 8 ,
 387which reads the
 388.Dq "raw disk" ,
 389will fail.
 390.Pp
 391In general,
 392.Xr disklabel 8
 393serves no useful purpose on a
 394.Nm
 395volume.
 396If you run it, it will show you
 397three partitions,
 398.Ql a ,
 399.Ql b
 400and
 401.Ql c ,
 402all the same except for the
 403.Va fstype ,
 404for example:
 405.Bd -literal
 4063 partitions:
 407#        size   offset    fstype   [fsize bsize bps/cpg]
 408  a:     2048        0    4.2BSD     1024  8192     0   # (Cyl.    0 - 0)
 409  b:     2048        0      swap                        # (Cyl.    0 - 0)
 410  c:     2048        0    unused        0     0         # (Cyl.    0 - 0)
 411.Ed
 412.Pp
 413.Nm
 414ignores the
 415.Dv DIOCWDINFO
 416and
 417.Dv DIOCSDINFO
 418ioctls, since there is nothing to change.
 419As a result, any attempt to modify the disk label will be silently ignored.
 420.Sh MAKING FILE SYSTEMS
 421Since
 422.Nm
 423volumes do not contain partitions, the names do not need to conform to the
 424standard rules for naming disk partitions.
 425For a physical disk partition, the
 426last letter of the device name specifies the partition identifier (a to h).
 427.Nm
 428volumes need not conform to this convention, but if they do not,
 429.Xr newfs 8
 430will complain that it cannot determine the partition.
 431To solve this problem,
 432use the
 433.Fl v
 434flag to
 435.Xr newfs 8 .
 436For example, if you have a volume
 437.Pa concat ,
 438use the following command to create a UFS file system on it:
 439.Pp
 440.Dl "newfs -v /dev/vinum/concat"
 441.Sh OBJECT NAMING
 442.Nm
 443assigns default names to plexes and subdisks, although they may be overridden.
 444We do not recommend overriding the default names.
 445Experience with the
 446Veritas\(tm
 447volume manager, which allows arbitrary naming of objects, has shown that this
 448flexibility does not bring a significant advantage, and it can cause confusion.
 449.Pp
 450Names may contain any non-blank character, but it is recommended to restrict
 451them to letters, digits and the underscore characters.
 452The names of volumes,
 453plexes and subdisks may be up to 64 characters long, and the names of drives may
 454up to 32 characters long.
 455When choosing volume and plex names, bear in mind
 456that automatically generated plex and subdisk names are longer than the name
 457from which they are derived.
 458.Bl -bullet
 459.It
 460When
 461.Nm
 462creates or deletes objects, it creates a directory
 463.Pa /dev/vinum ,
 464in which it makes device entries for each volume it finds.
 465It also creates
 466subdirectories,
 467.Pa /dev/vinum/plex
 468and
 469.Pa /dev/vinum/sd ,
 470in which it stores device entries for plexes and subdisks.
 471In addition, it creates two more directories,
 472.Pa /dev/vinum/vol
 473and
 474.Pa /dev/vinum/drive ,
 475in which it stores hierarchical information for volumes and drives.
 476.It
 477In addition,
 478.Nm
 479creates three super-devices,
 480.Pa /dev/vinum/control ,
 481.Pa /dev/vinum/Control
 482and
 483.Pa /dev/vinum/controld .
 484.Pa /dev/vinum/control
 485is used by
 486.Xr gvinum 8
 487when it has been compiled without the
 488.Dv VINUMDEBUG
 489option,
 490.Pa /dev/vinum/Control
 491is used by
 492.Xr gvinum 8
 493when it has been compiled with the
 494.Dv VINUMDEBUG
 495option, and
 496.Pa /dev/vinum/controld
 497is used by the
 498.Nm
 499daemon.
 500The two control devices for
 501.Xr gvinum 8
 502are used to synchronize the debug status of kernel and user modules.
 503.It
 504Unlike
 505.Ux
 506drives,
 507.Nm
 508volumes are not subdivided into partitions, and thus do not contain a disk
 509label.
 510Unfortunately, this confuses a number of utilities, notably
 511.Xr newfs 8 ,
 512which normally tries to interpret the last letter of a
 513.Nm
 514volume name as a partition identifier.
 515If you use a volume name which does not
 516end in the letters
 517.Ql a
 518to
 519.Ql c ,
 520you must use the
 521.Fl v
 522flag to
 523.Xr newfs 8
 524in order to tell it to ignore this convention.
 525.\"
 526.It
 527Plexes do not need to be assigned explicit names.
 528By default, a plex name is
 529the name of the volume followed by the letters
 530.Pa .p
 531and the number of the
 532plex.
 533For example, the plexes of volume
 534.Pa vol3
 535are called
 536.Pa vol3.p0 , vol3.p1
 537and so on.
 538These names can be overridden, but it is not recommended.
 539.It
 540Like plexes, subdisks are assigned names automatically, and explicit naming is
 541discouraged.
 542A subdisk name is the name of the plex followed by the letters
 543.Pa .s
 544and a number identifying the subdisk.
 545For example, the subdisks of
 546plex
 547.Pa vol3.p0
 548are called
 549.Pa vol3.p0.s0 , vol3.p0.s1
 550and so on.
 551.It
 552By contrast,
 553.Em drives
 554must be named.
 555This makes it possible to move a drive to a different location
 556and still recognize it automatically.
 557Drive names may be up to 32 characters
 558long.
 559.El
 560.Ss Example
 561Assume the
 562.Nm
 563objects described in the section
 564.Sx "CONFIGURATION FILE"
 565in
 566.Xr gvinum 8 .
 567The directory
 568.Pa /dev/vinum
 569looks like:
 570.Bd -literal -offset indent
 571# ls -lR /dev/vinum
 572total 5
 573brwxr-xr--  1 root  wheel   25,   2 Mar 30 16:08 concat
 574brwx------  1 root  wheel   25, 0x40000000 Mar 30 16:08 control
 575brwx------  1 root  wheel   25, 0x40000001 Mar 30 16:08 controld
 576drwxrwxrwx  2 root  wheel       512 Mar 30 16:08 drive
 577drwxrwxrwx  2 root  wheel       512 Mar 30 16:08 plex
 578drwxrwxrwx  2 root  wheel       512 Mar 30 16:08 rvol
 579drwxrwxrwx  2 root  wheel       512 Mar 30 16:08 sd
 580brwxr-xr--  1 root  wheel   25,   3 Mar 30 16:08 strcon
 581brwxr-xr--  1 root  wheel   25,   1 Mar 30 16:08 stripe
 582brwxr-xr--  1 root  wheel   25,   0 Mar 30 16:08 tinyvol
 583drwxrwxrwx  7 root  wheel       512 Mar 30 16:08 vol
 584brwxr-xr--  1 root  wheel   25,   4 Mar 30 16:08 vol5
 585
 586/dev/vinum/drive:
 587total 0
 588brw-r-----  1 root  operator    4,  15 Oct 21 16:51 drive2
 589brw-r-----  1 root  operator    4,  31 Oct 21 16:51 drive4
 590
 591/dev/vinum/plex:
 592total 0
 593brwxr-xr--  1 root  wheel   25, 0x10000002 Mar 30 16:08 concat.p0
 594brwxr-xr--  1 root  wheel   25, 0x10010002 Mar 30 16:08 concat.p1
 595brwxr-xr--  1 root  wheel   25, 0x10000003 Mar 30 16:08 strcon.p0
 596brwxr-xr--  1 root  wheel   25, 0x10010003 Mar 30 16:08 strcon.p1
 597brwxr-xr--  1 root  wheel   25, 0x10000001 Mar 30 16:08 stripe.p0
 598brwxr-xr--  1 root  wheel   25, 0x10000000 Mar 30 16:08 tinyvol.p0
 599brwxr-xr--  1 root  wheel   25, 0x10000004 Mar 30 16:08 vol5.p0
 600brwxr-xr--  1 root  wheel   25, 0x10010004 Mar 30 16:08 vol5.p1
 601
 602/dev/vinum/sd:
 603total 0
 604brwxr-xr--  1 root  wheel   25, 0x20000002 Mar 30 16:08 concat.p0.s0
 605brwxr-xr--  1 root  wheel   25, 0x20100002 Mar 30 16:08 concat.p0.s1
 606brwxr-xr--  1 root  wheel   25, 0x20010002 Mar 30 16:08 concat.p1.s0
 607brwxr-xr--  1 root  wheel   25, 0x20000003 Mar 30 16:08 strcon.p0.s0
 608brwxr-xr--  1 root  wheel   25, 0x20100003 Mar 30 16:08 strcon.p0.s1
 609brwxr-xr--  1 root  wheel   25, 0x20010003 Mar 30 16:08 strcon.p1.s0
 610brwxr-xr--  1 root  wheel   25, 0x20110003 Mar 30 16:08 strcon.p1.s1
 611brwxr-xr--  1 root  wheel   25, 0x20000001 Mar 30 16:08 stripe.p0.s0
 612brwxr-xr--  1 root  wheel   25, 0x20100001 Mar 30 16:08 stripe.p0.s1
 613brwxr-xr--  1 root  wheel   25, 0x20000000 Mar 30 16:08 tinyvol.p0.s0
 614brwxr-xr--  1 root  wheel   25, 0x20100000 Mar 30 16:08 tinyvol.p0.s1
 615brwxr-xr--  1 root  wheel   25, 0x20000004 Mar 30 16:08 vol5.p0.s0
 616brwxr-xr--  1 root  wheel   25, 0x20100004 Mar 30 16:08 vol5.p0.s1
 617brwxr-xr--  1 root  wheel   25, 0x20010004 Mar 30 16:08 vol5.p1.s0
 618brwxr-xr--  1 root  wheel   25, 0x20110004 Mar 30 16:08 vol5.p1.s1
 619
 620/dev/vinum/vol:
 621total 5
 622brwxr-xr--  1 root  wheel   25,   2 Mar 30 16:08 concat
 623drwxr-xr-x  4 root  wheel       512 Mar 30 16:08 concat.plex
 624brwxr-xr--  1 root  wheel   25,   3 Mar 30 16:08 strcon
 625drwxr-xr-x  4 root  wheel       512 Mar 30 16:08 strcon.plex
 626brwxr-xr--  1 root  wheel   25,   1 Mar 30 16:08 stripe
 627drwxr-xr-x  3 root  wheel       512 Mar 30 16:08 stripe.plex
 628brwxr-xr--  1 root  wheel   25,   0 Mar 30 16:08 tinyvol
 629drwxr-xr-x  3 root  wheel       512 Mar 30 16:08 tinyvol.plex
 630brwxr-xr--  1 root  wheel   25,   4 Mar 30 16:08 vol5
 631drwxr-xr-x  4 root  wheel       512 Mar 30 16:08 vol5.plex
 632
 633/dev/vinum/vol/concat.plex:
 634total 2
 635brwxr-xr--  1 root  wheel   25, 0x10000002 Mar 30 16:08 concat.p0
 636drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 concat.p0.sd
 637brwxr-xr--  1 root  wheel   25, 0x10010002 Mar 30 16:08 concat.p1
 638drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 concat.p1.sd
 639
 640/dev/vinum/vol/concat.plex/concat.p0.sd:
 641total 0
 642brwxr-xr--  1 root  wheel   25, 0x20000002 Mar 30 16:08 concat.p0.s0
 643brwxr-xr--  1 root  wheel   25, 0x20100002 Mar 30 16:08 concat.p0.s1
 644
 645/dev/vinum/vol/concat.plex/concat.p1.sd:
 646total 0
 647brwxr-xr--  1 root  wheel   25, 0x20010002 Mar 30 16:08 concat.p1.s0
 648
 649/dev/vinum/vol/strcon.plex:
 650total 2
 651brwxr-xr--  1 root  wheel   25, 0x10000003 Mar 30 16:08 strcon.p0
 652drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 strcon.p0.sd
 653brwxr-xr--  1 root  wheel   25, 0x10010003 Mar 30 16:08 strcon.p1
 654drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 strcon.p1.sd
 655
 656/dev/vinum/vol/strcon.plex/strcon.p0.sd:
 657total 0
 658brwxr-xr--  1 root  wheel   25, 0x20000003 Mar 30 16:08 strcon.p0.s0
 659brwxr-xr--  1 root  wheel   25, 0x20100003 Mar 30 16:08 strcon.p0.s1
 660
 661/dev/vinum/vol/strcon.plex/strcon.p1.sd:
 662total 0
 663brwxr-xr--  1 root  wheel   25, 0x20010003 Mar 30 16:08 strcon.p1.s0
 664brwxr-xr--  1 root  wheel   25, 0x20110003 Mar 30 16:08 strcon.p1.s1
 665
 666/dev/vinum/vol/stripe.plex:
 667total 1
 668brwxr-xr--  1 root  wheel   25, 0x10000001 Mar 30 16:08 stripe.p0
 669drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 stripe.p0.sd
 670
 671/dev/vinum/vol/stripe.plex/stripe.p0.sd:
 672total 0
 673brwxr-xr--  1 root  wheel   25, 0x20000001 Mar 30 16:08 stripe.p0.s0
 674brwxr-xr--  1 root  wheel   25, 0x20100001 Mar 30 16:08 stripe.p0.s1
 675
 676/dev/vinum/vol/tinyvol.plex:
 677total 1
 678brwxr-xr--  1 root  wheel   25, 0x10000000 Mar 30 16:08 tinyvol.p0
 679drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 tinyvol.p0.sd
 680
 681/dev/vinum/vol/tinyvol.plex/tinyvol.p0.sd:
 682total 0
 683brwxr-xr--  1 root  wheel   25, 0x20000000 Mar 30 16:08 tinyvol.p0.s0
 684brwxr-xr--  1 root  wheel   25, 0x20100000 Mar 30 16:08 tinyvol.p0.s1
 685
 686/dev/vinum/vol/vol5.plex:
 687total 2
 688brwxr-xr--  1 root  wheel   25, 0x10000004 Mar 30 16:08 vol5.p0
 689drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 vol5.p0.sd
 690brwxr-xr--  1 root  wheel   25, 0x10010004 Mar 30 16:08 vol5.p1
 691drwxr-xr-x  2 root  wheel       512 Mar 30 16:08 vol5.p1.sd
 692
 693/dev/vinum/vol/vol5.plex/vol5.p0.sd:
 694total 0
 695brwxr-xr--  1 root  wheel   25, 0x20000004 Mar 30 16:08 vol5.p0.s0
 696brwxr-xr--  1 root  wheel   25, 0x20100004 Mar 30 16:08 vol5.p0.s1
 697
 698/dev/vinum/vol/vol5.plex/vol5.p1.sd:
 699total 0
 700brwxr-xr--  1 root  wheel   25, 0x20010004 Mar 30 16:08 vol5.p1.s0
 701brwxr-xr--  1 root  wheel   25, 0x20110004 Mar 30 16:08 vol5.p1.s1
 702.Ed
 703.Pp
 704In the case of unattached plexes and subdisks, the naming is reversed.
 705Subdisks
 706are named after the disk on which they are located, and plexes are named after
 707the subdisk.
 708.\" XXX
 709.Bf -symbolic
 710This mapping is still to be determined.
 711.Ef
 712.Ss Object States
 713Each
 714.Nm
 715object has a
 716.Em state
 717associated with it.
 718.Nm
 719uses this state to determine the handling of the object.
 720.Ss Volume States
 721Volumes may have the following states:
 722.Bl -hang -width 14n
 723.It Em down
 724The volume is completely inaccessible.
 725.It Em up
 726The volume is up and at least partially functional.
 727Not all plexes may be
 728available.
 729.El
 730.Ss "Plex States"
 731Plexes may have the following states:
 732.Bl -hang -width 14n
 733.It Em referenced
 734A plex entry which has been referenced as part of a volume, but which is
 735currently not known.
 736.It Em faulty
 737A plex which has gone completely down because of I/O errors.
 738.It Em down
 739A plex which has been taken down by the administrator.
 740.It Em initializing
 741A plex which is being initialized.
 742.El
 743.Pp
 744The remaining states represent plexes which are at least partially up.
 745.Bl -hang -width 14n
 746.It Em corrupt
 747A plex entry which is at least partially up.
 748Not all subdisks are available,
 749and an inconsistency has occurred.
 750If no other plex is uncorrupted, the volume
 751is no longer consistent.
 752.It Em degraded
 753A RAID-5 plex entry which is accessible, but one subdisk is down, requiring
 754recovery for many I/O requests.
 755.It Em flaky
 756A plex which is really up, but which has a reborn subdisk which we do not
 757completely trust, and which we do not want to read if we can avoid it.
 758.It Em up
 759A plex entry which is completely up.
 760All subdisks are up.
 761.El
 762.Ss "Subdisk States"
 763Subdisks can have the following states:
 764.Bl -hang -width 14n
 765.It Em empty
 766A subdisk entry which has been created completely.
 767All fields are correct, and
 768the disk has been updated, but the on the disk is not valid.
 769.It Em referenced
 770A subdisk entry which has been referenced as part of a plex, but which is
 771currently not known.
 772.It Em initializing
 773A subdisk entry which has been created completely and which is currently being
 774initialized.
 775.El
 776.Pp
 777The following states represent invalid data.
 778.Bl -hang -width 14n
 779.It Em obsolete
 780A subdisk entry which has been created completely.
 781All fields are correct, the
 782config on disk has been updated, and the data was valid, but since then the
 783drive has been taken down, and as a result updates have been missed.
 784.It Em stale
 785A subdisk entry which has been created completely.
 786All fields are correct, the
 787disk has been updated, and the data was valid, but since then the drive has been
 788crashed and updates have been lost.
 789.El
 790.Pp
 791The following states represent valid, inaccessible data.
 792.Bl -hang -width 14n
 793.It Em crashed
 794A subdisk entry which has been created completely.
 795All fields are correct, the
 796disk has been updated, and the data was valid, but since then the drive has gone
 797down.
 798No attempt has been made to write to the subdisk since the crash, so the
 799data is valid.
 800.It Em down
 801A subdisk entry which was up, which contained valid data, and which was taken
 802down by the administrator.
 803The data is valid.
 804.It Em reviving
 805The subdisk is currently in the process of being revived.
 806We can write but not
 807read.
 808.El
 809.Pp
 810The following states represent accessible subdisks with valid data.
 811.Bl -hang -width 14n
 812.It Em reborn
 813A subdisk entry which has been created completely.
 814All fields are correct, the
 815disk has been updated, and the data was valid, but since then the drive has gone
 816down and up again.
 817No updates were lost, but it is possible that the subdisk
 818has been damaged.
 819We will not read from this subdisk if we have a choice.
 820If this
 821is the only subdisk which covers this address space in the plex, we set its
 822state to up under these circumstances, so this status implies that there is
 823another subdisk to fulfill the request.
 824.It Em up
 825A subdisk entry which has been created completely.
 826All fields are correct, the
 827disk has been updated, and the data is valid.
 828.El
 829.Ss "Drive States"
 830Drives can have the following states:
 831.Bl -hang -width 14n
 832.It Em referenced
 833At least one subdisk refers to the drive, but it is not currently accessible to
 834the system.
 835No device name is known.
 836.It Em down
 837The drive is not accessible.
 838.It Em up
 839The drive is up and running.
 840.El
 841.Sh SEE ALSO
 842.Xr loader.conf 5 ,
 843.Xr disklabel 8 ,
 844.Xr gvinum 8 ,
 845.Xr loader 8 ,
 846.Xr newfs 8
 847.Sh HISTORY
 848.Nm
 849first appeared in
 850.Fx 3.0 .
 851The RAID-5 component of
 852.Nm
 853was developed by Cybernet Inc.\&
 854.Pq Pa http://www.cybernet.com/ ,
 855for its NetMAX product.
 856.Sh AUTHORS
 857.An Greg Lehey Aq grog@lemis.com .
 858.Sh BUGS
 859.Nm
 860is a new product.
 861Bugs can be expected.
 862The configuration mechanism is not yet
 863fully functional.
 864If you have difficulties, please look at the section
 865.Sx "DEBUGGING PROBLEMS WITH VINUM"
 866before reporting problems.
 867.Pp
 868Kernels with the
 869.Nm
 870device appear to work, but are not supported.
 871If you have trouble with
 872this configuration, please first replace the kernel with a
 873.No non- Ns Nm
 874kernel and test with the KLD module.
 875.Pp
 876Detection of differences between the version of the kernel and the KLD is not
 877yet implemented.
 878.Pp
 879The RAID-5 functionality is new in
 880.Fx 3.3 .
 881Some problems have been
 882reported with
 883.Nm
 884in combination with soft updates, but these are not reproducible on all
 885systems.
 886If you are planning to use
 887.Nm
 888in a production environment, please test carefully.
 889.Sh DEBUGGING PROBLEMS WITH VINUM
 890Solving problems with
 891.Nm
 892can be a difficult affair.
 893This section suggests some approaches.
 894.Ss Configuration problems
 895It is relatively easy (too easy) to run into problems with the
 896.Nm
 897configuration.
 898If you do, the first thing you should do is stop configuration
 899updates:
 900.Pp
 901.Dl "vinum setdaemon 4"
 902.Pp
 903This will stop updates and any further corruption of the on-disk configuration.
 904.Pp
 905Next, look at the on-disk configuration, using a Bourne-style shell:
 906.Bd -literal
 907rm -f log
 908for i in /dev/da0s1h /dev/da1s1h /dev/da2s1h /dev/da3s1h; do
 909  (dd if=$i skip=8 count=6|tr -d '\e000-\e011\e200-\e377'; echo) >> log
 910done
 911.Ed
 912.Pp
 913The names of the devices are the names of all
 914.Nm
 915slices.
 916The file
 917.Pa log
 918should then contain something like this:
 919.Bd -literal
 920.if t .ps -3
 921.if t .vs -3
 922IN VINOpanic.lemis.comdrive1}6E7~^K6T^Yfoovolume obj state up
 923volume src state up
 924volume raid state down
 925volume r state down
 926volume foo state up
 927plex name obj.p0 state corrupt org concat vol obj
 928plex name obj.p1 state corrupt org striped 128b vol obj
 929plex name src.p0 state corrupt org striped 128b vol src
 930plex name src.p1 state up org concat vol src
 931plex name raid.p0 state faulty org disorg vol raid
 932plex name r.p0 state faulty org disorg vol r
 933plex name foo.p0 state up org concat vol foo
 934plex name foo.p1 state faulty org concat vol foo
 935sd name obj.p0.s0 drive drive2 plex obj.p0 state reborn len 409600b driveoffset 265b plexoffset 0b
 936sd name obj.p0.s1 drive drive4 plex obj.p0 state up len 409600b driveoffset 265b plexoffset 409600b
 937sd name obj.p1.s0 drive drive1 plex obj.p1 state up len 204800b driveoffset 265b plexoffset 0b
 938sd name obj.p1.s1 drive drive2 plex obj.p1 state reborn len 204800b driveoffset 409865b plexoffset 128b
 939sd name obj.p1.s2 drive drive3 plex obj.p1 state up len 204800b driveoffset 265b plexoffset 256b
 940sd name obj.p1.s3 drive drive4 plex obj.p1 state up len 204800b driveoffset 409865b plexoffset 384b
 941.if t .vs
 942.if t .ps
 943.Ed
 944.Pp
 945The first line contains the
 946.Nm
 947label and must start with the text
 948.Dq Li "IN VINO" .
 949It also contains the name of the system.
 950The exact definition is contained in
 951.Pa /usr/src/sys/dev/vinum/vinumvar.h .
 952The saved configuration starts in the middle of the line with the text
 953.Dq Li "volume obj state up"
 954and starts in sector 9 of the disk.
 955The rest of the output shows the remainder of the on-disk configuration.
 956It
 957may be necessary to increase the
 958.Cm count
 959argument of
 960.Xr dd 1
 961in order to see the complete configuration.
 962.Pp
 963The configuration on all disks should be the same.
 964If this is not the case,
 965please report the problem with the exact contents of the file
 966.Pa log .
 967There is probably little that can be done to recover the on-disk configuration,
 968but if you keep a copy of the files used to create the objects, you should be
 969able to re-create them.
 970The
 971.Ic create
 972command does not change the subdisk data, so this will not cause data
 973corruption.
 974You may need to use the
 975.Ic resetconfig
 976command if you have this kind of trouble.
 977.Ss Kernel Panics
 978In order to analyse a panic which you suspect comes from
 979.Nm
 980you will need to build a debug kernel.
 981See the online handbook at
 982.Pa /usr/share/doc/en/books/developers-handbook/kerneldebug.html
 983(if installed) or
 984.Pa http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/developers-\%handbook/kerneldebug.html
 985for more details of how to do this.
 986.Pp
 987Perform the following steps to analyse a
 988.Nm
 989problem:
 990.Bl -enum
 991.It
 992Copy the following files to the directory in which you will be
 993performing the analysis, typically
 994.Pa /var/crash :
 995.Pp
 996.Bl -bullet -compact
 997.It
 998.Pa /usr/src/sys/modules/vinum/.gdbinit.crash ,
 999.It
1000.Pa /usr/src/sys/modules/vinum/.gdbinit.kernel ,
1001.It
1002.Pa /usr/src/sys/modules/vinum/.gdbinit.serial ,
1003.It
1004.Pa /usr/src/sys/modules/vinum/.gdbinit.vinum
1005and
1006.It
1007.Pa /usr/src/sys/modules/vinum/.gdbinit.vinum.paths
1008.El
1009.It
1010Make sure that you build the
1011.Nm
1012module with debugging information.
1013The standard
1014.Pa Makefile
1015builds a module with debugging symbols by default.
1016If the version of
1017.Nm
1018in
1019.Pa /boot/kernel
1020does not contain symbols, you will not get an error message, but the stack trace
1021will not show the symbols.
1022Check the module before starting
1023.Xr gdb 1 :
1024.Bd -literal
1025$ file /boot/kernel/vinum.ko
1026/boot/kernel/vinum.ko: ELF 32-bit LSB shared object, Intel 80386,
1027  version 1 (FreeBSD), not stripped
1028.Ed
1029.Pp
1030If the output shows that
1031.Pa /boot/kernel/vinum.ko
1032is stripped, you will have to find a version which is not.
1033Usually this will be
1034either in
1035.Pa /usr/obj/sys/modules/vinum/vinum.ko
1036(if you have built
1037.Nm
1038with a
1039.Dq Li "make world" )
1040or
1041.Pa /usr/src/sys/modules/vinum/vinum.ko
1042(if you have built
1043.Nm
1044in this directory).
1045Modify the file
1046.Pa .gdbinit.vinum.paths
1047accordingly.
1048.It
1049Either take a dump or use remote serial
1050.Xr gdb 1
1051to analyse the problem.
1052To analyse a dump, say
1053.Pa /var/crash/vmcore.5 ,
1054link
1055.Pa /var/crash/.gdbinit.crash
1056to
1057.Pa /var/crash/.gdbinit
1058and enter:
1059.Bd -literal -offset indent
1060cd /var/crash
1061gdb -k kernel.debug vmcore.5
1062.Ed
1063.Pp
1064This example assumes that you have installed the correct debug kernel at
1065.Pa /var/crash/kernel.debug .
1066If not, substitute the correct name of the debug kernel.
1067.Pp
1068To perform remote serial debugging,
1069link
1070.Pa /var/crash/.gdbinit.serial
1071to
1072.Pa /var/crash/.gdbinit
1073and enter
1074.Bd -literal -offset indent
1075cd /var/crash
1076gdb -k kernel.debug
1077.Ed
1078.Pp
1079In this case, the
1080.Pa .gdbinit
1081file performs the functions necessary to establish connection.
1082The remote
1083machine must already be in debug mode: enter the kernel debugger and select
1084.Ic gdb
1085(see
1086.Xr ddb 4
1087for more details).
1088The serial
1089.Pa .gdbinit
1090file expects the serial connection to run at 38400 bits per second; if you run
1091at a different speed, edit the file accordingly (look for the
1092.Va remotebaud
1093specification).
1094.Pp
1095The following example shows a remote debugging session using the
1096.Ic debug
1097command of
1098.Xr gvinum 8 :
1099.Bd -literal
1100.if t .ps -3
1101.if t .vs -3
1102GDB 4.16 (i386-unknown-freebsd), Copyright 1996 Free Software Foundation, Inc.
1103Debugger (msg=0xf1093174 "vinum debug") at ../../i386/i386/db_interface.c:318
1104318                 in_Debugger = 0;
1105#1  0xf108d9bc in vinumioctl (dev=0x40001900, cmd=0xc008464b, data=0xf6dedee0 "",
1106    flag=0x3, p=0xf68b7940) at
1107    /usr/src/sys/modules/Vinum/../../dev/Vinum/vinumioctl.c:102
1108102             Debugger ("vinum debug");
1109(kgdb) bt
1110#0  Debugger (msg=0xf0f661ac "vinum debug") at ../../i386/i386/db_interface.c:318
1111#1  0xf0f60a7c in vinumioctl (dev=0x40001900, cmd=0xc008464b, data=0xf6923ed0 "",
1112      flag=0x3, p=0xf688e6c0) at
1113      /usr/src/sys/modules/vinum/../../dev/vinum/vinumioctl.c:109
1114#2  0xf01833b7 in spec_ioctl (ap=0xf6923e0c) at ../../miscfs/specfs/spec_vnops.c:424
1115#3  0xf0182cc9 in spec_vnoperate (ap=0xf6923e0c) at ../../miscfs/specfs/spec_vnops.c:129
1116#4  0xf01eb3c1 in ufs_vnoperatespec (ap=0xf6923e0c) at ../../ufs/ufs/ufs_vnops.c:2312
1117#5  0xf017dbb1 in vn_ioctl (fp=0xf1007ec0, com=0xc008464b, data=0xf6923ed0 "",
1118      p=0xf688e6c0) at vnode_if.h:395
1119#6  0xf015dce0 in ioctl (p=0xf688e6c0, uap=0xf6923f84) at ../../kern/sys_generic.c:473
1120#7  0xf0214c0b in syscall (frame={tf_es = 0x27, tf_ds = 0x27, tf_edi = 0xefbfcff8,
1121      tf_esi = 0x1, tf_ebp = 0xefbfcf90, tf_isp = 0xf6923fd4, tf_ebx = 0x2,
1122      tf_edx = 0x804b614, tf_ecx = 0x8085d10, tf_eax = 0x36, tf_trapno = 0x7,
1123      tf_err = 0x2, tf_eip = 0x8060a34, tf_cs = 0x1f, tf_eflags = 0x286,
1124      tf_esp = 0xefbfcf78, tf_ss = 0x27}) at ../../i386/i386/trap.c:1100
1125#8  0xf020a1fc in Xint0x80_syscall ()
1126#9  0x804832d in ?? ()
1127#10 0x80482ad in ?? ()
1128#11 0x80480e9 in ?? ()
1129.if t .vs
1130.if t .ps
1131.Ed
1132.Pp
1133When entering from the debugger, it is important that the source of frame 1
1134(listed by the
1135.Pa .gdbinit
1136file at the top of the example) contains the text
1137.Dq Li "Debugger (\*[q]vinum debug\*[q]);" .
1138.Pp
1139This is an indication that the address specifications are correct.
1140If you get
1141some other output, your symbols and the kernel module are out of sync, and the
1142trace will be meaningless.
1143.El
1144.Pp
1145For an initial investigation, the most important information is the output of
1146the
1147.Ic bt
1148(backtrace) command above.
1149.Ss Reporting Problems with Vinum
1150If you find any bugs in
1151.Nm ,
1152please report them to
1153.An Greg Lehey Aq grog@lemis.com .
1154Supply the following
1155information:
1156.Bl -bullet
1157.It
1158The output of the
1159.Nm vinum Cm list
1160command
1161(see
1162.Xr gvinum 8 ) .
1163.It
1164Any messages printed in
1165.Pa /var/log/messages .
1166All such messages will be identified by the text
1167.Dq Li vinum
1168at the beginning.
1169.It
1170If you have a panic, a stack trace as described above.
1171.El