PageRenderTime 52ms CodeModel.GetById 25ms app.highlight 10ms RepoModel.GetById 1ms app.codeStats 1ms

/share/doc/IPv6/IMPLEMENTATION

https://bitbucket.org/freebsd/freebsd-head/
#! | 2392 lines | 2005 code | 387 blank | 0 comment | 0 complexity | 0bd9aeb31d0e58ddba98bbdb1a5ebaa7 MD5 | raw file

Large files files are truncated, but you can click here to view the full file

   1	Implementation Note
   2
   3	KAME Project
   4	http://www.kame.net/
   5	$KAME: IMPLEMENTATION,v 1.216 2001/05/25 07:43:01 jinmei Exp $
   6	$FreeBSD$
   7
   8NOTE: The document tries to describe behaviors/implementation choices
   9of the latest KAME/*BSD stack.  The description here may not be
  10applicable to KAME-integrated *BSD releases, as we have certain amount
  11of changes between them.  Still, some of the content can be useful for
  12KAME-integrated *BSD releases.
  13
  14Table of Contents
  15
  16	1. IPv6
  17	1.1 Conformance
  18	1.2 Neighbor Discovery
  19	1.3 Scope Zone Index
  20	1.3.1 Kernel internal
  21	1.3.2 Interaction with API
  22	1.3.3 Interaction with users (command line)
  23	1.4 Plug and Play
  24	1.4.1 Assignment of link-local, and special addresses
  25	1.4.2 Stateless address autoconfiguration on hosts
  26	1.4.3 DHCPv6
  27	1.5 Generic tunnel interface
  28	1.6 Address Selection
  29	1.6.1 Source Address Selection
  30	1.6.2 Destination Address Ordering
  31	1.7 Jumbo Payload
  32	1.8 Loop prevention in header processing
  33	1.9 ICMPv6
  34	1.10 Applications
  35	1.11 Kernel Internals
  36	1.12 IPv4 mapped address and IPv6 wildcard socket
  37	1.12.1 KAME/BSDI3 and KAME/FreeBSD228
  38	1.12.2 KAME/FreeBSD[34]x
  39	1.12.2.1 KAME/FreeBSD[34]x, listening side
  40	1.12.2.2 KAME/FreeBSD[34]x, initiating side
  41	1.12.3 KAME/NetBSD
  42	1.12.3.1 KAME/NetBSD, listening side
  43	1.12.3.2 KAME/NetBSD, initiating side
  44	1.12.4 KAME/BSDI4
  45	1.12.4.1 KAME/BSDI4, listening side
  46	1.12.4.2 KAME/BSDI4, initiating side
  47	1.12.5 KAME/OpenBSD
  48	1.12.5.1 KAME/OpenBSD, listening side
  49	1.12.5.2 KAME/OpenBSD, initiating side
  50	1.12.6 More issues
  51	1.12.7 Interaction with SIIT translator
  52	1.13 sockaddr_storage
  53	1.14 Invalid addresses on the wire
  54	1.15 Node's required addresses
  55	1.15.1 Host case
  56	1.15.2 Router case
  57	1.16 Advanced API
  58	1.17 DNS resolver
  59	2. Network Drivers
  60	2.1 FreeBSD 2.2.x-RELEASE
  61	2.2 BSD/OS 3.x
  62	2.3 NetBSD
  63	2.4 FreeBSD 3.x-RELEASE
  64	2.5 FreeBSD 4.x-RELEASE
  65	2.6 OpenBSD 2.x
  66	2.7 BSD/OS 4.x
  67	3. Translator
  68	3.1 FAITH TCP relay translator
  69	3.2 IPv6-to-IPv4 header translator
  70	4. IPsec
  71	4.1 Policy Management
  72	4.2 Key Management
  73	4.3 AH and ESP handling
  74	4.4 IPComp handling
  75	4.5 Conformance to RFCs and IDs
  76	4.6 ECN consideration on IPsec tunnels
  77	4.7 Interoperability
  78	4.8 Operations with IPsec tunnel mode
  79	4.8.1 RFC2401 IPsec tunnel mode approach
  80	4.8.2 draft-touch-ipsec-vpn approach
  81	5. ALTQ
  82	6. Mobile IPv6
  83	6.1 KAME node as correspondent node
  84	6.2 KAME node as home agent/mobile node
  85	6.3 Old Mobile IPv6 code
  86	7. Coding style
  87	8. Policy on technology with intellectual property right restriction
  88
  891. IPv6
  90
  911.1 Conformance
  92
  93The KAME kit conforms, or tries to conform, to the latest set of IPv6
  94specifications.  For future reference we list some of the relevant documents
  95below (NOTE: this is not a complete list - this is too hard to maintain...).
  96For details please refer to specific chapter in the document, RFCs, manpages
  97come with KAME, or comments in the source code.
  98
  99Conformance tests have been performed on past and latest KAME STABLE kit,
 100at TAHI project.  Results can be viewed at http://www.tahi.org/report/KAME/.
 101We also attended Univ. of New Hampshire IOL tests (http://www.iol.unh.edu/)
 102in the past, with our past snapshots.
 103
 104RFC1639: FTP Operation Over Big Address Records (FOOBAR)
 105    * RFC2428 is preferred over RFC1639.  ftp clients will first try RFC2428,
 106      then RFC1639 if failed.
 107RFC1886: DNS Extensions to support IPv6
 108RFC1933: (see RFC2893)
 109RFC1981: Path MTU Discovery for IPv6
 110RFC2080: RIPng for IPv6
 111    * KAME-supplied route6d, bgpd and hroute6d support this.
 112RFC2283: Multiprotocol Extensions for BGP-4
 113    * so-called "BGP4+".
 114    * KAME-supplied bgpd supports this.
 115RFC2292: Advanced Sockets API for IPv6
 116    * see RFC3542
 117RFC2362: Protocol Independent Multicast-Sparse Mode (PIM-SM)
 118    * RFC2362 defines the packet formats and the protcol of PIM-SM.
 119RFC2373: IPv6 Addressing Architecture
 120    * KAME supports node required addresses, and conforms to the scope
 121      requirement.
 122RFC2374: An IPv6 Aggregatable Global Unicast Address Format
 123    * KAME supports 64-bit length of Interface ID.
 124RFC2375: IPv6 Multicast Address Assignments
 125    * Userland applications use the well-known addresses assigned in the RFC.
 126RFC2428: FTP Extensions for IPv6 and NATs
 127    * RFC2428 is preferred over RFC1639.  ftp clients will first try RFC2428,
 128      then RFC1639 if failed.
 129RFC2460: IPv6 specification
 130RFC2461: Neighbor discovery for IPv6
 131    * See 1.2 in this document for details.
 132RFC2462: IPv6 Stateless Address Autoconfiguration
 133    * See 1.4 in this document for details.
 134RFC2463: ICMPv6 for IPv6 specification
 135    * See 1.9 in this document for details.
 136RFC2464: Transmission of IPv6 Packets over Ethernet Networks
 137RFC2465: MIB for IPv6: Textual Conventions and General Group
 138    * Necessary statistics are gathered by the kernel.  Actual IPv6 MIB
 139      support is provided as patchkit for ucd-snmp.
 140RFC2466: MIB for IPv6: ICMPv6 group
 141    * Necessary statistics are gathered by the kernel.  Actual IPv6 MIB
 142      support is provided as patchkit for ucd-snmp.
 143RFC2467: Transmission of IPv6 Packets over FDDI Networks
 144RFC2472: IPv6 over PPP
 145RFC2492: IPv6 over ATM Networks
 146    * only PVC is supported.
 147RFC2497: Transmission of IPv6 packet over ARCnet Networks
 148RFC2545: Use of BGP-4 Multiprotocol Extensions for IPv6 Inter-Domain Routing
 149RFC2553: (see RFC3493)
 150RFC2671: Extension Mechanisms for DNS (EDNS0)
 151    * see USAGE for how to use it.
 152    * not supported on kame/freebsd4 and kame/bsdi4.
 153RFC2673: Binary Labels in the Domain Name System
 154    * KAME/bsdi4 supports A6, DNAME and binary label to some extent.
 155    * KAME apps/bind8 repository has resolver library with partial A6, DNAME
 156      and binary label support.
 157RFC2675: IPv6 Jumbograms
 158    * See 1.7 in this document for details.
 159RFC2710: Multicast Listener Discovery for IPv6
 160RFC2711: IPv6 router alert option
 161RFC2732: Format for Literal IPv6 Addresses in URL's
 162    * The spec is implemented in programs that handle URLs
 163      (like freebsd ftpio(3) and fetch(1), or netbsd ftp(1))
 164RFC2874: DNS Extensions to Support IPv6 Address Aggregation and Renumbering
 165    * KAME/bsdi4 supports A6, DNAME and binary label to some extent.
 166    * KAME apps/bind8 repository has resolver library with partial A6, DNAME
 167      and binary label support.
 168RFC2893: Transition Mechanisms for IPv6 Hosts and Routers
 169    * IPv4 compatible address is not supported.
 170    * automatic tunneling (4.3) is not supported.
 171    * "gif" interface implements IPv[46]-over-IPv[46] tunnel in a generic way,
 172      and it covers "configured tunnel" described in the spec.
 173      See 1.5 in this document for details.
 174RFC2894: Router renumbering for IPv6
 175RFC3041: Privacy Extensions for Stateless Address Autoconfiguration in IPv6
 176RFC3056: Connection of IPv6 Domains via IPv4 Clouds
 177    * So-called "6to4".
 178    * "stf" interface implements it.  Be sure to read
 179      draft-itojun-ipv6-transition-abuse-01.txt
 180      below before configuring it, there can be security issues.
 181RFC3142: An IPv6-to-IPv4 transport relay translator
 182    * FAITH tcp relay translator (faithd) implements this.  See 3.1 for more
 183      details.
 184RFC3152: Delegation of IP6.ARPA
 185    * libinet6 resolvers contained in the KAME snaps support to use
 186      the ip6.arpa domain (with the nibble format) for IPv6 reverse
 187      lookups.
 188RFC3484: Default Address Selection for IPv6
 189    * the selection algorithm for both source and destination addresses
 190      is implemented based on the RFC, though some rules are still omitted.
 191RFC3493: Basic Socket Interface Extensions for IPv6
 192    * IPv4 mapped address (3.7) and special behavior of IPv6 wildcard bind
 193      socket (3.8) are,
 194	- supported and turned on by default on KAME/FreeBSD[34]
 195	  and KAME/BSDI4,
 196	- supported but turned off by default on KAME/NetBSD and KAME/FreeBSD5,
 197	- not supported on KAME/FreeBSD228, KAME/OpenBSD and KAME/BSDI3.
 198      see 1.12 in this document for details.
 199    * The AI_ALL and AI_V4MAPPED flags are not supported.
 200RFC3542: Advanced Sockets API for IPv6 (revised)
 201    * For supported library functions/kernel APIs, see sys/netinet6/ADVAPI.
 202    * Some of the updates in the draft are not implemented yet.  See
 203      TODO.2292bis for more details.
 204RFC4007: IPv6 Scoped Address Architecture
 205    * some part of the documentation (especially about the routing
 206      model) is not supported yet.
 207    * zone indices that contain scope types have not been supported yet.
 208
 209draft-ietf-ipngwg-icmp-name-lookups-09: IPv6 Name Lookups Through ICMP
 210draft-ietf-ipv6-router-selection-07.txt:
 211	Default Router Preferences and More-Specific Routes
 212    * router-side: both router preference and specific routes are supported.
 213    * host-side: only router preference is supported.
 214draft-ietf-pim-sm-v2-new-02.txt
 215	A revised version of RFC2362, which includes the IPv6 specific
 216	packet format and protocol descriptions.
 217draft-ietf-dnsext-mdns-00.txt: Multicast DNS
 218    * kame/mdnsd has test implementation, which will not be built in
 219      default compilation.  The draft will experience a major change in the
 220      near future, so don't rely upon it.
 221draft-ietf-ipngwg-icmp-v3-02.txt: ICMPv6 for IPv6 specification (revised)
 222    * See 1.9 in this document for details.
 223draft-itojun-ipv6-tcp-to-anycast-01.txt:
 224	Disconnecting TCP connection toward IPv6 anycast address
 225draft-ietf-ipv6-rfc2462bis-06.txt: IPv6 Stateless Address
 226	Autoconfiguration (revised)
 227draft-itojun-ipv6-transition-abuse-01.txt:
 228	Possible abuse against IPv6 transition technologies (expired)
 229    * KAME does not implement RFC1933/2893 automatic tunnel.
 230    * "stf" interface implements some address filters.  Refer to stf(4)
 231      for details.  Since there's no way to make 6to4 interface 100% secure,
 232      we do not include "stf" interface into GENERIC.v6 compilation.
 233    * kame/openbsd completely disables IPv4 mapped address support.
 234    * kame/netbsd makes IPv4 mapped address support off by default.
 235    * See section 1.12.6 and 1.14 for more details.
 236draft-itojun-ipv6-flowlabel-api-01.txt: Socket API for IPv6 flow label field
 237    * no consideration is made against the use of routing headers and such.
 238
 2391.2 Neighbor Discovery
 240
 241Our implementation of Neighbor Discovery is fairly stable.  Currently
 242Address Resolution, Duplicated Address Detection, and Neighbor
 243Unreachability Detection are supported.  In the near future we will be
 244adding an Unsolicited Neighbor Advertisement transmission command as
 245an administration tool.
 246
 247Duplicated Address Detection (DAD) will be performed when an IPv6 address
 248is assigned to a network interface, or the network interface is enabled
 249(ifconfig up).  It is documented in RFC2462 5.4.
 250If DAD fails, the address will be marked "duplicated" and message will be
 251generated to syslog (and usually to console).  The "duplicated" mark
 252can be checked with ifconfig.  It is administrators' responsibility to check
 253for and recover from DAD failures.  We may try to improve failure recovery
 254in future KAME code.
 255
 256A successor version of RFC2462 (called rfc2462bis) clarifies the
 257behavior when DAD fails (i.e., duplicate is detected): if the
 258duplicate address is a link-local address formed from an interface
 259identifier based on the hardware address which is supposed to be
 260uniquely assigned (e.g., EUI-64 for an Ethernet interface), IPv6
 261operation on the interface should be disabled.  The KAME
 262implementation supports this as follows: if this type of duplicate is
 263detected, the kernel marks "disabled" in the ND specific data
 264structure for the interface.  Every IPv6 I/O operation in the kernel
 265checks this mark, and the kernel will drop packets received on or
 266being sent to the "disabled" interface.  Whether the IPv6 operation is
 267disabled or not can be confirmed by the ndp(8) command.  See the man
 268page for more details.
 269
 270DAD procedure may not be effective on certain network interfaces/drivers.
 271If a network driver needs long initialization time (with wireless network
 272interfaces this situation is popular), and the driver mistakingly raises
 273IFF_RUNNING before the driver becomes ready, DAD code will try to transmit
 274DAD probes to not-really-ready network driver and the packet will not go out
 275from the interface.  In such cases, network drivers should be corrected.
 276
 277Some of network drivers loop multicast packets back to themselves,
 278even if instructed not to do so (especially in promiscuous mode).  In
 279such cases DAD may fail, because the DAD engine sees inbound NS packet
 280(actually from the node itself) and considers it as a sign of
 281duplicate.  In this case, drivers should be corrected to honor
 282IFF_SIMPLEX behavior.  For example, you may need to check source MAC
 283address on an inbound packet, and reject it if it is from the node
 284itself.
 285
 286Neighbor Discovery specification (RFC2461) does not talk about neighbor
 287cache handling in the following cases:
 288(1) when there was no neighbor cache entry, node received unsolicited
 289    RS/NS/NA/redirect packet without link-layer address
 290(2) neighbor cache handling on medium without link-layer address
 291    (we need a neighbor cache entry for IsRouter bit)
 292For (1), we implemented workaround based on discussions on IETF ipngwg mailing
 293list.  For more details, see the comments in the source code and email
 294thread started from (IPng 7155), dated Feb 6 1999.
 295
 296IPv6 on-link determination rule (RFC2461) is quite different from
 297assumptions in BSD IPv4 network code.  To implement the behavior in
 298RFC2461 section 6.3.6 (3), the kernel needs to know the default
 299outgoing interface.  To configure the default outgoing interface, use
 300commands like "ndp -I de0" as root.  Then the kernel will have a
 301"default" route to the interface with the cloning "C" bit being on.
 302This default route will cause to make a neighbor cache entry for every
 303destination that does not match an explicit route entry.
 304
 305Note that we intentionally disable configuring the default interface
 306by default.  This is because we found it sometimes caused inconvenient
 307situation while it was rarely useful in practical usage.  For example,
 308consider a destination that has both IPv4 and IPv6 addresses but is
 309only reachable via IPv4.  Since our getaddrinfo(3) prefers IPv6 by
 310default, an (TCP) application using the library with PF_UNSPEC first
 311tries to connect to the IPv6 address.  If we turn on RFC 2461 6.3.6
 312(3), we have to wait for quite a long period before the first attempt
 313to make a connection fails.  If we turn it off, the first attempt will
 314immediately fail with EHOSTUNREACH, and then the application can try
 315the next, reachable address.
 316
 317The notion of the default interface is also disabled when the node is
 318acting as a router.  The reason is that routers tend to control all
 319routes stored in the kernel and the default route automatically
 320installed would rather confuse the routers.  Note that the spec misuse
 321the word "host" and "node" in several places in Section 5.2 of RFC
 3222461.  We basically read the word "node" in this section as "host,"
 323and thus believe the implementation policy does not break the
 324specification.
 325
 326To avoid possible DoS attacks and infinite loops, KAME stack will accept
 327only 10 options on ND packet.  Therefore, if you have 20 prefix options
 328attached to RA, only the first 10 prefixes will be recognized.
 329If this troubles you, please contact the KAME team and/or modify
 330nd6_maxndopt in sys/netinet6/nd6.c.  If there are high demands we may
 331provide a sysctl knob for the variable.
 332
 333Proxy Neighbor Advertisement support is implemented in the kernel.
 334For instance, you can configure it by using the following command:
 335	# ndp -s fe80::1234%ne0 0:1:2:3:4:5 proxy
 336where ne0 is the interface which attaches to the same link as the
 337proxy target.
 338There are certain limitations, though:
 339- It does not send unsolicited multicast NA on configuration.  This is MAY
 340  behavior in RFC2461.
 341- It does not add random delay before transmission of solicited NA.  This is
 342  SHOULD behavior in RFC2461.
 343- We cannot configure proxy NDP for off-link address.  The target address for
 344  proxying must be link-local address, or must be in prefixes configured to
 345  node which does proxy NDP.
 346- RFC2461 is unclear about if it is legal for a host to perform proxy ND.
 347  We do not prohibit hosts from doing proxy ND, but there will be very limited
 348  use in it.
 349
 350Starting mid March 2000, we support Neighbor Unreachability Detection
 351(NUD) on p2p interfaces, including tunnel interfaces (gif).  NUD is
 352turned on by default.  Before March 2000 the KAME stack did not
 353perform NUD on p2p interfaces.  If the change raises any
 354interoperability issues, you can turn off/on NUD by per-interface
 355basis.  Use "ndp -i interface -nud" to turn it off.  Consult ndp(8)
 356for details.
 357
 358RFC2461 specifies upper-layer reachability confirmation hint.  Whenever
 359upper-layer reachability confirmation hint comes, ND process can use it
 360to optimize neighbor discovery process - ND process can omit real ND exchange
 361and keep the neighbor cache state in REACHABLE.
 362We currently have two sources for hints: (1) setsockopt(IPV6_REACHCONF)
 363defined by the RFC3542 API, and (2) hints from tcp(6)_input.
 364
 365It is questionable if they are really trustworthy.  For example, a
 366rogue userland program can use IPV6_REACHCONF to confuse the ND
 367process.  Neighbor cache is a system-wide information pool, and it is
 368bad to allow a single process to affect others.  Also, tcp(6)_input
 369can be hosed by hijack attempts.  It is wrong to allow hijack attempts
 370to affect the ND process.
 371
 372Starting June 2000, the ND code has a protection mechanism against
 373incorrect upper-layer reachability confirmation.  The ND code counts
 374subsequent upper-layer hints.  If the number of hints reaches the
 375maximum, the ND code will ignore further upper-layer hints and run
 376real ND process to confirm reachability to the peer.  sysctl
 377net.inet6.icmp6.nd6_maxnudhint defines the maximum # of subsequent
 378upper-layer hints to be accepted.
 379(from April 2000 to June 2000, we rejected setsockopt(IPV6_REACHCONF) from
 380non-root process - after a local discussion, it looks that hints are not
 381that trustworthy even if they are from privileged processes)
 382
 383If inbound ND packets carry invalid values, the KAME kernel will
 384drop these packet and increment statistics variable.  See
 385"netstat -sn", icmp6 section.  For detailed debugging session, you can
 386turn on syslog output from the kernel on errors, by turning on sysctl MIB
 387net.inet6.icmp6.nd6_debug.  nd6_debug can be turned on at bootstrap
 388time, by defining ND6_DEBUG kernel compilation option (so you can
 389debug behavior during bootstrap).  nd6_debug configuration should
 390only be used for test/debug purposes - for a production environment,
 391nd6_debug must be set to 0.  If you leave it to 1, malicious parties
 392can inject broken packet and fill up /var/log partition.
 393
 3941.3 Scope Zone Index
 395
 396IPv6 uses scoped addresses.  It is therefore very important to
 397specify the scope zone index (link index for a link-local address, or
 398site index for a site-local address) with an IPv6 address.  Without a
 399zone index, a scoped IPv6 address is ambiguous to the kernel, and
 400the kernel would not be able to determine the outbound zone for a
 401packet to the scoped address.  KAME code tries to address the issue in
 402several ways.
 403
 404The entire architecture of scoped addresses is documented in RFC4007.
 405One non-trivial point of the architecture is that the link scope is
 406(theoretically) larger than the interface scope.  That is, two
 407different interfaces can belong to a same single link.  However, in a
 408normal operation, we can assume that there is 1-to-1 relationship
 409between links and interfaces.  In other words, we can usually put
 410links and interfaces in the same scope type.  The current KAME
 411implementation assumes the 1-to-1 relationship.  In particular, we use
 412interface names such as "ne1" as unique link identifiers.  This would
 413be much more human-readable and intuitive than numeric identifiers,
 414but please keep your mind on the theoretical difference between links
 415and interfaces.
 416
 417Site-local addresses are very vaguely defined in the specs, and both
 418the specification and the KAME code need tons of improvements to
 419enable its actual use.  For example, it is still very unclear how we
 420define a site, or how we resolve host names in a site.  There is work
 421underway to define behavior of routers at site border, but, we have
 422almost no code for site boundary node support (neither forwarding nor
 423routing) and we bet almost noone has.  We recommend, at this moment,
 424you to use global addresses for experiments - there are way too many
 425pitfalls if you use site-local addresses.
 426
 4271.3.1 Kernel internal
 428
 429In the kernel, the link index for a link-local scope address is
 430embedded into the 2nd 16bit-word (the 3rd and 4th bytes) in the IPv6
 431address.
 432For example, you may see something like:
 433	fe80:1::200:f8ff:fe01:6317
 434in the routing table and the interface address structure (struct
 435in6_ifaddr).  The address above is a link-local unicast address which
 436belongs to a network link whose link identifier is 1 (note that it
 437eqauls to the interface index by the assumption of our
 438implementation).  The embedded index enables us to identify IPv6
 439link-local addresses over multiple links effectively and with only a
 440little code change.
 441
 442The use of the internal format must be limited inside the kernel.  In
 443particular, addresses sent by an application should not contain the
 444embedded index (except via some very special APIs such as routing
 445sockets).  Instead, the index should be specified in the sin6_scope_id
 446field of a sockaddr_in6 structure.  Obviously, packets sent to or
 447received from must not contain the embedded index either, since the
 448index is meaningful only within the sending/receiving node.
 449
 450In order to deal with the differences, several kernel routines are
 451provided.  These are available by including <netinet6/scope_var.h>.
 452Typically, the following functions will be most generally used:
 453
 454- int sa6_embedscope(struct sockaddr_in6 *sa6, int defaultok);
 455  Embed sa6->sin6_scope_id into sa6->sin6_addr.  If sin6_scope_id is
 456  0, defaultok is non-0, and the default zone ID (see RFC4007) is
 457  configured, the default ID will be used instead of the value of the
 458  sin6_scope_id field.  On success, sa6->sin6_scope_id will be reset
 459  to 0.
 460
 461  This function returns 0 on success, or a non-0 error code otherwise.
 462 
 463- int sa6_recoverscope(struct sockaddr_in6 *sa6);
 464  Extract embedded zone ID in sa6->sin6_addr and set
 465  sa6->sin6_scope_id to that ID.  The embedded ID will be cleared with
 466  0.
 467
 468  This function returns 0 on success, or a non-0 error code otherwise.
 469
 470- int in6_clearscope(struct in6_addr *in6);
 471  Reset the embedded zone ID in 'in6' to 0.  This function never fails, and
 472  returns 0 if the original address is intact or non 0 if the address is
 473  modified.  The return value doesn't matter in most cases; currently, the
 474  only point where we care about the return value is ip6_input() for checking
 475  whether the source or destination addresses of the incoming packet is in
 476  the embedded form.
 477
 478- int in6_setscope(struct in6_addr *in6, struct ifnet *ifp,
 479                   u_int32_t *zoneidp);
 480  Embed zone ID determined by the address scope type for 'in6' and the
 481  interface 'ifp' into 'in6'.  If zoneidp is non NULL, *zoneidp will
 482  also have the zone ID.
 483
 484  This function returns 0 on success, or a non-0 error code otherwise.
 485
 486The typical usage of these functions is as follows:
 487
 488sa6_embedscope() will be used at the socket or transport layer to
 489convert a sockaddr_in6 structure passed by an application into the
 490kernel-internal form.  In this usage, the second argument is often the
 491'ip6_use_defzone' global variable.
 492
 493sa6_recoverscope() will also be used at the socket or transport layer
 494to convert an in6_addr structure with the embedded zone ID into a
 495sockaddr_in6 structure with the corresponding ID in the sin6_scope_id
 496field (and without the embedded ID in sin6_addr).
 497
 498in6_clearscope() will be used just before sending a packet to the wire
 499to remove the embedded ID.  In general, this must be done at the last
 500stage of an output path, since otherwise the address would lose the ID
 501and could be ambiguous with regard to scope.
 502
 503in6_setscope() will be used when the kernel receives a packet from the
 504wire to construct the kernel internal form for each address field in
 505the packet (typical examples are the source and destination addresses
 506of the packet).  In the typical usage, the third argument 'zoneidp'
 507will be NULL.  A non-NULL value will be used when the validity of the
 508zone ID must be checked, e.g., when forwarding a packet to another
 509link (see ip6_forward() for this usage).
 510
 511An application, when sending a packet, is basically assumed to specify
 512the appropriate scope zone of the destination address by the
 513sin6_scope_id field (this might be done transparently from the
 514application with getaddrinfo() and the extended textual format - see
 515below), or at least the default scope zone(s) must be configured as a
 516last resort.  In some cases, however, an application could specify an
 517ambiguous address with regard to scope, expecting it is disambiguated
 518in the kernel by some other means.  A typical usage is to specify the
 519outgoing interface through another API, which can disambiguate the
 520unspecified scope zone.  Such a usage is not recommended, but the
 521kernel implements some trick to deal with even this case.
 522
 523A rough sketch of the trick can be summarized as the following
 524sequence.
 525
 526   sa6_embedscope(dst, ip6_use_defzone);
 527   in6_selectsrc(dst, ..., &ifp, ...);
 528   in6_setscope(&dst->sin6_addr, ifp, NULL);
 529
 530sa6_embedscope() first tries to convert sin6_scope_id (or the default
 531zone ID) into the kernel-internal form.  This can fail with an
 532ambiguous destination, but it still tries to get the outgoing
 533interface (ifp) in the attempt of determining the source address of
 534the outgoing packet using in6_selectsrc().  If the interface is
 535detected, and the scope zone was originally ambiguous, in6_setscope()
 536can finally determine the appropriate ID with the address itself and
 537the interface, and construct the kernel-internal form.  See, for
 538example, comments in udp6_output() for more concrete example.
 539
 540In any case, kernel routines except ones in netinet6/scope6.c MUST NOT
 541directly refer to the embedded form.  They MUST use the above
 542interface functions.  In particular, kernel routines MUST NOT have the
 543following code fragment:
 544
 545	/* This is a bad practice.  Don't do this */
 546	if (IN6_IS_ADDR_LINKLOCAL(&sin6->sin6_addr))
 547		sin6->sin6_addr.s6_addr16[1] = htons(ifp->if_index);
 548
 549This is bad for several reasons.  First, address ambiguity is not
 550specific to link-local addresses (any non-global multicast addresses
 551are inherently ambiguous, and this is particularly true for
 552interface-local addresses).  Secondly, this is vulnerable to future
 553changes of the embedded form (the embedded position may change, or the
 554zone ID may not actually be the interface index).  Only scope6.c
 555routines should know the details.
 556
 557The above code fragment should thus actually be as follows:
 558
 559	/* This is correct. */
 560	in6_setscope(&sin6->sin6_addr, ifp, NULL);
 561	(and catch errors if possible and necessary)
 562
 5631.3.2 Interaction with API
 564
 565There are several candidates of API to deal with scoped addresses
 566without ambiguity.
 567
 568The IPV6_PKTINFO ancillary data type or socket option defined in the
 569advanced API (RFC2292 or RFC3542) can specify
 570the outgoing interface of a packet.  Similarly, the IPV6_PKTINFO or
 571IPV6_RECVPKTINFO socket options tell kernel to pass the incoming
 572interface to user applications.
 573
 574These options are enough to disambiguate scoped addresses of an
 575incoming packet, because we can uniquely identify the corresponding
 576zone of the scoped address(es) by the incoming interface.  However,
 577they are too strong for outgoing packets.  For example, consider a
 578multi-sited node and suppose that more than one interface of the node
 579belongs to a same site.  When we want to send a packet to the site,
 580we can only specify one of the interfaces for the outgoing packet with
 581these options; we cannot just say "send the packet to (one of the
 582interfaces of) the site."
 583
 584Another kind of candidates is to use the sin6_scope_id member in the
 585sockaddr_in6 structure, defined in RFC2553.  The KAME kernel
 586interprets the sin6_scope_id field properly in order to disambiguate scoped
 587addresses.  For example, if an application passes a sockaddr_in6
 588structure that has a non-zero sin6_scope_id value to the sendto(2)
 589system call, the kernel should send the packet to the appropriate zone
 590according to the sin6_scope_id field.  Similarly, when the source or
 591the destination address of an incoming packet is a scoped one, the
 592kernel should detect the correct zone identifier based on the address
 593and the receiving interface, fill the identifier in the sin6_scope_id
 594field of a sockaddr_in6 structure, and then pass the packet to an
 595application via the recvfrom(2) system call, etc.
 596
 597However, the semantics of the sin6_scope_id is still vague and on the
 598way to standardization.  Additionally, not so many operating systems
 599support the behavior above at this moment.
 600
 601In summary,
 602- If your target system is limited to KAME based ones (i.e. BSD
 603  variants and KAME snaps), use the sin6_scope_id field assuming the
 604  kernel behavior described above.
 605- Otherwise, (i.e. if your program should be portable on other systems
 606  than BSDs)
 607  + Use the advanced API to disambiguate scoped addresses of incoming
 608    packets.
 609  + To disambiguate scoped addresses of outgoing packets,
 610    * if it is okay to just specify the outgoing interface, use the
 611      advanced API.  This would be the case, for example, when you
 612      should only consider link-local addresses and your system
 613      assumes 1-to-1 relationship between links and interfaces.
 614    * otherwise, sorry but you lose.  Please rush the IETF IPv6
 615      community into standardizing the semantics of the sin6_scope_id
 616      field.
 617
 618Routing daemons and configuration programs, like route6d and ifconfig,
 619will need to manipulate the "embedded" zone index.  These programs use
 620routing sockets and ioctls (like SIOCGIFADDR_IN6) and the kernel API
 621will return IPv6 addresses with the 2nd 16bit-word filled in.  The
 622APIs are for manipulating kernel internal structure.  Programs that
 623use these APIs have to be prepared about differences in kernels
 624anyway.
 625
 626getaddrinfo(3) and getnameinfo(3) support an extended numeric IPv6
 627syntax, as documented in RFC4007.  You can specify the outgoing link,
 628by using the name of the outgoing interface as the link, like
 629"fe80::1%ne0" (again, note that we assume there is 1-to-1 relationship
 630between links and interfaces.)  This way you will be able to specify a
 631link-local scoped address without much trouble.
 632
 633Other APIs like inet_pton(3) and inet_ntop(3) are inherently
 634unfriendly with scoped addresses, since they are unable to annotate
 635addresses with zone identifier.
 636
 6371.3.3 Interaction with users (command line)
 638
 639Most of user applications now support the extended numeric IPv6
 640syntax.  In this case, you can specify outgoing link, by using the name
 641of the outgoing interface like "fe80::1%ne0" (sorry for the duplicated
 642notice, but please recall again that we assume 1-to-1 relationship
 643between links and interfaces).  This is even the case for some
 644management tools such as route(8) or ndp(8).  For example, to install
 645the IPv6 default route by hand, you can type like
 646	# route add -inet6 default fe80::9876:5432:1234:abcd%ne0
 647(Although we suggest you to run dynamic routing instead of static
 648routes, in order to avoid configuration mistakes.)
 649
 650Some applications have command line options for specifying an
 651appropriate zone of a scoped address (like "ping6 -I ne0 ff02::1" to
 652specify the outgoing interface).  However, you can't always expect such
 653options.  Additionally, specifying the outgoing "interface" is in
 654theory an overspecification as a way to specify the outgoing "link"
 655(see above).  Thus, we recommend you to use the extended format
 656described above.  This should apply to the case where the outgoing
 657interface is specified.
 658
 659In any case, when you specify a scoped address to the command line,
 660NEVER write the embedded form (such as ff02:1::1 or fe80:2::fedc),
 661which should only be used inside the kernel (see Section 1.3.1), and 
 662is not supposed to work.
 663
 6641.4 Plug and Play
 665
 666The KAME kit implements most of the IPv6 stateless address
 667autoconfiguration in the kernel.
 668Neighbor Discovery functions are implemented in the kernel as a whole.
 669Router Advertisement (RA) input for hosts is implemented in the
 670kernel.  Router Solicitation (RS) output for endhosts, RS input
 671for routers, and RA output for routers are implemented in the
 672userland.
 673
 6741.4.1 Assignment of link-local, and special addresses
 675
 676IPv6 link-local address is generated from IEEE802 address (ethernet MAC
 677address).  Each of interface is assigned an IPv6 link-local address
 678automatically, when the interface becomes up (IFF_UP).  Also, direct route
 679for the link-local address is added to routing table.
 680
 681Here is an output of netstat command:
 682
 683Internet6:
 684Destination                   Gateway                   Flags      Netif Expire
 685fe80::%ed0/64                 link#1                    UC           ed0
 686fe80::%ep0/64                 link#2                    UC           ep0
 687
 688Interfaces that has no IEEE802 address (pseudo interfaces like tunnel
 689interfaces, or ppp interfaces) will borrow IEEE802 address from other
 690interfaces, such as ethernet interfaces, whenever possible.
 691If there is no IEEE802 hardware attached, last-resort pseudorandom value,
 692which is from MD5(hostname), will be used as source of link-local address.
 693If it is not suitable for your usage, you will need to configure the
 694link-local address manually.
 695
 696If an interface is not capable of handling IPv6 (such as lack of multicast
 697support), link-local address will not be assigned to that interface.
 698See section 2 for details.
 699
 700Each interface joins the solicited multicast address and the
 701link-local all-nodes multicast addresses (e.g.  fe80::1:ff01:6317
 702and ff02::1, respectively, on the link the interface is attached).
 703In addition to a link-local address, the loopback address (::1) will be
 704assigned to the loopback interface.  Also, ::1/128 and ff01::/32 are
 705automatically added to routing table, and loopback interface joins
 706node-local multicast group ff01::1.
 707
 7081.4.2 Stateless address autoconfiguration on hosts
 709
 710In IPv6 specification, nodes are separated into two categories:
 711routers and hosts.  Routers forward packets addressed to others, hosts does
 712not forward the packets.  net.inet6.ip6.forwarding defines whether this
 713node is a router or a host (router if it is 1, host if it is 0).
 714
 715It is NOT recommended to change net.inet6.ip6.forwarding while the node
 716is in operation.  IPv6 specification defines behavior for "host" and "router"
 717quite differently, and switching from one to another can cause serious
 718troubles.  It is recommended to configure the variable at bootstrap time only.
 719
 720The first step in stateless address configuration is Duplicated Address
 721Detection (DAD).  See 1.2 for more detail on DAD.
 722
 723When a host hears Router Advertisement from the router, a host may
 724autoconfigure itself by stateless address autoconfiguration.  This
 725behavior can be controlled by the net.inet6.ip6.accept_rtadv sysctl
 726variable and a per-interface flag managed in the kernel.  The latter,
 727which we call "if_accept_rtadv" here, can be changed by the ndp(8)
 728command (see the manpage for more details).  When the sysctl variable
 729is set to 1, and the flag is set, the host autoconfigures itself.  By
 730autoconfiguration, network address prefixes for the receiving
 731interface (usually global address prefix) are added.  The default
 732route is also configured.
 733
 734Routers periodically generate Router Advertisement packets.  To
 735request an adjacent router to generate RA packet, a host can transmit
 736Router Solicitation.  To generate an RS packet at any time, use the
 737"rtsol" command.  The "rtsold" daemon is also available. "rtsold"
 738generates Router Solicitation whenever necessary, and it works greatly
 739for nomadic usage (notebooks/laptops).  If one wishes to ignore Router
 740Advertisements, use sysctl to set net.inet6.ip6.accept_rtadv to 0.
 741Additionally, ndp(8) command can be used to control the behavior
 742per-interface basis.
 743
 744To generate Router Advertisement from a router, use the "rtadvd" daemon.
 745
 746Note that the IPv6 specification assumes the following items and that
 747nonconforming cases are left unspecified:
 748- Only hosts will listen to router advertisements
 749- Hosts have a single network interface (except loopback)
 750This is therefore unwise to enable net.inet6.ip6.accept_rtadv on routers,
 751or multi-interface hosts.  A misconfigured node can behave strange
 752(KAME code allows nonconforming configuration, for those who would like
 753to do some experiments).
 754
 755To summarize the sysctl knob:
 756	accept_rtadv	forwarding	role of the node
 757	---		---		---
 758	0		0		host (to be manually configured)
 759	0		1		router
 760	1		0		autoconfigured host
 761					(spec assumes that hosts have a single
 762					interface only, autoconfigred hosts
 763					with multiple interfaces are
 764					out-of-scope)
 765	1		1		invalid, or experimental
 766					(out-of-scope of spec)
 767
 768The if_accept_rtadv flag is referred only when accept_rtadv is 1 (the
 769latter two cases).  The flag does not have any effects when the sysctl
 770variable is 0.
 771
 772See 1.2 in the document for relationship between DAD and autoconfiguration.
 773
 7741.4.3 DHCPv6
 775
 776We supply a tiny DHCPv6 server/client in kame/dhcp6. However, the
 777implementation is premature (for example, this does NOT implement
 778address lease/release), and it is not in default compilation tree on
 779some platforms. If you want to do some experiment, compile it on your
 780own.
 781
 782DHCPv6 and autoconfiguration also needs more work.  "Managed" and "Other"
 783bits in RA have no special effect to stateful autoconfiguration procedure
 784in DHCPv6 client program ("Managed" bit actually prevents stateless
 785autoconfiguration, but no special action will be taken for DHCPv6 client).
 786
 7871.5 Generic tunnel interface
 788
 789GIF (Generic InterFace) is a pseudo interface for configured tunnel.
 790Details are described in gif(4) manpage.
 791Currently
 792	v6 in v6
 793	v6 in v4
 794	v4 in v6
 795	v4 in v4
 796are available.  Use "gifconfig" to assign physical (outer) source
 797and destination address to gif interfaces.
 798Configuration that uses same address family for inner and outer IP
 799header (v4 in v4, or v6 in v6) is dangerous.  It is very easy to
 800configure interfaces and routing tables to perform infinite level
 801of tunneling.  Please be warned.
 802
 803gif can be configured to be ECN-friendly.  See 4.5 for ECN-friendliness
 804of tunnels, and gif(4) manpage for how to configure.
 805
 806If you would like to configure an IPv4-in-IPv6 tunnel with gif interface,
 807read gif(4) carefully.  You may need to remove IPv6 link-local address
 808automatically assigned to the gif interface.
 809
 8101.6 Address Selection
 811
 8121.6.1 Source Address Selection
 813
 814The KAME kernel chooses the source address for an outgoing packet
 815sent from a user application as follows:
 816
 8171. if the source address is explicitly specified via an IPV6_PKTINFO
 818   ancillary data item or the socket option of that name, just use it.
 819   Note that this item/option overrides the bound address of the
 820   corresponding (datagram) socket.
 821
 8222. if the corresponding socket is bound, use the bound address.
 823
 8243. otherwise, the kernel first tries to find the outgoing interface of
 825   the packet.  If it fails, the source address selection also fails.
 826   If the kernel can find an interface, choose the most appropriate
 827   address based on the algorithm described in RFC3484.
 828
 829   The policy table used in this algorithm is stored in the kernel.
 830   To install or view the policy, use the ip6addrctl(8) command.  The
 831   kernel does not have pre-installed policy.  It is expected that the
 832   default policy described in the draft should be installed at the
 833   bootstrap time using this command.
 834
 835   This draft allows an implementation to add implementation-specific
 836   rules with higher precedence than the rule "Use longest matching
 837   prefix."  KAME's implementation has the following additional rules
 838   (that apply in the appeared order):
 839
 840   - prefer addresses on alive interfaces, that is, interfaces with
 841     the UP flag being on.  This rule is particularly useful for
 842     routers, since some routing daemons stop advertising prefixes
 843    (addresses) on interfaces that have become down.
 844
 845   - prefer addresses on "preferred" interfaces.  "Preferred"
 846     interfaces can be specified by the ndp(8) command.  By default,
 847     no interface is preferred, that is, this rule does not apply.
 848     Again, this rule is particularly useful for routers, since there
 849     is a convention, among router administrators, of assigning
 850     "stable" addresses on a particular interface (typically a
 851     loopback interface).
 852
 853   In any case, addresses that break the scope zone of the
 854   destination, or addresses whose zone do not contain the outgoing
 855   interface are never chosen.
 856
 857When the procedure above fails, the kernel usually returns
 858EADDRNOTAVAIL to the application.
 859
 860In some cases, the specification explicitly requires the
 861implementation to choose a particular source address.  The source
 862address for a Neighbor Advertisement (NA) message is an example.
 863Under the spec (RFC2461 7.2.2) NA's source should be the target
 864address of the corresponding NS's target.  In this case we follow the
 865spec rather than the above rule.
 866
 867If you would like to prohibit the use of deprecated address for some
 868reason, configure net.inet6.ip6.use_deprecated to 0.  The issue
 869related to deprecated address is described in RFC2462 5.5.4 (NOTE:
 870there is some debate underway in IETF ipngwg on how to use
 871"deprecated" address).
 872
 873As documented in the source address selection document, temporary
 874addresses for privacy extension are less preferred to public addresses
 875by default.  However, for administrators who are particularly aware of
 876the privacy, there is a system-wide sysctl(3) variable
 877"net.inet6.ip6.prefer_tempaddr".  When the variable is set to
 878non-zero, the kernel will rather prefer temporary addresses.  The
 879default value of this variable is 0.
 880
 8811.6.2 Destination Address Ordering
 882
 883KAME's getaddrinfo(3) supports the destination address ordering
 884algorithm described in RFC3484.  Getaddrinfo(3) needs to know the
 885source address for each destination address and policy entries
 886(described in the previous section) for the source and destination
 887addresses.  To get the source address, the library function opens a
 888UDP socket and tries to connect(2) for the destination.  To get the
 889policy entry, the function issues sysctl(3).
 890
 8911.7 Jumbo Payload
 892
 893KAME supports the Jumbo Payload hop-by-hop option used to send IPv6
 894packets with payloads longer than 65,535 octets.  But since currently
 895KAME does not support any physical interface whose MTU is more than
 89665,535, such payloads can be seen only on the loopback interface(i.e.
 897lo0).
 898
 899If you want to try jumbo payloads, you first have to reconfigure the
 900kernel so that the MTU of the loopback interface is more than 65,535
 901bytes; add the following to the kernel configuration file:
 902	options		"LARGE_LOMTU"		#To test jumbo payload
 903and recompile the new kernel.
 904
 905Then you can test jumbo payloads by the ping6 command with -b and -s
 906options.  The -b option must be specified to enlarge the size of the
 907socket buffer and the -s option specifies the length of the packet,
 908which should be more than 65,535.  For example, type as follows; 
 909	% ping6 -b 70000 -s 68000 ::1
 910
 911The IPv6 specification requires that the Jumbo Payload option must not
 912be used in a packet that carries a fragment header.  If this condition
 913is broken, an ICMPv6 Parameter Problem message must be sent to the
 914sender.  KAME kernel follows the specification, but you cannot usually
 915see an ICMPv6 error caused by this requirement.
 916
 917If KAME kernel receives an IPv6 packet, it checks the frame length of
 918the packet and compares it to the length specified in the payload
 919length field of the IPv6 header or in the value of the Jumbo Payload
 920option, if any.  If the former is shorter than the latter, KAME kernel
 921discards the packet and increments the statistics.  You can see the
 922statistics as output of netstat command with `-s -p ip6' option:
 923	% netstat -s -p ip6
 924	ip6:
 925		(snip)
 926		1 with data size < data length
 927
 928So, KAME kernel does not send an ICMPv6 error unless the erroneous
 929packet is an actual Jumbo Payload, that is, its packet size is more
 930than 65,535 bytes.  As described above, KAME kernel currently does not
 931support physical interface with such a huge MTU, so it rarely returns an
 932ICMPv6 error.
 933
 934TCP/UDP over jumbogram is not supported at this moment.  This is because
 935we have no medium (other than loopback) to test this.  Contact us if you
 936need this.
 937
 938IPsec does not work on jumbograms.  This is due to some specification twists
 939in supporting AH with jumbograms (AH header size influences payload length,
 940and this makes it real hard to authenticate inbound packet with jumbo payload
 941option as well as AH).
 942
 943There are fundamental issues in *BSD support for jumbograms.  We would like to
 944address those, but we need more time to finalize the task.  To name a few:
 945- mbuf pkthdr.len field is typed as "int" in 4.4BSD, so it cannot hold
 946  jumbogram with len > 2G on 32bit architecture CPUs.  If we would like to
 947  support jumbogram properly, the field must be expanded to hold 4G +
 948  IPv6 header + link-layer header.  Therefore, it must be expanded to at least
 949  int64_t (u_int32_t is NOT enough).
 950- We mistakingly use "int" to hold packet length in many places.  We need
 951  to convert them into larger numeric type.  It needs a great care, as we may
 952  experience overflow during packet length computation.
 953- We mistakingly check for ip6_plen field of IPv6 header for packet payload
 954  length in various places.  We should be checking mbuf pkthdr.len instead.
 955  ip6_input() will perform sanity check on jumbo payload option on input,
 956  and we can safely use mbuf pkthdr.len afterwards.
 957- TCP code needs careful updates in bunch of places, of course.
 958
 9591.8 Loop prevention in header processing
 960
 961IPv6 specification allows arbitrary number of extension headers to
 962be placed onto packets.  If we implement IPv6 packet processing
 963code in the way BSD IPv4 code is implemented, kernel stack may
 964overflow due to long function call chain.  KAME sys/netinet6 code
 965is carefully designed to avoid kernel stack overflow.  Because of
 966this, KAME sys/netinet6 code defines its own protocol switch
 967structure, as "struct ip6protosw" (see netinet6/ip6protosw.h).
 968
 969In addition to this, we restrict the number of extension headers
 970(including the IPv6 header) in each incoming packet, in order to
 971prevent a DoS attack that tries to send packets with a massive number
 972of extension headers.  The upper limit can be configured by the sysctl
 973value net.inet6.ip6.hdrnestlimit.  In particular, if the value is 0,
 974the node will allow an arbitrary number of headers. As of writing this
 975document, the default value is 50.
 976
 977IPv4 part (sys/netinet) remains untouched for compatibility.
 978Because of this, if you receive IPsec-over-IPv4 packet with massive
 979number of IPsec headers, kernel stack may blow up.  IPsec-over-IPv6 is okay.
 980
 9811.9 ICMPv6
 982
 983After RFC2463 was published, IETF ipngwg has decided to disallow ICMPv6 error
 984packet against ICMPv6 redirect, to prevent ICMPv6 storm on a network medium.
 985KAME already implements this into the kernel.
 986
 987RFC2463 requires rate limitation for ICMPv6 error packets generated by a
 988node, to avoid possible DoS attacks.  KAME kernel implements two rate-
 989limitation mechanisms, tunable via sysctl:
 990- Minimum time interval between ICMPv6 error packets
 991	KAME kernel will generate no more than one ICMPv6 error packet,
 992	during configured time interval.  net.inet6.icmp6.errratelimit
 993	controls the interval (default: disabled).
 994- Maximum ICMPv6 error packet-per-second
 995	KAME kernel will generate no more than the configured number of
 996	packets in one second.  net.inet6.icmp6.errppslimit controls the
 997	maximum packet-per-second value (default: 200pps)
 998Basically, we need to pick values that are suitable against the bandwidth
 999of link layer devices directly attached to the node.  In some cases the
1000default values may not fit well.  We are still unsure if the default value
1001is sane or not.  Comments are welcome.
1002
10031.10 Applications
1004
1005For userland programming, we support IPv6 socket API as specified in
1006RFC2553/3493, RFC3542 and upcoming internet drafts.
1007
1008TCP/UDP over IPv6 is available and quite stable.  You can enjoy "telnet",
1009"ftp", "rlogin", "rsh", "ssh", etc.  These applications are protocol
1010independent.  That is, they automatically chooses IPv4 or IPv6
1011according to DNS.
1012
10131.11 Kernel Internals
1014
1015 (*) TCP/UDP part is handled differently between operating system platforms.
1016     See 1.12 for details.
1017
1018The current KAME has escaped from the IPv4 netinet logic.  While
1019ip_forward() calls ip_output(), ip6_forward() directly calls
1020if_output() since routers must not divide IPv6 packets into fragments.
1021
1022ICMPv6 should contain the original packet as long as possible up to
10231280.  UDP6/IP6 port unreach, for instance, should contain all
1024extension headers and the *unchanged* UDP6 and IP6 headers.
1025So, all IP6 functions except TCP6 never convert network byte
1026order into host byte order, to save the original packet.
1027
1028tcp6_input(), udp6_input() and icmp6_input() can't assume that IP6
1029header is preceding the transport headers due to extension
1030headers.  So, in6_cksum() was implemented to handle packets whose IP6
1031header and transport header is not continuous.  TCP/IP6 nor UDP/IP6
1032header structure don't exist for checksum calculation.
1033
1034To process IP6 header, extension headers and transport headers easily,
1035KAME requires network drivers to store packets in one internal mbuf or
1036one or more external mbufs.  A typical old driver prepares two
1037internal mbufs for 100 - 208 bytes data, however, KAME's reference
1038implementation stores it in one external mbuf.
1039
1040"netstat -s -p ip6" tells you whether or not your driver conforms
1041KAME's requirement.  In the following example, "cce0" violates the
1042requirement. (For more information, refer to Section 2.)
1043
1044

Large files files are truncated, but you can click here to view the full file