/share/doc/psd/20.ipctut/tutor.me
https://bitbucket.org/freebsd/freebsd-head/ · Unknown · 939 lines · 930 code · 9 blank · 0 comment · 0 complexity · 4d79c1f2f4b1514e42eb716818d286b7 MD5 · raw file
- .\" Copyright (c) 1986, 1993
- .\" The Regents of the University of California. All rights reserved.
- .\"
- .\" Redistribution and use in source and binary forms, with or without
- .\" modification, are permitted provided that the following conditions
- .\" are met:
- .\" 1. Redistributions of source code must retain the above copyright
- .\" notice, this list of conditions and the following disclaimer.
- .\" 2. Redistributions in binary form must reproduce the above copyright
- .\" notice, this list of conditions and the following disclaimer in the
- .\" documentation and/or other materials provided with the distribution.
- .\" 3. All advertising materials mentioning features or use of this software
- .\" must display the following acknowledgement:
- .\" This product includes software developed by the University of
- .\" California, Berkeley and its contributors.
- .\" 4. Neither the name of the University nor the names of its contributors
- .\" may be used to endorse or promote products derived from this software
- .\" without specific prior written permission.
- .\"
- .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
- .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
- .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
- .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
- .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
- .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
- .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
- .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
- .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
- .\" SUCH DAMAGE.
- .\"
- .\" @(#)tutor.me 8.1 (Berkeley) 8/14/93
- .\"
- .oh 'Introductory 4.4BSD IPC''PSD:20-%'
- .eh 'PSD:20-%''Introductory 4.4BSD IPC'
- .rs
- .sp 2
- .sz 14
- .ft B
- .ce 2
- An Introductory 4.4BSD
- Interprocess Communication Tutorial
- .sz 10
- .sp 2
- .ce
- .i "Stuart Sechrest"
- .ft
- .sp
- .ce 4
- Computer Science Research Group
- Computer Science Division
- Department of Electrical Engineering and Computer Science
- University of California, Berkeley
- .sp 2
- .ce
- .i ABSTRACT
- .sp
- .(c
- .pp
- Berkeley UNIX\(dg 4.4BSD offers several choices for interprocess communication.
- To aid the programmer in developing programs which are comprised of
- cooperating
- processes, the different choices are discussed and a series of example
- programs are presented. These programs
- demonstrate in a simple way the use of pipes, socketpairs, sockets
- and the use of datagram and stream communication. The intent of this
- document is to present a few simple example programs, not to describe the
- networking system in full.
- .)c
- .sp 2
- .(f
- \(dg\|UNIX is a trademark of AT&T Bell Laboratories.
- .)f
- .b
- .sh 1 "Goals"
- .r
- .pp
- Facilities for interprocess communication (IPC) and networking
- were a major addition to UNIX in the Berkeley UNIX 4.2BSD release.
- These facilities required major additions and some changes
- to the system interface.
- The basic idea of this interface is to make IPC similar to file I/O.
- In UNIX a process has a set of I/O descriptors, from which one reads
- and to which one writes.
- Descriptors may refer to normal files, to devices (including terminals),
- or to communication channels.
- The use of a descriptor has three phases: its creation,
- its use for reading and writing, and its destruction. By using descriptors
- to write files, rather than simply naming the target file in the write
- call, one gains a surprising amount of flexibility. Often, the program that
- creates a descriptor will be different from the program that uses the
- descriptor. For example the shell can create a descriptor for the output
- of the `ls'
- command that will cause the listing to appear in a file rather than
- on a terminal.
- Pipes are another form of descriptor that have been used in UNIX
- for some time.
- Pipes allow one-way data transmission from one process
- to another; the two processes and the pipe must be set up by a common
- ancestor.
- .pp
- The use of descriptors is not the only communication interface
- provided by UNIX.
- The signal mechanism sends a tiny amount of information from one
- process to another.
- The signaled process receives only the signal type,
- not the identity of the sender,
- and the number of possible signals is small.
- The signal semantics limit the flexibility of the signaling mechanism
- as a means of interprocess communication.
- .pp
- The identification of IPC with I/O is quite longstanding in UNIX and
- has proved quite successful. At first, however, IPC was limited to
- processes communicating within a single machine. With Berkeley UNIX
- 4.2BSD this expanded to include IPC between machines. This expansion
- has necessitated some change in the way that descriptors are created.
- Additionally, new possibilities for the meaning of read and write have
- been admitted. Originally the meanings, or semantics, of these terms
- were fairly simple. When you wrote something it was delivered. When
- you read something, you were blocked until the data arrived.
- Other possibilities exist,
- however. One can write without full assurance of delivery if one can
- check later to catch occasional failures. Messages can be kept as
- discrete units or merged into a stream.
- One can ask to read, but insist on not waiting if nothing is immediately
- available. These new possibilities are allowed in the Berkeley UNIX IPC
- interface.
- .pp
- Thus Berkeley UNIX 4.4BSD offers several choices for IPC.
- This paper presents simple examples that illustrate some of
- the choices.
- The reader is presumed to be familiar with the C programming language
- [Kernighan & Ritchie 1978],
- but not necessarily with the system calls of the UNIX system or with
- processes and interprocess communication.
- The paper reviews the notion of a process and the types of
- communication that are supported by Berkeley UNIX 4.4BSD.
- A series of examples are presented that create processes that communicate
- with one another. The programs show different ways of establishing
- channels of communication.
- Finally, the calls that actually transfer data are reviewed.
- To clearly present how communication can take place,
- the example programs have been cleared of anything that
- might be construed as useful work.
- They can, therefore, serve as models
- for the programmer trying to construct programs which are comprised of
- cooperating processes.
- .b
- .sh 1 "Processes"
- .pp
- A \fIprogram\fP is both a sequence of statements and a rough way of referring
- to the computation that occurs when the compiled statements are run.
- A \fIprocess\fP can be thought of as a single line of control in a program.
- Most programs execute some statements, go through a few loops, branch in
- various directions and then end. These are single process programs.
- Programs can also have a point where control splits into two independent lines,
- an action called \fIforking.\fP
- In UNIX these lines can never join again. A call to the system routine
- \fIfork()\fP, causes a process to split in this way.
- The result of this call is that two independent processes will be
- running, executing exactly the same code.
- Memory values will be the same for all values set before the fork, but,
- subsequently, each version will be able to change only the
- value of its own copy of each variable.
- Initially, the only difference between the two will be the value returned by
- \fIfork().\fP The parent will receive a process id for the child,
- the child will receive a zero.
- Calls to \fIfork(),\fP
- therefore, typically precede, or are included in, an if-statement.
- .pp
- A process views the rest of the system through a private table of descriptors.
- The descriptors can represent open files or sockets (sockets are communication
- objects that will be discussed below). Descriptors are referred to
- by their index numbers in the table. The first three descriptors are often
- known by special names, \fI stdin, stdout\fP and \fIstderr\fP.
- These are the standard input, output and error.
- When a process forks, its descriptor table is copied to the child.
- Thus, if the parent's standard input is being taken from a terminal
- (devices are also treated as files in UNIX), the child's input will
- be taken from the
- same terminal. Whoever reads first will get the input. If, before forking,
- the parent changes its standard input so that it is reading from a
- new file, the child will take its input from the new file. It is
- also possible to take input from a socket, rather than from a file.
- .b
- .sh 1 "Pipes"
- .r
- .pp
- Most users of UNIX know that they can pipe the output of a
- program ``prog1'' to the input of another, ``prog2,'' by typing the command
- \fI``prog1 | prog2.''\fP
- This is called ``piping'' the output of one program
- to another because the mechanism used to transfer the output is called a
- pipe.
- When the user types a command, the command is read by the shell, which
- decides how to execute it. If the command is simple, for example,
- .i "``prog1,''"
- the shell forks a process, which executes the program, prog1, and then dies.
- The shell waits for this termination and then prompts for the next
- command.
- If the command is a compound command,
- .i "``prog1 | prog2,''"
- the shell creates two processes connected by a pipe. One process
- runs the program, prog1, the other runs prog2. The pipe is an I/O
- mechanism with two ends, or sockets. Data that is written into one socket
- can be read from the other.
- .(z
- .ft CW
- .so pipe.c
- .ft
- .ce 1
- Figure 1\ \ Use of a pipe
- .)z
- .pp
- Since a program specifies its input and output only by the descriptor table
- indices, which appear as variables or constants,
- the input source and output destination can be changed without
- changing the text of the program.
- It is in this way that the shell is able to set up pipes. Before executing
- prog1, the process can close whatever is at \fIstdout\fP
- and replace it with one
- end of a pipe. Similarly, the process that will execute prog2 can substitute
- the opposite end of the pipe for
- \fIstdin.\fP
- .pp
- Let us now examine a program that creates a pipe for communication between
- its child and itself (Figure 1).
- A pipe is created by a parent process, which then forks.
- When a process forks, the parent's descriptor table is copied into
- the child's.
- .pp
- In Figure 1, the parent process makes a call to the system routine
- \fIpipe().\fP
- This routine creates a pipe and places descriptors for the sockets
- for the two ends of the pipe in the process's descriptor table.
- \fIPipe()\fP
- is passed an array into which it places the index numbers of the
- sockets it created.
- The two ends are not equivalent. The socket whose index is
- returned in the low word of the array is opened for reading only,
- while the socket in the high end is opened only for writing.
- This corresponds to the fact that the standard input is the first
- descriptor of a process's descriptor table and the standard output
- is the second. After creating the pipe, the parent creates the child
- with which it will share the pipe by calling \fIfork().\fP
- Figure 2 illustrates the effect of a fork.
- The parent process's descriptor table points to both ends of the pipe.
- After the fork, both parent's and child's descriptor tables point to
- the pipe.
- The child can then use the pipe to send a message to the parent.
- .(z
- .so fig2.pic
- .ce 2
- Figure 2\ \ Sharing a pipe between parent and child
- .ce 0
- .)z
- .pp
- Just what is a pipe?
- It is a one-way communication mechanism, with one end opened
- for reading and the other end for writing.
- Therefore, parent and child need to agree on which way to turn
- the pipe, from parent to child or the other way around.
- Using the same pipe for communication both from parent to child and
- from child to parent would be possible (since both processes have
- references to both ends), but very complicated.
- If the parent and child are to have a two-way conversation,
- the parent creates two pipes, one for use in each direction.
- (In accordance with their plans, both parent and child in the example above
- close the socket that they will not use. It is not required that unused
- descriptors be closed, but it is good practice.)
- A pipe is also a \fIstream\fP communication mechanism; that
- is, all messages sent through the pipe are placed in order
- and reliably delivered. When the reader asks for a certain
- number of bytes from this
- stream, he is given as many bytes as are available, up
- to the amount of the request. Note that these bytes may have come from
- the same call to \fIwrite()\fR or from several calls to \fIwrite()\fR
- which were concatenated.
- .b
- .sh 1 "Socketpairs"
- .r
- .pp
- Berkeley UNIX 4.4BSD provides a slight generalization of pipes. A pipe is a
- pair of connected sockets for one-way stream communication. One may
- obtain a pair of connected sockets for two-way stream communication
- by calling the routine \fIsocketpair().\fP
- The program in Figure 3 calls \fIsocketpair()\fP
- to create such a connection. The program uses the link for
- communication in both directions. Since socketpairs are
- an extension of pipes, their use resembles that of pipes.
- Figure 4 illustrates the result of a fork following a call to
- \fIsocketpair().\fP
- .pp
- \fISocketpair()\fP
- takes as
- arguments a specification of a domain, a style of communication, and a
- protocol.
- These are the parameters shown in the example.
- Domains and protocols will be discussed in the next section.
- Briefly,
- a domain is a space of names that may be bound
- to sockets and implies certain other conventions.
- Currently, socketpairs have only been implemented for one
- domain, called the UNIX domain.
- The UNIX domain uses UNIX path names for naming sockets.
- It only allows communication
- between sockets on the same machine.
- .pp
- Note that the header files
- .i "<sys/socket.h>"
- and
- .i "<sys/types.h>."
- are required in this program.
- The constants AF_UNIX and SOCK_STREAM are defined in
- .i "<sys/socket.h>,"
- which in turn requires the file
- .i "<sys/types.h>"
- for some of its definitions.
- .(z
- .ft CW
- .so socketpair.c
- .ft
- .ce 1
- Figure 3\ \ Use of a socketpair
- .)z
- .(z
- .so fig3.pic
- .ce 1
- Figure 4\ \ Sharing a socketpair between parent and child
- .)z
- .b
- .sh 1 "Domains and Protocols"
- .r
- .pp
- Pipes and socketpairs are a simple solution for communicating between
- a parent and child or between child processes.
- What if we wanted to have processes that have no common ancestor
- with whom to set up communication?
- Neither standard UNIX pipes nor socketpairs are
- the answer here, since both mechanisms require a common ancestor to
- set up the communication.
- We would like to have two processes separately create sockets
- and then have messages sent between them. This is often the
- case when providing or using a service in the system. This is
- also the case when the communicating processes are on separate machines.
- In Berkeley UNIX 4.4BSD one can create individual sockets, give them names and
- send messages between them.
- .pp
- Sockets created by different programs use names to refer to one another;
- names generally must be translated into addresses for use.
- The space from which an address is drawn is referred to as a
- .i domain.
- There are several domains for sockets.
- Two that will be used in the examples here are the UNIX domain (or AF_UNIX,
- for Address Format UNIX) and the Internet domain (or AF_INET).
- UNIX domain IPC is an experimental facility in 4.2BSD and 4.3BSD.
- In the UNIX domain, a socket is given a path name within the file system
- name space.
- A file system node is created for the socket and other processes may
- then refer to the socket by giving the proper pathname.
- UNIX domain names, therefore, allow communication between any two processes
- that work in the same file system.
- The Internet domain is the UNIX implementation of the DARPA Internet
- standard protocols IP/TCP/UDP.
- Addresses in the Internet domain consist of a machine network address
- and an identifying number, called a port.
- Internet domain names allow communication between machines.
- .pp
- Communication follows some particular ``style.''
- Currently, communication is either through a \fIstream\fP
- or by \fIdatagram.\fP
- Stream communication implies several things. Communication takes
- place across a connection between two sockets. The communication
- is reliable, error-free, and, as in pipes, no message boundaries are
- kept. Reading from a stream may result in reading the data sent from
- one or several calls to \fIwrite()\fP
- or only part of the data from a single call, if there is not enough room
- for the entire message, or if not all the data from a large message
- has been transferred.
- The protocol implementing such a style will retransmit messages
- received with errors. It will also return error messages if one tries to
- send a message after the connection has been broken.
- Datagram communication does not use connections. Each message is
- addressed individually. If the address is correct, it will generally
- be received, although this is not guaranteed. Often datagrams are
- used for requests that require a response from the
- recipient. If no response
- arrives in a reasonable amount of time, the request is repeated.
- The individual datagrams will be kept separate when they are read, that
- is, message boundaries are preserved.
- .pp
- The difference in performance between the two styles of communication is
- generally less important than the difference in semantics. The
- performance gain that one might find in using datagrams must be weighed
- against the increased complexity of the program, which must now concern
- itself with lost or out of order messages. If lost messages may simply be
- ignored, the quantity of traffic may be a consideration. The expense
- of setting up a connection is best justified by frequent use of the connection.
- Since the performance of a protocol changes as it is tuned for different
- situations, it is best to seek the most up-to-date information when
- making choices for a program in which performance is crucial.
- .pp
- A protocol is a set of rules, data formats and conventions that regulate the
- transfer of data between participants in the communication.
- In general, there is one protocol for each socket type (stream,
- datagram, etc.) within each domain.
- The code that implements a protocol
- keeps track of the names that are bound to sockets,
- sets up connections and transfers data between sockets,
- perhaps sending the data across a network.
- This code also keeps track of the names that are bound to sockets.
- It is possible for several protocols, differing only in low level
- details, to implement the same style of communication within
- a particular domain. Although it is possible to select
- which protocol should be used, for nearly all uses it is sufficient to
- request the default protocol. This has been done in all of the example
- programs.
- .pp
- One specifies the domain, style and protocol of a socket when
- it is created. For example, in Figure 5a the call to \fIsocket()\fP
- causes the creation of a datagram socket with the default protocol
- in the UNIX domain.
- .b
- .sh 1 "Datagrams in the UNIX Domain"
- .r
- .(z
- .ft CW
- .so udgramread.c
- .ft
- .ce 1
- Figure 5a\ \ Reading UNIX domain datagrams
- .)z
- .pp
- Let us now look at two programs that create sockets separately.
- The programs in Figures 5a and 5b use datagram communication
- rather than a stream.
- The structure used to name UNIX domain sockets is defined
- in the file \fI<sys/un.h>.\fP
- The definition has also been included in the example for clarity.
- .pp
- Each program creates a socket with a call to \fIsocket().\fP
- These sockets are in the UNIX domain.
- Once a name has been decided upon it is attached to a socket by the
- system call \fIbind().\fP
- The program in Figure 5a uses the name ``socket'',
- which it binds to its socket.
- This name will appear in the working directory of the program.
- The routines in Figure 5b use its
- socket only for sending messages. It does not create a name for
- the socket because no other process has to refer to it.
- .(z
- .ft CW
- .so udgramsend.c
- .ft
- .ce 1
- Figure 5b\ \ Sending a UNIX domain datagrams
- .)z
- .pp
- Names in the UNIX domain are path names. Like file path names they may
- be either absolute (e.g. ``/dev/imaginary'') or relative (e.g. ``socket'').
- Because these names are used to allow processes to rendezvous, relative
- path names can pose difficulties and should be used with care.
- When a name is bound into the name space, a file (inode) is allocated in the
- file system. If
- the inode is not deallocated, the name will continue to exist even after
- the bound socket is closed. This can cause subsequent runs of a program
- to find that a name is unavailable, and can cause
- directories to fill up with these
- objects. The names are removed by calling \fIunlink()\fP or using
- the \fIrm\fP\|(1) command.
- Names in the UNIX domain are only used for rendezvous. They are not used
- for message delivery once a connection is established. Therefore, in
- contrast with the Internet domain, unbound sockets need not be (and are
- not) automatically given addresses when they are connected.
- .pp
- There is no established means of communicating names to interested
- parties. In the example, the program in Figure 5b gets the
- name of the socket to which it will send its message through its
- command line arguments. Once a line of communication has been created,
- one can send the names of additional, perhaps new, sockets over the link.
- Facilities will have to be built that will make the distribution of
- names less of a problem than it now is.
- .b
- .sh 1 "Datagrams in the Internet Domain"
- .r
- .(z
- .ft CW
- .so dgramread.c
- .ft
- .ce 1
- Figure 6a\ \ Reading Internet domain datagrams
- .)z
- .pp
- The examples in Figure 6a and 6b are very close to the previous example
- except that the socket is in the Internet domain.
- The structure of Internet domain addresses is defined in the file
- \fI<netinet/in.h>\fP.
- Internet addresses specify a host address (a 32-bit number)
- and a delivery slot, or port, on that
- machine. These ports are managed by the system routines that implement
- a particular protocol.
- Unlike UNIX domain names, Internet socket names are not entered into
- the file system and, therefore,
- they do not have to be unlinked after the socket has been closed.
- When a message must be sent between machines it is sent to
- the protocol routine on the destination machine, which interprets the
- address to determine to which socket the message should be delivered.
- Several different protocols may be active on
- the same machine, but, in general, they will not communicate with one another.
- As a result, different protocols are allowed to use the same port numbers.
- Thus, implicitly, an Internet address is a triple including a protocol as
- well as the port and machine address.
- An \fIassociation\fP is a temporary or permanent specification
- of a pair of communicating sockets.
- An association is thus identified by the tuple
- <\fIprotocol, local machine address, local port,
- remote machine address, remote port\fP>.
- An association may be transient when using datagram sockets;
- the association actually exists during a \fIsend\fP operation.
- .(z
- .ft CW
- .so dgramsend.c
- .ft
- .ce 1
- Figure 6b\ \ Sending an Internet domain datagram
- .)z
- .pp
- The protocol for a socket is chosen when the socket is created. The
- local machine address for a socket can be any valid network address of the
- machine, if it has more than one, or it can be the wildcard value
- INADDR_ANY.
- The wildcard value is used in the program in Figure 6a.
- If a machine has several network addresses, it is likely
- that messages sent to any of the addresses should be deliverable to
- a socket. This will be the case if the wildcard value has been chosen.
- Note that even if the wildcard value is chosen, a program sending messages
- to the named socket must specify a valid network address. One can be willing
- to receive from ``anywhere,'' but one cannot send a message ``anywhere.''
- The program in Figure 6b is given the destination host name as a command
- line argument.
- To determine a network address to which it can send the message, it looks
- up
- the host address by the call to \fIgethostbyname()\fP.
- The returned structure includes the host's network address,
- which is copied into the structure specifying the
- destination of the message.
- .pp
- The port number can be thought of as the number of a mailbox, into
- which the protocol places one's messages. Certain daemons, offering
- certain advertised services, have reserved
- or ``well-known'' port numbers. These fall in the range
- from 1 to 1023. Higher numbers are available to general users.
- Only servers need to ask for a particular number.
- The system will assign an unused port number when an address
- is bound to a socket.
- This may happen when an explicit \fIbind\fP
- call is made with a port number of 0, or
- when a \fIconnect\fP or \fIsend\fP
- is performed on an unbound socket.
- Note that port numbers are not automatically reported back to the user.
- After calling \fIbind(),\fP asking for port 0, one may call
- \fIgetsockname()\fP to discover what port was actually assigned.
- The routine \fIgetsockname()\fP
- will not work for names in the UNIX domain.
- .pp
- The format of the socket address is specified in part by standards within the
- Internet domain. The specification includes the order of the bytes in
- the address. Because machines differ in the internal representation
- they ordinarily use
- to represent integers, printing out the port number as returned by
- \fIgetsockname()\fP may result in a misinterpretation. To
- print out the number, it is necessary to use the routine \fIntohs()\fP
- (for \fInetwork to host: short\fP) to convert the number from the
- network representation to the host's representation. On some machines,
- such as 68000-based machines, this is a null operation. On others,
- such as VAXes, this results in a swapping of bytes. Another routine
- exists to convert a short integer from the host format to the network format,
- called \fIhtons()\fP; similar routines exist for long integers.
- For further information, refer to the
- entry for \fIbyteorder\fP in section 3 of the manual.
- .b
- .sh 1 "Connections"
- .r
- .pp
- To send data between stream sockets (having communication style SOCK_STREAM),
- the sockets must be connected.
- Figures 7a and 7b show two programs that create such a connection.
- The program in 7a is relatively simple.
- To initiate a connection, this program simply creates
- a stream socket, then calls \fIconnect()\fP,
- specifying the address of the socket to which
- it wishes its socket connected. Provided that the target socket exists and
- is prepared to handle a connection, connection will be complete,
- and the program can begin to send
- messages. Messages will be delivered in order without message
- boundaries, as with pipes. The connection is destroyed when either
- socket is closed (or soon thereafter). If a process persists
- in sending messages after the connection is closed, a SIGPIPE signal
- is sent to the process by the operating system. Unless explicit action
- is taken to handle the signal (see the manual page for \fIsignal\fP
- or \fIsigvec\fP),
- the process will terminate and the shell
- will print the message ``broken pipe.''
- .(z
- .ft CW
- .so streamwrite.c
- .ft
- .ce 1
- Figure 7a\ \ Initiating an Internet domain stream connection
- .)z
- .(z
- .ft CW
- .so streamread.c
- .ft
- .ce 1
- Figure 7b\ \ Accepting an Internet domain stream connection
- .sp 2
- .ft CW
- .so strchkread.c
- .ft
- .ce 1
- Figure 7c\ \ Using select() to check for pending connections
- .)z
- .(z
- .so fig8.pic
- .sp
- .ce 1
- Figure 8\ \ Establishing a stream connection
- .)z
- .pp
- Forming a connection is asymmetrical; one process, such as the
- program in Figure 7a, requests a connection with a particular socket,
- the other process accepts connection requests.
- Before a connection can be accepted a socket must be created and an address
- bound to it. This
- situation is illustrated in the top half of Figure 8. Process 2
- has created a socket and bound a port number to it. Process 1 has created an
- unnamed socket.
- The address bound to process 2's socket is then made known to process 1 and,
- perhaps to several other potential communicants as well.
- If there are several possible communicants,
- this one socket might receive several requests for connections.
- As a result, a new socket is created for each connection. This new socket
- is the endpoint for communication within this process for this connection.
- A connection may be destroyed by closing the corresponding socket.
- .pp
- The program in Figure 7b is a rather trivial example of a server. It
- creates a socket to which it binds a name, which it then advertises.
- (In this case it prints out the socket number.) The program then calls
- \fIlisten()\fP for this socket.
- Since several clients may attempt to connect more or less
- simultaneously, a queue of pending connections is maintained in the system
- address space. \fIListen()\fP
- marks the socket as willing to accept connections and initializes the queue.
- When a connection is requested, it is listed in the queue. If the
- queue is full, an error status may be returned to the requester.
- The maximum length of this queue is specified by the second argument of
- \fIlisten()\fP; the maximum length is limited by the system.
- Once the listen call has been completed, the program enters
- an infinite loop. On each pass through the loop, a new connection is
- accepted and removed from the queue, and, hence, a new socket for the
- connection is created. The bottom half of Figure 8 shows the result of
- Process 1 connecting with the named socket of Process 2, and Process 2
- accepting the connection. After the connection is created, the
- service, in this case printing out the messages, is performed and the
- connection socket closed. The \fIaccept()\fP
- call will take a pending connection
- request from the queue if one is available, or block waiting for a request.
- Messages are read from the connection socket.
- Reads from an active connection will normally block until data is available.
- The number of bytes read is returned. When a connection is destroyed,
- the read call returns immediately. The number of bytes returned will
- be zero.
- .pp
- The program in Figure 7c is a slight variation on the server in Figure 7b.
- It avoids blocking when there are no pending connection requests by
- calling \fIselect()\fP
- to check for pending requests before calling \fIaccept().\fP
- This strategy is useful when connections may be received
- on more than one socket, or when data may arrive on other connected
- sockets before another connection request.
- .pp
- The programs in Figures 9a and 9b show a program using stream communication
- in the UNIX domain. Streams in the UNIX domain can be used for this sort
- of program in exactly the same way as Internet domain streams, except for
- the form of the names and the restriction of the connections to a single
- file system. There are some differences, however, in the functionality of
- streams in the two domains, notably in the handling of
- \fIout-of-band\fP data (discussed briefly below). These differences
- are beyond the scope of this paper.
- .(z
- .ft CW
- .so ustreamwrite.c
- .ft
- .ce 1
- Figure 9a\ \ Initiating a UNIX domain stream connection
- .sp 2
- .ft CW
- .so ustreamread.c
- .ft
- .ce 1
- Figure 9b\ \ Accepting a UNIX domain stream connection
- .)z
- .b
- .sh 1 "Reads, Writes, Recvs, etc."
- .r
- .pp
- UNIX 4.4BSD has several system calls for reading and writing information.
- The simplest calls are \fIread() \fP and \fIwrite().\fP \fIWrite()\fP
- takes as arguments the index of a descriptor, a pointer to a buffer
- containing the data and the size of the data.
- The descriptor may indicate either a file or a connected socket.
- ``Connected'' can mean either a connected stream socket (as described
- in Section 8) or a datagram socket for which a \fIconnect()\fP
- call has provided a default destination (see the \fIconnect()\fP manual page).
- \fIRead()\fP also takes a descriptor that indicates either a file or a socket.
- \fIWrite()\fP requires a connected socket since no destination is
- specified in the parameters of the system call.
- \fIRead()\fP can be used for either a connected or an unconnected socket.
- These calls are, therefore, quite flexible and may be used to
- write applications that require no assumptions about the source of
- their input or the destination of their output.
- There are variations on \fIread() \fP and \fIwrite()\fP
- that allow the source and destination of the input and output to use
- several separate buffers, while retaining the flexibility to handle
- both files and sockets. These are \fIreadv()\fP and \fI writev(),\fP
- for read and write \fIvector.\fP
- .pp
- It is sometimes necessary to send high priority data over a
- connection that may have unread low priority data at the
- other end. For example, a user interface process may be interpreting
- commands and sending them on to another process through a stream connection.
- The user interface may have filled the stream with as yet unprocessed
- requests when the user types
- a command to cancel all outstanding requests.
- Rather than have the high priority data wait
- to be processed after the low priority data, it is possible to
- send it as \fIout-of-band\fP
- (OOB) data. The notification of pending OOB data results in the generation of
- a SIGURG signal, if this signal has been enabled (see the manual
- page for \fIsignal\fP or \fIsigvec\fP).
- See [Leffler 1986] for a more complete description of the OOB mechanism.
- There are a pair of calls similar to \fIread\fP and \fIwrite\fP
- that allow options, including sending
- and receiving OOB information; these are \fI send()\fP
- and \fIrecv().\fP
- These calls are used only with sockets; specifying a descriptor for a file will
- result in the return of an error status. These calls also allow
- \fIpeeking\fP at data in a stream.
- That is, they allow a process to read data without removing the data from
- the stream. One use of this facility is to read ahead in a stream
- to determine the size of the next item to be read.
- When not using these options, these calls have the same functions as
- \fIread()\fP and \fIwrite().\fP
- .pp
- To send datagrams, one must be allowed to specify the destination.
- The call \fIsendto()\fP
- takes a destination address as an argument and is therefore used for
- sending datagrams. The call \fIrecvfrom()\fP
- is often used to read datagrams, since this call returns the address
- of the sender, if it is available, along with the data.
- If the identity of the sender does not matter, one may use \fIread()\fP
- or \fIrecv().\fP
- .pp
- Finally, there are a pair of calls that allow the sending and
- receiving of messages from multiple buffers, when the address of the
- recipient must be specified. These are \fIsendmsg()\fP and
- \fIrecvmsg().\fP
- These calls are actually quite general and have other uses,
- including, in the UNIX domain, the transmission of a file descriptor from one
- process to another.
- .pp
- The various options for reading and writing are shown in Figure 10,
- together with their parameters. The parameters for each system call
- reflect the differences in function of the different calls.
- In the examples given in this paper, the calls \fIread()\fP and
- \fIwrite()\fP have been used whenever possible.
- .(z
- .ft CW
- /*
- * The variable descriptor may be the descriptor of either a file
- * or of a socket.
- */
- cc = read(descriptor, buf, nbytes)
- int cc, descriptor; char *buf; int nbytes;
- /*
- * An iovec can include several source buffers.
- */
- cc = readv(descriptor, iov, iovcnt)
- int cc, descriptor; struct iovec *iov; int iovcnt;
- cc = write(descriptor, buf, nbytes)
- int cc, descriptor; char *buf; int nbytes;
- cc = writev(descriptor, iovec, ioveclen)
- int cc, descriptor; struct iovec *iovec; int ioveclen;
- /*
- * The variable ``sock'' must be the descriptor of a socket.
- * Flags may include MSG_OOB and MSG_PEEK.
- */
- cc = send(sock, msg, len, flags)
- int cc, sock; char *msg; int len, flags;
- cc = sendto(sock, msg, len, flags, to, tolen)
- int cc, sock; char *msg; int len, flags;
- struct sockaddr *to; int tolen;
- cc = sendmsg(sock, msg, flags)
- int cc, sock; struct msghdr msg[]; int flags;
- cc = recv(sock, buf, len, flags)
- int cc, sock; char *buf; int len, flags;
- cc = recvfrom(sock, buf, len, flags, from, fromlen)
- int cc, sock; char *buf; int len, flags;
- struct sockaddr *from; int *fromlen;
- cc = recvmsg(sock, msg, flags)
- int cc, socket; struct msghdr msg[]; int flags;
- .ft
- .sp 1
- .ce 1
- Figure 10\ \ Varieties of read and write commands
- .)z
- .b
- .sh 1 "Choices"
- .r
- .pp
- This paper has presented examples of some of the forms
- of communication supported by
- Berkeley UNIX 4.4BSD. These have been presented in an order chosen for
- ease of presentation. It is useful to review these options emphasizing the
- factors that make each attractive.
- .pp
- Pipes have the advantage of portability, in that they are supported in all
- UNIX systems. They also are relatively
- simple to use. Socketpairs share this simplicity and have the additional
- advantage of allowing bidirectional communication. The major shortcoming
- of these mechanisms is that they require communicating processes to be
- descendants of a common process. They do not allow intermachine communication.
- .pp
- The two communication domains, UNIX and Internet, allow processes with no common
- ancestor to communicate.
- Of the two, only the Internet domain allows
- communication between machines.
- This makes the Internet domain a necessary
- choice for processes running on separate machines.
- .pp
- The choice between datagrams and stream communication is best made by
- carefully considering the semantic and performance
- requirements of the application.
- Streams can be both advantageous and disadvantageous. One disadvantage
- is that a process is only allowed a limited number of open streams,
- as there are usually only 64 entries available in the open descriptor
- table. This can cause problems if a single server must talk with a large
- number of clients.
- Another is that for delivering a short message the stream setup and
- teardown time can be unnecessarily long. Weighed against this are
- the reliability built into the streams. This will often be the
- deciding factor in favor of streams.
- .b
- .sh 1 "What to do Next"
- .r
- .pp
- Many of the examples presented here can serve as models for multiprocess
- programs and for programs distributed across several machines.
- In developing a new multiprocess program, it is often easiest to
- first write the code to create the processes and communication paths.
- After this code is debugged, the code specific to the application can
- be added.
- .pp
- An introduction to the UNIX system and programming using UNIX system calls
- can be found in [Kernighan and Pike 1984].
- Further documentation of the Berkeley UNIX 4.4BSD IPC mechanisms can be
- found in [Leffler et al. 1986].
- More detailed information about particular calls and protocols
- is provided in sections
- 2, 3 and 4 of the
- UNIX Programmer's Manual [CSRG 1986].
- In particular the following manual pages are relevant:
- .(b
- .TS
- l l.
- creating and naming sockets socket(2), bind(2)
- establishing connections listen(2), accept(2), connect(2)
- transferring data read(2), write(2), send(2), recv(2)
- addresses inet(4F)
- protocols tcp(4P), udp(4P).
- .TE
- .)b
- .(b
- .sp
- .b
- Acknowledgements
- .pp
- I would like to thank Sam Leffler and Mike Karels for their help in
- understanding the IPC mechanisms and all the people whose comments
- have helped in writing and improving this report.
- .pp
- This work was sponsored by the Defense Advanced Research Projects Agency
- (DoD), ARPA Order No. 4031, monitored by the Naval Electronics Systems
- Command under contract No. N00039-C-0235.
- The views and conclusions contained in this document are those of the
- author and should not be interpreted as representing official policies,
- either expressed or implied, of the Defense Research Projects Agency
- or of the US Government.
- .)b
- .(b
- .sp
- .b
- References
- .r
- .sp
- .ls 1
- B.W. Kernighan & R. Pike, 1984,
- .i "The UNIX Programming Environment."
- Englewood Cliffs, N.J.: Prentice-Hall.
- .sp
- .ls 1
- B.W. Kernighan & D.M. Ritchie, 1978,
- .i "The C Programming Language,"
- Englewood Cliffs, N.J.: Prentice-Hall.
- .sp
- .ls 1
- S.J. Leffler, R.S. Fabry, W.N. Joy, P. Lapsley, S. Miller & C. Torek, 1986,
- .i "An Advanced 4.4BSD Interprocess Communication Tutorial."
- Computer Systems Research Group,
- Department of Electrical Engineering and Computer Science,
- University of California, Berkeley.
- .sp
- .ls 1
- Computer Systems Research Group, 1986,
- .i "UNIX Programmer's Manual, 4.4 Berkeley Software Distribution."
- Computer Systems Research Group,
- Department of Electrical Engineering and Computer Science,
- University of California, Berkeley.
- .)b