PageRenderTime 33ms CodeModel.GetById 12ms app.highlight 9ms RepoModel.GetById 2ms app.codeStats 0ms

/share/doc/psd/04.uprog/p4

https://bitbucket.org/freebsd/freebsd-head/
#! | 600 lines | 589 code | 11 blank | 0 comment | 0 complexity | 4ee81eb0ab56139cbebf4caf1feb7990 MD5 | raw file
  1.\" Copyright (C) Caldera International Inc. 2001-2002.  All rights reserved.
  2.\" 
  3.\" Redistribution and use in source and binary forms, with or without
  4.\" modification, are permitted provided that the following conditions are
  5.\" met:
  6.\" 
  7.\" Redistributions of source code and documentation must retain the above
  8.\" copyright notice, this list of conditions and the following
  9.\" disclaimer.
 10.\" 
 11.\" Redistributions in binary form must reproduce the above copyright
 12.\" notice, this list of conditions and the following disclaimer in the
 13.\" documentation and/or other materials provided with the distribution.
 14.\" 
 15.\" All advertising materials mentioning features or use of this software
 16.\" must display the following acknowledgement:
 17.\" 
 18.\" This product includes software developed or owned by Caldera
 19.\" International, Inc.  Neither the name of Caldera International, Inc.
 20.\" nor the names of other contributors may be used to endorse or promote
 21.\" products derived from this software without specific prior written
 22.\" permission.
 23.\" 
 24.\" USE OF THE SOFTWARE PROVIDED FOR UNDER THIS LICENSE BY CALDERA
 25.\" INTERNATIONAL, INC.  AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR
 26.\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 27.\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 28.\" DISCLAIMED.  IN NO EVENT SHALL CALDERA INTERNATIONAL, INC. BE LIABLE
 29.\" FOR ANY DIRECT, INDIRECT INCIDENTAL, SPECIAL, EXEMPLARY, OR
 30.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 31.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
 32.\" BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
 33.\" WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
 34.\" OR OTHERWISE) RISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
 35.\" IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 36.\" 
 37.\" $FreeBSD$
 38.\"
 39.\"	@(#)p4	8.1 (Berkeley) 6/8/93
 40.\"
 41.NH
 42LOW-LEVEL I/O
 43.PP
 44This section describes the 
 45bottom level of I/O on the
 46.UC UNIX
 47system.
 48The lowest level of I/O in
 49.UC UNIX
 50provides no buffering or any other services;
 51it is in fact a direct entry into the operating system.
 52You are entirely on your own,
 53but on the other hand,
 54you have the most control over what happens.
 55And since the calls and usage are quite simple,
 56this isn't as bad as it sounds.
 57.NH 2
 58File Descriptors
 59.PP
 60In the
 61.UC UNIX
 62operating system,
 63all input and output is done
 64by reading or writing files,
 65because all peripheral devices, even the user's terminal,
 66are files in the file system.
 67This means that a single, homogeneous interface
 68handles all communication between a program and peripheral devices.
 69.PP
 70In the most general case,
 71before reading or writing a file,
 72it is necessary to inform the system
 73of your intent to do so,
 74a process called
 75``opening'' the file.
 76If you are going to write on a file,
 77it may also be necessary to create it.
 78The system checks your right to do so
 79(Does the file exist?
 80Do you have permission to access it?),
 81and if all is well,
 82returns a small positive integer
 83called a
 84.ul
 85file descriptor.
 86Whenever I/O is to be done on the file,
 87the file descriptor is used instead of the name to identify the file.
 88(This is roughly analogous to the use of
 89.UC READ(5,...)
 90and
 91.UC WRITE(6,...)
 92in Fortran.)
 93All
 94information about an open file is maintained by the system;
 95the user program refers to the file
 96only
 97by the file descriptor.
 98.PP
 99The file pointers discussed in section 3
100are similar in spirit to file descriptors,
101but file descriptors are more fundamental.
102A file pointer is a pointer to a structure that contains,
103among other things, the file descriptor for the file in question.
104.PP
105Since input and output involving the user's terminal
106are so common,
107special arrangements exist to make this convenient.
108When the command interpreter (the
109``shell'')
110runs a program,
111it opens
112three files, with file descriptors 0, 1, and 2,
113called the standard input,
114the standard output, and the standard error output.
115All of these are normally connected to the terminal,
116so if a program reads file descriptor 0
117and writes file descriptors 1 and 2,
118it can do terminal I/O
119without worrying about opening the files.
120.PP
121If I/O is redirected 
122to and from files with
123.UL < 
124and
125.UL > ,
126as in
127.P1
128prog <infile >outfile
129.P2
130the shell changes the default assignments for file descriptors
1310 and 1
132from the terminal to the named files.
133Similar observations hold if the input or output is associated with a pipe.
134Normally file descriptor 2 remains attached to the terminal,
135so error messages can go there.
136In all cases,
137the file assignments are changed by the shell,
138not by the program.
139The program does not need to know where its input
140comes from nor where its output goes,
141so long as it uses file 0 for input and 1 and 2 for output.
142.NH 2
143Read and Write
144.PP
145All input and output is done by
146two functions called
147.UL read
148and
149.UL write .
150For both, the first argument is a file descriptor.
151The second argument is a buffer in your program where the data is to
152come from or go to.
153The third argument is the number of bytes to be transferred.
154The calls are
155.P1
156n_read = read(fd, buf, n);
157
158n_written = write(fd, buf, n);
159.P2
160Each call returns a byte count
161which is the number of bytes actually transferred.
162On reading,
163the number of bytes returned may be less than
164the number asked for,
165because fewer than
166.UL n
167bytes remained to be read.
168(When the file is a terminal,
169.UL read
170normally reads only up to the next newline,
171which is generally less than what was requested.)
172A return value of zero bytes implies end of file,
173and
174.UL -1
175indicates an error of some sort.
176For writing, the returned value is the number of bytes
177actually written;
178it is generally an error if this isn't equal
179to the number supposed to be written.
180.PP
181The number of bytes to be read or written is quite arbitrary.
182The two most common values are 
1831,
184which means one character at a time
185(``unbuffered''),
186and
187512,
188which corresponds to a physical blocksize on many peripheral devices.
189This latter size will be most efficient,
190but even character at a time I/O
191is not inordinately expensive.
192.PP
193Putting these facts together,
194we can write a simple program to copy
195its input to its output.
196This program will copy anything to anything,
197since the input and output can be redirected to any file or device.
198.P1
199#define	BUFSIZE	512	/* best size for PDP-11 UNIX */
200
201main()	/* copy input to output */
202{
203	char	buf[BUFSIZE];
204	int	n;
205
206	while ((n = read(0, buf, BUFSIZE)) > 0)
207		write(1, buf, n);
208	exit(0);
209}
210.P2
211If the file size is not a multiple of
212.UL BUFSIZE ,
213some 
214.UL read
215will return a smaller number of bytes
216to be written by
217.UL write ;
218the next call to 
219.UL read
220after that
221will return zero.
222.PP
223It is instructive to see how
224.UL read
225and
226.UL write
227can be used to construct
228higher level routines like
229.UL getchar ,
230.UL putchar ,
231etc.
232For example,
233here is a version of
234.UL getchar
235which does unbuffered input.
236.P1
237#define	CMASK	0377	/* for making char's > 0 */
238
239getchar()	/* unbuffered single character input */
240{
241	char c;
242
243	return((read(0, &c, 1) > 0) ? c & CMASK : EOF);
244}
245.P2
246.UL c
247.ul
248must
249be declared
250.UL char ,
251because
252.UL read
253accepts a character pointer.
254The character being returned must be masked with
255.UL 0377
256to ensure that it is positive;
257otherwise sign extension may make it negative.
258(The constant
259.UL 0377
260is appropriate for the
261.UC PDP -11
262but not necessarily for other machines.)
263.PP
264The second version of
265.UL getchar
266does input in big chunks,
267and hands out the characters one at a time.
268.P1
269#define	CMASK	0377	/* for making char's > 0 */
270#define	BUFSIZE	512
271
272getchar()	/* buffered version */
273{
274	static char	buf[BUFSIZE];
275	static char	*bufp = buf;
276	static int	n = 0;
277
278	if (n == 0) {	/* buffer is empty */
279		n = read(0, buf, BUFSIZE);
280		bufp = buf;
281	}
282	return((--n >= 0) ? *bufp++ & CMASK : EOF);
283}
284.P2
285.NH 2
286Open, Creat, Close, Unlink
287.PP
288Other than the default
289standard input, output and error files,
290you must explicitly open files in order to
291read or write them.
292There are two system entry points for this,
293.UL open
294and
295.UL creat 
296[sic].
297.PP
298.UL open
299is rather like the
300.UL  fopen
301discussed in the previous section,
302except that instead of returning a file pointer,
303it returns a file descriptor,
304which is just an
305.UL int .
306.P1
307int fd;
308
309fd = open(name, rwmode);
310.P2
311As with
312.UL fopen ,
313the
314.UL name
315argument
316is a character string corresponding to the external file name.
317The access mode argument
318is different, however:
319.UL rwmode
320is 0 for read, 1 for write, and 2 for read and write access.
321.UL open
322returns
323.UL -1
324if any error occurs;
325otherwise it returns a valid file descriptor.
326.PP
327It is an error to 
328try to
329.UL open
330a file that does not exist.
331The entry point
332.UL creat
333is provided to create new files,
334or to re-write old ones.
335.P1
336fd = creat(name, pmode);
337.P2
338returns a file descriptor
339if it was able to create the file
340called
341.UL name ,
342and
343.UL -1
344if not.
345If the file
346already exists,
347.UL creat
348will truncate it to zero length;
349it is not an error to
350.UL creat
351a file that already exists.
352.PP
353If the file is brand new,
354.UL creat
355creates it with the
356.ul
357protection mode 
358specified by
359the
360.UL pmode
361argument.
362In the
363.UC UNIX
364file system,
365there are nine bits of protection information
366associated with a file,
367controlling read, write and execute permission for
368the owner of the file,
369for the owner's group,
370and for all others.
371Thus a three-digit octal number
372is most convenient for specifying the permissions.
373For example,
3740755
375specifies read, write and execute permission for the owner,
376and read and execute permission for the group and everyone else.
377.PP
378To illustrate,
379here is a simplified version of
380the
381.UC UNIX
382utility
383.IT cp ,
384a program which copies one file to another.
385(The main simplification is that our version
386copies only one file,
387and does not permit the second argument
388to be a directory.)
389.P1
390#define NULL 0
391#define BUFSIZE 512
392#define PMODE 0644 /* RW for owner, R for group, others */
393
394main(argc, argv)	/* cp: copy f1 to f2 */
395int argc;
396char *argv[];
397{
398	int	f1, f2, n;
399	char	buf[BUFSIZE];
400
401	if (argc != 3)
402		error("Usage: cp from to", NULL);
403	if ((f1 = open(argv[1], 0)) == -1)
404		error("cp: can't open %s", argv[1]);
405	if ((f2 = creat(argv[2], PMODE)) == -1)
406		error("cp: can't create %s", argv[2]);
407
408	while ((n = read(f1, buf, BUFSIZE)) > 0)
409		if (write(f2, buf, n) != n)
410			error("cp: write error", NULL);
411	exit(0);
412}
413.P2
414.P1
415error(s1, s2)	/* print error message and die */
416char *s1, *s2;
417{
418	printf(s1, s2);
419	printf("\en");
420	exit(1);
421}
422.P2
423.PP
424As we said earlier,
425there is a limit (typically 15-25)
426on the number of files which a program
427may have open simultaneously.
428Accordingly, any program which intends to process
429many files must be prepared to re-use
430file descriptors.
431The routine
432.UL close
433breaks the connection between a file descriptor
434and an open file,
435and frees the
436file descriptor for use with some other file.
437Termination of a program
438via
439.UL exit
440or return from the main program closes all open files.
441.PP
442The function
443.UL unlink(filename)
444removes the file
445.UL filename
446from the file system.
447.NH 2
448Random Access \(em Seek and Lseek
449.PP
450File I/O is normally sequential:
451each
452.UL read
453or
454.UL write
455takes place at a position in the file
456right after the previous one.
457When necessary, however,
458a file can be read or written in any arbitrary order.
459The
460system call
461.UL lseek
462provides a way to move around in
463a file without actually reading
464or writing:
465.P1
466lseek(fd, offset, origin);
467.P2
468forces the current position in the file
469whose descriptor is
470.UL fd
471to move to position
472.UL offset ,
473which is taken relative to the location
474specified by
475.UL origin .
476Subsequent reading or writing will begin at that position.
477.UL offset
478is
479a
480.UL long ;
481.UL fd
482and
483.UL origin
484are
485.UL int 's.
486.UL origin
487can be 0, 1, or 2 to specify that 
488.UL offset
489is to be
490measured from
491the beginning, from the current position, or from the
492end of the file respectively.
493For example,
494to append to a file,
495seek to the end before writing:
496.P1
497lseek(fd, 0L, 2);
498.P2
499To get back to the beginning (``rewind''),
500.P1
501lseek(fd, 0L, 0);
502.P2
503Notice the
504.UL 0L
505argument;
506it could also be written as
507.UL (long)\ 0 .
508.PP
509With 
510.UL lseek ,
511it is possible to treat files more or less like large arrays,
512at the price of slower access.
513For example, the following simple function reads any number of bytes
514from any arbitrary place in a file.
515.P1
516get(fd, pos, buf, n) /* read n bytes from position pos */
517int fd, n;
518long pos;
519char *buf;
520{
521	lseek(fd, pos, 0);	/* get to pos */
522	return(read(fd, buf, n));
523}
524.P2
525.PP
526In pre-version 7
527.UC UNIX ,
528the basic entry point to the I/O system
529is called
530.UL seek .
531.UL seek
532is identical to
533.UL lseek ,
534except that its
535.UL  offset 
536argument is an
537.UL int
538rather than  a
539.UL long .
540Accordingly,
541since
542.UC PDP -11
543integers have only 16 bits,
544the
545.UL offset
546specified
547for
548.UL seek
549is limited to 65,535;
550for this reason,
551.UL origin
552values of 3, 4, 5 cause
553.UL seek
554to multiply the given offset by 512
555(the number of bytes in one physical block)
556and then interpret
557.UL origin
558as if it were 0, 1, or 2 respectively.
559Thus to get to an arbitrary place in a large file
560requires two seeks, first one which selects
561the block, then one which
562has
563.UL origin
564equal to 1 and moves to the desired byte within the block.
565.NH 2
566Error Processing
567.PP
568The routines discussed in this section,
569and in fact all the routines which are direct entries into the system
570can incur errors.
571Usually they indicate an error by returning a value of \-1.
572Sometimes it is nice to know what sort of error occurred;
573for this purpose all these routines, when appropriate,
574leave an error number in the external cell
575.UL errno .
576The meanings of the various error numbers are
577listed
578in the introduction to Section II
579of the
580.I
581.UC UNIX
582Programmer's Manual,
583.R
584so your program can, for example, determine if
585an attempt to open a file failed because it did not exist
586or because the user lacked permission to read it.
587Perhaps more commonly,
588you may want to print out the
589reason for failure.
590The routine
591.UL perror
592will print a message associated with the value
593of
594.UL errno ;
595more generally,
596.UL sys\_errno
597is an array of character strings which can be indexed
598by
599.UL errno
600and printed by your program.