13.2 Input and Output Primitives

This section describes the functions for performing primitive input and output operations on file descriptors: read, write, and lseek. These functions are declared in the header file unistd.h.

Data Type: ssize_t

This data type is used to represent the sizes of blocks that can be read or written in a single operation. It is similar to size_t, but must be a signed type.

Function: ssize_t read (int filedes, void *buffer, size_t size)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

The read function reads up to size bytes from the file with descriptor filedes, storing the results in the buffer. (This is not necessarily a character string, and no terminating null character is added.)

The return value is the number of bytes actually read. This might be less than size; for example, if there aren’t that many bytes left in the file or if there aren’t that many bytes immediately available. The exact behavior depends on what kind of file it is. Note that reading less than size bytes is not an error.

A value of zero indicates end-of-file (except if the value of the size argument is also zero). This is not considered an error. If you keep calling read while at end-of-file, it will keep returning zero and doing nothing else.

If read returns at least one character, there is no way you can tell whether end-of-file was reached. But if you did reach the end, the next read will return zero.

In case of an error, read returns -1. The following errno error conditions are defined for this function:

EAGAIN

Normally, when no input is immediately available, read waits for some input. But if the O_NONBLOCK flag is set for the file (see File Status Flags), read returns immediately without reading any data, and reports this error.

Compatibility Note: Most versions of BSD Unix use a different error code for this: EWOULDBLOCK. In the GNU C Library, EWOULDBLOCK is an alias for EAGAIN, so it doesn’t matter which name you use.

On some systems, reading a large amount of data from a character special file can also fail with EAGAIN if the kernel cannot find enough physical memory to lock down the user’s pages. This is limited to devices that transfer with direct memory access into the user’s memory, which means it does not include terminals, since they always use separate buffers inside the kernel. This problem never happens on GNU/Hurd systems.

Any condition that could result in EAGAIN can instead result in a successful read which returns fewer bytes than requested. Calling read again immediately would result in EAGAIN.

EBADF

The filedes argument is not a valid file descriptor, or is not open for reading.

EINTR

read was interrupted by a signal while it was waiting for input. See Primitives Interrupted by Signals. A signal will not necessarily cause read to return EINTR; it may instead result in a successful read which returns fewer bytes than requested.

EIO

For many devices, and for disk files, this error code indicates a hardware error.

EIO also occurs when a background process tries to read from the controlling terminal, and the normal action of stopping the process by sending it a SIGTTIN signal isn’t working. This might happen if the signal is being blocked or ignored, or because the process group is orphaned. See Job Control, for more information about job control, and Signal Handling, for information about signals.

EINVAL

In some systems, when reading from a character or block device, position and size offsets must be aligned to a particular block size. This error indicates that the offsets were not properly aligned.

Please note that there is no function named read64. This is not necessary since this function does not directly modify or handle the possibly wide file offset. Since the kernel handles this state internally, the read function can be used for all cases.

This function is a cancellation point in multi-threaded programs. This is a problem if the thread allocates some resources (like memory, file descriptors, semaphores or whatever) at the time read is called. If the thread gets canceled these resources stay allocated until the program ends. To avoid this, calls to read should be protected using cancellation handlers.

The read function is the underlying primitive for all of the functions that read from streams, such as fgetc.

Function: ssize_t pread (int filedes, void *buffer, size_t size, off_t offset)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

The pread function is similar to the read function. The first three arguments are identical, and the return values and error codes also correspond.

The difference is the fourth argument and its handling. The data block is not read from the current position of the file descriptor filedes. Instead the data is read from the file starting at position offset. The position of the file descriptor itself is not affected by the operation. The value is the same as before the call.

When the source file is compiled with _FILE_OFFSET_BITS == 64 the pread function is in fact pread64 and the type off_t has 64 bits, which makes it possible to handle files up to 2^63 bytes in length.

The return value of pread describes the number of bytes read. In the error case it returns -1 like read does and the error codes are also the same, with these additions:

EINVAL

The value given for offset is negative and therefore illegal.

ESPIPE

The file descriptor filedes is associated with a pipe or a FIFO and this device does not allow positioning of the file pointer.

The function is an extension defined in the Unix Single Specification version 2.

Function: ssize_t pread64 (int filedes, void *buffer, size_t size, off64_t offset)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

This function is similar to the pread function. The difference is that the offset parameter is of type off64_t instead of off_t which makes it possible on 32 bit machines to address files larger than 2^31 bytes and up to 2^63 bytes. The file descriptor filedes must be opened using open64 since otherwise the large offsets possible with off64_t will lead to errors with a descriptor in small file mode.

When the source file is compiled with _FILE_OFFSET_BITS == 64 on a 32 bit machine this function is actually available under the name pread and so transparently replaces the 32 bit interface.

Function: ssize_t write (int filedes, const void *buffer, size_t size)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

The write function writes up to size bytes from buffer to the file with descriptor filedes. The data in buffer is not necessarily a character string and a null character is output like any other character.

The return value is the number of bytes actually written. This may be size, but can always be smaller. Your program should always call write in a loop, iterating until all the data is written.

Once write returns, the data is enqueued to be written and can be read back right away, but it is not necessarily written out to permanent storage immediately. You can use fsync when you need to be sure your data has been permanently stored before continuing. (It is more efficient for the system to batch up consecutive writes and do them all at once when convenient. Normally they will always be written to disk within a minute or less.) Modern systems provide another function fdatasync which guarantees integrity only for the file data and is therefore faster. You can use the O_FSYNC open mode to make write always store the data to disk before returning; see I/O Operating Modes.

In the case of an error, write returns -1. The following errno error conditions are defined for this function:

EAGAIN

Normally, write blocks until the write operation is complete. But if the O_NONBLOCK flag is set for the file (see Control Operations on Files), it returns immediately without writing any data and reports this error. An example of a situation that might cause the process to block on output is writing to a terminal device that supports flow control, where output has been suspended by receipt of a STOP character.

Compatibility Note: Most versions of BSD Unix use a different error code for this: EWOULDBLOCK. In the GNU C Library, EWOULDBLOCK is an alias for EAGAIN, so it doesn’t matter which name you use.

On some systems, writing a large amount of data from a character special file can also fail with EAGAIN if the kernel cannot find enough physical memory to lock down the user’s pages. This is limited to devices that transfer with direct memory access into the user’s memory, which means it does not include terminals, since they always use separate buffers inside the kernel. This problem does not arise on GNU/Hurd systems.

EBADF

The filedes argument is not a valid file descriptor, or is not open for writing.

EFBIG

The size of the file would become larger than the implementation can support.

EINTR

The write operation was interrupted by a signal while it was blocked waiting for completion. A signal will not necessarily cause write to return EINTR; it may instead result in a successful write which writes fewer bytes than requested. See Primitives Interrupted by Signals.

EIO

For many devices, and for disk files, this error code indicates a hardware error.

ENOSPC

The device containing the file is full.

EPIPE

This error is returned when you try to write to a pipe or FIFO that isn’t open for reading by any process. When this happens, a SIGPIPE signal is also sent to the process; see Signal Handling.

EINVAL

In some systems, when writing to a character or block device, position and size offsets must be aligned to a particular block size. This error indicates that the offsets were not properly aligned.

Unless you have arranged to prevent EINTR failures, you should check errno after each failing call to write, and if the error was EINTR, you should simply repeat the call. See Primitives Interrupted by Signals. The easy way to do this is with the macro TEMP_FAILURE_RETRY, as follows:

nbytes = TEMP_FAILURE_RETRY (write (desc, buffer, count));

Please note that there is no function named write64. This is not necessary since this function does not directly modify or handle the possibly wide file offset. Since the kernel handles this state internally the write function can be used for all cases.

This function is a cancellation point in multi-threaded programs. This is a problem if the thread allocates some resources (like memory, file descriptors, semaphores or whatever) at the time write is called. If the thread gets canceled these resources stay allocated until the program ends. To avoid this, calls to write should be protected using cancellation handlers.

The write function is the underlying primitive for all of the functions that write to streams, such as fputc.

Function: ssize_t pwrite (int filedes, const void *buffer, size_t size, off_t offset)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

The pwrite function is similar to the write function. The first three arguments are identical, and the return values and error codes also correspond.

The difference is the fourth argument and its handling. The data block is not written to the current position of the file descriptor filedes. Instead the data is written to the file starting at position offset. The position of the file descriptor itself is not affected by the operation. The value is the same as before the call.

However, on Linux, if a file is opened with O_APPEND, pwrite appends data to the end of the file, regardless of the value of offset.

When the source file is compiled with _FILE_OFFSET_BITS == 64 the pwrite function is in fact pwrite64 and the type off_t has 64 bits, which makes it possible to handle files up to 2^63 bytes in length.

The return value of pwrite describes the number of written bytes. In the error case it returns -1 like write does and the error codes are also the same, with these additions:

EINVAL

The value given for offset is negative and therefore illegal.

ESPIPE

The file descriptor filedes is associated with a pipe or a FIFO and this device does not allow positioning of the file pointer.

The function is an extension defined in the Unix Single Specification version 2.

Function: ssize_t pwrite64 (int filedes, const void *buffer, size_t size, off64_t offset)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

This function is similar to the pwrite function. The difference is that the offset parameter is of type off64_t instead of off_t which makes it possible on 32 bit machines to address files larger than 2^31 bytes and up to 2^63 bytes. The file descriptor filedes must be opened using open64 since otherwise the large offsets possible with off64_t will lead to errors with a descriptor in small file mode.

When the source file is compiled using _FILE_OFFSET_BITS == 64 on a 32 bit machine this function is actually available under the name pwrite and so transparently replaces the 32 bit interface.