10.2.1 Calling glob

The result of globbing is a vector of file names (strings). To return this vector, glob uses a special data type, glob_t, which is a structure. You pass glob the address of the structure, and it fills in the structure’s fields to tell you about the results.

Data Type: glob_t

This data type holds a pointer to a word vector. More precisely, it records both the address of the word vector and its size. The GNU implementation contains some more fields which are non-standard extensions.

gl_pathc

The number of elements in the vector, excluding the initial null entries if the GLOB_DOOFFS flag is used (see gl_offs below).

gl_pathv

The address of the vector. This field has type char **.

gl_offs

The offset of the first real element of the vector, from its nominal address in the gl_pathv field. Unlike the other fields, this is always an input to glob, rather than an output from it.

If you use a nonzero offset, then that many elements at the beginning of the vector are left empty. (The glob function fills them with null pointers.)

The gl_offs field is meaningful only if you use the GLOB_DOOFFS flag. Otherwise, the offset is always zero regardless of what is in this field, and the first real element comes at the beginning of the vector.

gl_closedir

The address of an alternative implementation of the closedir function. It is used if the GLOB_ALTDIRFUNC bit is set in the flag parameter. The type of this field is void (*) (void *).

This is a GNU extension.

gl_readdir

The address of an alternative implementation of the readdir function used to read the contents of a directory. It is used if the GLOB_ALTDIRFUNC bit is set in the flag parameter. The type of this field is struct dirent *(*) (void *).

An implementation of gl_readdir needs to initialize the following members of the struct dirent object:

d_type

This member should be set to the file type of the entry if it is known. Otherwise, the value DT_UNKNOWN can be used. The glob function may use the specified file type to avoid callbacks in cases where the file type indicates that the data is not required.

d_ino

This member needs to be non-zero, otherwise glob may skip the current entry and call the gl_readdir callback function again to retrieve another entry.

d_name

This member must be set to the name of the entry. It must be null-terminated.

The example below shows how to allocate a struct dirent object containing a given name.


#include <dirent.h>
#include <errno.h>
#include <stddef.h>
#include <stdlib.h>
#include <string.h>

struct dirent *
mkdirent (const char *name)
{
  size_t dirent_size = offsetof (struct dirent, d_name) + 1;
  size_t name_length = strlen (name);
  size_t total_size = dirent_size + name_length;
  if (total_size < dirent_size)
    {
      errno = ENOMEM;
      return NULL;
    }
  struct dirent *result = malloc (total_size);
  if (result == NULL)
    return NULL;
  result->d_type = DT_UNKNOWN;
  result->d_ino = 1;            /* Do not skip this entry. */
  memcpy (result->d_name, name, name_length + 1);
  return result;
}

The glob function reads the struct dirent members listed above and makes a copy of the file name in the d_name member immediately after the gl_readdir callback function returns. Future invocations of any of the callback functions may deallocate or reuse the buffer. It is the responsibility of the caller of the glob function to allocate and deallocate the buffer, around the call to glob or using the callback functions. For example, an application could allocate the buffer in the gl_readdir callback function, and deallocate it in the gl_closedir callback function.

The gl_readdir member is a GNU extension.

gl_opendir

The address of an alternative implementation of the opendir function. It is used if the GLOB_ALTDIRFUNC bit is set in the flag parameter. The type of this field is void *(*) (const char *).

This is a GNU extension.

gl_stat

The address of an alternative implementation of the stat function to get information about an object in the filesystem. It is used if the GLOB_ALTDIRFUNC bit is set in the flag parameter. The type of this field is int (*) (const char *, struct stat *).

This is a GNU extension.

gl_lstat

The address of an alternative implementation of the lstat function to get information about an object in the filesystems, not following symbolic links. It is used if the GLOB_ALTDIRFUNC bit is set in the flag parameter. The type of this field is int (*) (const char *, struct stat *).

This is a GNU extension.

gl_flags

The flags used when glob was called. In addition, GLOB_MAGCHAR might be set. See Flags for Globbing for more details.

This is a GNU extension.

For use in the glob64 function glob.h contains another definition for a very similar type. glob64_t differs from glob_t only in the types of the members gl_readdir, gl_stat, and gl_lstat.

Data Type: glob64_t

This data type holds a pointer to a word vector. More precisely, it records both the address of the word vector and its size. The GNU implementation contains some more fields which are non-standard extensions.

gl_pathc

The number of elements in the vector, excluding the initial null entries if the GLOB_DOOFFS flag is used (see gl_offs below).

gl_pathv

The address of the vector. This field has type char **.

gl_offs

The offset of the first real element of the vector, from its nominal address in the gl_pathv field. Unlike the other fields, this is always an input to glob, rather than an output from it.

If you use a nonzero offset, then that many elements at the beginning of the vector are left empty. (The glob function fills them with null pointers.)

The gl_offs field is meaningful only if you use the GLOB_DOOFFS flag. Otherwise, the offset is always zero regardless of what is in this field, and the first real element comes at the beginning of the vector.

gl_closedir

The address of an alternative implementation of the closedir function. It is used if the GLOB_ALTDIRFUNC bit is set in the flag parameter. The type of this field is void (*) (void *).

This is a GNU extension.

gl_readdir

The address of an alternative implementation of the readdir64 function used to read the contents of a directory. It is used if the GLOB_ALTDIRFUNC bit is set in the flag parameter. The type of this field is struct dirent64 *(*) (void *).

This is a GNU extension.

gl_opendir

The address of an alternative implementation of the opendir function. It is used if the GLOB_ALTDIRFUNC bit is set in the flag parameter. The type of this field is void *(*) (const char *).

This is a GNU extension.

gl_stat

The address of an alternative implementation of the stat64 function to get information about an object in the filesystem. It is used if the GLOB_ALTDIRFUNC bit is set in the flag parameter. The type of this field is int (*) (const char *, struct stat64 *).

This is a GNU extension.

gl_lstat

The address of an alternative implementation of the lstat64 function to get information about an object in the filesystems, not following symbolic links. It is used if the GLOB_ALTDIRFUNC bit is set in the flag parameter. The type of this field is int (*) (const char *, struct stat64 *).

This is a GNU extension.

gl_flags

The flags used when glob was called. In addition, GLOB_MAGCHAR might be set. See Flags for Globbing for more details.

This is a GNU extension.

Function: int glob (const char *pattern, int flags, int (*errfunc) (const char *filename, int error-code), glob_t *vector-ptr)

Preliminary: | MT-Unsafe race:utent env sig:ALRM timer locale | AS-Unsafe dlopen plugin corrupt heap lock | AC-Unsafe corrupt lock fd mem | See POSIX Safety Concepts.

The function glob does globbing using the pattern pattern in the current directory. It puts the result in a newly allocated vector, and stores the size and address of this vector into *vector-ptr. The argument flags is a combination of bit flags; see Flags for Globbing, for details of the flags.

The result of globbing is a sequence of file names. The function glob allocates a string for each resulting word, then allocates a vector of type char ** to store the addresses of these strings. The last element of the vector is a null pointer. This vector is called the word vector.

To return this vector, glob stores both its address and its length (number of elements, not counting the terminating null pointer) into *vector-ptr.

Normally, glob sorts the file names alphabetically before returning them. You can turn this off with the flag GLOB_NOSORT if you want to get the information as fast as possible. Usually it’s a good idea to let glob sort them—if you process the files in alphabetical order, the users will have a feel for the rate of progress that your application is making.

If glob succeeds, it returns 0. Otherwise, it returns one of these error codes:

GLOB_ABORTED

There was an error opening a directory, and you used the flag GLOB_ERR or your specified errfunc returned a nonzero value. for an explanation of the GLOB_ERR flag and errfunc.

GLOB_NOMATCH

The pattern didn’t match any existing files. If you use the GLOB_NOCHECK flag, then you never get this error code, because that flag tells glob to pretend that the pattern matched at least one file.

GLOB_NOSPACE

It was impossible to allocate memory to hold the result.

In the event of an error, glob stores information in *vector-ptr about all the matches it has found so far.

It is important to notice that the glob function will not fail if it encounters directories or files which cannot be handled without the LFS interfaces. The implementation of glob is supposed to use these functions internally. This at least is the assumption made by the Unix standard. The GNU extension of allowing the user to provide their own directory handling and stat functions complicates things a bit. If these callback functions are used and a large file or directory is encountered glob can fail.

Function: int glob64 (const char *pattern, int flags, int (*errfunc) (const char *filename, int error-code), glob64_t *vector-ptr)

Preliminary: | MT-Unsafe race:utent env sig:ALRM timer locale | AS-Unsafe dlopen corrupt heap lock | AC-Unsafe corrupt lock fd mem | See POSIX Safety Concepts.

The glob64 function was added as part of the Large File Summit extensions but is not part of the original LFS proposal. The reason for this is simple: it is not necessary. The necessity for a glob64 function is added by the extensions of the GNU glob implementation which allows the user to provide their own directory handling and stat functions. The readdir and stat functions do depend on the choice of _FILE_OFFSET_BITS since the definition of the types struct dirent and struct stat will change depending on the choice.

Besides this difference, glob64 works just like glob in all aspects.

This function is a GNU extension.