There are a few anti-patterns to consider when accessing the file system.
This article assumes knowledge of the standard
GFile
,
GInputStream
and
GOutputStream
APIs.
Use asynchronous I/O for file access. ()
Always use appropriate functions to construct file names and paths. ()
Validate file paths are in the expected directories before using them. ()
Use mandatory access control profiles to enforce constraints on file access. ()
Almost all I/O should be performed asynchronously. That is, without
blocking the
GLib
main context. This can be achieved by always using the
*_async()
and *_finish()
variants of each I/O
function.
For example,
g_input_stream_read_async()
rather than
g_input_stream_read()
.
Synchronous I/O blocks the main loop, which means that other events, such as user input, incoming networking packets, timeouts and idle callbacks, are not handled until the blocking function returns.
Synchronous I/O is acceptable in certain circumstances where the overheads
of scheduling an asynchronous operation exceed the costs of local
synchronous I/O on Linux. For example, making a small read from a local
file, or from a virtual file system such as g_open()
, read()
and g_close()
should be used rather than GIO.
Files in the user’s home directory do not count as local, as they could be on a networked file system.
Note that the alternative – running synchronous I/O in a separate thread – is highly discouraged; see the threading guidelines for more information.
File names and paths are not normal strings: on some systems, they can use a character encoding other than UTF-8, while normal strings in GLib are guaranteed to always use UTF-8. For this reason, special functions should be used to build and handle file names and paths. (Modern Linux systems almost universally use UTF-8 for filename encoding, so this is not an issue in practice, but the file path functions should still be used for compatibility with systems such as Windows, which use UTF-16 filenames.)
For example, file paths should be built using
g_build_filename()
rather than
g_strconcat()
.
Doing so makes it clearer what the code is meant to do, and also eliminates duplicate directory separators, so the returned path is canonical (though not necessarily absolute).
As another example, paths should be disassembled using
g_path_get_basename()
and
g_path_get_dirname()
rather than
g_strrstr()
and other manual searching functions.
If a filename or path comes from external input, such as a web page or
user input, it should be validated to ensure that putting it into a file
path will not produce an arbitrary path. For example if a filename is
constructed from the constant string
This can be avoided by validating constructed paths before using them,
using
g_file_resolve_relative_path()
to convert any relative paths to absolute ones, and then validating that
the path is beneath a given root sandboxing directory appropriate for the
operation. For example, if code downloads a file, it could validate that
all paths are beneath g_file_has_parent()
.
As a second line of defence, all projects which access the file system should consider providing a mandatory access control profile, using a system such as AppArmor or SELinux, which limits the directories and files they can read from and write to.