After having moved into a new apartment I'm now back at working on my thesis about Thunar. There are a few things which are solved very differently in GIO than in ThunarVFS. One of them is the way jobs are handled. A job basically is a task which may take a while to run and thus is executed in a separate thread (so it doesn't block the GUI).

ThunarVFS has a framework called ThunarVfsJob. It lets you create different kinds of jobs e.g. for changing file permissions recursively or for computing the total number of files and the total size of a directory. These jobs report back to the GUI thread using signals such as "new-files" (when new files are added and need to picked up by the GUI) or "progress".

GIO has something similar ... but it's not so obvious how it works. When I tried to figure out how to migrate ThunarVfsJob to GIO I thought: hey, GIO must have something like this already! It contains several job-like functions such as g_file_copy_async() after all.

So here's what I found out after spending some time on reading gfile.c and glocalfile.c: there is a job framework in GIO ... but it's hidden behind easy-to-use asynchronous functions. It's based on GCancellable, GAsyncResult and a bunch of callback types. It uses GIOScheduler internally to glue everything together to something that is actually pretty convenient (but still kinda tricky).

So, what do you need in orderto write your own jobs in the GIO style?

First of all, you need an example task. I picked counting files and computing the total size of a directory to understand how it works. What we want is an asynchronous function which does exactly that and uses a callback to report the progress back to the GUI thread ... just like g_file_copy_async() does.

First of all, you define the callback type and two functions for starting the job (sync and async version):

The Public API

typedef void (*GFileCountProgressCallback) (goffset  current_num_files,
                                            goffset  current_num_bytes,
                                            gpointer user_data);

static gboolean 
g_file_deep_count (GFile                     *file,
                   GCancellable              *cancellable,
                   GFileCountProgressCallback progress_callback,
                   gpointer                   progress_callback_data,
                   GError                   **error);

 static void
 g_file_deep_count_async (GFile                     *file,
                          int                        io_priority,
                          GCancellable              *cancellable,
                          GFileCountProgressCallback progress_callback,
                          gpointer                   progress_callback_data,
                          GAsyncReadyCallback        callback,
                          gpointer                   callback_data);

The Implementation

All the function g_file_deep_count_async() will do is to create a GSimpleAsyncResult, put the callback information into it and then tell the GIOScheduler to run the job. Here's how that looks like:

static void 
g_file_deep_count_async (GFile                     *file,
                         int                        io_priority,
                         GCancellable              *cancellable,
                         GFileCountProgressCallback progress_callback,
                         gpointer                   progress_callback_data,
                         GAsyncReadyCallback        callback,
                         gpointer                   callback_data)
{
  GSimpleAsyncResult *result;
  DeepCountAsyncData *data;

  g_return_if_fail (G_IS_FILE (file));

  data = g_new0 (DeepCountAsyncData, 1);
  data->file = g_object_ref (file);
  data->progress_cb = progress_callback;
  data->progress_cb_data = progress_callback_data;

  result = g_simple_async_result_new (G_OBJECT (file), 
                                      callback,
                                      callback_data, 
                                      g_file_deep_count_async);
  g_simple_async_result_set_op_res_gpointer (result, 
                                             data, 
                                             (GDestroyNotify) deep_count_async_data_free); 

  g_io_scheduler_push_job (deep_count_async_thread,
                           result, 
                           g_object_unref, 
                           io_priority, 
                           cancellable);
}

DeepCountAsyncData is a simple struct which needs no further explanation, I think. First data with callback and user data information is added to the GSimpleAsyncResult and then the job is added to the GIOScheduler. As you can see, there is another function involved: deep_count_async_thread. This is the function which runs in a separate thread and does most of the work (well, not quite ... but almost). Here's how it looks like:

static gboolean
deep_count_async_thread (GIOSchedulerJob *job,
                         GCancellable    *cancellable,
                         gpointer         user_data)
{
  GSimpleAsyncResult *res;
  DeepCountAsyncData *data;
  gboolean            result;
  GError             *error = NULL;

  res = user_data;
  data = g_simple_async_result_get_op_res_gpointer (res);

  data->job = job;
  result = g_file_deep_count (data->file, 
                              cancellable, 
                              data->progress_cb != NULL ? deep_count_async_progress_callback : NULL, 
                              data, 
                              &error);

  if (data->progress_cb != NULL)
    g_io_scheduler_job_send_to_mainloop (job, (GSourceFunc) gtk_false, NULL, NULL);

  if (!result && error != NULL)
    {
      g_simple_async_result_set_from_error (res, error);
      g_error_free (error);
    }

  g_simple_async_result_complete_in_idle (res);

  return FALSE;
}

As you can see it runs the synchronous function g_file_deep_count() and makes sure the progress callback is called at least once. It does one more thing though: it defines it's own progress callback: deep_count_async_progress_callback. This is required for the real progress callback to be called inside the GUI thread. This is the code for the internal callback:

static gboolean
deep_count_async_progress_in_main (gpointer user_data)
{
  ProgressData       *progress = user_data;
  DeepCountAsyncData *data = progress->data;

  data->progress_cb (progress->current_num_files, 
                     progress->current_num_bytes, 
                     data->progress_cb_data);

  return FALSE;
}

static void
deep_count_async_progress_callback (goffset  current_num_files,
                                    goffset  current_num_bytes,
                                    gpointer user_data)
{
  DeepCountAsyncData *data = user_data;
  ProgressData       *progress;

  progress = g_new (ProgressData, 1);
  progress->data = data;
  progress->current_num_files = current_num_files;
  progress->current_num_bytes = current_num_bytes;

  g_io_scheduler_job_send_to_mainloop_async (data->job, 
                                             deep_count_async_progress_in_main, 
                                             progress, 
                                             g_free);
}

deep_count_async_progress_callback() is called from within the job thread. It then tells the scheduler to call deep_count_async_progress_in_main from the GUI thread. And finally deep_count_async_progress_in_main calls the real progress callback e.g. to update the GUI.

Now you still haven't seen any code related to counting files and computing the total file size of a directory ... let's get to that now. Here's the synchronous deep count function which is called from within the job thread:

static gboolean  
g_file_deep_count (GFile                     *file,
                   GCancellable              *cancellable,
                   GFileCountProgressCallback progress_callback,
                   gpointer                   progress_callback_data,
                   GError                   **error)
{
  ProgressData data = {
    .data = NULL,
    .current_num_files = 0,
    .current_num_bytes = 0,
  };

  g_return_val_if_fail (G_IS_FILE (file), FALSE); 

  if (g_cancellable_set_error_if_cancelled (cancellable, error))
    return FALSE;

  return g_file_real_deep_count (file, 
                                 cancellable, 
                                 progress_callback, 
                                 progress_callback_data, 
                                 &data, 
                                 error);
}

Damn ... it still doesn't do any real work! Ok, but this time there's no big rat-tail of nested function calls anymore, I promise. There's just one function left: g_file_real_deep_count(). Before we can call it, however, g_file_deep_count() has to initialize the progress data. After that we can call g_file_real_deep_count() recursively and do something useful. Here we go:

static gboolean
g_file_real_deep_count (GFile                     *file,
                        GCancellable              *cancellable,
                        GFileCountProgressCallback progress_callback,
                        gpointer                   progress_callback_data,
                        ProgressData              *progress_data,
                        GError                   **error)
{
  GFileEnumerator *enumerator;
  GFileInfo       *info;
  GFileInfo       *child_info;
  GFile           *child;
  gboolean         success = TRUE;
  
  g_return_val_if_fail (G_IS_FILE (file), FALSE);
  if (g_cancellable_set_error_if_cancelled (cancellable, error))
    return FALSE;

  info = g_file_query_info (file, 
                            "standard::*", 
                            G_FILE_QUERY_INFO_NOFOLLOW_SYMLINKS, 
                            cancellable, 
                            error);

  if (g_cancellable_is_cancelled (cancellable))
    return FALSE;

  if (info == NULL)
    return FALSE;

  progress_data->current_num_files += 1;
  progress_data->current_num_bytes += g_file_info_get_size (info);

  if (progress_callback != NULL)
    {
      /* Here we call the internal callback */
      progress_callback (progress_data->current_num_files, 
                         progress_data->current_num_bytes, 
                         progress_callback_data);
    }

  if (g_file_info_get_file_type (info) == G_FILE_TYPE_DIRECTORY)
    {
      enumerator = g_file_enumerate_children (file, 
                                              "standard::*", 
                                              G_FILE_QUERY_INFO_NOFOLLOW_SYMLINKS, 
                                              cancellable, 
                                              error);
    
      if (!g_cancellable_is_cancelled (cancellable))
        {
          if (enumerator != NULL)
            {
              while (!g_cancellable_is_cancelled (cancellable) && success)
                {
                  child_info = g_file_enumerator_next_file (enumerator, 
                                                            cancellable, 
                                                            error);

                  if (g_cancellable_is_cancelled (cancellable))
                    break;

                  if (child_info == NULL)
                    {
                      if (*error != NULL)
                        success = FALSE;
                      break;
                    }

                  child = g_file_resolve_relative_path (file, g_file_info_get_name (child_info));
                  success = success && g_file_real_deep_count (child, 
                                                               cancellable, 
                                                               progress_callback, 
                                                               progress_callback_data, 
                                                               progress_data, 
                                                               error);
                  g_object_unref (child);
                  g_object_unref (child_info);
                }

              g_object_unref (enumerator);
            }
        }
    }

  g_object_unref (info);

  return !g_cancellable_is_cancelled (cancellable) && success;
}

And that's it. We can now compute the number of files and the total size of a directory recursively using a GCancellable and one or two callbacks. All of this is done using threads, so you don't have to worry about blocking your GUI main loop.

If you want to see this in action, visit the job framework page in my thesis wiki and download deepcount.c and the Makefile.