<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>GData Overview: GData Reference Manual</title>
<meta name="generator" content="DocBook XSL Stylesheets Vsnapshot">
<link rel="home" href="index.html" title="GData Reference Manual">
<link rel="up" href="pt01.html" title="Part I. GData Overview">
<link rel="prev" href="pt01.html" title="Part I. GData Overview">
<link rel="next" href="gdata-running.html" title="Running GData Applications">
<meta name="generator" content="GTK-Doc V1.26.1 (XML mode)">
<link rel="stylesheet" href="style.css" type="text/css">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<table class="navigation" id="top" width="100%" summary="Navigation header" cellpadding="2" cellspacing="5"><tr valign="middle">
<td width="100%" align="left" class="shortcuts"></td>
<td><a accesskey="h" href="index.html"><img src="home.png" width="16" height="16" border="0" alt="Home"></a></td>
<td><a accesskey="u" href="pt01.html"><img src="up.png" width="16" height="16" border="0" alt="Up"></a></td>
<td><a accesskey="p" href="pt01.html"><img src="left.png" width="16" height="16" border="0" alt="Prev"></a></td>
<td><a accesskey="n" href="gdata-running.html"><img src="right.png" width="16" height="16" border="0" alt="Next"></a></td>
</tr></table>
<div class="refentry">
<a name="gdata-overview"></a><div class="titlepage"></div>
<div class="refnamediv"><table width="100%"><tr>
<td valign="top">
<h2><span class="refentrytitle"><a name="gdata-overview.top_of_page"></a>GData Overview</span></h2>
<p>GData Overview — overview of libgdata's architecture</p>
</td>
<td class="gallery_image" valign="top" align="right"></td>
</tr></table></div>
<div class="refsect1">
<a name="id-1.2.2.3"></a><h2>Introduction</h2>
<p>libgdata is a library to allow access to web services using the GData protocol from the desktop. The <a class="ulink" href="http://code.google.com/apis/gdata/overview.html" target="_top">GData protocol</a> is a simple protocol for reading and writing
data on the web, designed by Google.</p>
<div class="refsect2">
<a name="id-1.2.2.3.3"></a><h3>Protocols</h3>
<p>Google's services were originally only accessible using an XML-based protocol called <em class="firstterm">GData</em>. However, later additions
to the set of available services use a REST-style JSON protocol. libgdata supports both protocols, although specific
services use exactly one of the two protocols.</p>
<p>The core API in libgdata transparently supports both protocols, so client code need not consider which protocol to use.</p>
</div>
<hr>
<div class="refsect2">
<a name="id-1.2.2.3.4"></a><h3>XML protocol</h3>
<p>The GData XML protocol is designed by Google to allow interaction with their web services. It is based on the Atom Publishing
protocol, with namespaced XML additions. Communication between the client and server is broadly achieved through HTTP
requests with query parameters, and Atom feeds being returned with result entries. Each <em class="firstterm">service</em>
has its own namespaced additions to the GData protocol; for example, the Google Calendar service's API has
specialisations for addresses and time periods.
</p>
<div class="figure">
<a name="id-1.2.2.3.4.2.2"></a><p class="title"><b>Figure 1. </b></p>
<div class="figure-contents">
<div class="mediaobject" align="center"><img src="data-flow.png" align="middle"></div>
<span class="phrase">An overview of the data flow when making a request of a GData service.</span>
</div>
</div>
<p><br class="figure-break">
</p>
<p>Results are always returned in the form of result <em class="firstterm">feeds</em>, containing multiple
<em class="firstterm">entries</em>. How the entries are interpreted depends on what was queried of the service, but when
using libgdata, this is all taken care of transparently.</p>
</div>
<hr>
<div class="refsect2">
<a name="id-1.2.2.3.5"></a><h3>JSON protocol</h3>
<p>The more recent JSON protocol is similar in architecture to the XML protocol: entries are arranged into feeds, and the core
operations available are: listing all entries, getting a specific entry, inserting an entry, updating an entry and deleting an entry.</p>
<p>The key difference between the two protocols, apart from the serialisation format, is that the JSON protocol is not namespaced. Each
service uses a specific JSON format, and there is no formal sharing of data structures between services. For example, every entry
in the XML protocol is required to have a title, ID and update time (as per the Atom specification). Such commonality between
JSON entries is purely ad-hoc.</p>
<p>Differences between the XML and JSON protocols are hidden by the libgdata API. Both protocols are implemented by the standard
<span class="type"><a class="link" href="GDataService.html" title="GDataService">GDataService</a></span>, <span class="type"><a class="link" href="GDataFeed.html" title="GDataFeed">GDataFeed</a></span> and
<span class="type"><a class="link" href="GDataEntry.html" title="GDataEntry">GDataEntry</a></span> classes.</p>
</div>
<hr>
<div class="refsect2">
<a name="id-1.2.2.3.6"></a><h3>Structure</h3>
<p>The basic design of libgdata mirrors the protocol's structure quite closely:
</p>
<div class="figure">
<a name="id-1.2.2.3.6.2.1"></a><p class="title"><b>Figure 2. </b></p>
<div class="figure-contents">
<div class="mediaobject" align="center"><img src="structure.png" align="middle"></div>
<span class="phrase">An overview of the libgdata class structure.</span>
</div>
</div>
<p><br class="figure-break">
</p>
<div class="variablelist"><table border="0" class="variablelist">
<colgroup>
<col align="left" valign="top">
<col>
</colgroup>
<tbody>
<tr>
<td><p><span class="term"><span class="type"><a class="link" href="GDataService.html" title="GDataService">GDataService</a></span></span></p></td>
<td>
<p>Subclassed for each different web service implemented, this class represents a single client's
connection to the relevant web service, holding their authentication state, and making the necessary
requests to read and write data to and from the service. All top-level actions, such as creating a new
object on the server, are carried out through a service.</p>
<p>There should be one <span class="type"><a class="link" href="GDataService.html" title="GDataService">GDataService</a></span> subclass for
each of the services listed <a class="ulink" href="http://code.google.com/apis/gdata/" target="_top">in the GData
documentation</a>.</p>
</td>
</tr>
<tr>
<td><p><span class="term"><span class="type"><a class="link" href="GDataQuery.html" title="GDataQuery">GDataQuery</a></span></span></p></td>
<td>
<p>For queries to have multiple individual parameters, a
<span class="type"><a class="link" href="GDataQuery.html" title="GDataQuery">GDataQuery</a></span> can be used to specify the parameters.</p>
<p>Query objects are optional, and can only be used with queries (not with entry insertions, updates
or deletions). The query object builds the query URI used by the
<span class="type"><a class="link" href="GDataService.html" title="GDataService">GDataService</a></span> when sending the query to the
server.</p>
<p>Services can subclass <span class="type"><a class="link" href="GDataQuery.html" title="GDataQuery">GDataQuery</a></span> if the service
supports non-standard query parameters.</p>
</td>
</tr>
<tr>
<td><p><span class="term"><span class="type"><a class="link" href="GDataFeed.html" title="GDataFeed">GDataFeed</a></span></span></p></td>
<td>
<p>Effectively a list of <span class="type"><a class="link" href="GDataEntry.html" title="GDataEntry">GDataEntry</a></span>s, the
<span class="type"><a class="link" href="GDataFeed.html" title="GDataFeed">GDataFeed</a></span> class is a direct counterpart of the root
<span class="type"><feed></span> element in the Atom feeds which form the GData protocol. It contains the
elements in a query response, as well as general information about the response, such as links to
related feeds and the categories under which the query response falls.</p>
<p><span class="type"><a class="link" href="GDataFeed.html" title="GDataFeed">GDataFeed</a></span> is usually not subclassed by services,
as there are rarely service-specific elements in a feed itself.</p>
</td>
</tr>
<tr>
<td><p><span class="term"><span class="type"><a class="link" href="GDataEntry.html" title="GDataEntry">GDataEntry</a></span></span></p></td>
<td><p>A <span class="type"><a class="link" href="GDataEntry.html" title="GDataEntry">GDataEntry</a></span> is a direct counterpart of the
<span class="type"><entry></span> element in the Atom feeds which form the GData protocol. It represents a
single object of unspecified semantics; an entry could be anything from a calendar event to a video
comment or access control rule. Semantics are given to entries by subclassing
<span class="type"><a class="link" href="GDataEntry.html" title="GDataEntry">GDataEntry</a></span> for the various types of entries returned
by queries to a service. Such subclasses implement useful, relevant and query-specific properties
on the entry (such as the duration of a video, or the recurrence rules of a calendar event).</p></td>
</tr>
</tbody>
</table></div>
</div>
</div>
<div class="refsect1">
<a name="id-1.2.2.4"></a><h2>Development Philosophy</h2>
<p>As the GData protocol (and all the service-specific protocols which extend it) is reasonably young, it is subject to fairly
frequent updates and expansions. While backwards compatibility is maintained, these updates necessitate that libgdata
remains fairly flexible in how it treats data. The sections below detail some of the ways in which libgdata achieves this,
and the reasoning behind them, as well as other major design decisions behind libgdata's API.</p>
<div class="refsect2">
<a name="enumerable-properties"></a><h3>Enumerable Properties</h3>
<p>There are many class properties in libgdata which should, at first glance, be implemented as enumerated types. Function
calls such as <code class="function"><a class="link" href="GDataLink.html#gdata-link-get-relation-type" title="gdata_link_get_relation_type ()">gdata_link_get_relation_type()</a></code>
and <code class="function"><a class="link" href="GDataGDIMAddress.html#gdata-gd-im-address-get-protocol" title="gdata_gd_im_address_get_protocol ()">gdata_gd_im_address_get_protocol()</a></code>
would, in a conventional library, return a value from an enum, which would work well, and be more typesafe and
memory-efficient than using arbitrary strings.</p>
<p>However, such an implementation would not be forwards-compatible. If a protocol addition was made which added another
link relation type, or added supportf or another IM protocol, there would be no way for libgdata to represent some
of the data it retrieved from the server. It could return an “other” value from the enum, but that could lead to
data loss in the common case of GData entries being queried from the server, processed, then updated again.</p>
<p>For this reason – which is made more troublesome by the fact that it is unpredictable when updates to the protocol are
released, or when updated XML/JSON will start coming over the wire – libgdata uses enumerated types sparingly; they are
only used when it is very improbable (or even impossible) for the property in question to be extended or changed in
the future. In any other case, a string value is used instead, with libgdata providing <code class="code">#define</code>d values
for the known values of the property. These values should be used as much as possible by applications which use
libgdata (i.e. they should be treated as if they were enumerated values), but applications are free to use strings
of their own, too. All validation of such pseudo-enums is left to the server.</p>
<p>One situation where it is acceptable to use enumerated types is in API which is only ever used to query the server, and
isn't involved in processing or representing the response at all, i.e. subclasses of
<span class="type"><a class="link" href="GDataQuery.html" title="GDataQuery">GDataQuery</a></span>.</p>
</div>
<hr>
<div class="refsect2">
<a name="id-1.2.2.4.4"></a><h3>String Constants</h3>
<p>As the protocols are XML- or JSON-based, they make extensive use of string constants, typically as
<a class="link" href="gdata-overview.html#enumerable-properties" title="Enumerable Properties">enumerated types</a> or namespaced URIs. To stop the authors of applications
which use libgdata from having to continually look up the correct “magic strings” to use, all such strings should
be <code class="code">#define</code>d in libgdata, and referenced in the appropriate function documentation.</p>
</div>
<hr>
<div class="refsect2">
<a name="id-1.2.2.4.5"></a><h3>New Services</h3>
<p>The API required to implement support for a new service using libgdata is not publicly exposed. This is because doing
so would clutter the API to a large extent; for example, exposing various properties as writeable which are currently
only readable. While the freedom for users of libgdata to write their own services is a good one, it is outweighed by
the muddlement that this would bring to the API.</p>
<p>Furthermore, since it is highly unlikely that anyone except Google will use GData as a basis for communicating with
their service, there is little harm in restricting the implementation of services to libgdata. If someone wants to
implement support for a new GData service, it is for the benefit of everyone if this implementation is done in libgdata
itself, rather than their application.</p>
</div>
<hr>
<div class="refsect2">
<a name="cancellable-support"></a><h3>Cancellable Support</h3>
<p>As libgdata is a network library, it has to be able to deal with operations which take a long (and indeterminate) amount
of time due to network latencies. As well as providing asynchronous operation support, every such operation in libgdata
is cancellable, using <span class="type">GCancellable</span>.</p>
<p>Using <span class="type">GCancellable</span>, any ongoing libgdata operation can be cancelled
from any other thread by calling <code class="function">g_cancellable_cancel</code>.
If the ongoing operation is doing network activity, the operation will be cancelled as safely as possible (although
the server's state cannot be guaranteed when cancelling a non-idempotent operation, such as an insertion or update,
since the server may have already committed the results of the operation, but might not have returned them to libgdata
yet) and the operation will return to its calling function with a
<code class="code">G_IO_ERROR_CANCELLED</code> error. Similarly,
if the operation is yet to do network activity, it will return with the above error before the network activity is
started, leaving the server unchanged.</p>
<p>However, if the operation has finished its network activity, libgdata does not guarantee that it will return with an
error — it may return successfully. There is no way to fix this, as it is an inherent race condition between checking
for cancellation for the last time, and returning the successful result. Rather than reduce the probability of the race
condition occurring, but still have the possibility of it occurring, libgdata will just continue to process an operation
after its network activity is over, and return success.</p>
<p>This may be useful in situations where the user is cancelling an operation due to it taking too long; the application
using libgdata may want to make use of the result of the operation, even if it has previously tried to cancel the
operation after network activity finished.</p>
<p>The behaviour of cancellation in libgdata can be represented as follows:
</p>
<div class="figure">
<a name="id-1.2.2.4.6.6.1"></a><p class="title"><b>Figure 3. </b></p>
<div class="figure-contents">
<div class="mediaobject" align="center"><img src="cancellation.png" align="middle"></div>
<span class="phrase">The behaviour of cancellation in libgdata.</span>
</div>
</div>
<p><br class="figure-break">
</p>
</div>
<hr>
<div class="refsect2">
<a name="id-1.2.2.4.7"></a><h3>Privacy</h3>
<p>Privacy is an important consideration with code such as libgdata's, which handles valuable data such as people's
address books and Google Account login details.</p>
<p>Unfortunately, it's infeasible for libgdata to ensure that no private data is ever leaked from a process. To do this
properly would require almost all the data allocated by libgdata (and all the libraries it depends on, all the way down
to the TLS implementation) to use non-pageable memory for all network requests and responses, and to be careful about
zeroing them before freeing them. There isn't enough support for this level of paranoia in the lower levels of the
stack (such as libsoup).</p>
<p>However, it is feasible to ensure that the user's password and authentication/authorization tokens aren't leaked. This
is done in several ways in libgdata:</p>
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
<li class="listitem">
<p>If libgdata is compiled with libgcr support enabled (using the
<code class="code">--enable-gnome</code> configuration flag), it will use libgcr's support for
non-pageable memory. This will try hard to avoid passwords and auth. tokens being paged out to disk at
any point (although there are circumstances, such as when hibernating, where this is
unavoidable).</p>
<p>Otherwise, libgdata will ensure that passwords and auth. tokens are zeroed out in memory before being
freed, which lowers the chance of them reaching disk at a later stage.</p>
</li>
<li class="listitem"><p>Unless run with <code class="envar">LIBGDATA_DEBUG</code> set to <code class="literal">4</code>, libgdata will attempt to
redact all usernames, passwords and auth. tokens from debug log output. This aims to prevent accidental
disclosure of passwords, etc. in bug reports. Currently, this is implemented using a fixed set of
search patterns, so it's possible that certain bits of private information will not be redacted; any
such occurrence is a bug which should be reported on
<a class="ulink" href="https://bugzilla.gnome.org/enter_bug.cgi?product=libgdata" target="_top">GNOME
Bugzilla</a>.</p></li>
</ul></div>
<p>libgdata universally uses HTTPS rather than HTTP for communicating with servers. The port which is used may be changed
for testing purposes, using the <code class="envar">LIBGDATA_HTTPS_PORT</code> environment variable; but the protocol used will
always be HTTPS.</p>
<p>libgdata provides ways to upload and download files, but does not implement code for loading or saving those files to
or from disk. Since these files will typically be user data (such as their Google Drive documents), it is highly
recommended that they are given restricted permissions, any temporary files are only readable by the current user,
and files are potentially encrypted on disk where appropriate. The aim is to avoid leaking user data to other users
of the system, or to attackers who gain access to the user’s hard drive (which may not be encrypted). libgdata itself
only guarantees that data is encrypted while being sent over the network.</p>
</div>
</div>
</div>
<div class="footer">
<hr>Generated by GTK-Doc V1.26.1</div>
</body>
</html>