|
Packit |
4b6dd7 |
|
|
Packit |
4b6dd7 |
|
|
Packit |
4b6dd7 |
<refentry id="gdata-overview">
|
|
Packit |
4b6dd7 |
<refmeta>
|
|
Packit |
4b6dd7 |
<refentrytitle role="top_of_page" id="gdata-overview.top_of_page">GData Overview</refentrytitle>
|
|
Packit |
4b6dd7 |
<manvolnum>3</manvolnum>
|
|
Packit |
4b6dd7 |
<refmiscinfo>GDATA Library</refmiscinfo>
|
|
Packit |
4b6dd7 |
</refmeta>
|
|
Packit |
4b6dd7 |
<refnamediv>
|
|
Packit |
4b6dd7 |
<refname>GData Overview</refname>
|
|
Packit |
4b6dd7 |
<refpurpose>overview of libgdata's architecture</refpurpose>
|
|
Packit |
4b6dd7 |
</refnamediv>
|
|
Packit |
4b6dd7 |
|
|
Packit |
4b6dd7 |
<refsect1>
|
|
Packit |
4b6dd7 |
<title>Introduction</title>
|
|
Packit |
4b6dd7 |
<para>libgdata is a library to allow access to web services using the GData protocol from the desktop. The
|
|
Packit |
4b6dd7 |
url="http://code.google.com/apis/gdata/overview.html">GData protocol</ulink> is a simple protocol for reading and writing
|
|
Packit |
4b6dd7 |
data on the web, designed by Google.</para>
|
|
Packit |
4b6dd7 |
|
|
Packit |
4b6dd7 |
<refsect2>
|
|
Packit |
4b6dd7 |
<title>Protocols</title>
|
|
Packit |
4b6dd7 |
<para>Google's services were originally only accessible using an XML-based protocol called <firstterm>GData</firstterm>. However, later additions
|
|
Packit |
4b6dd7 |
to the set of available services use a REST-style JSON protocol. libgdata supports both protocols, although specific
|
|
Packit |
4b6dd7 |
services use exactly one of the two protocols.</para>
|
|
Packit |
4b6dd7 |
<para>The core API in libgdata transparently supports both protocols, so client code need not consider which protocol to use.</para>
|
|
Packit |
4b6dd7 |
</refsect2>
|
|
Packit |
4b6dd7 |
|
|
Packit |
4b6dd7 |
<refsect2>
|
|
Packit |
4b6dd7 |
<title>XML protocol</title>
|
|
Packit |
4b6dd7 |
<para>The GData XML protocol is designed by Google to allow interaction with their web services. It is based on the Atom Publishing
|
|
Packit |
4b6dd7 |
protocol, with namespaced XML additions. Communication between the client and server is broadly achieved through HTTP
|
|
Packit |
4b6dd7 |
requests with query parameters, and Atom feeds being returned with result entries. Each <firstterm>service</firstterm>
|
|
Packit |
4b6dd7 |
has its own namespaced additions to the GData protocol; for example, the Google Calendar service's API has
|
|
Packit |
4b6dd7 |
specialisations for addresses and time periods.
|
|
Packit |
4b6dd7 |
<figure>
|
|
Packit |
4b6dd7 |
<mediaobject>
|
|
Packit |
4b6dd7 |
<imageobject><imagedata fileref="data-flow.png" format="PNG" align="center"/></imageobject>
|
|
Packit |
4b6dd7 |
</mediaobject>
|
|
Packit |
4b6dd7 |
<textobject><phrase>An overview of the data flow when making a request of a GData service.</phrase></textobject>
|
|
Packit |
4b6dd7 |
</figure>
|
|
Packit |
4b6dd7 |
</para>
|
|
Packit |
4b6dd7 |
<para>Results are always returned in the form of result <firstterm>feeds</firstterm>, containing multiple
|
|
Packit |
4b6dd7 |
<firstterm>entries</firstterm>. How the entries are interpreted depends on what was queried of the service, but when
|
|
Packit |
4b6dd7 |
using libgdata, this is all taken care of transparently.</para>
|
|
Packit |
4b6dd7 |
</refsect2>
|
|
Packit |
4b6dd7 |
|
|
Packit |
4b6dd7 |
<refsect2>
|
|
Packit |
4b6dd7 |
<title>JSON protocol</title>
|
|
Packit |
4b6dd7 |
<para>The more recent JSON protocol is similar in architecture to the XML protocol: entries are arranged into feeds, and the core
|
|
Packit |
4b6dd7 |
operations available are: listing all entries, getting a specific entry, inserting an entry, updating an entry and deleting an entry.</para>
|
|
Packit |
4b6dd7 |
<para>The key difference between the two protocols, apart from the serialisation format, is that the JSON protocol is not namespaced. Each
|
|
Packit |
4b6dd7 |
service uses a specific JSON format, and there is no formal sharing of data structures between services. For example, every entry
|
|
Packit |
4b6dd7 |
in the XML protocol is required to have a title, ID and update time (as per the Atom specification). Such commonality between
|
|
Packit |
4b6dd7 |
JSON entries is purely ad-hoc.</para>
|
|
Packit |
4b6dd7 |
<para>Differences between the XML and JSON protocols are hidden by the libgdata API. Both protocols are implemented by the standard
|
|
Packit |
4b6dd7 |
<type><link linkend="GDataService">GDataService</link></type>, <type><link linkend="GDataFeed">GDataFeed</link></type> and
|
|
Packit |
4b6dd7 |
<type><link linkend="GDataEntry">GDataEntry</link></type> classes.</para>
|
|
Packit |
4b6dd7 |
</refsect2>
|
|
Packit |
4b6dd7 |
|
|
Packit |
4b6dd7 |
<refsect2>
|
|
Packit |
4b6dd7 |
<title>Structure</title>
|
|
Packit |
4b6dd7 |
<para>The basic design of libgdata mirrors the protocol's structure quite closely:
|
|
Packit |
4b6dd7 |
<figure>
|
|
Packit |
4b6dd7 |
<mediaobject>
|
|
Packit |
4b6dd7 |
<imageobject><imagedata fileref="structure.png" format="PNG" align="center"/></imageobject>
|
|
Packit |
4b6dd7 |
</mediaobject>
|
|
Packit |
4b6dd7 |
<textobject><phrase>An overview of the libgdata class structure.</phrase></textobject>
|
|
Packit |
4b6dd7 |
</figure>
|
|
Packit |
4b6dd7 |
</para>
|
|
Packit |
4b6dd7 |
<variablelist>
|
|
Packit |
4b6dd7 |
<varlistentry>
|
|
Packit |
4b6dd7 |
<term><type><link linkend="GDataService">GDataService</link></type></term>
|
|
Packit |
4b6dd7 |
<listitem><para>Subclassed for each different web service implemented, this class represents a single client's
|
|
Packit |
4b6dd7 |
connection to the relevant web service, holding their authentication state, and making the necessary
|
|
Packit |
4b6dd7 |
requests to read and write data to and from the service. All top-level actions, such as creating a new
|
|
Packit |
4b6dd7 |
object on the server, are carried out through a service.</para>
|
|
Packit |
4b6dd7 |
<para>There should be one <type><link linkend="GDataService">GDataService</link></type> subclass for
|
|
Packit |
4b6dd7 |
each of the services listed <ulink type="http" url="http://code.google.com/apis/gdata/">in the GData
|
|
Packit |
4b6dd7 |
documentation</ulink>.</para></listitem>
|
|
Packit |
4b6dd7 |
</varlistentry>
|
|
Packit |
4b6dd7 |
<varlistentry>
|
|
Packit |
4b6dd7 |
<term><type><link linkend="GDataQuery">GDataQuery</link></type></term>
|
|
Packit |
4b6dd7 |
<listitem><para>For queries to have multiple individual parameters, a
|
|
Packit |
4b6dd7 |
<type><link linkend="GDataQuery">GDataQuery</link></type> can be used to specify the parameters.</para>
|
|
Packit |
4b6dd7 |
<para>Query objects are optional, and can only be used with queries (not with entry insertions, updates
|
|
Packit |
4b6dd7 |
or deletions). The query object builds the query URI used by the
|
|
Packit |
4b6dd7 |
<type><link linkend="GDataService">GDataService</link></type> when sending the query to the
|
|
Packit |
4b6dd7 |
server.</para>
|
|
Packit |
4b6dd7 |
<para>Services can subclass <type><link linkend="GDataQuery">GDataQuery</link></type> if the service
|
|
Packit |
4b6dd7 |
supports non-standard query parameters.</para>
|
|
Packit |
4b6dd7 |
</listitem>
|
|
Packit |
4b6dd7 |
</varlistentry>
|
|
Packit |
4b6dd7 |
<varlistentry>
|
|
Packit |
4b6dd7 |
<term><type><link linkend="GDataFeed">GDataFeed</link></type></term>
|
|
Packit |
4b6dd7 |
<listitem><para>Effectively a list of <type><link linkend="GDataEntry">GDataEntry</link></type>s, the
|
|
Packit |
4b6dd7 |
<type><link linkend="GDataFeed">GDataFeed</link></type> class is a direct counterpart of the root
|
|
Packit |
4b6dd7 |
<type><feed></type> element in the Atom feeds which form the GData protocol. It contains the
|
|
Packit |
4b6dd7 |
elements in a query response, as well as general information about the response, such as links to
|
|
Packit |
4b6dd7 |
related feeds and the categories under which the query response falls.</para>
|
|
Packit |
4b6dd7 |
<para><type><link linkend="GDataFeed">GDataFeed</link></type> is usually not subclassed by services,
|
|
Packit |
4b6dd7 |
as there are rarely service-specific elements in a feed itself.</para></listitem>
|
|
Packit |
4b6dd7 |
</varlistentry>
|
|
Packit |
4b6dd7 |
<varlistentry>
|
|
Packit |
4b6dd7 |
<term><type><link linkend="GDataEntry">GDataEntry</link></type></term>
|
|
Packit |
4b6dd7 |
<listitem><para>A <type><link linkend="GDataEntry">GDataEntry</link></type> is a direct counterpart of the
|
|
Packit |
4b6dd7 |
<type><entry></type> element in the Atom feeds which form the GData protocol. It represents a
|
|
Packit |
4b6dd7 |
single object of unspecified semantics; an entry could be anything from a calendar event to a video
|
|
Packit |
4b6dd7 |
comment or access control rule. Semantics are given to entries by subclassing
|
|
Packit |
4b6dd7 |
<type><link linkend="GDataEntry">GDataEntry</link></type> for the various types of entries returned
|
|
Packit |
4b6dd7 |
by queries to a service. Such subclasses implement useful, relevant and query-specific properties
|
|
Packit |
4b6dd7 |
on the entry (such as the duration of a video, or the recurrence rules of a calendar event).</para>
|
|
Packit |
4b6dd7 |
</listitem>
|
|
Packit |
4b6dd7 |
</varlistentry>
|
|
Packit |
4b6dd7 |
</variablelist>
|
|
Packit |
4b6dd7 |
</refsect2>
|
|
Packit |
4b6dd7 |
</refsect1>
|
|
Packit |
4b6dd7 |
|
|
Packit |
4b6dd7 |
<refsect1>
|
|
Packit |
4b6dd7 |
<title>Development Philosophy</title>
|
|
Packit |
4b6dd7 |
<para>As the GData protocol (and all the service-specific protocols which extend it) is reasonably young, it is subject to fairly
|
|
Packit |
4b6dd7 |
frequent updates and expansions. While backwards compatibility is maintained, these updates necessitate that libgdata
|
|
Packit |
4b6dd7 |
remains fairly flexible in how it treats data. The sections below detail some of the ways in which libgdata achieves this,
|
|
Packit |
4b6dd7 |
and the reasoning behind them, as well as other major design decisions behind libgdata's API.</para>
|
|
Packit |
4b6dd7 |
|
|
Packit |
4b6dd7 |
<refsect2 id="enumerable-properties">
|
|
Packit |
4b6dd7 |
<title>Enumerable Properties</title>
|
|
Packit |
4b6dd7 |
<para>There are many class properties in libgdata which should, at first glance, be implemented as enumerated types. Function
|
|
Packit |
4b6dd7 |
calls such as <function><link linkend="gdata-link-get-relation-type">gdata_link_get_relation_type()</link></function>
|
|
Packit |
4b6dd7 |
and <function><link linkend="gdata-gd-im-address-get-protocol">gdata_gd_im_address_get_protocol()</link></function>
|
|
Packit |
4b6dd7 |
would, in a conventional library, return a value from an enum, which would work well, and be more typesafe and
|
|
Packit |
4b6dd7 |
memory-efficient than using arbitrary strings.</para>
|
|
Packit |
4b6dd7 |
<para>However, such an implementation would not be forwards-compatible. If a protocol addition was made which added another
|
|
Packit |
4b6dd7 |
link relation type, or added supportf or another IM protocol, there would be no way for libgdata to represent some
|
|
Packit |
4b6dd7 |
of the data it retrieved from the server. It could return an “other” value from the enum, but that could lead to
|
|
Packit |
4b6dd7 |
data loss in the common case of GData entries being queried from the server, processed, then updated again.</para>
|
|
Packit |
4b6dd7 |
<para>For this reason – which is made more troublesome by the fact that it is unpredictable when updates to the protocol are
|
|
Packit |
4b6dd7 |
released, or when updated XML/JSON will start coming over the wire – libgdata uses enumerated types sparingly; they are
|
|
Packit |
4b6dd7 |
only used when it is very improbable (or even impossible) for the property in question to be extended or changed in
|
|
Packit |
4b6dd7 |
the future. In any other case, a string value is used instead, with libgdata providing #define d values
|
|
Packit |
4b6dd7 |
for the known values of the property. These values should be used as much as possible by applications which use
|
|
Packit |
4b6dd7 |
libgdata (i.e. they should be treated as if they were enumerated values), but applications are free to use strings
|
|
Packit |
4b6dd7 |
of their own, too. All validation of such pseudo-enums is left to the server.</para>
|
|
Packit |
4b6dd7 |
<para>One situation where it is acceptable to use enumerated types is in API which is only ever used to query the server, and
|
|
Packit |
4b6dd7 |
isn't involved in processing or representing the response at all, i.e. subclasses of
|
|
Packit |
4b6dd7 |
<type><link linkend="GDataQuery">GDataQuery</link></type>.</para>
|
|
Packit |
4b6dd7 |
</refsect2>
|
|
Packit |
4b6dd7 |
|
|
Packit |
4b6dd7 |
<refsect2>
|
|
Packit |
4b6dd7 |
<title>String Constants</title>
|
|
Packit |
4b6dd7 |
<para>As the protocols are XML- or JSON-based, they make extensive use of string constants, typically as
|
|
Packit |
4b6dd7 |
<link linkend="enumerable-properties">enumerated types</link> or namespaced URIs. To stop the authors of applications
|
|
Packit |
4b6dd7 |
which use libgdata from having to continually look up the correct “magic strings” to use, all such strings should
|
|
Packit |
4b6dd7 |
be #define d in libgdata, and referenced in the appropriate function documentation.</para>
|
|
Packit |
4b6dd7 |
</refsect2>
|
|
Packit |
4b6dd7 |
|
|
Packit |
4b6dd7 |
<refsect2>
|
|
Packit |
4b6dd7 |
<title>New Services</title>
|
|
Packit |
4b6dd7 |
<para>The API required to implement support for a new service using libgdata is not publicly exposed. This is because doing
|
|
Packit |
4b6dd7 |
so would clutter the API to a large extent; for example, exposing various properties as writeable which are currently
|
|
Packit |
4b6dd7 |
only readable. While the freedom for users of libgdata to write their own services is a good one, it is outweighed by
|
|
Packit |
4b6dd7 |
the muddlement that this would bring to the API.</para>
|
|
Packit |
4b6dd7 |
<para>Furthermore, since it is highly unlikely that anyone except Google will use GData as a basis for communicating with
|
|
Packit |
4b6dd7 |
their service, there is little harm in restricting the implementation of services to libgdata. If someone wants to
|
|
Packit |
4b6dd7 |
implement support for a new GData service, it is for the benefit of everyone if this implementation is done in libgdata
|
|
Packit |
4b6dd7 |
itself, rather than their application.</para>
|
|
Packit |
4b6dd7 |
</refsect2>
|
|
Packit |
4b6dd7 |
|
|
Packit |
4b6dd7 |
<refsect2 id="cancellable-support">
|
|
Packit |
4b6dd7 |
<title>Cancellable Support</title>
|
|
Packit |
4b6dd7 |
<para>As libgdata is a network library, it has to be able to deal with operations which take a long (and indeterminate) amount
|
|
Packit |
4b6dd7 |
of time due to network latencies. As well as providing asynchronous operation support, every such operation in libgdata
|
|
Packit |
4b6dd7 |
is cancellable, using <type><link linkend="GCancellable">GCancellable</link></type>.</para>
|
|
Packit |
4b6dd7 |
<para>Using <type><link linkend="GCancellable">GCancellable</link></type>, any ongoing libgdata operation can be cancelled
|
|
Packit |
4b6dd7 |
from any other thread by calling <function><link linkend="g-cancellable-cancel">g_cancellable_cancel</link></function>.
|
|
Packit |
4b6dd7 |
If the ongoing operation is doing network activity, the operation will be cancelled as safely as possible (although
|
|
Packit |
4b6dd7 |
the server's state cannot be guaranteed when cancelling a non-idempotent operation, such as an insertion or update,
|
|
Packit |
4b6dd7 |
since the server may have already committed the results of the operation, but might not have returned them to libgdata
|
|
Packit |
4b6dd7 |
yet) and the operation will return to its calling function with a
|
|
Packit |
4b6dd7 |
<link linkend="G-IO-ERROR-CANCELLED:CAPS">G_IO_ERROR_CANCELLED </link> error. Similarly,
|
|
Packit |
4b6dd7 |
if the operation is yet to do network activity, it will return with the above error before the network activity is
|
|
Packit |
4b6dd7 |
started, leaving the server unchanged.</para>
|
|
Packit |
4b6dd7 |
<para>However, if the operation has finished its network activity, libgdata does not guarantee that it will return with an
|
|
Packit |
4b6dd7 |
error — it may return successfully. There is no way to fix this, as it is an inherent race condition between checking
|
|
Packit |
4b6dd7 |
for cancellation for the last time, and returning the successful result. Rather than reduce the probability of the race
|
|
Packit |
4b6dd7 |
condition occurring, but still have the possibility of it occurring, libgdata will just continue to process an operation
|
|
Packit |
4b6dd7 |
after its network activity is over, and return success.</para>
|
|
Packit |
4b6dd7 |
<para>This may be useful in situations where the user is cancelling an operation due to it taking too long; the application
|
|
Packit |
4b6dd7 |
using libgdata may want to make use of the result of the operation, even if it has previously tried to cancel the
|
|
Packit |
4b6dd7 |
operation after network activity finished.</para>
|
|
Packit |
4b6dd7 |
<para>The behaviour of cancellation in libgdata can be represented as follows:
|
|
Packit |
4b6dd7 |
<figure>
|
|
Packit |
4b6dd7 |
<mediaobject>
|
|
Packit |
4b6dd7 |
<imageobject><imagedata fileref="cancellation.png" format="PNG" align="center"/></imageobject>
|
|
Packit |
4b6dd7 |
</mediaobject>
|
|
Packit |
4b6dd7 |
<textobject><phrase>The behaviour of cancellation in libgdata.</phrase></textobject>
|
|
Packit |
4b6dd7 |
</figure>
|
|
Packit |
4b6dd7 |
</para>
|
|
Packit |
4b6dd7 |
</refsect2>
|
|
Packit |
4b6dd7 |
|
|
Packit |
4b6dd7 |
<refsect2>
|
|
Packit |
4b6dd7 |
<title>Privacy</title>
|
|
Packit |
4b6dd7 |
<para>Privacy is an important consideration with code such as libgdata's, which handles valuable data such as people's
|
|
Packit |
4b6dd7 |
address books and Google Account login details.</para>
|
|
Packit |
4b6dd7 |
<para>Unfortunately, it's infeasible for libgdata to ensure that no private data is ever leaked from a process. To do this
|
|
Packit |
4b6dd7 |
properly would require almost all the data allocated by libgdata (and all the libraries it depends on, all the way down
|
|
Packit |
4b6dd7 |
to the TLS implementation) to use non-pageable memory for all network requests and responses, and to be careful about
|
|
Packit |
4b6dd7 |
zeroing them before freeing them. There isn't enough support for this level of paranoia in the lower levels of the
|
|
Packit |
4b6dd7 |
stack (such as libsoup).</para>
|
|
Packit |
4b6dd7 |
<para>However, it is feasible to ensure that the user's password and authentication/authorization tokens aren't leaked. This
|
|
Packit |
4b6dd7 |
is done in several ways in libgdata:</para>
|
|
Packit |
4b6dd7 |
<itemizedlist>
|
|
Packit |
4b6dd7 |
<listitem>
|
|
Packit |
4b6dd7 |
<para>If libgdata is compiled with libgcr support enabled (using the
|
|
Packit |
4b6dd7 |
--enable-gnome configuration flag), it will use libgcr's support for
|
|
Packit |
4b6dd7 |
non-pageable memory. This will try hard to avoid passwords and auth. tokens being paged out to disk at
|
|
Packit |
4b6dd7 |
any point (although there are circumstances, such as when hibernating, where this is
|
|
Packit |
4b6dd7 |
unavoidable).</para>
|
|
Packit |
4b6dd7 |
<para>Otherwise, libgdata will ensure that passwords and auth. tokens are zeroed out in memory before being
|
|
Packit |
4b6dd7 |
freed, which lowers the chance of them reaching disk at a later stage.</para>
|
|
Packit |
4b6dd7 |
</listitem>
|
|
Packit |
4b6dd7 |
<listitem>
|
|
Packit |
4b6dd7 |
<para>Unless run with <envar>LIBGDATA_DEBUG</envar> set to <literal>4</literal>, libgdata will attempt to
|
|
Packit |
4b6dd7 |
redact all usernames, passwords and auth. tokens from debug log output. This aims to prevent accidental
|
|
Packit |
4b6dd7 |
disclosure of passwords, etc. in bug reports. Currently, this is implemented using a fixed set of
|
|
Packit |
4b6dd7 |
search patterns, so it's possible that certain bits of private information will not be redacted; any
|
|
Packit |
4b6dd7 |
such occurrence is a bug which should be reported on
|
|
Packit |
4b6dd7 |
<ulink type="http" url="https://bugzilla.gnome.org/enter_bug.cgi?product=libgdata">GNOME
|
|
Packit |
4b6dd7 |
Bugzilla</ulink>.</para>
|
|
Packit |
4b6dd7 |
</listitem>
|
|
Packit |
4b6dd7 |
</itemizedlist>
|
|
Packit |
4b6dd7 |
|
|
Packit |
4b6dd7 |
<para>libgdata universally uses HTTPS rather than HTTP for communicating with servers. The port which is used may be changed
|
|
Packit |
4b6dd7 |
for testing purposes, using the <envar>LIBGDATA_HTTPS_PORT</envar> environment variable; but the protocol used will
|
|
Packit |
4b6dd7 |
always be HTTPS.</para>
|
|
Packit |
4b6dd7 |
|
|
Packit |
4b6dd7 |
<para>libgdata provides ways to upload and download files, but does not implement code for loading or saving those files to
|
|
Packit |
4b6dd7 |
or from disk. Since these files will typically be user data (such as their Google Drive documents), it is highly
|
|
Packit |
4b6dd7 |
recommended that they are given restricted permissions, any temporary files are only readable by the current user,
|
|
Packit |
4b6dd7 |
and files are potentially encrypted on disk where appropriate. The aim is to avoid leaking user data to other users
|
|
Packit |
4b6dd7 |
of the system, or to attackers who gain access to the user’s hard drive (which may not be encrypted). libgdata itself
|
|
Packit |
4b6dd7 |
only guarantees that data is encrypted while being sent over the network.</para>
|
|
Packit |
4b6dd7 |
</refsect2>
|
|
Packit |
4b6dd7 |
</refsect1>
|
|
Packit |
4b6dd7 |
</refentry>
|