Blame WORK-IN-PROGRESS

Packit 01d647
T A B L E   o f   C O N T E N T S
Packit 01d647
---------------------------------
Packit 01d647
Packit 01d647
1   Building Adobe XMPsdk and Samples in Terminal with the ./Generate_XXX_mac.sh scripts
Packit 01d647
1.1 Amazing Discovery 1    DumpFile is linked to libstdc++.6.dylib
Packit 01d647
1.2 Amazing Discovery 2    Millions of "weak symbol/visibility" messages
Packit 01d647
Packit 01d647
4   Build design for v0.26.1
Packit 01d647
4.8 Support for MinGW
Packit 01d647
Packit 01d647
5   Refactoring the Tiff Code
Packit 01d647
5.1 Background
Packit 01d647
5.2 How does Exiv2 decode the ExifData in a JPEG?
Packit 01d647
5.3 How is metadata organized in Exiv2
Packit 01d647
5.4 Where are the tags defined?
Packit 01d647
5.5 How do the MakerNotes get decoded?
Packit 01d647
5.6 How do the encoders work?
Packit 01d647
Packit 01d647
6   Using external XMP SDK via Conan
Packit 01d647
Packit 01d647
==========================================================================
Packit 01d647
Packit 01d647
4   Build design for v0.26.1
Packit 01d647
Packit 01d647
Added   : 2017-08-18
Packit 01d647
Modified: 2017-08-23
Packit 01d647
Packit 01d647
    The purpose of the v0.26.1 is to release bug fixes and
Packit 01d647
    experimental new features which may become defaults with v0.27
Packit 01d647
Packit 01d647
4.8 Support for MinGW
Packit 01d647
    MinGW msys/1.0 was deprecated when v0.26 was released.
Packit 01d647
    No support for MinGW msys/1.0 will be provided.
Packit 01d647
    It's very likely that the MinGW msys/1.0 will build.
Packit 01d647
    I will not provide any user support for MinGW msys/1.0 in future.
Packit 01d647
Packit 01d647
    MinGW msys/2.0 might be supported as "experimental" in Exiv2 v0.26.2
Packit 01d647
Packit 01d647
Packit 01d647
==========================================================================
Packit 01d647
Packit 01d647
5   Refactoring the Tiff Code
Packit 01d647
Packit 01d647
Added   : 2017-09-24
Packit 01d647
Modified: 2017-09-24
Packit 01d647
Packit 01d647
5.1 Background
Packit 01d647
    Tiff parsing is the root code of a metadata engine.
Packit 01d647
Packit 01d647
    The Tiff parsing code in Exiv2 is very difficult to understand and has major architectural shortcomings:
Packit 01d647
Packit 01d647
    1) It requires the Tiff file to be totally in memory
Packit 01d647
    2) It cannot handle BigTiff
Packit 01d647
    3) The parser doesn't know the source of the in memory tiff image
Packit 01d647
    4) It uses memory mapping on the tiff file
Packit 01d647
       - if the network connection is lost, horrible things happen
Packit 01d647
       - it requires a lot of VM to map the complete file
Packit 01d647
       - BigTiff file can be 100GB+
Packit 01d647
       - The memory mapping causes problems with Virus Detection software on Windows
Packit 01d647
    5) The parser cannot deal with multi-page tiff files
Packit 01d647
    6) It requires the total file to be in contiguous memory and defeats 'webready'.
Packit 01d647
Packit 01d647
    The Tiff parsing code in Exiv2 is ingenious.  It's also very robust.  It works well.  It can:
Packit 01d647
Packit 01d647
    1) Handle 32-bit Tiff and Many Raw formats (which are derived from Tiff)
Packit 01d647
    2) It can read and write Manufacturer's MakerNotes which are (mostly) in Tiff format
Packit 01d647
    3) It probably has other great features that I haven't discovered
Packit 01d647
       - because the code is so hard to understand, I can't simply browse and read it.
Packit 01d647
    4) It separates file navigation from data analysis.
Packit 01d647
Packit 01d647
    The code in image::printStructure was originally written to understand "what is a tiff?"
Packit 01d647
    It has problems:
Packit 01d647
    1) It was intended to be a single threaded debugging function and has security issues.
Packit 01d647
    2) It doesn't handle BigTiff
Packit 01d647
    3) It's messy.  It's reading and processing metadata simultaneously.
Packit 01d647
Packit 01d647
    The aim of this project is to
Packit 01d647
    1) Reconsider the Tiff Code.
Packit 01d647
    2) Keep everything good in the code and address known deficiencies
Packit 01d647
    3) Establish a Team Exiv2 "Tiff Expert" who knows the code intimately.
Packit 01d647
Packit 01d647
5.2 How does Exiv2 decode the ExifData in a JPEG?
Packit 01d647
    You can get my test file from http://clanmills.com/Stonehenge.jpg
Packit 01d647
Packit 01d647
    808 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/build $ exiv2 -pS ~/Stonehenge.jpg
Packit 01d647
        STRUCTURE OF JPEG FILE: /Users/rmills/Stonehenge.jpg
Packit 01d647
         address | marker       |  length | data
Packit 01d647
               0 | 0xffd8 SOI
Packit 01d647
               2 | 0xffe1 APP1  |   15288 | Exif..II*......................
Packit 01d647
           15292 | 0xffe1 APP1  |    2610 | http://ns.adobe.com/xap/1.0/.
Packit 01d647
           17904 | 0xffed APP13 |      96 | Photoshop 3.0.8BIM.......'.....
Packit 01d647
           18002 | 0xffe2 APP2  |    4094 | MPF.II*...............0100.....
Packit 01d647
           22098 | 0xffdb DQT   |     132
Packit 01d647
           22232 | 0xffc0 SOF0  |      17
Packit 01d647
           22251 | 0xffc4 DHT   |     418
Packit 01d647
           22671 | 0xffda SOS
Packit 01d647
        809 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/build $
Packit 01d647
Packit 01d647
    Exiv2 calls JpegBase::readMetadata which locates the APP1/Exif segment.
Packit 01d647
    It invokes the ExifParser:
Packit 01d647
       ExifParser::decode(exifData_, rawExif.pData_, rawExif.size_);
Packit 01d647
    This is thin wrapper over:
Packit 01d647
       TiffParserWorker::decode(....) in tiffimage.cpp
Packit 01d647
Packit 01d647
    What happens then?  I don't know.  The metadata is decoded in:
Packit 01d647
       tiffvisitor.cpp TiffDecoder::visitEntry()
Packit 01d647
Packit 01d647
    The design of the TiffMumble classes is the "Visitor" pattern
Packit 01d647
    described in "Design Patterns" by Addison & Wesley.  The aim of the pattern
Packit 01d647
    is to separate parsing from dealing with the data.
Packit 01d647
Packit 01d647
    The data is being stored in ExifData which is a vector.
Packit 01d647
    Order is important and preserved.
Packit 01d647
    As the data values are recovered they are stored as Exifdatum in the vector.
Packit 01d647
Packit 01d647
    How does the tiff visitor work?  I think the reader and processor
Packit 01d647
    are connected by this line in TiffParser::
Packit 01d647
        rootDir->accept(reader);
Packit 01d647
Packit 01d647
    The class tree for the decoder is:
Packit 01d647
Packit 01d647
    class TiffDecoder : public TiffFinder {
Packit 01d647
      class TiffReader ,
Packit 01d647
      class TiffFinder : public TiffVisitor {
Packit 01d647
        class TiffVisitor {
Packit 01d647
          public:
Packit 01d647
          //! Events for the stop/go flag. See setGo().
Packit 01d647
          enum GoEvent {
Packit 01d647
              geTraverse       = 0,
Packit 01d647
              geKnownMakernote = 1
Packit 01d647
          };
Packit 01d647
Packit 01d647
          void setGo(GoEvent event, bool go);
Packit 01d647
          virtual void visitEntry(TiffEntry* object) =0;
Packit 01d647
          virtual void visitDataEntry(TiffDataEntry* object) =0;
Packit 01d647
          virtual void visitImageEntry(TiffImageEntry* object) =0;
Packit 01d647
          virtual void visitSizeEntry(TiffSizeEntry* object) =0;
Packit 01d647
          virtual void visitDirectory(TiffDirectory* object) =0;
Packit 01d647
          virtual void visitSubIfd(TiffSubIfd* object) =0;
Packit 01d647
          virtual void visitMnEntry(TiffMnEntry* object) =0;
Packit 01d647
          virtual void visitIfdMakernote(TiffIfdMakernote* object) =0;
Packit 01d647
          virtual void visitIfdMakernoteEnd(TiffIfdMakernote* object);
Packit 01d647
          virtual void visitBinaryArray(TiffBinaryArray* object) =0;
Packit 01d647
          virtual void visitBinaryArrayEnd(TiffBinaryArray* object);
Packit 01d647
          //! Operation to perform for an element of a binary array
Packit 01d647
          virtual void visitBinaryElement(TiffBinaryElement* object) =0;
Packit 01d647
Packit 01d647
          //! Check if stop flag for \em event is clear, return true if it's clear.
Packit 01d647
          bool go(GoEvent event) const;
Packit 01d647
        }
Packit 01d647
      }
Packit 01d647
    }
Packit 01d647
Packit 01d647
    The reader works by stepping along the Tiff directory and calls the visitor's
Packit 01d647
    "callbacks" as it reads.
Packit 01d647
Packit 01d647
    There are 2000 lines of code in tiffcomposite.cpp and, to be honest,
Packit 01d647
    I don't know what most of it does!
Packit 01d647
Packit 01d647
    Set a breakpoint in src/exif.cpp#571.
Packit 01d647
    That’s where he adds the key/value to the exifData vector.
Packit 01d647
    Exactly how did he get here?  That’s a puzzle.
Packit 01d647
Packit 01d647
    void ExifData::add(const ExifKey& key, const Value* pValue)
Packit 01d647
    {
Packit 01d647
        add(Exifdatum(key, pValue));
Packit 01d647
    }
Packit 01d647
Packit 01d647
5.3 How is metadata organized in Exiv2
Packit 01d647
    section.group.tag
Packit 01d647
Packit 01d647
    section: Exif | IPTC | Xmp
Packit 01d647
    group:   Photo | Image | MakerNote | Nikon3 ....
Packit 01d647
    tag: YResolution etc ...
Packit 01d647
Packit 01d647
    820 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ exiv2 -pa ~/Stonehenge.jpg | cut -d' ' -f 1 | cut -d. -f 1 | sort | uniq
Packit 01d647
    Exif
Packit 01d647
    Iptc
Packit 01d647
    Xmp
Packit 01d647
Packit 01d647
    821 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ exiv2 -pa --grep Exif ~/Stonehenge.jpg  | cut -d'.' -f 2 | sort | uniq
Packit 01d647
    GPSInfo
Packit 01d647
    Image
Packit 01d647
    Iop
Packit 01d647
    MakerNote
Packit 01d647
    Nikon3
Packit 01d647
    NikonAf2
Packit 01d647
    NikonCb2b
Packit 01d647
    NikonFi
Packit 01d647
    NikonIi
Packit 01d647
    NikonLd3
Packit 01d647
    NikonMe
Packit 01d647
    NikonPc
Packit 01d647
    NikonVr
Packit 01d647
    NikonWt
Packit 01d647
    Photo
Packit 01d647
    Thumbnail
Packit 01d647
Packit 01d647
    822 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ 533 rmills@rmillsmbp:~/Downloads $ exiv2 -pa --grep Exif ~/Stonehenge.jpg  | cut -d'.' -f 3 | cut -d' ' -f 1 | sort | uniq
Packit 01d647
    AFAperture
Packit 01d647
    AFAreaHeight
Packit 01d647
    AFAreaMode
Packit 01d647
    ...
Packit 01d647
    XResolution
Packit 01d647
    YCbCrPositioning
Packit 01d647
    YResolution
Packit 01d647
534 rmills@rmillsmbp:~/Downloads $
Packit 01d647
    823 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $
Packit 01d647
Packit 01d647
    The data in IFD0 of is Exiv2.Image:
Packit 01d647
Packit 01d647
    826 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ exiv2 -pR ~/Stonehenge.jpg  | head -20
Packit 01d647
    STRUCTURE OF JPEG FILE: /Users/rmills/Stonehenge.jpg
Packit 01d647
     address | marker       |  length | data
Packit 01d647
           0 | 0xffd8 SOI
Packit 01d647
           2 | 0xffe1 APP1  |   15288 | Exif..II*......................
Packit 01d647
      STRUCTURE OF TIFF FILE (II): MemIo
Packit 01d647
       address |    tag                              |      type |    count |    offset | value
Packit 01d647
            10 | 0x010f Make                         |     ASCII |       18 |       146 | NIKON CORPORATION
Packit 01d647
            22 | 0x0110 Model                        |     ASCII |       12 |       164 | NIKON D5300
Packit 01d647
            34 | 0x0112 Orientation                  |     SHORT |        1 |           | 1
Packit 01d647
            46 | 0x011a XResolution                  |  RATIONAL |        1 |       176 | 300/1
Packit 01d647
            58 | 0x011b YResolution                  |  RATIONAL |        1 |       184 | 300/1
Packit 01d647
            70 | 0x0128 ResolutionUnit               |     SHORT |        1 |           | 2
Packit 01d647
            82 | 0x0131 Software                     |     ASCII |       10 |       192 | Ver.1.00
Packit 01d647
            94 | 0x0132 DateTime                     |     ASCII |       20 |       202 | 2015:07:16 20:25:28
Packit 01d647
           106 | 0x0213 YCbCrPositioning             |     SHORT |        1 |           | 1
Packit 01d647
           118 | 0x8769 ExifTag                      |      LONG |        1 |           | 222
Packit 01d647
        STRUCTURE OF TIFF FILE (II): MemIo
Packit 01d647
         address |    tag                              |      type |    count |    offset | value
Packit 01d647
             224 | 0x829a ExposureTime                 |  RATIONAL |        1 |       732 | 10/4000
Packit 01d647
             236 | 0x829d FNumber                      |  RATIONAL |        1 |       740 | 100/10
Packit 01d647
    827 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ exiv2 -pa --grep Image ~/Stonehenge.jpg
Packit 01d647
    Exif.Image.Make                              Ascii      18  NIKON CORPORATION
Packit 01d647
    Exif.Image.Model                             Ascii      12  NIKON D5300
Packit 01d647
    Exif.Image.Orientation                       Short       1  top, left
Packit 01d647
    Exif.Image.XResolution                       Rational    1  300
Packit 01d647
    Exif.Image.YResolution                       Rational    1  300
Packit 01d647
    Exif.Image.ResolutionUnit                    Short       1  inch
Packit 01d647
    Exif.Image.Software                          Ascii      10  Ver.1.00
Packit 01d647
    Exif.Image.DateTime                          Ascii      20  2015:07:16 20:25:28
Packit 01d647
    Exif.Image.YCbCrPositioning                  Short       1  Centered
Packit 01d647
    Exif.Image.ExifTag                           Long        1  222
Packit 01d647
    Exif.Nikon3.ImageBoundary                    Short       4  0 0 6000 4000
Packit 01d647
    Exif.Nikon3.ImageDataSize                    Long        1  6173648
Packit 01d647
    Exif.NikonAf2.AFImageWidth                   Short       1  0
Packit 01d647
    Exif.NikonAf2.AFImageHeight                  Short       1  0
Packit 01d647
    Exif.Photo.ImageUniqueID                     Ascii      33  090caaf2c085f3e102513b24750041aa
Packit 01d647
    Exif.Image.GPSTag                            Long        1  4060
Packit 01d647
    828 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $
Packit 01d647
Packit 01d647
    The data in IFD1 is Exiv2.Photo
Packit 01d647
Packit 01d647
    The data in the MakerNote is another embedded TIFF (which more embedded tiffs)
Packit 01d647
Packit 01d647
    829 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $ exiv2 -pa --grep MakerNote ~/Stonehenge.jpg
Packit 01d647
    Exif.Photo.MakerNote                         Undefined 3152  (Binary value suppressed)
Packit 01d647
    Exif.MakerNote.Offset                        Long        1  914
Packit 01d647
    Exif.MakerNote.ByteOrder                     Ascii       3  II
Packit 01d647
    830 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2/src $
Packit 01d647
Packit 01d647
    The MakerNote decodes them into:
Packit 01d647
Packit 01d647
    Exif.Nikon1, Exiv2.NikonAf2 and so on.  I don't know exactly it achieves this.
Packit 01d647
    However it means that tag-numbers can be reused in different IFDs.
Packit 01d647
    Tag 0x0016 = Nikon GPSSpeed and can mean something different elsewhere.
Packit 01d647
Packit 01d647
5.4 Where are the tags defined?
Packit 01d647
Packit 01d647
    There's an array of "TagInfo" data structures in each of the makernote decoders.
Packit 01d647
    These define the tag (a number) and the tag name, the groupID (eg canonId) and the default type.
Packit 01d647
    There's also a callback to print the value of the tag.  This does the "interpretation"
Packit 01d647
    that is performed by the -pt in the exiv2 command-line program.
Packit 01d647
Packit 01d647
    TagInfo(0x4001, "ColorData", N_("Color Data"), N_("Color data"), canonId, makerTags, unsignedShort, -1, printValue),
Packit 01d647
Packit 01d647
5.5 How do the MakerNotes get decoded?
Packit 01d647
Packit 01d647
    I don't know.  It has something to do with this code in tiffcomposite.cpp#936
Packit 01d647
Packit 01d647
    TiffMnEntry::doAccept(TiffVisitor& visitor) { ... }
Packit 01d647
Packit 01d647
    Most makernotes are TiffStructures.  So the TiffXXX classes are invoked recursively to decode the maker note.
Packit 01d647
Packit 01d647
#0	0x000000010058b4b0 in Exiv2::Internal::TiffDirectory::doAccept(Exiv2::Internal::TiffVisitor&) at /Users/rmills/gnu/github/exiv2/exiv2/src/tiffcomposite.cpp:916
Packit 01d647
    This function iterated the array of entries
Packit 01d647
Packit 01d647
#1	0x000000010058b3c6 in Exiv2::Internal::TiffComponent::accept(Exiv2::Internal::TiffVisitor&) at /Users/rmills/gnu/github/exiv2/exiv2/src/tiffcomposite.cpp:891
Packit 01d647
#2	0x00000001005b5357 in Exiv2::Internal::TiffParserWorker::parse(unsigned char const*, unsigned int, unsigned int, Exiv2::Internal::TiffHeaderBase*) at /Users/rmills/gnu/github/exiv2/exiv2/src/tiffimage.cpp:2006
Packit 01d647
    This function creates an array of TiffEntries
Packit 01d647
Packit 01d647
#3	0x00000001005a2a60 in Exiv2::Internal::TiffParserWorker::decode(Exiv2::ExifData&, Exiv2::IptcData&, Exiv2::XmpData&, unsigned char const*, unsigned int, unsigned int, void (Exiv2::Internal::TiffDecoder::* (*)(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned int, Exiv2::Internal::IfdId))(Exiv2::Internal::TiffEntryBase const*), Exiv2::Internal::TiffHeaderBase*) at /Users/rmills/gnu/github/exiv2/exiv2/src/tiffimage.cpp:1900
Packit 01d647
#4	0x00000001005a1ae9 in Exiv2::TiffParser::decode(Exiv2::ExifData&, Exiv2::IptcData&, Exiv2::XmpData&, unsigned char const*, unsigned int) at /Users/rmills/gnu/github/exiv2/exiv2/src/tiffimage.cpp:260
Packit 01d647
#5	0x000000010044d956 in Exiv2::ExifParser::decode(Exiv2::ExifData&, unsigned char const*, unsigned int) at /Users/rmills/gnu/github/exiv2/exiv2/src/exif.cpp:625
Packit 01d647
#6	0x0000000100498fd7 in Exiv2::JpegBase::readMetadata() at /Users/rmills/gnu/github/exiv2/exiv2/src/jpgimage.cpp:386
Packit 01d647
#7	0x000000010000bc59 in Action::Print::printList() at /Users/rmills/gnu/github/exiv2/exiv2/src/actions.cpp:530
Packit 01d647
#8	0x0000000100005835 in Action::Print::run(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) at /Users/rmills/gnu/github/exiv2/exiv2/src/actions.cpp:245
Packit 01d647
Packit 01d647
Packit 01d647
5.6 How do the encoders work?
Packit 01d647
Packit 01d647
    I understand writeMetadata() and will document that soon.
Packit 01d647
    I still have to study how the TiffVisitor writes metadata.
Packit 01d647
Packit 01d647
Packit 01d647
6   Using external XMP SDK via Conan
Packit 01d647
Packit 01d647
Section 1 describes how to compile the newer versions of XMP SDK with a bash script. This
Packit 01d647
approach had few limitations:
Packit 01d647
Packit 01d647
    1) We had to include sources from other projects into the Exiv2 repository: Check the folder
Packit 01d647
    xmpsdk/third-party.
Packit 01d647
    2) Different scripts for compiling XMP SDK on Linux, Mac OSX and Windows.
Packit 01d647
    3) Lot of configuration/compilation issues depending on the system configuration.
Packit 01d647
Packit 01d647
Taking into account that during the last months we have done a big effort in migrating the
Packit 01d647
manipulation of 3rd party dependencies to Conan, we have decided to do the same here. A conan recipe
Packit 01d647
has been written for XmpSdk at:
Packit 01d647
Packit 01d647
https://github.com/piponazo/conan-xmpsdk
Packit 01d647
Packit 01d647
And the recipe and package binaries can be found in the piponazo's bintray repository:
Packit 01d647
Packit 01d647
https://bintray.com/piponazo/piponazo
Packit 01d647
Packit 01d647
This conan recipe provides a custom CMake finder that will be used by our CMake code to properly
Packit 01d647
find XMP SDK in the conan cache and then be able to use the CMake variables: ${XMPSDK_LIBRARY} and
Packit 01d647
${XMPSDK_INCLUDE_DIR}.
Packit 01d647
Packit 01d647
These are the steps you will need to follow to configure the project with the external XMP support:
Packit 01d647
Packit 01d647
    # Add the conan-piponazo remote to your conan configuration (only once)
Packit 01d647
    conan remote add conan-piponazo https://api.bintray.com/conan/piponazo/piponazo 
Packit 01d647
Packit 01d647
    mkdir build && cd build
Packit 01d647
Packit 01d647
    # Run conan to bring the dependencies. Note that the XMPSDK is not enabled by default and you will
Packit 01d647
    # need to enable the xmp option to bring it.
Packit 01d647
    conan install .. --options xmp=True
Packit 01d647
Packit 01d647
    # Configure the project with support for the external XMP version. Disable the normal XMP version
Packit 01d647
    cmake -DCMAKE_BUILD_TYPE=Release -DEXIV2_ENABLE_XMP=OFF -DEXIV2_ENABLE_EXTERNAL_XMP=ON -DBUILD_SHARED_LIBS=ON ..
Packit 01d647
Packit 01d647
Note that the usage of the newer versions of XMP is experimental and it was included in Exiv2
Packit 01d647
because few users has requested it.