collective.pdfpeek integration doctest
======================================

This test is an integration test that uses PloneTestCase. Here, 'self' is
the test class, so we can use 'self.folder', 'portal' and so on. The
setup is done in tests/test_integration_doctests.py

We first test that the low level machinery of the PDF to image transform works,
we then test our event handlers to see if they fire the transform.

Setup:
------

Do some setup for plone.app.testing

    >>> from plone.app.testing import login
    >>> from plone.app.testing import TEST_USER_NAME
    >>> from plone.app.testing import TEST_USER_ID
    >>> from plone.app.testing import setRoles
    >>> portal = layer['portal']
    >>> setRoles(portal, TEST_USER_ID, ['Manager'])

log in as the portal owner:

    >>> login(portal, TEST_USER_NAME)

create a few file objects to work with:

    >>> portal.invokeFactory('File', id='test_pdf', title='Test PDF File')
    'test_pdf'

create another file object we keep empty for later:

    >>> portal.invokeFactory('File', id='test_pdf_2', title='Second Test PDF File')
    'test_pdf_2'

create a file that will be used to test the image sizes:

    >>> portal.invokeFactory('File', id='test_image_size', title='Image Test PDF File')
    'test_image_size'

Testing the inner-workings of the collective.pdfpeek.transforms module:
-----------------------------------------------------------------------

Put some content in the file object (yes, we're basically testing ATFile here, what?):

    >>> portal.test_pdf.setFile('this is a test')
    >>> portal.test_pdf_2.setFile('this is another test for later')
    >>> portal.test_pdf.getFile().get_data()
    'this is a test'

Let's get the current path and pass in the path with the test pdf file in
the tests/ directory called plone.pdf:

    >>> def mydir():
    ...     import os.path, sys
    ...     if __name__ == '__main__':
    ...         filename = sys.argv[0]
    ...     else:
    ...         filename = __file__
    ...     return os.path.abspath(os.path.dirname(filename))
    >>> file_path = mydir() + """/data/plone.pdf"""
    >>> pdf_file = open(file_path, mode='rb')

Ok, now put a PDF file in the file object. Now we store the pdf_file we just opened
on the first ATFile object we created:

    >>> portal.test_pdf.setFile(pdf_file)

Let's check to be sure we've got the PDF in the ATFile object:

    >>> portal.test_pdf.getFile().get_data()
    '%PDF-1.5\r%\xe2\xe3\xcf\xd3\r\n10 0 obj\r...>>stream\r\nh\xdebb\x00\x01&FFCC\x06& \xab\x15D\xf2W\x82\xd9= \x92Q\x16(\xfb\x7f\xbf&X\x84\x81\x11D2\xfd\x07\x91\x8c\x0c\x00\x01\x06\x00\x86.\x05\x1b\rendstream\rendobj\rstartxref\r116\r%%EOF\r'

Get the mime type of the file stored in the ATFile object:

    >>> field = portal.test_pdf.getField('file')
    >>> field.getContentType(portal.test_pdf)
    'application/pdf'

Check that the mime type of the file with no pdf is text/plain:

    >>> field2 = portal.test_pdf_2.getField('file')
    >>> field2.getContentType(portal.test_pdf_2)
    'text/plain'

Now initialize an instance of the transform class which will convert
the pdf stored on the ATFile object to one image per page:

    >>> from collective.pdfpeek.interfaces import IPDFDataExtractor
    >>> converter = IPDFDataExtractor(portal.test_pdf)

Now try the converter with the good data, it should work:

    >>> images = converter.get_thumbnails(0, converter.pages)

And store the list of jpegs on the ATFile object as an annotation.

    >>> from zope.annotation.interfaces import IAnnotations
    >>> from zope.annotation.interfaces import IAttributeAnnotatable
    >>> from zope.interface import alsoProvides
    >>> alsoProvides(portal.test_pdf, IAttributeAnnotatable)
    >>> annotations = IAnnotations(portal.test_pdf)
    >>> annotations['pdfpeek'] = {}
    >>> annotations['pdfpeek']['image_thumbnails'] = images

OK, now let's try to access the annotation on the object:

    >>> portal.test_pdf.__annotations__
    <BTrees.OOBTree.OOBTree object at ...>

Let's put the annotations in a dict:

    >>> dict(portal.test_pdf.__annotations__)
    {'pdfpeek': {'image_thumbnails': {'1_preview': <Image at 1_preview>, '1_thumb': <Image at 1_thumb>}}, 'Archetypes.storage.AnnotationStorage-file': ...}

Let's set the image using our image test file:

    >>> portal.test_image_size.setFile(pdf_file)

Let's check the default image sizes:

    >>> from zope.component import getUtility
    >>> from plone.registry.interfaces import IRegistry
    >>> from collective.pdfpeek.interfaces import IPDFPeekConfiguration
    >>> registry = getUtility(IRegistry)
    >>> config = registry.forInterface(IPDFPeekConfiguration)
    >>> config.preview_width, config.preview_length
    (512, 512)
    >>> config.thumbnail_width, config.thumbnail_length
    (128, 128)

Let's generate some images:

    >>> converter = IPDFDataExtractor(portal.test_image_size)
    >>> images = converter.get_thumbnails(0, converter.pages)

The height of the preview should have been shrunk to `512`, likewise
the thumbnail should have shrunk to `128`:

    >>> images['1_preview'].height
    512
    >>> images['1_thumb'].height
    128

Now if we set the sizes smaller:

    >>> config.preview_width = 9999
    >>> config.preview_length = 256
    >>> config.thumbnail_width = 9999
    >>> config.thumbnail_length = 50

Let's regenerate the images and see what sizes are set:

    >>> converter = IPDFDataExtractor(portal.test_image_size)
    >>> images = converter.get_thumbnails(0, converter.pages)
    >>> images['1_preview'].height
    256
    >>> images['1_thumb'].height
    50

Testing collective.pdfpeek's event handler subsystem:
-----------------------------------------------------

So the converter works, let's try creating an ATFile object, the object
should get the pdfpeek annotation when we add a pdf file to it and fire the proper event.

    >>> portal.invokeFactory('File', id='test_pdf_3', title='Yet Another Test PDF File')
    'test_pdf_3'

OK, we've got another ATFile object, let's input the pdf file:

    >>> portal.test_pdf_3.setFile(pdf_file)

We have the plone.pdf file stored in this third ATFile object, let's notify
our event handler that the object has been edited; the event handler should detect
the event and fire the transform, annotating the results on the ATFile object:

    >>> from Products.Archetypes.event import ObjectEditedEvent
    >>> from zope.event import notify
    >>> import transaction
    >>> notify(ObjectEditedEvent(portal.test_pdf_3))

OK, now we should have a conversion job in the queue:

    >>> from collective.pdfpeek import async
    >>> queue_name = 'collective.pdfpeek.conversion_' + portal.id
    >>> queue = async.get_queue(queue_name)
    >>> queue.pending
    [<collective.pdfpeek.async.Job object at ...>]

Let's process the queue:

    >>> transaction.commit()

Now we need to visit the url that will run the method that processes the queue:

    >>> from plone.testing.z2 import Browser
    >>> portal.error_log._ignored_exceptions = ()
    >>> browser = Browser(layer['app'])
    >>> portal_url = portal.absolute_url()

Let's login as the portal manager:

    >>> from plone.app.testing import SITE_OWNER_NAME, SITE_OWNER_PASSWORD
    >>> browser.open(portal_url + "/login")
    >>> browser.getControl(name='__ac_name').value = SITE_OWNER_NAME
    >>> browser.getControl(name='__ac_password').value = SITE_OWNER_PASSWORD
    >>> browser.getControl(name='submit').click()
    >>> portal.error_log._ignored_exceptions = ()
    >>> queue_process_url = portal_url + "/@@pdfpeek.utils/process_conversion_queue"
    >>> browser.open(queue_process_url)
    >>> "1 Jobs in queue. Processing queue..." in browser.contents
    True

The pending queue should now be empty:

    >>> queue.pending
    []

The job should have been added to the finished queue:

    >>> queue.finished
    [<collective.pdfpeek.async.Job object at ...>]

Now we should have the annotation on the object because the event handler fired:

    >>> portal.test_pdf_3.__annotations__
    <BTrees.OOBTree.OOBTree object at ...>

Ok, so we have the annotations on there, but do they contain what we expect? Let's see:

    >>> annotations = dict(portal.test_pdf_3.__annotations__)
    >>> annotations
    {'pdfpeek': <BTrees.OOBTree.OOBTree object at ...>, 'Archetypes.storage.AnnotationStorage-file': <plone.app.blob.field.BlobWrapper object at ...>, 'Archetypes.storage.AnnotationStorage-title': u'Yet Another Test PDF File'}
    >>> annotations['pdfpeek']['image_thumbnails']
    {'1_preview': <Image at 1_preview>, '1_thumb': <Image at 1_thumb>}
    >>> image = annotations['pdfpeek']['image_thumbnails']['1_preview']
    >>> image
    <Image at 1_preview>
    >>> image.getContentType()
    'image/png'

Check metadata stored into annotations

    >>> 'metadata' in annotations['pdfpeek']
    True
    >>> annotations['pdfpeek']['metadata']
    {'/Title': u'Plone CMS: Open Source Content Management',
     '/CreationDate': u"D:20090416164855-07'00'",
     '/Producer': u'Acrobat Distiller 9.0.0 (Macintosh)',
     '/ModDate': u"D:20090416164855-07'00'",
     '/Author': u'David Brenneman',
     '/Creator': u'firefox-bin: cgpdftops CUPS filter',
     'width': 612.0,
     'height': 792.0,
     'pages': 1}

Hooray, the annotations are there after the event is fired, and they contain what we expect,
the list of images output by the transform!

Now let's try putting some non-pdf content in the file object:

First we open a text file to replace the pdf with:

    >>> text_file_path = mydir() + """/data/plone.txt"""
    >>> text_file = open(text_file_path, mode='rb')
    >>> text_file
    <open file ...>

    >>> portal.test_pdf_3.setFile(text_file)
    >>> portal.test_pdf_3.getContentType()
    'text/plain'

Then let's notify the subscriber / event handler that our file object has changed:

    >>> notify(ObjectEditedEvent(portal.test_pdf_3))

OK, now we should have an image annotation removal job in the queue:

    >>> queue.pending
    [<collective.pdfpeek.async.Job object at ...>]

Let's process the queue:

    >>> transaction.commit()

Now we need to visit the url that will run the method that processes the queue:

Let's login as the portal manager:

    >>> browser.open(portal_url + "/login")
    >>> browser.getControl(name='__ac_name').value = SITE_OWNER_NAME
    >>> browser.getControl(name='__ac_password').value = SITE_OWNER_PASSWORD
    >>> browser.getControl(name='submit').click()
    >>> portal.error_log._ignored_exceptions = ()
    >>> queue_process_url = portal_url + "/@@pdfpeek.utils/process_conversion_queue"
    >>> browser.open(queue_process_url)
    >>> "1 Jobs in queue. Processing queue..." in browser.contents
    True

The pending queue should now be empty:

    >>> queue.pending
    []

The job should have been added to the finished queue:

    >>> queue.finished
    [<collective.pdfpeek.async.Job object at ...>]

Now there shouldn't be any annotations any more:

    >>> dict(portal.test_pdf_3.__annotations__)
    {'Archetypes.storage.AnnotationStorage-file': <plone.app.blob.field.BlobWrapper object at ...>, 'Archetypes.storage.AnnotationStorage-title': u'Yet Another Test PDF File'}
