A calibre plugin containing an OPDS client that can import books into calibre

Steinar Bang 6a2ecf64cc Examining the gutenberg feed 9 years ago
calibre_plugin b0effca07a Recursively download all OPDS feeds in entry "links" lists. 9 years ago
emacs 22aa0394bf Insert line separator between the braces in "}}". 9 years ago
experiments 6a2ecf64cc Examining the gutenberg feed 9 years ago
.gitignore 8b79f9be9a Ignored generated versions of the README. 9 years ago
LICENSE 10fdb244d9 Added licensing information (GPLv3) on the project level 9 years ago
README.org bfeec3f95e Added some notes on OPDS feeds. 9 years ago

README.org

-*- coding: utf-8 -*-

Calibre OPDS client

What's this?

How do I install it?

How do I use it?

This is a calibre plugin that is an OPDS client intended to read the contents of another calibre installation, find the differences to the current calibre and offer to copy books from the other calibre into the current calibre Requires git and calibre installed:

  • Clone the repository:
  • #+BEGIN_EXAMPLE git clone https://github.com/steinarb/opds-reader.git #+END_EXAMPLE
  • Install the plugin in calibre
  • #+BEGIN_EXAMPLE cd opds-reader/calibre_plugin/ calibre-customize -b . #+END_EXAMPLE
  • Start calibre (if calibre was already running, stop calibre and start it again)
  • Click the button "Preferences"
  • In the dialog "calibre - Preferences":
  • Under "Interface", click on the button "Toolbar"
  • In the dialog "calibre - Preferences - Toolbar":
  • In the dropdown, select "The main toolbar"
  • In "Available actions" scroll down to find "OPDS Client" and select it
  • Click the top arrow button (arrow pointing right)
  • Click the "Apply" button
  • Click the "Close" button
  • I made this tool to backup my book collection between two PCs in my home LAN, and that is the procedure I will document here:
  • In the calibre you wish to copy from (in this example called calibre1.home.lan):
  • Click Preferences
  • In the "calibre - Preferences" dialog:
  • Click "Sharing over the net"
  • In the "calibre - Preferences - Sharing over the net" dialog:
  • Click the "Start Server" button
  • Select the checkbox "Run server automatically when calibre starts"
  • Click the "Apply" button
  • Click the "close" button
  • In the calibre you wish to copy to
  • Install this plugin (see the "How do I install it?" section)
  • Click the "OPDS client" button
  • In the "OPDS client" dialog
  • Edit the "OPDS URL" value, change
  • http://localhost:8080/opds
    

    to

    http://calibre1.home.lan:8080/opds
    

    License

      and then press the RETURN key on the keyboard
    • Click the "Download OPDS" button
    • Wait until the OPDS feed has finished loading (this may take some time if there is a large number of books to load)
    • Note: if no books appear, try unchecking the "Hide books already in the library" checkbox. If that makes a lot of books appear, it means that the two calibre instances have the same books
    • select the books you wish to copy into the current calibre and click the "Download selected books"
    • calibre will start downloading and installing the books:
    • The Jobs counter in calibre's lower right corner, will show a decrementing number and the icon will spin
    • The book list will be updated as the books are downloaded
    • The downloaded books will be in approximately the same order as in the original, but the time stamp will be the download time. To fix the time stamp, click on the "Fix timestamps of the selection" button
    • The updated timestamps may not show up immediatly, but they will show up after the first update of the display, and the books will be ordered according to the timestamp after stopping and starting calibre
    • This calibre plugin is copyright Steinar Bang, 2015, and licensed Under GPL version 3.

    List of TODO items [27/41]

    DONE Create an icon

    DONE Put a list of books into the dialog

    DONE Read RSS from the plugin and put the resulting data into a datastructure compatible with the calibre db API

    DONE Populate the list in the GUI with data read from the RSS

    DONE Make the list of books look a little better (resize the dialog to make room for everything)

    DONE Add a checkbox to filter out newspapers

    DONE Add a checkbox to filter out books already present in the library

    DONE Make the book datamodel be Metadata (add a field to hold the parsed OPDS structure), and parse all available metadata info

    DONE Add a download button to download the selected books

    DONE Fix line height after updates

    DONE Move OPDS reading to the model and use the model refresh instead of setting a new model

    DONE Restore the authors to the OPDS book list

    DONE Make sure all of the books in the library are listed

    DONE Give feedback on the number of OPDS books downloaded

    DONE Get a display value back for "updated"

    DONE Reverse the order of requested downloads

    DONE Set the date/time of the copied book to the date of the original

    See the LICENSE file for more detail.

  • The plugin API seems to mandate one for a plugin that is to add an action to the GUI
  • Need an icon that doesn't have any copyright limitations, it doesn't have to be pretty
  • <2015-08-23 søn 08:50> Will use a QTableView
  • import:
  • #+BEGIN_SRC python from PyQt5.Qt import QTableView #+END_SRC
  • Tried using the derived class BookView, but this failed because the parent (OpdsDialog) was missing the field iactions
  • Will try filling the QtTableView with BooksModel which derives from QAbstractTableModel
  • #+BEGIN_SRC python from calibre.gui2.library.models import BooksModel #+END_SRC
  • The argument to BooksModel.setData will be a list of SearchResult instances
  • #+BEGIN_SRC python from calibre.gui2.store.search_result import SearchResult #+END_SRC
  • SearchResult contains the following fields:
  • store_name
  • cover_url
  • cover_data
  • title
  • author
  • price
  • detail_item
  • drm
  • formats
  • downloads
  • dictionary
  • affilate
  • boolean
  • plugin_author
  • create_browser
  • SearchResult equality is determined in the following way:
  • #+BEGIN_SRC python return self.title == other.title and self.author == other.author and self.store_name == other.store_name and self.formats == other.formats #+END_SRC
  • The "formats" part of the comparison may be an issue when comparing with the current database? Could be that a comparison that excludes formats may be needed? E.g. I may want to keep ORIGINAL_EPUB in the calibre where I did the conversion, but not bother copying it to other calibres
  • An example usage of creating a list of SearchResult is in calibre in MobileReadStore.deserialize_books
  • <2015-08-23 søn 10:17> Will try putting the resulting data into SearchResult initially (as mentioned in the previous TODO item)
  • <2015-08-23 søn 11:05> The feedparser is already present in calibre
  • import statement:
  • #+BEGIN_SRC python from calibre.web.feeds.feedparser import parse #+END_SRC
  • <2015-08-23 søn 12:02> The BooksModel used in view.py is directly connected to the database, i.e. can't use that BooksModel
  • Instead created a new BooksModel patterned on the one in the mobileread store
  • Use a new data structure OpdsBook instead of SearchResult
  • <2015-08-23 søn 10:18> Hopefully this will be as simple as calling BooksModel.setData
  • <2015-09-04 fre 21:44> Makes a list of all book download links with EPUB first if found and download the first URL in the list
  • <2015-09-05 lør 10:34> Calibre is set up by default to only deliver 30 items
  • <2015-09-05 lør 23:20> Following the "next" links of the feed until there are no more "next"
  • <2015-09-06 søn 12:43> Perhaps update the list after each 30 book chunk has been added?
  • <2015-09-06 søn 12:28> The value is back, but now there is a value for the initial empty lines
  • <2015-09-06 søn 08:12> The idea is that what's started first will finish first and that this will give the same book order in the two calibres
  • <2015-09-05 lør 23:08> The date returned by "Updated" was the same for all books and the date/time of the last change to the db of the remote calibre
  • Opened an issue to replace this with the Metadata.timestamp attribute of the book: https://bugs.launchpad.net/bugs/1492651
  • <2015-09-06 søn 09:25> Use the ajax.py REST API to download the metadata for the remote book
  • Endpoints are:
  • ajax/book/{book_id}/{library_id=None}
  • Return the metadata of the book as a JSON dictionary.
  • Query parameters: ?category_urls=true&id_is_uuid=false&device_for_template=None
  • /ajax/books/{library_id=None}
  • Return the metadata for the books as a JSON dictionary.
  • Query parameters: ?ids=all&category_urls=true&id_is_uuid=false&device_for_template=None
  • /ajax/categories/{library_id=None}
  • Return the list of top-level categories as a list of dictionaries
  • Each category has the form:
  • #+BEGIN_SRC json { "name": "Display Name", "url": "URL that gives the JSON object corresponding to all entries in this category", "icon": "URL to icon of this category", "is_category": "False for the All Books and Newest categories, True for everything else" } #+END_SRC
  • /ajax/category/{encoded_name}/{library_id=None}
  • Return a dictionary describing the category specified by name
  • Query parameters: ?num=100&offset=0&sort=name&sort_order=asc
  • Response:
  • #+BEGIN_SRC json { "category_name": "Category display name", "base_url": "Base URL for this category", "total_num": "Total numberof items in this category", "offset": "The offset for the items returned in this result", "num": "The number of items returned in this result", "sort": "How the returned items are sorted", "sort_order": "asc or desc", "subcategories": "List of sub categories of this category.", "items": "List of items in this category" } #+END_SRC
  • /ajax/books_in/{encoded_category}/{encoded_item}/{library_id=None}
  • Return the books (as list of ids) present in the specified category
  • Query parameters: ?num=100&offset=0&sort=title&sort_order=asc&get_additional_fields=
  • /ajax/search/{library_id=None}
  • Return the books (as list of ids) matching the specified search query.
  • Query parameters: ?num=100&offset=0&sort=title&sort_order=asc&query=
  • <2015-09-06 søn 10:30> Tried using the API
  • http://edwards.hjemme.lan:8080/ajax/books
  • returned a 404
  • Added an
  • Accept: application/json
    

    DONE Make the "Fix timestamps of selection button" work for books with more than one author

    DONE Format the book list with different color for alternate lines

    DONE Add licensing information (GPLv3) on the project level

    DONE Add recently used dropdown to the opds_url configuration

    DONE Make the OPDS parsing more robust (hardcoded to the default structure of calibre right now)

    DONE Put the OPDS feed combobox on the main dialog

    DONE Add a catalog selection combobox

    DONE Add search in downloaded results

    TODO Read OPDS feeds other than calibre

    header to the GET, but still got a 404

  • <2015-09-14 ma 23:56> Timestamps are now received and placed correctly on the book Metadata objects from the OPDS feed, the downloaded metadata hasn't been updated yet
  • <2015-09-15 ti 22:47> Added a "Fix timestamps of selection" button that can be clicked after download
  • I would have preferred to do this automatically, but that would require a callback that could be called after download, and no such callback system exists
  • The "Fix timestamps of selection" button only works for books that have a single author, the db.find_identical_books() method returns nothing when the metadata has more than one author
  • <2015-09-20 sø 12:13> Non-editable, populated from the catalogs in the root catalog when downloading a new feed
  • <2015-09-26 lø 19:55> Solved, but not entirely satsified with the results:
  • Only the currently visible items are searchable
  • Only the search matches are shown, instead of scrolling to the matches while showing all
  • Probably the best that can be done, using just built-in Qt functionality?
  • <2015-09-18 fr 19:01> Some examples
  • feedbooks:
  • http://www.feedbooks.com/books/top.atom?category=FBFIC028000&lang=en
    

    or

    http://www.feedbooks.com/catalog.atom
    
  • Internet archive:
  • http://bookserver.archive.org/catalog/
  • Pragmatic bookshelf:
  • http://pragprog.com/magazines.opds
  • ManyBooks:
  • http://www.manybooks.net/opds/index.php
  • Project Gutenberg:
  • http://m.gutenberg.org/ebooks/?format=opds
  • O'Reilly:
  • http://opds.oreilly.com/opds/
  • Baen ebooks:
  • http://www.baenebooks.com/stanza.aspx
  • <2015-10-04 søn 10:57> Need to differentiate between "navigation feeds" and "aquisition feeds"
  • For the calibre OPDS feeds
  • The top level calibre feed is a navigation feed
  • The "Newest" feed on the second level, is an "aquisition feed"
  • The "Authors" feed on the second level, is a "navigation feed" with the entries consisting of links to each author with a surname starting with a particular letter
  • A "navigation feed", is identified by:
  • kind=navigation

    in the parameter to the type definition in the "type" attribute of a link element, e.g. like so: #+BEGIN_SRC xml #+END_SRC

  • An "aquisition feed", is identified by:
  • kind=aquisition
    

    TODO Add auto discovery of calibre instances in the LAN

    TODO Find out why the OPDS reader dialog sometimes disappear after downloading the OPDS

    TODO Add username/password information to saved opds_url values

    TODO Migrate own code from underscore separation to camelCase (Python has a camelCase modula/pascal feel to it)

    TODO Find out why some books (in PDF...?) aren't downloaded

    TODO Explore the documentation format to see if it is relevant to this plugin

    TODO Try to keep the line length correct during intermediate model updates

    TODO Get better matching with existing books (the "Maven cookbook" was already present, but it still showed up)

    DONE Add configuration options for defaults for the "hide" checkboxes

    TODO Remove all leftover debug trace

    TODO Copy read marks in calibre's reader from the remote

    TODO Refresh the list as books are downloaded (suppress downloaded books from the list)

    TODO Add cover thumbnails to the list of books

    TODO Add an exclusion list (a list of books that should be permanently hidden from the comparison)

      in the parameter to the type definition in the "type" attribute of a link element, e.g. like so: #+BEGIN_SRC xml #+END_SRC
    • Unfortunately, the OPDS feeds don't have a "kind" parameter to the OPDS feeds:
    • Newest (which links to an aquisition feed, where all entries are books):
    • #+BEGIN_SRC xml #+END_SRC
    • Authors (which links to a navigation feeds, where all entries are new OPDS feds):
    • #+BEGIN_SRC xml #+END_SRC
    • In other words: nothing to distinguish the two in just the link to the feed
    • I.e. it will be necessary to handle all feeds the same way, and just recurse on the links that are feeds
    • <2015-10-04 søn 11:54> Trying to distinguish the entries in "authors" from the entries in "Newest"
    • "Authors" entry:
    • #+BEGIN_SRC python { u'updated': u'2015-09-28T15:49:33+00:00', u'updated_parsed': time.struct_time(tm_year=2015, tm_mon=9, tm_mday=28, tm_hour=15, tm_min=49, tm_sec=33, tm_wday=0, tm_yday=271, tm_isdst=0), u'links': [ { 'href': u'http://localhost:8080/opds/categorygroup/617574686f7273/5c', 'type': u'application/atom+xml;type=feed;profile=opds-catalog', u'rel': u'alternate' } ], u'title': u'\\', u'summary': u'1 items', u'content': [ { u'base': u'http://localhost:8080/opds/navcatalog/4e617574686f7273', u'type': u'text/plain', u'value': u'1 items', u'language': None} ], u'guidislink': True, u'title_detail': { u'base': u'http://localhost:8080/opds/navcatalog/4e617574686f7273', u'type': u'text/plain', u'value': u'\\', u'language': None}, u'link': u'calibre:category-group:authors:\\', u'id': u'calibre:category-group:authors:\\' } #+END_SRC
    • "Newest" entry:
    • #+BEGIN_SRC python { u'updated': u'2015-09-28T15:49:33+00:00', u'updated_parsed': time.struct_time(tm_year=2015, tm_mon=9, tm_mday=28, tm_hour=15, tm_min=49, tm_sec=33, tm_wday=0, tm_yday=271, tm_isdst=0), u'links': [ { 'href': u'http://localhost:8080/get/epub/460', 'type': u'application/epub+zip', 'rel': u'http://opds-spec.org/acquisition' }, { 'href': u'http://localhost:8080/get/cover/460', 'type': u'image/jpeg', 'rel': u'http://opds-spec.org/cover' }, { 'href': u'http://localhost:8080/get/thumb/460', 'type': u'image/jpeg', 'rel': u'http://opds-spec.org/thumbnail' } ], u'author': u'Michael Scott Rohan & Allan J. Scott', u'title': u'The Ice King', u'summary': u'TAGS: Fiction, General, Science Fiction
      \n

      A Viking temple. A Viking ship. Both preserved in the clinging, black mud of the North Yorkshire estuary. Press and TV watch over the archaeologists\' shoulders as past and present merge. And while huge, death-cold creatures stalk and destroy through the blizzards of an eerily early winter, modern computer science and the dark night-knowledge of the old Norse gods disinter a terrible truth about a past that is sleeping, not dead.

      ', u'content': [ { u'base': u'http://localhost:8080/opds/navcatalog/4f6e6577657374', u'type': u'application/xhtml+xml', u'value': u'TAGS: Fiction, General, Science Fiction
      \n

      A Viking temple. A Viking ship. Both preserved in the clinging, black mud of the North Yorkshire estuary. Press and TV watch over the archaeologists\' shoulders as past and present merge. And while huge, death-cold creatures stalk and destroy through the blizzards of an eerily early winter, modern computer science and the dark night-knowledge of the old Norse gods disinter a terrible truth about a past that is sleeping, not dead.

      ', u'language': None} ], u'guidislink': True, u'title_detail': { u'base': u'http://localhost:8080/opds/navcatalog/4f6e6577657374', u'type': u'text/plain', u'value': u'The Ice King', u'language': None}, u'link': u'urn:uuid:c34a0ada-ffe2-4d51-a820-9a84b7664003', u'authors': [ { u'name': u'Michael Scott Rohan & Allan J. Scott' } ], u'author_detail': { u'name': u'Michael Scott Rohan & Allan J. Scott' }, u'id': u'urn:uuid:c34a0ada-ffe2-4d51-a820-9a84b7664003' } #+END_SRC
    • <2015-09-06 søn 11:49> Perhaps use the Bonjour protocol? (is this what FBReaderJ uses?)
    • <2015-09-06 søn 16:10> When updating the book list model after each OPDS chunk, the line heights are wrong
    • They are corrected after the final read but they look a bit silly during the intermediate chunks