A calibre plugin containing an OPDS client that can import books into calibre

Steinar Bang 6a2ecf64cc Examining the gutenberg feed		9 years ago
calibre_plugin	b0effca07a Recursively download all OPDS feeds in entry "links" lists.	9 years ago
emacs	22aa0394bf Insert line separator between the braces in "}}".	9 years ago
experiments	6a2ecf64cc Examining the gutenberg feed	9 years ago
.gitignore	8b79f9be9a Ignored generated versions of the README.	9 years ago
LICENSE	10fdb244d9 Added licensing information (GPLv3) on the project level	9 years ago
README.org	bfeec3f95e Added some notes on OPDS feeds.	9 years ago

-*- coding: utf-8 -*-

Calibre OPDS client

What's this?

How do I install it?

How do I use it?

This is a calibre plugin that is an OPDS client intended to read the contents of another calibre installation, find the differences to the current calibre and offer to copy books from the other calibre into the current calibre Requires git and calibre installed:

Clone the repository:

#+BEGIN_EXAMPLE git clone https://github.com/steinarb/opds-reader.git #+END_EXAMPLE

Install the plugin in calibre

#+BEGIN_EXAMPLE cd opds-reader/calibre_plugin/ calibre-customize -b . #+END_EXAMPLE

Start calibre (if calibre was already running, stop calibre and start it again)

Click the button "Preferences"

In the dialog "calibre - Preferences":

Under "Interface", click on the button "Toolbar"

In the dialog "calibre - Preferences - Toolbar":

In the dropdown, select "The main toolbar"

In "Available actions" scroll down to find "OPDS Client" and select it

Click the top arrow button (arrow pointing right)

Click the "Apply" button

Click the "Close" button

I made this tool to backup my book collection between two PCs in my home LAN, and that is the procedure I will document here:

In the calibre you wish to copy from (in this example called calibre1.home.lan):

Click Preferences

In the "calibre - Preferences" dialog:

Click "Sharing over the net"

In the "calibre - Preferences - Sharing over the net" dialog:

Click the "Start Server" button

Select the checkbox "Run server automatically when calibre starts"

Click the "Apply" button

Click the "close" button

In the calibre you wish to copy to

Install this plugin (see the "How do I install it?" section)

Click the "OPDS client" button

In the "OPDS client" dialog

Edit the "OPDS URL" value, change

http://localhost:8080/opds

http://calibre1.home.lan:8080/opds

License

Click the "Download OPDS" button
Wait until the OPDS feed has finished loading (this may take some time if there is a large number of books to load)
Note: if no books appear, try unchecking the "Hide books already in the library" checkbox. If that makes a lot of books appear, it means that the two calibre instances have the same books
select the books you wish to copy into the current calibre and click the "Download selected books"
calibre will start downloading and installing the books:
The Jobs counter in calibre's lower right corner, will show a decrementing number and the icon will spin
The book list will be updated as the books are downloaded
The downloaded books will be in approximately the same order as in the original, but the time stamp will be the download time. To fix the time stamp, click on the "Fix timestamps of the selection" button
The updated timestamps may not show up immediatly, but they will show up after the first update of the display, and the books will be ordered according to the timestamp after stopping and starting calibre

List of TODO items [27/41]

DONE Create an icon

DONE Put a list of books into the dialog

DONE Read RSS from the plugin and put the resulting data into a datastructure compatible with the calibre db API

DONE Populate the list in the GUI with data read from the RSS

DONE Make the list of books look a little better (resize the dialog to make room for everything)

DONE Add a checkbox to filter out newspapers

DONE Add a checkbox to filter out books already present in the library

DONE Make the book datamodel be Metadata (add a field to hold the parsed OPDS structure), and parse all available metadata info

DONE Add a download button to download the selected books

DONE Fix line height after updates

DONE Move OPDS reading to the model and use the model refresh instead of setting a new model

DONE Restore the authors to the OPDS book list

DONE Make sure all of the books in the library are listed

DONE Give feedback on the number of OPDS books downloaded

DONE Get a display value back for "updated"

DONE Reverse the order of requested downloads

DONE Set the date/time of the copied book to the date of the original

See the LICENSE file for more detail.

The plugin API seems to mandate one for a plugin that is to add an action to the GUI

Need an icon that doesn't have any copyright limitations, it doesn't have to be pretty

<2015-08-23 søn 08:50> Will use a QTableView

import:

#+BEGIN_SRC python from PyQt5.Qt import QTableView #+END_SRC

Tried using the derived class BookView, but this failed because the parent (OpdsDialog) was missing the field iactions

Will try filling the QtTableView with BooksModel which derives from QAbstractTableModel

#+BEGIN_SRC python from calibre.gui2.library.models import BooksModel #+END_SRC

The argument to BooksModel.setData will be a list of SearchResult instances

#+BEGIN_SRC python from calibre.gui2.store.search_result import SearchResult #+END_SRC

SearchResult contains the following fields:

store_name

cover_url

cover_data

title

author

price

detail_item

drm

formats

downloads

dictionary

affilate

boolean

plugin_author

create_browser

SearchResult equality is determined in the following way:

#+BEGIN_SRC python return self.title == other.title and self.author == other.author and self.store_name == other.store_name and self.formats == other.formats #+END_SRC

The "formats" part of the comparison may be an issue when comparing with the current database? Could be that a comparison that excludes formats may be needed? E.g. I may want to keep ORIGINAL_EPUB in the calibre where I did the conversion, but not bother copying it to other calibres

An example usage of creating a list of SearchResult is in calibre in MobileReadStore.deserialize_books

<2015-08-23 søn 10:17> Will try putting the resulting data into SearchResult initially (as mentioned in the previous TODO item)

<2015-08-23 søn 11:05> The feedparser is already present in calibre

import statement:

#+BEGIN_SRC python from calibre.web.feeds.feedparser import parse #+END_SRC

<2015-08-23 søn 12:02> The BooksModel used in view.py is directly connected to the database, i.e. can't use that BooksModel

Instead created a new BooksModel patterned on the one in the mobileread store

Use a new data structure OpdsBook instead of SearchResult

<2015-08-23 søn 10:18> Hopefully this will be as simple as calling BooksModel.setData

<2015-09-04 fre 21:44> Makes a list of all book download links with EPUB first if found and download the first URL in the list

<2015-09-05 lør 10:34> Calibre is set up by default to only deliver 30 items

<2015-09-05 lør 23:20> Following the "next" links of the feed until there are no more "next"

<2015-09-06 søn 12:43> Perhaps update the list after each 30 book chunk has been added?

<2015-09-06 søn 12:28> The value is back, but now there is a value for the initial empty lines

<2015-09-06 søn 08:12> The idea is that what's started first will finish first and that this will give the same book order in the two calibres

<2015-09-05 lør 23:08> The date returned by "Updated" was the same for all books and the date/time of the last change to the db of the remote calibre

Opened an issue to replace this with the Metadata.timestamp attribute of the book: https://bugs.launchpad.net/bugs/1492651

<2015-09-06 søn 09:25> Use the ajax.py REST API to download the metadata for the remote book

Endpoints are:

ajax/book/{book_id}/{library_id=None}

Return the metadata of the book as a JSON dictionary.

Query parameters: ?category_urls=true&id_is_uuid=false&device_for_template=None

/ajax/books/{library_id=None}

Return the metadata for the books as a JSON dictionary.

Query parameters: ?ids=all&category_urls=true&id_is_uuid=false&device_for_template=None

/ajax/categories/{library_id=None}

Return the list of top-level categories as a list of dictionaries

Each category has the form:

#+BEGIN_SRC json { "name": "Display Name", "url": "URL that gives the JSON object corresponding to all entries in this category", "icon": "URL to icon of this category", "is_category": "False for the All Books and Newest categories, True for everything else" } #+END_SRC

/ajax/category/{encoded_name}/{library_id=None}

Return a dictionary describing the category specified by name

Query parameters: ?num=100&offset=0&sort=name&sort_order=asc

Response:

#+BEGIN_SRC json { "category_name": "Category display name", "base_url": "Base URL for this category", "total_num": "Total numberof items in this category", "offset": "The offset for the items returned in this result", "num": "The number of items returned in this result", "sort": "How the returned items are sorted", "sort_order": "asc or desc", "subcategories": "List of sub categories of this category.", "items": "List of items in this category" } #+END_SRC

/ajax/books_in/{encoded_category}/{encoded_item}/{library_id=None}

Return the books (as list of ids) present in the specified category

Query parameters: ?num=100&offset=0&sort=title&sort_order=asc&get_additional_fields=

/ajax/search/{library_id=None}

Return the books (as list of ids) matching the specified search query.

Query parameters: ?num=100&offset=0&sort=title&sort_order=asc&query=

<2015-09-06 søn 10:30> Tried using the API

http://edwards.hjemme.lan:8080/ajax/books

returned a 404

Added an

Accept: application/json

DONE Make the "Fix timestamps of selection button" work for books with more than one author

DONE Format the book list with different color for alternate lines

DONE Add licensing information (GPLv3) on the project level

DONE Add licensing and copyright information to all files

DONE Make the OPDS parsing more robust (hardcoded to the default structure of calibre right now)

DONE Put the OPDS feed combobox on the main dialog

DONE Add a catalog selection combobox

DONE Add search in downloaded results

TODO Read OPDS feeds other than calibre

header to the GET, but still got a 404

<2015-09-14 ma 23:56> Timestamps are now received and placed correctly on the book Metadata objects from the OPDS feed, the downloaded metadata hasn't been updated yet

<2015-09-15 ti 22:47> Added a "Fix timestamps of selection" button that can be clicked after download

I would have preferred to do this automatically, but that would require a callback that could be called after download, and no such callback system exists

The "Fix timestamps of selection" button only works for books that have a single author, the db.find_identical_books() method returns nothing when the metadata has more than one author

<2015-09-20 sø 12:13> Non-editable, populated from the catalogs in the root catalog when downloading a new feed

<2015-09-26 lø 19:55> Solved, but not entirely satsified with the results:

Only the currently visible items are searchable

Only the search matches are shown, instead of scrolling to the matches while showing all

Probably the best that can be done, using just built-in Qt functionality?

<2015-09-18 fr 19:01> Some examples

feedbooks:

http://www.feedbooks.com/books/top.atom?category=FBFIC028000&lang=en

http://www.feedbooks.com/catalog.atom
Internet archive:
http://bookserver.archive.org/catalog/
Pragmatic bookshelf:
http://pragprog.com/magazines.opds
ManyBooks:
http://www.manybooks.net/opds/index.php
Project Gutenberg:
http://m.gutenberg.org/ebooks/?format=opds
O'Reilly:
http://opds.oreilly.com/opds/
Baen ebooks:
http://www.baenebooks.com/stanza.aspx
<2015-10-04 søn 10:57> Need to differentiate between "navigation feeds" and "aquisition feeds"
For the calibre OPDS feeds
The top level calibre feed is a navigation feed
The "Newest" feed on the second level, is an "aquisition feed"
The "Authors" feed on the second level, is a "navigation feed" with the entries consisting of links to each author with a surname starting with a particular letter
A "navigation feed", is identified by:
kind=navigation

in the parameter to the type definition in the "type" attribute of a link element, e.g. like so: #+BEGIN_SRC xml #+END_SRC

An "aquisition feed", is identified by:

kind=aquisition

TODO Add auto discovery of calibre instances in the LAN

TODO Find out why the OPDS reader dialog sometimes disappear after downloading the OPDS

TODO Add username/password information to saved opds_url values

TODO Migrate own code from underscore separation to camelCase (Python has a camelCase modula/pascal feel to it)

TODO Find out why some books (in PDF...?) aren't downloaded

TODO Explore the documentation format to see if it is relevant to this plugin

TODO Try to keep the line length correct during intermediate model updates

TODO Get better matching with existing books (the "Maven cookbook" was already present, but it still showed up)

DONE Add configuration options for defaults for the "hide" checkboxes

TODO Remove all leftover debug trace

TODO Copy read marks in calibre's reader from the remote

TODO Refresh the list as books are downloaded (suppress downloaded books from the list)

TODO Add cover thumbnails to the list of books

TODO Add an exclusion list (a list of books that should be permanently hidden from the comparison)

Unfortunately, the OPDS feeds don't have a "kind" parameter to the OPDS feeds:
Newest (which links to an aquisition feed, where all entries are books):
Authors (which links to a navigation feeds, where all entries are new OPDS feds):
In other words: nothing to distinguish the two in just the link to the feed
I.e. it will be necessary to handle all feeds the same way, and just recurse on the links that are feeds
<2015-10-04 søn 11:54> Trying to distinguish the entries in "authors" from the entries in "Newest"
"Authors" entry:
"Newest" entry:

A Viking temple. A Viking ship. Both preserved in the clinging, black mud of the North Yorkshire estuary. Press and TV watch over the archaeologists\' shoulders as past and present merge. And while huge, death-cold creatures stalk and destroy through the blizzards of an eerily early winter, modern computer science and the dark night-knowledge of the old Norse gods disinter a terrible truth about a past that is sleeping, not dead.

<2015-09-06 søn 11:49> Perhaps use the Bonjour protocol? (is this what FBReaderJ uses?)
<2015-09-06 søn 16:10> When updating the book list model after each OPDS chunk, the line heights are wrong
They are corrected after the final read but they look a bit silly during the intermediate chunks

README.org