Changelog¶
Version 1.20¶
Released 2021-07-12
Add
Reader.after_entry_update_hooks
, which allows running arbitrary actions for updated entries. Thanks to Mirek Długosz for the issue and pull request. (#241)Raise
StorageError
when opening / operating on an invalid database, instead of a plainsqlite3.DatabaseError
. (#243)
Version 1.19¶
Released 2021-06-16
Drop Python 3.6 support. (#237)
Support PyPy 3.7. (#234)
Skip enclosures with no
href
/url
; previously, they would result in a parse error. (#240)Stop using Travis CI (only use GitHub Actions). (#199)
Add the
new
argument toupdate_feeds()
andupdate_feeds_iter()
;new_only
is deprecated and will be removed in 2.0. (#217)Rename
UpdatedFeed.updated
tomodified
; for backwards compatibility, the old attribute will be available as a property until version 2.0, when it will be removed. (#241).Warning
The signature of
UpdatedFeed
changed fromUpdatedFeed(url, new, updated)
toUpdatedFeed(url, new, modified)
.This is a minor compatibility break, but only affects third-party code that instantiates UpdatedFeed directly with
updated
as a keyword argument.
Version 1.18¶
Released 2021-06-03
Rename
Reader
feed metadata methods:For backwards compatibility, the old method signatures will continue to work until version 2.0, when they will be removed. (#183)
Warning
The
get_feed_metadata(feed, key[, default]) -> value
form is backwards-compatible only when the arguments are positional.This is a minor compatibility break; the following work in 1.17, but do not in 1.18:
# raises TypeError reader.get_feed_metadata(feed, key, default=None) # returns `(key, value), ...` instead of `value` reader.get_feed_metadata(feed, key=key)
The pre-1.18
get_feed_metadata()
(1.18get_feed_metadata_item()
) is intended to have positional-only arguments, but this cannot be expressed easily until Python 3.8.Rename
MetadataNotFoundError
toFeedMetadataNotFoundError
.MetadataNotFoundError
remains available, and is a superclass ofFeedMetadataNotFoundError
for backwards compatibility. (#228)Warning
The signatures of the following exceptions changed:
MetadataError
Takes a new required
key
argument, instead of no required arguments.MetadataNotFoundError
Takes only one required argument,
key
; theurl
argument has been removed.Use
FeedMetadataNotFoundError
instead.
This is a minor compatibility break, but only affects third-party code that instantiates these exceptions directly.
Rename
EntryError.url
tofeed_url
; for backwards compatibility, the old attribute will be available as a property until version 2.0, when it will be removed. (#183).Warning
The signature of
EntryError
(and its subclasses) changed fromEntryError(url, id)
toEntryError(feed_url, id)
.This is a minor compatibility break, but only affects third-party code that instantiates these exceptions directly with
url
as a keyword argument.Rename
remove_feed()
todelete_feed()
. For backwards compatibility, the old method will continue to work until version 2.0, when it will be removed. (#183)Rename
Reader
mark_as_...
methods:For backwards compatibility, the old methods will continue to work until version 2.0, when they will be removed. (#183)
Fix feeds with no title sometimes missing from the
get_feeds()
results when there are more than 256 feeds (Storage.chunk_size
). (#203)When serving the web application with
python -m reader serve
, don’t set theReferer
header for cross-origin requests. (#209)
Version 1.17¶
Released 2021-05-06
Reserve tags and metadata keys starting with
.reader.
and.plugin.
for reader- and plugin-specific uses. See the Reserved names user guide section for details. (#186)Ignore
updated
when updating feeds; only update the feed if other feed data changed or if any entries were added/updated. (#231)Prevents spurious updates for feeds whose
updated
changes excessively (either because the entries’ content changes excessively, or because an RSS feed does not have adc:date
element, and feedparser falls back tolastBuildDate
forupdated
).The
regex_mark_as_read
experimental plugin is now built-in. To use it with the CLI / web application, use the plugin name instead of the entry point (reader.mark_as_read
).The config metadata key and format changed; the config will be migrated automatically on the next feed update, during reader version 1.17 only. If you used
regex_mark_as_read
and are upgrading to a version >1.17, install 1.17 (pip install reader==1.17
) and run a full feed update (python -m reader update
) before installing the newer version.The
enclosure-tags
,preview-feed-list
, andsqlite-releases
unstable extras are not available anymore. Use theunstable-plugins
extra to install dependencies of the unstable plugins instead.In the web application, allow updating a feed manually. (#195)
Version 1.16¶
Released 2021-03-29
Allow
make_reader()
to load plugins through theplugins
argument. (#229)Enable the
ua_fallback
plugin by default.make_reader()
may now raiseInvalidPluginError
(aValueError
subclass, which it already raises implicitly) for invalid plugin names.The
enclosure_dedupe
,feed_entry_dedupe
, andua_fallback
plugins are now built-in. (#229)To use them with the CLI / web application, use the plugin name instead of the entry point:
reader._plugins.enclosure_dedupe:enclosure_dedupe -> reader.enclosure_dedupe reader._plugins.feed_entry_dedupe:feed_entry_dedupe -> reader.entry_dedupe reader._plugins.ua_fallback:init -> reader.ua_fallback
Remove the
plugins
extra; plugin loading machinery does not have additional dependencies anymore.Mention in the User guide that all reader functions/methods can raise
ValueError
orTypeError
if passed invalid arguments. There is no behavior change, this is just documenting existing, previously undocumented behavior.
Version 1.15¶
Released 2021-03-21
Update entries whenever their content changes, regardless of their
updated
date. (#179)Limit content-only updates (not due to an
updated
change) to 24 consecutive updates, to prevent spurious updates for entries whose content changes excessively (for example, because it includes the current time). (#225)Previously, entries would be updated only if the entry
updated
was newer than the stored one.Fix bug causing entries that don’t have
updated
set in the feed to not be updated if the feed is marked as stale. Feed staleness is an internal feature used during storage migrations; this bug could only manifest when migrating from 0.22 to 1.x. (found during #179)Minor web application improvements.
Minor CLI improvements.
Version 1.14¶
Released 2021-02-22
Add the
update_feeds_iter()
method, which yields the update status of each feed as it gets updated. (#204)Change the return type of
update_feed()
fromNone
toOptional[UpdatedFeed]
. (#204)Add the
session_timeout
argument tomake_reader()
to set a timeout for retrieving HTTP(S) feeds. The default (connect timeout, read timeout) is (3.05, 60) seconds; the previous behavior was to never time out.Use
PRAGMA user_version
instead of a version table. (#210)Use
PRAGMA application_id
to identify reader databases; the id is0x66656564
–read
in ASCII / UTF-8. (#211)Change the
reader update
command to show a progress bar and update summary (with colors), instead of plain log output. (#204)Fix broken Mypy config following 0.800 release. (#213)
Version 1.13¶
Released 2021-01-29
JSON Feed support. (#206)
Split feed retrieval from parsing; should make it easier to add new/custom parsers. (#206)
Prevent any logging output from the
reader
logger by default. (#207)In the
preview_feed_list
plugin, add<link rel=alternative ...>
tags as a feed detection heuristic.In the
preview_feed_list
plugin, add<a>
tags as a fallback feed detection heuristic.In the web application, fix bug causing the entries page to crash when counts are enabled.
Version 1.12¶
Released 2020-12-13
Add the
limit
andstarting_after
arguments toget_feeds()
,get_entries()
, andsearch_entries()
, allowing them to be used in a paginated fashion. (#196)Add the
object_id
property that allows getting the unique identifier of a data object in a uniform way. (#196)In the web application, add links to toggle feed/entry counts. (#185)
Version 1.11¶
Released 2020-11-28
Allow disabling feed updates for specific feeds. (#187)
Add methods to get aggregated feed and entry counts. (#185)
In the web application: allow disabling feed updates for a feed; allow filtering feeds by whether they have updates enabled; do not show feed update errors for feeds that have updates disabled. (#187)
In the web application, show feed and entry counts when
?counts=yes
is used. (#185)In the web application, use YAML instead of JSON for the tags and metadata fields.
Version 1.10¶
Released 2020-11-20
Use indexes for
get_entries()
(recent order); should make calls 10-30% faster. (#134)Allow sorting
search_entries()
results randomly. Allow sorting search results randomly in the web application. (#200)Reraise unexpected errors caused by parser bugs instead of replacing them with an
AssertionError
.Add the
sqlite_releases
custom parser plugin.Refactor the HTTP feed sub-parser to allow reuse by custom parsers.
Add a user guide, and improve other parts of the documentation. (#194)
Version 1.9¶
Released 2020-10-28
Support Python 3.9. (#199)
Support Windows (requires Python >= 3.9). (#163)
Use GitHub Actions to do macOS and Windows CI builds. (#199)
Rename the
cloudflare_ua_fix
plugin toua_fallback
. Retry any feed that gets a 403, not just those served by Cloudflare. (#181)Fix type annotation to avoid mypy 0.790 errors. (#198)
Version 1.8¶
Released 2020-10-02
Drop feedparser 5.x support (deprecated in 1.7); use feedparser 6.x instead. (#190)
Make the string representation of
ReaderError
and its subclasses more consistent; add error messages and improve the existing ones. (#173)Add method
change_feed_url()
to change the URL of a feed. (#149)Allow changing the URL of a feed in the web application. (#149)
Add more tag navigation links to the web application. (#184)
In the
feed_entry_dedupe
plugin, copy the important flag from the old entry to the new one. (#140)
Version 1.7¶
Released 2020-09-19
Add new methods to support feed tags:
add_feed_tag()
,remove_feed_tag()
, andget_feed_tags()
. Allow filtering feeds and entries by their feed tags. (#184)Add the
broken
argument toget_feeds()
, which allows getting only feeds that failed / did not fail during the last update. (#189)feedparser 5.x support is deprecated in favor of feedparser 6.x. Using feedparser 5.x will raise a deprecation warning in version 1.7, and support will be removed the following version. (#190)
Tag-related web application features: show tags in the feed list; allow adding/removing tags; allow filtering feeds and entries by their feed tag; add a page that lists all tags. (#184)
In the web application, allow showing only feeds that failed / did not fail. (#189)
In the
preview_feed_list
plugin, add<meta>
tags as a feed detection heuristic.Add a few property-based tests. (#188)
Version 1.6¶
Released 2020-09-04
Add the
feed_root
argument tomake_reader()
, which allows limiting local feed parsing to a specific directory or disabling it altogether. Using it is recommended, since by default reader will access any local feed path (in 2.0, local file parsing will be disabled by default). (#155)Support loading CLI and web application settings from a configuration file. (#177)
Fail fast for feeds that return HTTP 4xx or 5xx status codes, instead of (likely) failing later with an ambiguous XML parsing error. The cause of the raised
ParseError
is now an instance ofrequests.HTTPError
. (#182)Add
cloudflare_ua_fix
plugin (work around Cloudflare sometimes blocking requests). (#181)feedparser 6.0 (beta) compatibility fixes.
Internal parser API changes to support alternative parsers, pre-request hooks, and making arbitrary HTTP requests using the same logic
Reader
uses. (#155)In the /preview page and the
preview_feed_list
plugin, use the same plugins the mainReader
does. (enabled by #155)
Version 1.5¶
Released 2020-07-30
Use rowid when deleting from the search index, instead of the entry id. Previously, each
update_search()
call would result in a full scan, even if there was nothing to update/delete. This should reduce the amount of reads significantly (deleting 4 entries from a database with 10k entries resulted in an 1000x decrease in bytes read). (#178)Require at least SQLite 3.18 (released 2017-03-30) for the current
update_search()
implementation; all other reader features continue to work with SQLite >= 3.15. (#178)Run
PRAGMA optimize
onclose()
. This should increase the performance of all methods. As an example, in #178 it was found thatupdate_search()
resulted in a full scan of the entries table, even if there was nothing to update; this change should prevent this from happening. (#143)Note
PRAGMA optimize
is a no-op in SQLite versions earlier than 3.18. In order to avoid the case described above, you should run ANALYZE regularly (e.g. every few days).
Version 1.4¶
Released 2020-07-13
Work to reduce the likelihood of “database is locked” errors during updates (#175):
Prepare entries to be added to the search index (
update_search()
) outside transactions.Fix bug causing duplicate rows in the search index when an entry changes while updating the search index.
Update the search index only when the indexed values change (details below).
Use SQLite WAL (details below).
Update the search index only when the indexed values change. Previously, any change on a feed would result in all its entries being re-indexed, even if the feed title or the entry content didn’t change. This should reduce the
update_search()
run time significantly.Use SQLite’s write-ahead logging to increase concurrency. At the moment there is no way to disable WAL. This change may be reverted in the future. (#169)
Require at least click 7.0 for the
cli
extra.Do not fail for feeds with incorrectly-declared media types, if feedparser can parse the feed; this is similar to the current behavior for incorrectly-declared encodings. (#171)
Raise
ParseError
during update for feeds feedparser can’t detect the type of, instead of silently returning an empty feed. (#171)Add
sort
argument tosearch_entries()
. Allow sorting search results by recency in addition to relevance (the default). (#176)In the web application, display a nice error message for invalid search queries instead of returning an HTTP 500 Internal Server Error.
Other minor web application improvements.
Minor CLI logging improvements.
Version 1.3¶
Released 2020-06-23
If a feed failed to update, provide details about the error in
Feed.last_exception
. (#68)Show details about feed update errors in the web application. (#68)
Expose the
added
andlast_updated
Feed attributes.Expose the
last_updated
Entry attribute.Raise
ParseError
/ log during update if an entry has no id, instead of unconditionally raisingAttributeError
. (#170)Fall back to <link> as entry id if an entry in an RSS feed has no <guid>; previously, feeds like this would fail on update. (#170)
Minor web application improvements (show feed added/updated date).
In the web application, handle previewing an invalid feed nicely instead of returning an HTTP 500 Internal Server Error. (#172)
Internal API changes to support multiple storage implementations in the future. (#168)
Version 1.2¶
Released 2020-05-18
Minor web application improvements.
Remove unneeded additional query in methods that use pagination (for n = len(result) / page size, always do n queries instead n+1).
get_entries()
andsearch_entries()
are now 33–7% and 46–36% faster, respectively, for results of size 32–256. (#166)All queries are now chunked/paginated to avoid locking the SQLite storage for too long, decreasing the chance of concurrent queries timing out; the problem was most visible during
update_search()
. This should cap memory usage for methods returning an iterable that were not paginated before; previously the whole result set would be read before returning it. (#167)
Version 1.1¶
Released 2020-05-08
Add
sort
argument toget_entries()
. Allow sorting entries randomly in addition to the default most-recent-first order. (#105)Allow changing the entry sort order in the web application. (#105)
Use a query builder instead of appending strings manually for the more complicated queries in search and storage. (#123)
Make searching entries faster by filtering them before searching; e.g. if 1/5 of the entries are read, searching only read entries is now ~5x faster. (enabled by #123)
Version 1.0.1¶
Released 2020-04-30
Fix bug introduced in 0.20 causing
update_feeds()
to silently stop updating the remaining feeds after a feed failed. (#164)
Version 1.0¶
Released 2020-04-28
Make all private submodules explicitly private. (#156)
Note
All direct imports from
reader
continue to work.The
reader.core.*
modules moved toreader.*
(most of them prefixed by_
).The web application WSGI entry point moved from
reader.app.wsgi:app
toreader._app.wsgi:app
.The entry points for plugins that ship with reader moved from
reader.plugins.*
toreader._plugins.*
.
Require at least beautifulsoup4 4.5 for the
search
extra (before, the version was unspecified). (#161)Rename the web application dependencies extra from
web-app
toapp
.Fix relative link resolution and content sanitization; sgmllib3k is now a required dependency for this reason. (#125, #157)
Version 0.22¶
Released 2020-04-14
Add the
Entry.feed_url
attribute. (#159)Rename the
EntrySearchResult
feed
attribute tofeed_url
. Usingfeed
will raise a deprecation warning in version 0.22, and will be removed in the following version. (#159)Use
executemany()
instead ofexecute()
in the SQLite storage. Makes updating feeds (excluding network calls) 5-10% faster. (#144)In the web app, redirect to the feed’s page after adding a feed. (#119)
In the web app, show highlighted search result snippets. (#122)
Version 0.21¶
Released 2020-04-04
Minor consistency improvements to the web app search button. (#122)
Add support for web application plugins. (#80)
The enclosure tag proxy is now a plugin, and is disabled by default. See its documentation for details. (#52)
In the web app, the “add feed” button shows a preview before adding the feed. (#145)
In the web app, if the feed to be previewed is not actually a feed, show a list of feeds linked from that URL. This is a plugin, and is disabled by default. (#150)
reader now uses a User-Agent header like
python-reader/0.21
when retrieving feeds instead of the default requests one. (#154)
Version 0.20¶
Released 2020-03-31
Fix bug in
enable_search()
that caused it to fail if search was already enabled and the reader had any entries.Add an
entry
argument toget_entries()
, for symmetry withsearch_entries()
.Add a
feed
argument toget_feeds()
.Add a
key
argument toget_feed_metadata()
.Require at least requests 2.18 (before, the version was unspecified).
Allow updating feeds concurrently; add a
workers
argument toupdate_feeds()
. (#152)
Version 0.19¶
Released 2020-03-25
Support PyPy 3.6.
Allow searching for entries. (#122)
Stricter type checking for the core modules.
Various changes to the storage internal API.
Version 0.18¶
Released 2020-01-26
Support Python 3.8.
Increase the
get_entries()
recent threshold from 3 to 7 days. (#141)Enforce type checking for the core modules. (#132)
Use dataclasses for the data objects instead of attrs. (#137)
Version 0.17¶
Released 2019-10-12
Remove the
which
argument ofget_entries()
. (#136)Reader
objects should now be created usingmake_reader()
. Instantiating Reader directly will raise a deprecation warning.The resources associated with a reader can now be released explicitly by calling its
close()
method. (#139)Make the database schema more strict regarding nulls. (#138)
Tests are now run in a random order. (#142)
Version 0.16¶
Released 2019-09-02
Allow marking entries as important. (#127)
get_entries()
andget_feeds()
now take only keyword arguments.get_entries()
argumentwhich
is now deprecated in favor ofread
. (#136)
Version 0.15¶
Released 2019-08-24
Improve entry page rendering for text/plain content. (#117)
Improve entry page rendering for images and code blocks. (#126)
Show enclosures on the entry page. (#128)
Show the entry author. (#129)
Fix bug causing the enclosure tag proxy to use too much memory. (#133)
Start using mypy on the core modules. (#132)
Version 0.14¶
Released 2019-08-12
Version 0.13¶
Released 2019-07-12
Add entry page. (#117)
get_feed()
now raisesFeedNotFoundError
if the feed does not exist; useget_feed(..., default=None)
for the old behavior.Add
get_entry()
. (#120)
Version 0.12¶
Released 2019-06-22
Version 0.11¶
Released 2019-05-26
Version 0.10¶
Released 2019-05-18
Unify plugin loading and error handling code. (#112)
Minor improvements to CLI error reporting.
Version 0.9¶
Released 2019-05-12
Improve the
get_entries()
sorting algorithm. Fixes a bug introduced by #106 (entries of new feeds would always show up at the top). (#113)
Version 0.8¶
Released 2019-04-21
Version 0.7¶
Released 2019-04-14
Increase timeout of the button actions from 2 to 10 seconds.
get_entries()
now sorts entries by the import date first, and then bypublished
/updated
. (#106)Add
enclosure_dedupe
plugin (deduplicate enclosures of an entry). (#78)The
serve
command now supports loading plugins. (#78)reader.app.wsgi
now supports loading plugins. (#78)
Version 0.6¶
Released 2019-04-13
Version 0.5¶
Released 2019-02-09
Make updating new feeds up to 2 orders of magnitude faster; fixes a problem introduced by #94. (#104)
Move the core modules to a separate subpackage and enforce test coverage (
make coverage
now fails if the coverage for core modules is less than 100%). (#101)Support Python 3.8 development branch.
Add
dev
anddocs
extras (to install development requirements).Build HTML documentation when running tox.
Add
test-all
anddocs
make targets (to run tox / build HTML docs).
Version 0.4¶
Released 2019-01-02
Support Python 3.7.
Entry
content
andenclosures
now default to an empty tuple instead ofNone
. (#99)get_feeds()
now sorts feeds byuser_title
ortitle
instead of justtitle
. (#102)get_feeds()
now sorts feeds in a case insensitive way. (#103)Add
sort
argument toget_feeds()
; allows sorting feeds by title or by when they were added. (#98)Allow changing the feed sort order in the web application. (#98)
Version 0.3¶
Released on 2018-12-22
get_entries()
now prefers sorting bypublished
(if present) to sorting byupdated
. (#97)Add
regex_mark_as_read
plugin (mark new entries as read based on a regex). (#79)Add
feed_entry_dedupe
plugin (deduplicate new entries for a feed). (#79)Plugin loading machinery dependencies are now installed via the
plugins
extra.Add a plugins section to the documentation.
Version 0.2¶
Released on 2018-11-25
Version 0.1.1¶
Released on 2018-10-21
Fix broken
reader serve
command (broken in 0.1).Raise
StorageError
for unsupported SQLite configurations atReader
instantiation instead of failing at run-time with a genericStorageError("sqlite3 error")
. (#92)Fix wrong submit button being used when pressing enter in non-button fields. (#69)
Raise
StorageError
for failed migrations instead of an undocumented exception. (#92)Use
requests-mock
in parser tests instead of a web server (test suite run time down by ~35%). (#90)