Returning the "killed" RSS of Reuters from the dead
Reuters silently "killed" its RSS feed 2 days ago. It seems like they did it on purpose. But, I think I know a way to replace it by a custom Google News RSS feed
TL; DR I’ve been reverse-engineering Google News RSS’s querying features for the past few months of my spare time. In this article, I describe how you can make Google News RSS display only the latest Reuters articles.
Link to a working “Reuters” RSS in English: https://news.google.com/rss/search?q=when:24h+allinurl:reuters.com&ceid=US:en&hl=en-US&gl=US
Did it work? Consider subscribing to my newsletter to get more useful content like that. It’s free:
I am a co-founder of NewsCatcherAPI — ultra-fast API to find news articles by any topic, country, language, website, or keyword.
Guess how we collect the data for our database? Right, mostly through the RSS feeds.
When I saw this post on HackerNews 2 days ago I was a bit shocked. Because it meant that we do not fetch the news from Reuters anymore.
Fortunately for us, for the past few months, I’ve been working on my Python package that normalizes news data from Google News RSS feeds.
I figured out that Google allows you to perform advanced querying even when you just want to generate a news feed via RSS.
99% of the work I had to do is to figure out the syntax. Google has no official documentation for the News RSS feed. So, I had to collect the clues piece by piece over the internet.
The Python package + API will be released on the 1st of July 2020. Subscribe to not miss it. I have an 80% off promo while getting my first 100 paid subscribers. My long-reads about how I reverse-engineered Google News RSS will be published only for paid subscribers of CODARIUM, for example. It’s just 11$ for a year of subscription!
How does Google News RSS for “Reuters” work?
Once again, here is the link that will generate the reuters.com RSS feed of the best articles of the last 24 hours: https://news.google.com/rss/search?q=when:24h+allinurl:reuters.com&ceid=US:en&hl=en-US&gl=US
Let’s see it closely:
Base URL:
https://news.google.com/rss/search
Copy past it in your browser and you will see the Top Headlines RSS from Google News
Query Parameter:
q=when:24h+allinurl:reuters.com
That’s how you let Google know what you need.
when
parameter is responsible of fetching the last X hours articles
allinurl
parameter restricts search results to documents that contain all of the query words in the document URL
Country and language:
ceid=US:en
If you need a Reuters feed from another country then just change the URL to one with a subdomain, and the country & language parameter.
For example, Russian feed for the past 24 hours will look as follows:
https://news.google.com/rss/search?q=when:24h+allinurl:ru.reuters.com&hl=ru&gl=RU&ceid=RU:ru
Does this take all articles from Reuters? or only the top trending ones?
Nice, thanks!