CODARIUM

Share this post
"Scrape" ALL latest news on any company/person within 3 lines of Python code
codarium.substack.com

"Scrape" ALL latest news on any company/person within 3 lines of Python code

Mining news data from Google

Artem Bugara
Jul 6, 2020
2
Share this post
"Scrape" ALL latest news on any company/person within 3 lines of Python code
codarium.substack.com

Step 1. Install pygooglenews

$ pip install pygooglenews

Step 2. Three lines to get news data

from pygooglenews import GoogleNews

gn = GoogleNews(lang = 'en', country = 'US')

# latest news on Amazon that got published over the last hour
news = gn.search('Amazon', when = '1h')

Why is it cool?

  1. You get data from Google — the best search engine

  2. pygooglenews hits RSS feed URL (not the UI URL). So, you not get blocked by scraping Google

  3. Google’s RSS can have up to 100 articles. But, Google allows you to specify that you want to see only the articles from the past hour.

    Make such a request a few times an hour, and you will not miss a single news article that mentions your company/person of interest.

    (Unless there were significantly more than 100 articles indexed by Google about it within the past hour)

Get 80% off for 1 year


Other reads that you might find interesting:

  1. "Reverse Engineering" Google News RSS Feed. Part I.

  2. “Doing HTTP requests? Always use 'requests' library, not 'urllib'“

Comment
Share
Share this post
"Scrape" ALL latest news on any company/person within 3 lines of Python code
codarium.substack.com

Create your profile

0 subscriptions will be displayed on your profile (edit)

Skip for now

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.

TopNewCommunity

No posts

Ready for more?

© 2022 Artem Bugara
Privacy ∙ Terms ∙ Collection notice
Publish on Substack Get the app
Substack is the home for great writing