Skip to Main Content
Python logo with a newspaper and the Chinese flag

China News Distributor

Webscraping

E-mail

Docx

Warning! This code relies on external sources, and may require some adjustments in order to work as intended.

Description

Webscraper used to get gather and distribute news articles based on keywords.

This project allows for a list of “collections” based on any number of keywords. The articles are organized in a Word document (.docx) with custom styling, allowing for easy overview. The documents are then sent out via e-mail.

This was a project born from the book and online course AutomateTheBoringStuffWithPython, which then got expanded way beyond it’s initial scope.

The problem is solved was, that a number of people, skimmed Chinese news websites and articles, whose content were interesting for the international context. These articles were copy/pasted into a document, which in turn was sent to be proofread, before being distributed to a large number of subscribers. This was done manually on a daily basis. The entire task took around 5 man-hours per day.

This was my very first project that had a real impact, was solving a real problem.

This project taught me the basics of refactoring, organizing code, Object-Oriented Programming, and scope creep.

I also learned how dynamic the web actually is under the hood, by how many times a correction was necessary when a path or selector had changed for a target HTML element.