922ad6db8b
* add yts_torrents project (#116) * add yts_torrents project * Added Python Script HacktoberFest 2019 (#128) * Added whatsapp-message Script * Update README.md * Script: Importance Checker (Updated Project List) (#132) * Add files to fork * Update README.md * Fixed ToDoBot link (#133) To Do bot link was using "-" instead of %20 Changed path to folders Removed full repo path to folders to make project robust to cloning as discussed on issue #139 on [AzureNotebooks](https://github.com/Microsoft/AzureNotebooks/issues/193). Fixed clean up photo directory name Fixed file organizer link Script: Importance Checker (Updated Project List) (#132) * Add files to fork * Update README.md Squash Squash Fixed ToDoBot link To Do bot link was using "-" instead of %20 parent 7c58b564104cdd1990492276292f04ec19009e57 author MatTerra <mateus.b.s.terra@gmail.com> 1574710004 -0300 committer MatTerra <mateus.b.s.terra@gmail.com> 1574710650 -0300 Squash Fixed ToDoBot link To Do bot link was using "-" instead of %20 Squash Changed path to folders Removed full repo path to folders to make project robust to cloning as discussed on issue #139 on [AzureNotebooks](https://github.com/Microsoft/AzureNotebooks/issues/193). Fixed ToDoBot link To Do bot link was using "-" instead of %20 Script: Importance Checker (Updated Project List) (#132) * Add files to fork * Update README.md * All changes made : Script added (#130) * Added whatsapp-message Script * Update README.md Co-authored-by: Ayush Bhardwaj <classicayush@gmail.com> * Revert "All changes made : Script added (#130)" (#135) This reverts commit |
||
---|---|---|
.. | ||
css | ||
.gitignore | ||
chromedriver.exe | ||
ClassFilm.py | ||
film_content_parser.py | ||
html_creator.py | ||
main.py | ||
parser_config.py | ||
README.md | ||
requirements.txt |
IMDBQuerier
This project is written to parsing films from IMDB user lists based on some attributes. It uses Selenium and BeautifulSoup to obtain and parse the film data.
Until now, the project can parse films based on their:
- Runtime
- Score
- Year
- Genre
- Type (TV show or film)
Currently, one can make the exact queries on the refine section at the bottom of each user list. However, it is hard to apply your selections to all lists.
Checkout original repo for the latest version.
Requirements
Selenium and BeautifulSoup modules are necessary for the project. Other than that, you will need a WebDriver. The project is using ChromeDriver but you can change it to the other supported browsers easily.
If you have changed the driver, make sure to change the below code accordingly.
# main.py line 16
driver = webdriver.Chrome()
Here is a link for the Firefox driver.
Usage
First of all, change the values in the parse_options
dictionary in the parser_config.py.
Then, change the value of list_url
variable in the main.py code to the list wanted to be parsed.
Run the code, the output html will apear in list_htmls folder.
Common Driver Error
The used version of the browser driver can be out-dated. Always use the latest version in case of an error.