Download page as pdf (#196)

* Download page as PDF. * Contributor name. * Pudim page typo.
2024-11-23 20:11:07 +00:00 · 2020-10-24 08:50:19 -03:00 · 2020-10-24 08:50:19 -03:00 · 957f7ab45c
commit 957f7ab45c
parent 21b89e112a
4 changed files with 69 additions and 0 deletions
--- a/Download-page-as-pdf/README.md
+++ b/Download-page-as-pdf/README.md
@ -0,0 +1,24 @@
+# Download Page as PDF:
+
+Download a page as a PDF .
+
+ #### Required Modules :
+  - pyppdf
+    ```bash
+      pip3 install pyppdf
+    ```
+  - pyppyteer 
+    ```bash
+      pip3 install pyppeteer
+    ```
+
+ #### Examples of use :
+ - Download a page:
+ ```bash
+    python download-page-as-pdf.py -l 'www.pudim.com.br'
+ ```
+
+ - Download a page and give a pdf name:
+ ```bash
+    python download-page-as-pdf.py -l 'http://www.pudim.com.br' -n 'pudim.pdf'
+ ```
--- a/Download-page-as-pdf/download-page-as-pdf.py
+++ b/Download-page-as-pdf/download-page-as-pdf.py
@ -0,0 +1,42 @@
+#!/usr/bin/python
+# -*- coding: UTF-8 -*-
+
+import argparse
+import pyppdf
+import re
+from pyppeteer.errors import PageError, TimeoutError, NetworkError
+
+
+def main():
+    parser = argparse.ArgumentParser(description = 'Page Downloader as PDF')
+    parser.add_argument('--link', '-l', action = 'store', dest = 'link', 
+                        required = True, help = 'Inform the link to download.')
+    parser.add_argument('--name', '-n', action = 'store', dest = 'name', 
+                        required = False, help = 'Inform the name to save.')
+
+    arguments = parser.parse_args()
+
+    url = arguments.link
+
+    if not arguments.name:
+        name = re.sub(r'^\w+://', '', url.lower())
+        name = name.replace('/', '-')
+    else:
+        name = arguments.name
+
+    if not name.endswith('.pdf'):
+        name = name + '.pdf'
+
+    print(f'Name of the file: {name}')
+
+    try:
+        pyppdf.save_pdf(name, url)
+    except PageError:
+        print('URL could not be resolved.')
+    except TimeoutError:
+        print('Timeout.')
+    except NetworkError:
+        print('No access to the network.')
+
+if __name__ == '__main__':
+    main()
--- a/Download-page-as-pdf/requirements.txt
+++ b/Download-page-as-pdf/requirements.txt
@ -0,0 +1,2 @@
+pyppdf==0.1.2
+pyppeteer==0.2.2
--- a/README.md
+++ b/README.md
@ -165,6 +165,7 @@ So far, the following projects have been integrated to this repo:
 |[IMDBQuerier](IMDBQuerier)|[Burak Bekci](https://github.com/Bekci)
 |[URL shortener](url_shortener)|[Sam Ebison](https://github.com/ebsa491)
 |[2048](https://github.com/hastagAB/Awesome-Python-Scripts/tree/master/2048)|[Krunal](https://github.com/gitkp11)
+|[Download Page as PDF](https://github.com/hastagAB/Awesome-Python-Scripts/tree/master/Download-page-as-pdf)|[Jeremias Gomes](https://github.com/j3r3mias)


 ## How to use :