Added automated web scraper script with README and requirements.

2025-05-17 14:46:43 +00:00 · 2023-10-07 05:27:05 +00:00 · 2023-10-07 05:27:05 +00:00 · 4b5cb86d29
commit 4b5cb86d29
parent e9e1cde1a6
3 changed files with 49 additions and 0 deletions
--- a/Automated-Web-Scraper/README.md
+++ b/Automated-Web-Scraper/README.md
@ -0,0 +1,31 @@
+# Automated Web Scraper
+
+This Python script automates the process of web scraping using the `requests` and `BeautifulSoup` libraries.
+
+## Usage
+
+1. Modify the script (`automated_web_scraper.py`) to set the URL you want to scrape and specify the data you want to extract.
+
+2. Run the script using Python.
+
+## Requirements
+
+- Python 3.x
+- `requests` library
+- `BeautifulSoup` library
+
+## Installation
+
+1. Clone this repository or download the script (`automated_web_scraper.py`).
+
+2. Install the required libraries using the `requirements.txt` file.
+
+3. Modify the script and run it to perform automated web scraping.
+
+## Author
+
+sonicdashh
+
+## License
+
+This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
--- a/Automated-Web-Scraper/automated-web-scraper.py
+++ b/Automated-Web-Scraper/automated-web-scraper.py
@ -0,0 +1,16 @@
+import requests
+from bs4 import BeautifulSoup
+
+# Specify the URL to scrape
+url = "https://example.com"
+
+# Send an HTTP request
+response = requests.get(url)
+
+# Parse the HTML content
+soup = BeautifulSoup(response.text, "html.parser")
+
+# Extract and process data (e.g., extract all headings)
+headings = soup.find_all(["h1", "h2", "h3", "h4", "h5", "h6"])
+for heading in headings:
+    print(heading.text.strip())
--- a/Automated-Web-Scraper/requirements.txt
+++ b/Automated-Web-Scraper/requirements.txt
@ -0,0 +1,2 @@
+requests==2.26.0
+beautifulsoup4==4.10.0