Python/web_programming/fetch_jobs.py
Christian Clauss a2fa32c7ad
Lukazlim: Replace dependency requests with httpx (#12744)
* Replace dependency `requests` with `httpx`

Fixes #12742
Signed-off-by: Lim, Lukaz Wei Hwang <lukaz.wei.hwang.lim@intel.com>

* updating DIRECTORY.md

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Lim, Lukaz Wei Hwang <lukaz.wei.hwang.lim@intel.com>
Co-authored-by: Lim, Lukaz Wei Hwang <lukaz.wei.hwang.lim@intel.com>
Co-authored-by: cclauss <cclauss@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-05-14 04:42:11 +03:00

35 lines
1012 B
Python

"""
Scraping jobs given job title and location from indeed website
"""
# /// script
# requires-python = ">=3.13"
# dependencies = [
# "beautifulsoup4",
# "httpx",
# ]
# ///
from __future__ import annotations
from collections.abc import Generator
import httpx
from bs4 import BeautifulSoup
url = "https://www.indeed.co.in/jobs?q=mobile+app+development&l="
def fetch_jobs(location: str = "mumbai") -> Generator[tuple[str, str]]:
soup = BeautifulSoup(httpx.get(url + location, timeout=10).content, "html.parser")
# This attribute finds out all the specifics listed in a job
for job in soup.find_all("div", attrs={"data-tn-component": "organicJob"}):
job_title = job.find("a", attrs={"data-tn-element": "jobTitle"}).text.strip()
company_name = job.find("span", {"class": "company"}).text.strip()
yield job_title, company_name
if __name__ == "__main__":
for i, job in enumerate(fetch_jobs("Bangalore"), 1):
print(f"Job {i:>2} is {job[0]} at {job[1]}")