Add ArchiveBox to Web Content Extracting section

This commit is contained in:
Nick Sweeting 2024-05-04 00:57:16 -07:00 committed by GitHub
parent b5bd4d0ad0
commit 1fa73e3ef7
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -1118,6 +1118,7 @@ Inspired by [awesome-php](https://github.com/ziadoz/awesome-php).
*Libraries for extracting web contents.*
* [archivebox](https://github.com/ArchiveBox/ArchiveBox) - Extract text, media, images, git repos, and more.
* [html2text](https://github.com/Alir3z4/html2text) - Convert HTML to Markdown-formatted text.
* [lassie](https://github.com/michaelhelmick/lassie) - Web Content Retrieval for Humans.
* [micawber](https://github.com/coleifer/micawber) - A small library for extracting rich content from URLs.