mirror of
https://gitlab.com/jeancf/twoot.git
synced 2025-01-17 23:07:04 +00:00
Add description of file structure
This commit is contained in:
parent
3d0262f005
commit
b45e60a778
127
structure.md
Normal file
127
structure.md
Normal file
|
@ -0,0 +1,127 @@
|
|||
# `main()`
|
||||
|
||||
- Start timer
|
||||
- Parse command line
|
||||
- Build config object - `build_config()`
|
||||
- Setup logging
|
||||
- Open or create database
|
||||
- Select nitter instance to use
|
||||
- Get soup of whole page and timeline (list of soup of items) - `get_timeline()`
|
||||
- Iterate timeline to generate list of dicts with content of each tweet:
|
||||
- Extract tweet ID
|
||||
- Extract timestamp
|
||||
- Skip if timestamp is not within acceptable range - `is_time_valid()`
|
||||
- Skip if if it is a retweet and retweets are excluded
|
||||
- Check database if tweet already exists and skip if it does
|
||||
- Extract author name
|
||||
- Extract twitter user name of author
|
||||
- Extract full status page URL
|
||||
- Add prefix if tweet is reply-to
|
||||
- Add prefix if tweet is retweet
|
||||
- Process media body `process_media_body()`
|
||||
- Add link to quoted page ("card")
|
||||
- Extract image(s) from card `process_card()`
|
||||
- process video and image attachments `process_attachments()`
|
||||
- Add custom footer
|
||||
- Add "Original tweet" footer
|
||||
- Add optional timestamp to footer
|
||||
- If no media, look for image in linked URL
|
||||
- Get filename of downloaded video
|
||||
- Update user profile if necessary - `update_profile()`
|
||||
- Login to Mastodon instance - `login()`
|
||||
- Check toot character limit
|
||||
- Iterate list of tweets
|
||||
- Check if toot cap not reached
|
||||
- Upload video if applicable (previously downloaded)
|
||||
- If no video and applicable, download and upload pic
|
||||
- Find in database toot id of replied_to_tweet
|
||||
- Post toot + insert in database
|
||||
- Clean up downloaded video files
|
||||
- Delete excess records in database
|
||||
|
||||
# `build_config()`
|
||||
|
||||
- Instanciate `global TOML` struct
|
||||
- populate TOML with default values
|
||||
- Load config file and (Over)write all valid keys with values read from file
|
||||
- If no config file, (Over)write all valid keys with values read from the command line
|
||||
- Verify that a minimum valid config is present
|
||||
|
||||
# `get_timeline()`
|
||||
|
||||
- Initiate requests session
|
||||
- Populate headers
|
||||
- Download nitter page of user
|
||||
- Make soup
|
||||
- Build a list with soup of each timeline item
|
||||
- Iterate list
|
||||
- if individual tweet, add to final list
|
||||
- if first tweet of thread, get the thread from tweet page - `_get_rest_of_thread()`
|
||||
|
||||
# `_get_rest_of_thread()`
|
||||
|
||||
- Download page
|
||||
- Make soup
|
||||
- Get all items in thread after main tweet
|
||||
- build list with references of previous tweet
|
||||
- Reverse timeline order
|
||||
|
||||
# `is_time_valid()`
|
||||
|
||||
- Compare timestamp to `tweet_delay` and `tweet_max_age`
|
||||
|
||||
# `process_media_body()`
|
||||
|
||||
- Copy plain text
|
||||
- Convert links starting with @ and # to plain text
|
||||
- Remove redirection from links `deredir_url()`
|
||||
- Substitute source from links `substitute_source()`
|
||||
- Remove trackers from fragments `clean_url()`
|
||||
|
||||
# `process_card()`
|
||||
|
||||
- Get list of image URL in card tag
|
||||
|
||||
# `process_attachments()`
|
||||
|
||||
- Collect URLs of images
|
||||
- Download nitter video (converted animated GIF) and save it in output directory
|
||||
- Download twitter video by calling `youtube_dl` and save it in output directory
|
||||
|
||||
# `update_profile()`
|
||||
|
||||
- Extract banner and avatar picture addresses from soup
|
||||
- Get the banner and avatar picture addresses from database
|
||||
- If user record not found in db, create a new one
|
||||
- If they have changed
|
||||
- Download banner and avatar pictures
|
||||
- Login to Mastodon - `login()`
|
||||
- Update credentials
|
||||
- Record image URLs in database
|
||||
|
||||
# `login()`
|
||||
|
||||
- Create Mastodon application
|
||||
- Login with password if provided
|
||||
- Login with token
|
||||
|
||||
# `deredir_url()`
|
||||
|
||||
- Populate HTTP headers
|
||||
- Download the page
|
||||
|
||||
# `substitute_source()`
|
||||
|
||||
- Parse URL
|
||||
- Susbtitute domain values from config
|
||||
- Unparse URL
|
||||
|
||||
# `clean_url()`
|
||||
|
||||
- Parse URL
|
||||
- Remove UTM parameters from query and fragments
|
||||
- Unparse URL
|
||||
|
||||
# `_remove_trackers_query(url_parsed.query)`
|
||||
|
||||
# `_remove_trackers_fragment(url_parsed.fragment)`
|
Loading…
Reference in New Issue
Block a user