I've been a TWIT listener ever since Leo started the first episode of This Week in Tech. I've learned a great deal as well as been entertained by the different netcasts over the years. Not to mention the live radiio show on the weekend.
So when COVID hit and Leo started the Club TWIT subscription I signed up immediately. I then wanted to be able to download the ad free versions of the episodes onto my hard drive so I could watch them on my KODI in the living room. My wife and I enjoy our morning coffee listening to TWIT, TWIG (now Intelligent Machines), Security Now, The Untitled Linux show, and many others.
The question I then had was, how do I take this XML file, the club twit membership download, and grab the podcast videos onto my KODI hard drive? I decided to write myself a downloader in python that would parse the XML file and extract the videos with their appropriate episode titles for their file names.
The result is on my github as dlclubtwit. I decided to store three initialisation variables in the environment so that they are not hard coded in the program. This would allow me to share the code with out having any of my private information in the program.
There are three environment variables, one required and two optional. The required variable is your own private club twit URL. The other two are the destination folder to download into, which defaults to the current working directory, and the block size of the download records which defaults to 1048576 (1024*1024) bytes.
In your .profile of your home account you add these three environment variables as follows:
# set up configuration for the twit club downloader
# twitcluburl - the url from the twit club for your shows
export twitcluburl={your private url here}
# twitclubblocksize - the size of the block to read from the stream while downloading
export twitclubblocksize=8388608
# twitclubdestination - the location for the downloaded files
export twitclubdestination=/home/mainmeister/kodi/twit.tv
When you log into you home account this will create these in your environment.
The program does have one command line argument in order to initially skip all of the current episodes. This is useful if you just want to start downloading future episodes. This is the -s or --skip argument.
I have a script I run in tmux as a background session which runs the python program once an hour.
while true; docd /home/mainmeister/kodi/twit.tvsource .venv/bin/activatepython3 main.pysleep 3600done
python3 -m venv .venv
source .env/bin/activate
pip install requests
git clone https://github.com/renesugar/html2txt.git
The requests module allows the reading of data from the internet. The html2txt module is used to convert the weird xml episode descriptions into plain text.
To run the script type
source .venv/bin/activate
python3 main.py
It will print out the three environment variables and then proceed to download any shows that have either not been downloaded or skipped. There is a lot of screen entertainment value which comes from the xml file for each episode. It then shows a count of the downloaded bytes in human readable form.
The program keeps a list of successfully downloaded or skipped episodes in a sqlite database file named dltwit.sqlite. This file is created when you first run the program. If this file is deleted then the next time you run the program it will recreate this file and download all of the episodes again, unless you use the -s or --skip command line argument.
No comments:
Post a Comment