This is a search engine for transcribed podcasts. The transcripts are automatically generated by WhisperX and Whisper.cpp. The site is updated whenever there's a new episode.
Search is powered by Tantivy and the website by Django. Video player is VideoJS, and CSS by Bootstrap
Labels are generated using Spacy and summaries by OpenHermes.
Share and Download icons by Icons8.
If you have any questions or suggestions, email me, or send me a message on Twitter / Reddit. I'm also on Discord.
The search engine has an API that can be accessed at /search/api.
GET /search/api/shows/
GET /search/api/shows/<id>/
GET /search/api/shows/<id>/episodes/
GET /search/api/shows/<id>/episodes/<id>/
examples:
curl -L https://fight.fudgie.org/search/api/shows/
curl -L https://fight.fudgie.org/search/api/shows/aj/
curl -L https://fight.fudgie.org/search/api/shows/aj/episodes/
curl -L https://fight.fudgie.org/search/api/shows/aj/episodes/20040624_Thu_Alex/
GET /search/api/latest/
example:
curl -L https://fight.fudgie.org/search/api/latest/
GET /search/api/search/<query>
query required parmeters:
q terms to search for
query optional parameters:
s comma separated list of show paths to search in (aj,sr,nn), default all sources
exact on or off, exact/verbatim query, skips stemming of terms
offset offset in the results
limit number of results to return
order episode, recent, or score
invert on or off, inverts the sources to search in
start_date YYYY-MM-DD
end_date YYYY-MM-DD
exmaple:
curl -L "https://fight.fudgie.org/search/api/search/?s=kf&keywords=knowledge%20fight&exact=on&invert=on&order=recent&limit=10"
The URLs returns a JSON object. There is a max limit of 10000 episodes in each query, to
get the next set of episodes, append ?offset=10000
.
curl https://fight.fudgie.org/search/api/shows/ | jq
I provide downloads of the archives I've collected here as torrents.
5,013 episodes, from 2008-04 to 2024-06-18. Most were downloaded as MP4, early videos converted from FLV and WMV.
Resolutions go from 320x240, to 480x270, 720x480, 700x394, 768x432 and finally 1280x720 from 2019-09-06.
9.11 TB, MP4 format
torrent magnet