Merge pull request #177 from Puyodead1/feat/cookies

Merge feat/cookies
This commit is contained in:
Puyodead1 2023-08-12 23:56:15 -04:00 committed by GitHub
commit 84eb17b793
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
4 changed files with 179 additions and 133 deletions

177
README.md
View File

@ -10,9 +10,9 @@
# NOTE # NOTE
- **This tool will not work without decryption keys. Do not bother installing unless you already have keys or can obtain them!** - **This tool will not work without decryption keys. Do not bother installing unless you already have keys or can obtain them!**
- **Downloading courses is against Udemy's Terms of Service, I am NOT held responsible for your account getting suspended as a result from the use of this program!** - **Downloading courses is against Udemy's Terms of Service, I am NOT held responsible for your account getting suspended as a result from the use of this program!**
- This program is WIP, the code is provided as-is and I am not held resposible for any legal issues resulting from the use of this program. - This program is WIP, the code is provided as-is and I am not held resposible for any legal issues resulting from the use of this program.
# Description # Description
@ -25,10 +25,10 @@ The following are a list of required third-party tools, you will need to ensure
_**Note**:_ _These are seperate requirements that are not installed with the pip command! You will need to download and install these manually!_ _**Note**:_ _These are seperate requirements that are not installed with the pip command! You will need to download and install these manually!_
- [ffmpeg](https://www.ffmpeg.org/) - This tool is also available in Linux package repositories - [ffmpeg](https://www.ffmpeg.org/) - This tool is also available in Linux package repositories
- [aria2/aria2c](https://github.com/aria2/aria2/) - This tool is also available in Linux package repositories - [aria2/aria2c](https://github.com/aria2/aria2/) - This tool is also available in Linux package repositories
- [shaka-packager](https://github.com/shaka-project/shaka-packager/releases/latest) - [shaka-packager](https://github.com/shaka-project/shaka-packager/releases/latest)
- [yt-dlp](https://github.com/yt-dlp/yt-dlp/) - This tool is also available in Linux package repositories, but can also be installed using pip if desired (`pip install yt-dlp`) - [yt-dlp](https://github.com/yt-dlp/yt-dlp/) - This tool is also available in Linux package repositories, but can also be installed using pip if desired (`pip install yt-dlp`)
# Usage # Usage
@ -36,44 +36,56 @@ _quick and dirty how-to_
You will need to get a few things before you can use this program: You will need to get a few things before you can use this program:
- Decryption Key ID - Decryption Key ID
- Decryption Key - Decryption Key
- Udemy Course URL - Udemy Course URL
- Udemy Bearer Token (aka acccess token for udemy-dl users) - Udemy Bearer Token (aka acccess token for udemy-dl users)
- Udemy cookies (only required for subscription plans - see [Udemy Subscription Plans](#udemy-subscription-plans)) - Udemy cookies (only required for subscription plans - see [Udemy Subscription Plans](#udemy-subscription-plans))
## Setting up ## Setting up
- rename `.env.sample` to `.env` _(you only need to do this if you plan to use the .env file to store your bearer token)_ - rename `.env.sample` to `.env` _(you only need to do this if you plan to use the .env file to store your bearer token)_
- rename `keyfile.example.json` to `keyfile.json` - rename `keyfile.example.json` to `keyfile.json`
## Acquire Bearer Token ## Acquire Bearer Token
- Firefox: [Udemy-DL Guide](https://github.com/r0oth3x49/udemy-dl/issues/389#issuecomment-491903900) - Firefox: [Udemy-DL Guide](https://github.com/r0oth3x49/udemy-dl/issues/389#issuecomment-491903900)
- Chrome: [Udemy-DL Guide](https://github.com/r0oth3x49/udemy-dl/issues/389#issuecomment-492569372) - Chrome: [Udemy-DL Guide](https://github.com/r0oth3x49/udemy-dl/issues/389#issuecomment-492569372)
- If you want to use the .env file to store your Bearer Token, edit the .env and add your token. - If you want to use the .env file to store your Bearer Token, edit the .env and add your token.
## Key ID and Key ## Key ID and Key
It is up to you to acquire the key and key ID. Please **DO NOT** ask me for help acquiring these, decrypting DRM protected content can be considered piracy. The tool required for this has already been discused in a GitHub issue. It is up to you to acquire the key and key ID. Please **DO NOT** ask me for help acquiring these, decrypting DRM protected content can be considered piracy. The tool required for this has already been discused in a GitHub issue.
- Enter the key and key id in the `keyfile.json` - Enter the key and key id in the `keyfile.json`
- ![keyfile example](https://i.imgur.com/e5aU0ng.png) - ![keyfile example](https://i.imgur.com/e5aU0ng.png)
- ![example key and kid from console](https://i.imgur.com/awgndZA.png) - ![example key and kid from console](https://i.imgur.com/awgndZA.png)
## Start Downloading ## Cookies
To download a course included in a subscription plan that you did not purchase individually, you will need to use cookies. You can also use cookies as an alternative to Bearer Tokens.
The program can automatically extract them from your browser. You can specify what browser to extract cookies from with the `--browser` argument. Supported browsers are:
- chrome
- firefox
- opera
- edge
- brave
- chromium
- vivaldi
- safari
## Ready to go
You can now run the program, see the examples below. The course will download to `out_dir`. You can now run the program, see the examples below. The course will download to `out_dir`.
# Udemy Subscription Plans
You will need to use a different branch of the program, please see [feat/cookies](https://github.com/Puyodead1/udemy-downloader/tree/feat/cookies).
# Advanced Usage # Advanced Usage
``` ```
usage: main.py [-h] -c COURSE_URL [-b BEARER_TOKEN] [-q QUALITY] [-l LANG] [-cd CONCURRENT_DOWNLOADS] [--disable-ipv6] [--skip-lectures] [--download-assets] [--download-captions] [--keep-vtt] [--skip-hls] usage: main.py [-h] -c COURSE_URL [-b BEARER_TOKEN] [-q QUALITY] [-l LANG] [-cd CONCURRENT_DOWNLOADS] [--disable-ipv6] [--skip-lectures] [--download-assets] [--download-captions] [--download-quizzes]
[--info] [--id-as-course-name] [-sc] [--save-to-file] [--load-from-file] [--log-level LOG_LEVEL] [--use-h265] [--h265-crf H265_CRF] [--h265-preset H265_PRESET] [--use-nvenc] [-v] [--keep-vtt] [--skip-hls] [--info] [--id-as-course-name] [-sc] [--save-to-file] [--load-from-file] [--log-level LOG_LEVEL] [--browser {chrome,firefox,opera,edge,brave,chromium,vivaldi,safari}]
[--use-h265] [--h265-crf H265_CRF] [--h265-preset H265_PRESET] [--use-nvenc] [-v]
Udemy Downloader Udemy Downloader
@ -92,6 +104,7 @@ options:
--skip-lectures If specified, lectures won't be downloaded --skip-lectures If specified, lectures won't be downloaded
--download-assets If specified, lecture assets will be downloaded --download-assets If specified, lecture assets will be downloaded
--download-captions If specified, captions will be downloaded --download-captions If specified, captions will be downloaded
--download-quizzes If specified, quizzes will be downloaded
--keep-vtt If specified, .vtt files won't be removed --keep-vtt If specified, .vtt files won't be removed
--skip-hls If specified, hls streams will be skipped (faster fetching) (hls streams usually contain 1080p quality for non-drm lectures) --skip-hls If specified, hls streams will be skipped (faster fetching) (hls streams usually contain 1080p quality for non-drm lectures)
--info If specified, only course information will be printed, nothing will be downloaded --info If specified, only course information will be printed, nothing will be downloaded
@ -104,6 +117,8 @@ options:
time) time)
--log-level LOG_LEVEL --log-level LOG_LEVEL
Logging level: one of DEBUG, INFO, ERROR, WARNING, CRITICAL (Default is INFO) Logging level: one of DEBUG, INFO, ERROR, WARNING, CRITICAL (Default is INFO)
--browser {chrome,firefox,opera,edge,brave,chromium,vivaldi,safari}
The browser to extract cookies from
--use-h265 If specified, videos will be encoded with the H.265 codec --use-h265 If specified, videos will be encoded with the H.265 codec
--h265-crf H265_CRF Set a custom CRF value for H.265 encoding. FFMPEG default is 28 --h265-crf H265_CRF Set a custom CRF value for H.265 encoding. FFMPEG default is 28
--h265-preset H265_PRESET --h265-preset H265_PRESET
@ -112,55 +127,55 @@ options:
-v, --version show program's version number and exit -v, --version show program's version number and exit
``` ```
- Passing a Bearer Token and Course ID as an argument - Passing a Bearer Token and Course ID as an argument
- `python main.py -c <Course URL> -b <Bearer Token>` - `python main.py -c <Course URL> -b <Bearer Token>`
- `python main.py -c https://www.udemy.com/courses/myawesomecourse -b <Bearer Token>` - `python main.py -c https://www.udemy.com/courses/myawesomecourse -b <Bearer Token>`
- Download a specific quality - Download a specific quality
- `python main.py -c <Course URL> -q 720` - `python main.py -c <Course URL> -q 720`
- Download assets along with lectures - Download assets along with lectures
- `python main.py -c <Course URL> --download-assets` - `python main.py -c <Course URL> --download-assets`
- Download assets and specify a quality - Download assets and specify a quality
- `python main.py -c <Course URL> -q 360 --download-assets` - `python main.py -c <Course URL> -q 360 --download-assets`
- Download captions (Defaults to English) - Download captions (Defaults to English)
- `python main.py -c <Course URL> --download-captions` - `python main.py -c <Course URL> --download-captions`
- Download captions with specific language - Download captions with specific language
- `python main.py -c <Course URL> --download-captions -l en` - English subtitles - `python main.py -c <Course URL> --download-captions -l en` - English subtitles
- `python main.py -c <Course URL> --download-captions -l es` - Spanish subtitles - `python main.py -c <Course URL> --download-captions -l es` - Spanish subtitles
- `python main.py -c <Course URL> --download-captions -l it` - Italian subtitles - `python main.py -c <Course URL> --download-captions -l it` - Italian subtitles
- `python main.py -c <Course URL> --download-captions -l pl` - Polish Subtitles - `python main.py -c <Course URL> --download-captions -l pl` - Polish Subtitles
- `python main.py -c <Course URL> --download-captions -l all` - Downloads all subtitles - `python main.py -c <Course URL> --download-captions -l all` - Downloads all subtitles
- etc - etc
- Skip downloading lecture videos - Skip downloading lecture videos
- `python main.py -c <Course URL> --skip-lectures --download-captions` - Downloads only captions - `python main.py -c <Course URL> --skip-lectures --download-captions` - Downloads only captions
- `python main.py -c <Course URL> --skip-lectures --download-assets` - Downloads only assets - `python main.py -c <Course URL> --skip-lectures --download-assets` - Downloads only assets
- Keep .VTT caption files: - Keep .VTT caption files:
- `python main.py -c <Course URL> --download-captions --keep-vtt` - `python main.py -c <Course URL> --download-captions --keep-vtt`
- Skip parsing HLS Streams (HLS streams usually contain 1080p quality for Non-DRM lectures): - Skip parsing HLS Streams (HLS streams usually contain 1080p quality for Non-DRM lectures):
- `python main.py -c <Course URL> --skip-hls` - `python main.py -c <Course URL> --skip-hls`
- Print course information only: - Print course information only:
- `python main.py -c <Course URL> --info` - `python main.py -c <Course URL> --info`
- Specify max number of concurrent downloads: - Specify max number of concurrent downloads:
- `python main.py -c <Course URL> --concurrent-downloads 20` - `python main.py -c <Course URL> --concurrent-downloads 20`
- `python main.py -c <Course URL> -cd 20` - `python main.py -c <Course URL> -cd 20`
- Cache course information: - Cache course information:
- `python main.py -c <Course URL> --save-to-file` - `python main.py -c <Course URL> --save-to-file`
- Load course cache: - Load course cache:
- `python main.py -c <Course URL> --load-from-file` - `python main.py -c <Course URL> --load-from-file`
- Change logging level: - Change logging level:
- `python main.py -c <Course URL> --log-level DEBUG` - `python main.py -c <Course URL> --log-level DEBUG`
- `python main.py -c <Course URL> --log-level WARNING` - `python main.py -c <Course URL> --log-level WARNING`
- `python main.py -c <Course URL> --log-level INFO` - `python main.py -c <Course URL> --log-level INFO`
- `python main.py -c <Course URL> --log-level CRITICAL` - `python main.py -c <Course URL> --log-level CRITICAL`
- Use course ID as the course name: - Use course ID as the course name:
- `python main.py -c <Course URL> --id-as-course-name` - `python main.py -c <Course URL> --id-as-course-name`
- Encode in H.265: - Encode in H.265:
- `python main.py -c <Course URL> --use-h265` - `python main.py -c <Course URL> --use-h265`
- Encode in H.265 with custom CRF: - Encode in H.265 with custom CRF:
- `python main.py -c <Course URL> --use-h265 -h265-crf 20` - `python main.py -c <Course URL> --use-h265 -h265-crf 20`
- Encode in H.265 with custom preset: - Encode in H.265 with custom preset:
- `python main.py -c <Course URL> --use-h265 --h265-preset faster` - `python main.py -c <Course URL> --use-h265 --h265-preset faster`
- Encode in H.265 using NVIDIA hardware transcoding: - Encode in H.265 using NVIDIA hardware transcoding:
- `python main.py -c <Course URL> --use-h265 --use-nvenc` - `python main.py -c <Course URL> --use-h265 --use-nvenc`
If you encounter errors while downloading such as If you encounter errors while downloading such as
@ -178,11 +193,11 @@ if you want help using the program, join my [Discord](https://discord.gg/tMzrSxQ
# Credits # Credits
- https://github.com/Jayapraveen/Drm-Dash-stream-downloader - For the original code which this is based on - https://github.com/Jayapraveen/Drm-Dash-stream-downloader - For the original code which this is based on
- https://github.com/alastairmccormack/pywvpssh - For code related to PSSH extraction - https://github.com/alastairmccormack/pywvpssh - For code related to PSSH extraction
- https://github.com/alastairmccormack/pymp4parse - For code related to mp4 box parsing (used by pywvpssh) - https://github.com/alastairmccormack/pymp4parse - For code related to mp4 box parsing (used by pywvpssh)
- https://github.com/lbrayner/vtt-to-srt - For code related to converting subtitles from vtt to srt format - https://github.com/lbrayner/vtt-to-srt - For code related to converting subtitles from vtt to srt format
- https://github.com/r0oth3x49/udemy-dl - For some of the informaton related to using the udemy api - https://github.com/r0oth3x49/udemy-dl - For some of the informaton related to using the udemy api
## License ## License

View File

@ -1 +1 @@
__version__ = "1.2.10" __version__ = "1.2.10-cookies"

132
main.py
View File

@ -12,6 +12,7 @@ from html.parser import HTMLParser as compat_HTMLParser
from pathlib import Path from pathlib import Path
from typing import IO from typing import IO
import browser_cookie3
import m3u8 import m3u8
import requests import requests
import yt_dlp import yt_dlp
@ -29,7 +30,6 @@ from utils import extract_kid
from vtt_to_srt import convert from vtt_to_srt import convert
retry = 3 retry = 3
cookies = ""
downloader = None downloader = None
logger: logging.Logger = None logger: logging.Logger = None
dl_assets = False dl_assets = False
@ -56,6 +56,8 @@ use_h265 = False
h265_crf = 28 h265_crf = 28
h265_preset = "medium" h265_preset = "medium"
use_nvenc = False use_nvenc = False
browser = None
cj = None
# from https://stackoverflow.com/a/21978778/9785713 # from https://stackoverflow.com/a/21978778/9785713
@ -68,7 +70,7 @@ def log_subprocess_output(prefix: str, pipe: IO[bytes]):
# this is the first function that is called, we parse the arguments, setup the logger, and ensure that required directories exist # this is the first function that is called, we parse the arguments, setup the logger, and ensure that required directories exist
def pre_run(): def pre_run():
global cookies, dl_assets, dl_captions, dl_quizzes, skip_lectures, caption_locale, quality, bearer_token, course_name, keep_vtt, skip_hls, concurrent_downloads, disable_ipv6, load_from_file, save_to_file, bearer_token, course_url, info, logger, keys, id_as_course_name, is_subscription_course, LOG_LEVEL, use_h265, h265_crf, h265_preset, use_nvenc global dl_assets, dl_captions, dl_quizzes, skip_lectures, caption_locale, quality, bearer_token, course_name, keep_vtt, skip_hls, concurrent_downloads, disable_ipv6, load_from_file, save_to_file, bearer_token, course_url, info, logger, keys, id_as_course_name, LOG_LEVEL, use_h265, h265_crf, h265_preset, use_nvenc, browser
# make sure the directory exists # make sure the directory exists
if not os.path.exists(DOWNLOAD_DIR): if not os.path.exists(DOWNLOAD_DIR):
@ -187,6 +189,12 @@ def pre_run():
type=str, type=str,
help="Logging level: one of DEBUG, INFO, ERROR, WARNING, CRITICAL (Default is INFO)", help="Logging level: one of DEBUG, INFO, ERROR, WARNING, CRITICAL (Default is INFO)",
) )
parser.add_argument(
"--browser",
dest="browser",
help="The browser to extract cookies from",
choices=["chrome", "firefox", "opera", "edge", "brave", "chromium", "vivaldi", "safari"],
)
parser.add_argument( parser.add_argument(
"--use-h265", "--use-h265",
dest="use_h265", dest="use_h265",
@ -304,6 +312,8 @@ def pre_run():
id_as_course_name = args.id_as_course_name id_as_course_name = args.id_as_course_name
if args.is_subscription_course: if args.is_subscription_course:
is_subscription_course = args.is_subscription_course is_subscription_course = args.is_subscription_course
if args.browser:
browser = args.browser
Path(DOWNLOAD_DIR).mkdir(parents=True, exist_ok=True) Path(DOWNLOAD_DIR).mkdir(parents=True, exist_ok=True)
Path(SAVED_DIR).mkdir(parents=True, exist_ok=True) Path(SAVED_DIR).mkdir(parents=True, exist_ok=True)
@ -315,32 +325,40 @@ def pre_run():
else: else:
logger.warning("> Keyfile not found! You won't be able to decrypt videos!") logger.warning("> Keyfile not found! You won't be able to decrypt videos!")
# Read cookies from file
if os.path.exists(COOKIE_FILE_PATH):
with open(COOKIE_FILE_PATH, encoding="utf8", mode="r") as cookiefile:
cookies = cookiefile.read()
cookies = cookies.rstrip()
else:
logger.warning(
"No cookies.txt file was found, you won't be able to download subscription courses! You can ignore ignore this if you don't plan to download a course included in a subscription plan."
)
class Udemy: class Udemy:
def __init__(self, bearer_token): def __init__(self, bearer_token):
global cj
self.session = None self.session = None
self.bearer_token = None self.bearer_token = None
self.auth = UdemyAuth(cache_session=False) self.auth = UdemyAuth(cache_session=False)
if not self.session: if not self.session:
self.session, self.bearer_token = self.auth.authenticate(bearer_token=bearer_token) self.session = self.auth.authenticate(bearer_token=bearer_token)
if self.session and self.bearer_token: if not self.session:
self.session._headers.update({"Authorization": "Bearer {}".format(self.bearer_token)}) if browser == None:
self.session._headers.update({"X-Udemy-Authorization": "Bearer {}".format(self.bearer_token)}) logger.error("No bearer token was provided, and no browser for cookie extraction was specified.")
logger.info("Login Success") sys.exit(1)
else:
logger.fatal("Login Failure! You are probably missing an access token!") logger.warning("No bearer token was provided, attempting to use browser cookies.")
sys.exit(1)
self.session = self.auth._session
if browser == "chrome":
cj = browser_cookie3.chrome()
elif browser == "firefox":
cj = browser_cookie3.firefox()
elif browser == "opera":
cj = browser_cookie3.opera()
elif browser == "edge":
cj = browser_cookie3.edge()
elif browser == "brave":
cj = browser_cookie3.brave()
elif browser == "chromium":
cj = browser_cookie3.chromium()
elif browser == "vivaldi":
cj = browser_cookie3.vivaldi()
def _get_quiz(self, quiz_id): def _get_quiz(self, quiz_id):
print(portal_name) print(portal_name)
@ -540,14 +558,15 @@ class Udemy:
for pl in playlists: for pl in playlists:
resolution = pl.stream_info.resolution resolution = pl.stream_info.resolution
codecs = pl.stream_info.codecs codecs = pl.stream_info.codecs
if not resolution: if not resolution:
continue continue
if not codecs: if not codecs:
continue continue
width, height = resolution width, height = resolution
if height in seen: continue if height in seen:
continue
# we need to save the individual playlists to disk also # we need to save the individual playlists to disk also
playlist_path = Path(temp_path, f"index_{asset_id}_{width}x{height}.m3u8") playlist_path = Path(temp_path, f"index_{asset_id}_{width}x{height}.m3u8")
@ -869,9 +888,7 @@ class Udemy:
def _extract_course_info(self, url): def _extract_course_info(self, url):
global portal_name global portal_name
portal_name, course_name = self.extract_course_name(url) portal_name, course_name = self.extract_course_name(url)
course = { course = {"portal_name": portal_name}
"portal_name": portal_name
}
if not is_subscription_course: if not is_subscription_course:
results = self._subscribed_courses(portal_name=portal_name, course_name=course_name) results = self._subscribed_courses(portal_name=portal_name, course_name=course_name)
@ -898,11 +915,11 @@ class Udemy:
"It seems either you are not enrolled or you have to visit the course atleast once while you are logged in.", "It seems either you are not enrolled or you have to visit the course atleast once while you are logged in.",
) )
logger.info( logger.info(
"Trying to logout now...", "Terminating Session...",
) )
self.session.terminate() self.session.terminate()
logger.info( logger.info(
"Logged out successfully.", "Session terminated.",
) )
sys.exit(1) sys.exit(1)
@ -1009,6 +1026,7 @@ class Udemy:
return lecture return lecture
class Session(object): class Session(object):
def __init__(self): def __init__(self):
self._headers = HEADERS self._headers = HEADERS
@ -1023,11 +1041,10 @@ class Session(object):
def _set_auth_headers(self, bearer_token=""): def _set_auth_headers(self, bearer_token=""):
self._headers["Authorization"] = "Bearer {}".format(bearer_token) self._headers["Authorization"] = "Bearer {}".format(bearer_token)
self._headers["X-Udemy-Authorization"] = "Bearer {}".format(bearer_token) self._headers["X-Udemy-Authorization"] = "Bearer {}".format(bearer_token)
self._headers["Cookie"] = cookies
def _get(self, url): def _get(self, url):
for i in range(10): for i in range(10):
session = self._session.get(url, headers=self._headers) session = self._session.get(url, headers=self._headers, cookies=cj)
if session.ok or session.status_code in [502, 503]: if session.ok or session.status_code in [502, 503]:
return session return session
if not session.ok: if not session.ok:
@ -1036,7 +1053,7 @@ class Session(object):
time.sleep(0.8) time.sleep(0.8)
def _post(self, url, data, redirect=True): def _post(self, url, data, redirect=True):
session = self._session.post(url, data, headers=self._headers, allow_redirects=redirect) session = self._session.post(url, data, headers=self._headers, allow_redirects=redirect, cookies=cj)
if session.ok: if session.ok:
return session return session
if not session.ok: if not session.ok:
@ -1140,14 +1157,12 @@ class UdemyAuth(object):
self._cache = cache_session self._cache = cache_session
self._session = Session() self._session = Session()
def authenticate(self, bearer_token=""): def authenticate(self, bearer_token=None):
if bearer_token: if bearer_token:
self._session._set_auth_headers(bearer_token=bearer_token) self._session._set_auth_headers(bearer_token=bearer_token)
self._session._session.cookies.update({"bearer_token": bearer_token}) return self._session
return self._session, bearer_token
else: else:
self._session._set_auth_headers() return None
return None, None
def durationtoseconds(period): def durationtoseconds(period):
@ -1197,9 +1212,7 @@ def mux_process(video_title, video_filepath, audio_filepath, output_path):
transcode, video_filepath, audio_filepath, codec, h265_crf, h265_preset, video_title, output_path transcode, video_filepath, audio_filepath, codec, h265_crf, h265_preset, video_title, output_path
) )
else: else:
command = 'ffmpeg -y -i "{}" -i "{}" -c:v copy -c:a copy -fflags +bitexact -map_metadata -1 -metadata title="{}" "{}"'.format( command = 'ffmpeg -y -i "{}" -i "{}" -c:v copy -c:a copy -fflags +bitexact -map_metadata -1 -metadata title="{}" "{}"'.format(video_filepath, audio_filepath, video_title, output_path)
video_filepath, audio_filepath, video_title, output_path
)
else: else:
if use_h265: if use_h265:
command = 'nice -n 7 ffmpeg {} -y -i "{}" -i "{}" -c:v libx265 -vtag hvc1 -crf {} -preset {} -c:a copy -fflags +bitexact -map_metadata -1 -metadata title="{}" "{}"'.format( command = 'nice -n 7 ffmpeg {} -y -i "{}" -i "{}" -c:v libx265 -vtag hvc1 -crf {} -preset {} -c:a copy -fflags +bitexact -map_metadata -1 -metadata title="{}" "{}"'.format(
@ -1538,7 +1551,18 @@ def process_lecture(lecture, lecture_path, lecture_file_name, chapter_dir):
source_type = source.get("type") source_type = source.get("type")
if source_type == "hls": if source_type == "hls":
temp_filepath = lecture_path.replace(".mp4", ".%(ext)s") temp_filepath = lecture_path.replace(".mp4", ".%(ext)s")
cmd = ["yt-dlp", "--enable-file-urls", "--force-generic-extractor", "--concurrent-fragments", f"{concurrent_downloads}", "--downloader", "aria2c", "-o", f"{temp_filepath}", f"{url}"] cmd = [
"yt-dlp",
"--enable-file-urls",
"--force-generic-extractor",
"--concurrent-fragments",
f"{concurrent_downloads}",
"--downloader",
"aria2c",
"-o",
f"{temp_filepath}",
f"{url}",
]
if disable_ipv6: if disable_ipv6:
cmd.append("--downloader-args") cmd.append("--downloader-args")
cmd.append('aria2c:"--disable-ipv6"') cmd.append('aria2c:"--disable-ipv6"')
@ -1574,7 +1598,6 @@ def process_lecture(lecture, lecture_path, lecture_file_name, chapter_dir):
logger.error(" > Missing sources for lecture", lecture) logger.error(" > Missing sources for lecture", lecture)
def process_quiz(udemy: Udemy, lecture, chapter_dir): def process_quiz(udemy: Udemy, lecture, chapter_dir):
lecture_title = lecture.get("lecture_title") lecture_title = lecture.get("lecture_title")
lecture_index = lecture.get("lecture_index") lecture_index = lecture.get("lecture_index")
@ -1594,7 +1617,6 @@ def process_quiz(udemy: Udemy, lecture, chapter_dir):
f.write(html) f.write(html)
def parse_new(udemy: Udemy, udemy_object: dict): def parse_new(udemy: Udemy, udemy_object: dict):
total_chapters = udemy_object.get("total_chapters") total_chapters = udemy_object.get("total_chapters")
total_lectures = udemy_object.get("total_lectures") total_lectures = udemy_object.get("total_lectures")
@ -1848,28 +1870,37 @@ def main():
udemy_object["title"] = title udemy_object["title"] = title
udemy_object["course_title"] = course_title udemy_object["course_title"] = course_title
udemy_object["chapters"] = [] udemy_object["chapters"] = []
counter = -1 chapter_index_counter = -1
if resource: if resource:
logger.info("> Trying to logout") logger.info("> Terminating Session...")
udemy.session.terminate() udemy.session.terminate()
logger.info("> Logged out.") logger.info("> Session Terminated.")
if course: if course:
logger.info("> Processing course data, this may take a minute. ") logger.info("> Processing course data, this may take a minute. ")
lecture_counter = 0 lecture_counter = 0
lectures = []
for entry in course: for entry in course:
clazz = entry.get("_class") clazz = entry.get("_class")
if clazz == "chapter": if clazz == "chapter":
# add all lectures for the previous chapter
if len(lectures) > 0:
udemy_object["chapters"][chapter_index_counter]["lectures"] = lectures
udemy_object["chapters"][chapter_index_counter]["lecture_count"] = len(lectures)
# reset lecture tracking
lecture_counter = 0 lecture_counter = 0
lectures = [] lectures = []
chapter_index = entry.get("object_index") chapter_index = entry.get("object_index")
chapter_title = "{0:02d} - ".format(chapter_index) + sanitize_filename(entry.get("title")) chapter_title = "{0:02d} - ".format(chapter_index) + sanitize_filename(entry.get("title"))
if chapter_title not in udemy_object["chapters"]: if chapter_title not in udemy_object["chapters"]:
udemy_object["chapters"].append({"chapter_title": chapter_title, "chapter_id": entry.get("id"), "chapter_index": chapter_index, "lectures": []}) udemy_object["chapters"].append({"chapter_title": chapter_title, "chapter_id": entry.get("id"), "chapter_index": chapter_index, "lectures": []})
counter += 1 chapter_index_counter += 1
elif clazz == "lecture": elif clazz == "lecture":
lecture_counter += 1 lecture_counter += 1
lecture_id = entry.get("id") lecture_id = entry.get("id")
@ -1889,8 +1920,8 @@ def main():
lecture_title = "{0:03d} ".format(lecture_counter) + sanitize_filename(entry.get("title")) lecture_title = "{0:03d} ".format(lecture_counter) + sanitize_filename(entry.get("title"))
lectures.append({"index": lecture_counter, "lecture_index": lecture_index, "lecture_title": lecture_title, "_class": entry.get("_class"), "id": lecture_id, "data": entry}) lectures.append({"index": lecture_counter, "lecture_index": lecture_index, "lecture_title": lecture_title, "_class": entry.get("_class"), "id": lecture_id, "data": entry})
udemy_object["chapters"][counter]["lectures"] = lectures else:
udemy_object["chapters"][counter]["lecture_count"] = len(lectures) logger.debug("Lecture: ID is None, skipping")
elif clazz == "quiz": elif clazz == "quiz":
lecture_counter += 1 lecture_counter += 1
lecture_id = entry.get("id") lecture_id = entry.get("id")
@ -1910,9 +1941,8 @@ def main():
lecture_title = "{0:03d} ".format(lecture_counter) + sanitize_filename(entry.get("title")) lecture_title = "{0:03d} ".format(lecture_counter) + sanitize_filename(entry.get("title"))
lectures.append({"index": lecture_counter, "lecture_index": lecture_index, "lecture_title": lecture_title, "_class": entry.get("_class"), "id": lecture_id, "data": entry}) lectures.append({"index": lecture_counter, "lecture_index": lecture_index, "lecture_title": lecture_title, "_class": entry.get("_class"), "id": lecture_id, "data": entry})
else:
udemy_object["chapters"][counter]["lectures"] = lectures logger.debug("Quiz: ID is None, skipping")
udemy_object["chapters"][counter]["lectures_count"] = len(lectures)
udemy_object["total_chapters"] = len(udemy_object["chapters"]) udemy_object["total_chapters"] = len(udemy_object["chapters"])
udemy_object["total_lectures"] = sum([entry.get("lecture_count", 0) for entry in udemy_object["chapters"] if entry]) udemy_object["total_lectures"] = sum([entry.get("lecture_count", 0) for entry in udemy_object["chapters"] if entry])

View File

@ -15,3 +15,4 @@ lxml
six six
pathvalidate pathvalidate
coloredlogs coloredlogs
browser_cookie3