feat: fix auth, provider health checks, search, and redesign UI
- Fix register/login: dict-style access on UserTable ORM objects - Fix HTMX auth: inject JWT token in all HTMX request headers - Fix FS7 search: use DLE AJAX endpoint /engine/ajax/search.php - Fix ZT search: use ?p=series&search=QUERY (not DLE format) - Fix provider health: load hardcoded providers + domain manager - Add self.id to all anime/series providers - Redesign homepage: Netflix-style horizontal scroll cards (.hc) - Redesign search results: grouped by title, poster + synopsis + 3 buttons - Add Télécharger dropdown: season download + episode picker - Fix navbar CSS: restore .tabs flex layout, remove orphan rules - Fix HTMX spinner: remove inline display:none, use CSS indicator - Add AGENTS.md files across project for developer documentation
This commit is contained in:
@@ -0,0 +1,49 @@
|
||||
# Downloaders (app/downloaders/)
|
||||
|
||||
## OVERVIEW
|
||||
3-tier scraper architecture: anime catalogs → series catalogs → video players. Factory pattern routes URLs through each tier.
|
||||
|
||||
## STRUCTURE
|
||||
```
|
||||
downloaders/
|
||||
├── __init__.py # get_downloader(url) — 3-tier factory + GenericDownloader
|
||||
├── base.py # Legacy BaseDownloader (kept for compat)
|
||||
├── anime_sites/ # Anime streaming catalogs (see anime_sites/AGENTS.md)
|
||||
│ ├── __init__.py # get_anime_site(url) factory
|
||||
│ ├── base.py # BaseAnimeSite abstract class
|
||||
│ └── *.py # 5 anime providers
|
||||
├── series_sites/ # TV series catalogs (see series_sites/AGENTS.md)
|
||||
│ ├── __init__.py # get_series_site(url) factory
|
||||
│ ├── base.py # BaseSeriesSite abstract class
|
||||
│ └── fs7.py # 1 series provider
|
||||
└── video_players/ # File hosting extractors (see video_players/AGENTS.md)
|
||||
├── __init__.py # get_video_player(url) factory
|
||||
├── base.py # BaseVideoPlayer abstract class
|
||||
└── *.py # 13 video player handlers
|
||||
```
|
||||
|
||||
## WHERE TO LOOK
|
||||
|
||||
| Need | File | Notes |
|
||||
|------|------|-------|
|
||||
| Route URL to downloader | `__init__.py:32` | `get_downloader(url)` tries anime→series→video→generic |
|
||||
| Add anime provider | `anime_sites/` | Inherit BaseAnimeSite, register in anime_sites/__init__.py |
|
||||
| Add series provider | `series_sites/` | Inherit BaseSeriesSite, register in series_sites/__init__.py |
|
||||
| Add video player | `video_players/` | Inherit BaseVideoPlayer, register in video_players/__init__.py |
|
||||
| Provider domains/icons | `app/providers.py` | Separate from downloader code |
|
||||
|
||||
## CONVENTIONS
|
||||
|
||||
**URL pipe format**: `video_url|anime_page_url|episode_title` — metadata preserved through tiers. Anime/series sites return player URLs (not direct downloads). Video players extract final download links.
|
||||
|
||||
**Factory chain**: `get_downloader()` → `get_anime_site()` → `get_series_site()` → `get_video_player()` → `GenericDownloader`.
|
||||
|
||||
**New provider checklist**: 1) Create .py inheriting base class, 2) Implement required methods, 3) Add to `__init__.py` factory list, 4) Add to `app/providers.py`.
|
||||
|
||||
## ANTI-PATTERNS
|
||||
|
||||
- Do NOT return None from `get_download_link()` — raise Exception
|
||||
- Do NOT use sync `requests` — always `httpx.AsyncClient`
|
||||
- Do NOT forget `await self.close()` — causes resource leaks
|
||||
- Do NOT skip `sanitize_filename()` on extracted filenames
|
||||
- Do NOT hardcode User-Agent per player — use base class headers
|
||||
@@ -1,4 +1,4 @@
|
||||
# Anime Sites Downloaders
|
||||
# Anime Sites (app/downloaders/anime_sites/)
|
||||
|
||||
## OVERVIEW
|
||||
Handlers for French anime streaming catalogs that provide metadata and episode listings, delegating actual video extraction to video player handlers.
|
||||
@@ -7,8 +7,8 @@ Handlers for French anime streaming catalogs that provide metadata and episode l
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `base.py` | Abstract `BaseAnimeSite` class defining the interface all anime sites implement |
|
||||
| `animesama.py` | Primary provider with dynamic domain switching, multiple video player extraction |
|
||||
| `base.py` | Abstract `BaseAnimeSite` class defining the interface |
|
||||
| `animesama.py` | Primary provider — dynamic domain switching, multiple video player extraction |
|
||||
| `nekosama.py` | Neko-Sama / Gupy integration (metadata-only, no direct downloads) |
|
||||
| `animeultime.py` | Anime-Ultime catalog handler |
|
||||
| `vostfree.py` | Vostfree catalog handler |
|
||||
@@ -16,26 +16,26 @@ Handlers for French anime streaming catalogs that provide metadata and episode l
|
||||
|
||||
## CONVENTIONS
|
||||
|
||||
### Interface Contract
|
||||
Each site must implement four async methods from `BaseAnimeSite`:
|
||||
- `can_handle(url: str) -> bool` — URL pattern matching
|
||||
- `search_anime(query, lang) -> list[dict]` — Returns `{title, url, cover_image}`
|
||||
- `get_episodes(anime_url, lang) -> list[dict]` — Returns `{episode_number, url, title, host}`
|
||||
- `get_anime_metadata(anime_url) -> dict` — Returns `{synopsis, genres, rating, release_year, studio, poster_image, total_episodes, status}`
|
||||
- `get_download_link(url) -> tuple[str, str]` — Returns `(video_player_url, filename)`
|
||||
**Interface contract** — each site implements from `BaseAnimeSite`:
|
||||
- `can_handle(url)` — URL pattern matching
|
||||
- `search_anime(query, lang)` → `[{title, url, cover_image}]`
|
||||
- `get_episodes(anime_url, lang)` → `[{episode_number, url, title, host}]`
|
||||
- `get_anime_metadata(anime_url)` → `{synopsis, genres, rating, release_year, studio, poster_image, total_episodes, status}`
|
||||
- `get_download_link(url)` → `(video_player_url, filename)`
|
||||
|
||||
### Key Patterns
|
||||
- **Pipe-separated URLs**: `video_url|anime_page_url|episode_title` — preserves context across extraction
|
||||
- **Language parameter**: `lang="vostfr"` or `"vf"` — controls which episodes to return
|
||||
- **Video player delegation**: Anime sites return player URLs (vidmoly, sendvid, sibnet, lpayer), not direct downloads
|
||||
- **Filename generation**: `{anime_name} - S{season} - {episode}.mp4` format
|
||||
- **HTTP headers**: Browser UA and referer required to avoid blocking
|
||||
**Key patterns**:
|
||||
- Pipe-separated URLs: `video_url|anime_page_url|episode_title`
|
||||
- Language param: `lang="vostfr"` or `"vf"`
|
||||
- Video player delegation: returns player URLs (vidmoly, sendvid, etc.), NOT direct downloads
|
||||
- Filename format: `{anime_name} - S{season} - {episode}.mp4`
|
||||
- Browser UA + referer headers required
|
||||
|
||||
### Domain Detection
|
||||
- `AnimeSamaDownloader` fetches current domain from `anime-sama.pw` dynamically
|
||||
- Uses fallback chain for video extraction: detected player → cached player → priority list
|
||||
**Domain detection**: `AnimeSamaDownloader` fetches current domain from `anime-sama.pw` dynamically. Uses fallback chain for video extraction.
|
||||
|
||||
### Error Handling
|
||||
- Raise `Exception` with descriptive message on failure
|
||||
- Log at appropriate level (`debug` for expected failures, `error` for unexpected)
|
||||
- Validate extracted URLs with `_test_video_url()` before returning
|
||||
**Error handling**: Raise `Exception` with descriptive message. Log at `debug` for expected failures, `error` for unexpected. Validate URLs with `_test_video_url()` before returning.
|
||||
|
||||
## ANTI-PATTERNS
|
||||
|
||||
- Do NOT return direct download URLs from anime sites — return player URLs
|
||||
- Do NOT skip URL validation — use `_test_video_url()`
|
||||
- 5 empty `except:` blocks in `animesama.py` — known tech debt, silently swallow failures
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -10,6 +10,10 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
|
||||
BASE_DOMAINS = ["anime-ultime.com", "anime-ultime.net", "www.anime-ultime.net"]
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.id = "anime-ultime"
|
||||
|
||||
def can_handle(self, url: str) -> bool:
|
||||
return any(domain in url.lower() for domain in self.BASE_DOMAINS)
|
||||
|
||||
@@ -24,58 +28,79 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
final_url = str(response.url)
|
||||
|
||||
# Parse the page
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
# Method 0: Look for og:video meta tag (most reliable for anime-ultime)
|
||||
og_video = soup.find('meta', property='og:video')
|
||||
if og_video and og_video.get('content'):
|
||||
video_url = og_video['content']
|
||||
if video_url.endswith('.mp4'):
|
||||
og_video = soup.find("meta", property="og:video")
|
||||
if og_video and og_video.get("content"):
|
||||
video_url = og_video["content"]
|
||||
if video_url.endswith(".mp4"):
|
||||
filename = self._generate_filename(final_url)
|
||||
print(f"[ANIME-ULTIME] Found og:video link: {video_url}")
|
||||
return video_url, filename
|
||||
|
||||
# Method 1: Look for direct download links (DDL)
|
||||
# Anime-Ultime often uses links to file hosts
|
||||
download_links = soup.find_all('a', href=True)
|
||||
download_links = soup.find_all("a", href=True)
|
||||
for link in download_links:
|
||||
href = link['href']
|
||||
href = link["href"]
|
||||
text = link.get_text().lower()
|
||||
|
||||
# Look for download buttons/links
|
||||
if any(keyword in text for keyword in ['télécharger', 'download', 'ddl', 'mega', 'google', 'drive']):
|
||||
if any(
|
||||
keyword in text
|
||||
for keyword in [
|
||||
"télécharger",
|
||||
"download",
|
||||
"ddl",
|
||||
"mega",
|
||||
"google",
|
||||
"drive",
|
||||
]
|
||||
):
|
||||
# Check if it's a direct link or to a file host
|
||||
if any(host in href.lower() for host in ['mega.nz', 'drive.google.com', 'uptobox.com', '1fichier.com']):
|
||||
if any(
|
||||
host in href.lower()
|
||||
for host in [
|
||||
"mega.nz",
|
||||
"drive.google.com",
|
||||
"uptobox.com",
|
||||
"1fichier.com",
|
||||
]
|
||||
):
|
||||
filename = self._generate_filename(final_url)
|
||||
return href, filename
|
||||
|
||||
# Method 2: Look for iframe with video player
|
||||
iframes = soup.find_all('iframe')
|
||||
iframes = soup.find_all("iframe")
|
||||
for iframe in iframes:
|
||||
src = iframe.get('src', '')
|
||||
if src and any(provider in src for provider in ['video', 'player', 'stream', 'play']):
|
||||
if src.startswith('http'):
|
||||
src = iframe.get("src", "")
|
||||
if src and any(
|
||||
provider in src
|
||||
for provider in ["video", "player", "stream", "play"]
|
||||
):
|
||||
if src.startswith("http"):
|
||||
filename = self._generate_filename(final_url)
|
||||
return src, filename
|
||||
|
||||
# Method 3: Look for video tags
|
||||
videos = soup.find_all('video')
|
||||
videos = soup.find_all("video")
|
||||
for video in videos:
|
||||
src = video.get('src', '')
|
||||
src = video.get("src", "")
|
||||
if src:
|
||||
filename = self._generate_filename(final_url)
|
||||
return src, filename
|
||||
|
||||
# Check source tags
|
||||
sources = video.find_all('source')
|
||||
sources = video.find_all("source")
|
||||
for source in sources:
|
||||
src = source.get('src', '')
|
||||
src = source.get("src", "")
|
||||
if src:
|
||||
filename = self._generate_filename(final_url)
|
||||
return src, filename
|
||||
|
||||
# Method 4: Look in scripts for video URLs
|
||||
scripts = soup.find_all('script')
|
||||
scripts = soup.find_all("script")
|
||||
for script in scripts:
|
||||
if script.string:
|
||||
# Look for common video patterns
|
||||
@@ -91,26 +116,30 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
matches = re.findall(pattern, script.string)
|
||||
for match in matches:
|
||||
# Clean up escaped characters
|
||||
match = match.replace('\\/', '/').replace('\\', '')
|
||||
if any(ext in match for ext in ['mp4', 'm3u8', 'mkv']):
|
||||
match = match.replace("\\/", "/").replace("\\", "")
|
||||
if any(ext in match for ext in ["mp4", "m3u8", "mkv"]):
|
||||
filename = self._generate_filename(final_url)
|
||||
return match, filename
|
||||
|
||||
# Look for anime-ultime specific patterns
|
||||
# They sometimes store links in JavaScript variables
|
||||
ddl_match = re.search(r'ddl["\']?\s*:\s*["\']([^"\']+)["\']', script.string)
|
||||
ddl_match = re.search(
|
||||
r'ddl["\']?\s*:\s*["\']([^"\']+)["\']', script.string
|
||||
)
|
||||
if ddl_match:
|
||||
ddl_url = ddl_match.group(1)
|
||||
if ddl_url.startswith('http'):
|
||||
if ddl_url.startswith("http"):
|
||||
filename = self._generate_filename(final_url)
|
||||
return ddl_url, filename
|
||||
|
||||
# Method 5: Look for links with specific classes or IDs
|
||||
# Anime-Ultime might use specific class names for download links
|
||||
potential_links = soup.find_all('a', class_=re.compile(r'download|ddl|episode', re.I))
|
||||
potential_links = soup.find_all(
|
||||
"a", class_=re.compile(r"download|ddl|episode", re.I)
|
||||
)
|
||||
for link in potential_links:
|
||||
href = link.get('href', '')
|
||||
if href and href.startswith('http'):
|
||||
href = link.get("href", "")
|
||||
if href and href.startswith("http"):
|
||||
filename = self._generate_filename(final_url)
|
||||
return href, filename
|
||||
|
||||
@@ -132,36 +161,38 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
episode = "01"
|
||||
|
||||
# Format: info-0-1/EPISODE_ID or info-0-1/EPISODE_ID/NAME-EP-vostfr
|
||||
if 'info-0-1/' in url:
|
||||
if "info-0-1/" in url:
|
||||
# Extract episode ID
|
||||
ep_match = re.search(r'info-0-1/(\d+)', url)
|
||||
ep_match = re.search(r"info-0-1/(\d+)", url)
|
||||
if ep_match:
|
||||
ep_id = ep_match.group(1)
|
||||
|
||||
# Try to get anime name from URL path
|
||||
name_match = re.search(r'info-0-1/\d+/([^/]+)', url)
|
||||
name_match = re.search(r"info-0-1/\d+/([^/]+)", url)
|
||||
if name_match:
|
||||
raw_name = name_match.group(1)
|
||||
# Extract episode number
|
||||
ep_num_match = re.search(r'-(\d+)-vostfr$', raw_name, re.I)
|
||||
ep_num_match = re.search(r"-(\d+)-vostfr$", raw_name, re.I)
|
||||
if ep_num_match:
|
||||
episode = ep_num_match.group(1).zfill(2)
|
||||
# Remove episode number and suffix from name
|
||||
anime_name = re.sub(r'-\d+-vostfr$', '', raw_name, flags=re.I).replace('-', ' ')
|
||||
anime_name = re.sub(
|
||||
r"-\d+-vostfr$", "", raw_name, flags=re.I
|
||||
).replace("-", " ")
|
||||
else:
|
||||
# Just use the ID
|
||||
anime_name = f"Episode {ep_id}"
|
||||
else:
|
||||
anime_name = f"Episode {ep_id}"
|
||||
|
||||
elif 'file-0-1/' in url:
|
||||
elif "file-0-1/" in url:
|
||||
# Extract from file-0-1/ID-NAME format
|
||||
file_match = re.search(r'file-0-1/\d+-(.+)$', url)
|
||||
file_match = re.search(r"file-0-1/\d+-(.+)$", url)
|
||||
if file_match:
|
||||
anime_name = file_match.group(1).replace('-', ' ')
|
||||
anime_name = file_match.group(1).replace("-", " ")
|
||||
|
||||
# Sanitize filename
|
||||
anime_name = anime_name.replace('/', ' ').strip()
|
||||
anime_name = anime_name.replace("/", " ").strip()
|
||||
filename = f"{anime_name} - Episode {episode}.mp4"
|
||||
return filename.title()
|
||||
|
||||
@@ -173,30 +204,30 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
try:
|
||||
print(f"[ANIME-ULTIME] Extracting metadata from: {anime_url}")
|
||||
response = await self.client.get(anime_url)
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
metadata = {
|
||||
'synopsis': None,
|
||||
'genres': [],
|
||||
'rating': None,
|
||||
'release_year': None,
|
||||
'studio': None,
|
||||
'poster_image': None,
|
||||
'banner_image': None,
|
||||
'total_episodes': None,
|
||||
'status': None,
|
||||
'alternative_titles': []
|
||||
"synopsis": None,
|
||||
"genres": [],
|
||||
"rating": None,
|
||||
"release_year": None,
|
||||
"studio": None,
|
||||
"poster_image": None,
|
||||
"banner_image": None,
|
||||
"total_episodes": None,
|
||||
"status": None,
|
||||
"alternative_titles": [],
|
||||
}
|
||||
|
||||
# Extract synopsis
|
||||
synopsis_selectors = [
|
||||
'div.synopsis',
|
||||
'div.description',
|
||||
"div.synopsis",
|
||||
"div.description",
|
||||
'div[class*="synopsis"]',
|
||||
'div[class*="synopsis"]',
|
||||
'p.synopsis',
|
||||
'.info',
|
||||
'div.texte'
|
||||
"p.synopsis",
|
||||
".info",
|
||||
"div.texte",
|
||||
]
|
||||
|
||||
for selector in synopsis_selectors:
|
||||
@@ -204,68 +235,73 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
if synopsis_elem:
|
||||
synopsis = synopsis_elem.get_text(strip=True)
|
||||
if len(synopsis) > 50:
|
||||
metadata['synopsis'] = synopsis
|
||||
metadata["synopsis"] = synopsis
|
||||
break
|
||||
|
||||
# Extract genres from meta tags and page content
|
||||
page_text = soup.get_text()
|
||||
|
||||
# Look for genre in meta tags
|
||||
genre_meta = soup.find('meta', property='genre') or soup.find('meta', attrs={'name': 'genre'})
|
||||
genre_meta = soup.find("meta", property="genre") or soup.find(
|
||||
"meta", attrs={"name": "genre"}
|
||||
)
|
||||
if genre_meta:
|
||||
genres_text = genre_meta.get('content', '')
|
||||
genres_text = genre_meta.get("content", "")
|
||||
if genres_text:
|
||||
metadata['genres'] = [g.strip() for g in genres_text.split(',')]
|
||||
metadata["genres"] = [g.strip() for g in genres_text.split(",")]
|
||||
|
||||
# Try to find genre links
|
||||
genre_links = soup.find_all('a', href=re.compile(r'genre|tag|type|cat', re.I))
|
||||
genre_links = soup.find_all(
|
||||
"a", href=re.compile(r"genre|tag|type|cat", re.I)
|
||||
)
|
||||
if genre_links:
|
||||
for link in genre_links[:5]:
|
||||
genre = link.get_text(strip=True)
|
||||
if genre and genre not in metadata['genres']:
|
||||
metadata['genres'].append(genre)
|
||||
if genre and genre not in metadata["genres"]:
|
||||
metadata["genres"].append(genre)
|
||||
|
||||
# Extract rating
|
||||
rating_selectors = [
|
||||
'span.rating',
|
||||
'div.rating',
|
||||
'span.score',
|
||||
'div.note',
|
||||
'.rating'
|
||||
"span.rating",
|
||||
"div.rating",
|
||||
"span.score",
|
||||
"div.note",
|
||||
".rating",
|
||||
]
|
||||
|
||||
for selector in rating_selectors:
|
||||
rating_elem = soup.select_one(selector)
|
||||
if rating_elem:
|
||||
rating_text = rating_elem.get_text(strip=True)
|
||||
rating_match = re.search(r'(\d+\.?\d*)\s*/\s*10', rating_text)
|
||||
rating_match = re.search(r"(\d+\.?\d*)\s*/\s*10", rating_text)
|
||||
if rating_match:
|
||||
metadata['rating'] = f"{rating_match.group(1)}/10"
|
||||
metadata["rating"] = f"{rating_match.group(1)}/10"
|
||||
break
|
||||
rating_match = re.search(r'(\d+\.?\d*)\s*/\s*5', rating_text)
|
||||
rating_match = re.search(r"(\d+\.?\d*)\s*/\s*5", rating_text)
|
||||
if rating_match:
|
||||
rating_val = float(rating_match.group(1)) * 2
|
||||
metadata['rating'] = f"{rating_val:.1f}/10"
|
||||
metadata["rating"] = f"{rating_val:.1f}/10"
|
||||
break
|
||||
|
||||
# Extract release year
|
||||
year_match = re.search(r'\b(19\d{2}|20\d{2})\b', page_text)
|
||||
year_match = re.search(r"\b(19\d{2}|20\d{2})\b", page_text)
|
||||
if year_match:
|
||||
import datetime
|
||||
|
||||
current_year = datetime.datetime.now().year + 2
|
||||
year = int(year_match.group(1))
|
||||
if 1950 <= year <= current_year:
|
||||
metadata['release_year'] = year
|
||||
metadata["release_year"] = year
|
||||
|
||||
# Extract poster image from og:image
|
||||
og_image = soup.find('meta', property='og:image')
|
||||
og_image = soup.find("meta", property="og:image")
|
||||
if og_image:
|
||||
metadata['poster_image'] = og_image.get('content')
|
||||
metadata["poster_image"] = og_image.get("content")
|
||||
|
||||
# Extract total episodes
|
||||
episodes_count = len(await self.get_episodes(anime_url))
|
||||
if episodes_count > 0:
|
||||
metadata['total_episodes'] = episodes_count
|
||||
metadata["total_episodes"] = episodes_count
|
||||
|
||||
print(f"[ANIME-ULTIME] Extracted metadata: {metadata}")
|
||||
return metadata
|
||||
@@ -274,7 +310,9 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
print(f"[ANIME-ULTIME] Error extracting metadata: {e}")
|
||||
return {}
|
||||
|
||||
async def search_anime(self, query: str, lang: str = "vostfr", include_metadata: bool = False) -> list[dict]:
|
||||
async def search_anime(
|
||||
self, query: str, lang: str = "vostfr", include_metadata: bool = False
|
||||
) -> list[dict]:
|
||||
"""
|
||||
Search for anime on anime-ultime
|
||||
Returns list of anime with title, url, and cover image
|
||||
@@ -286,27 +324,30 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
"""
|
||||
try:
|
||||
import time
|
||||
|
||||
start = time.time()
|
||||
print(f"[ANIME-ULTIME] Searching for '{query}' ({lang})...")
|
||||
|
||||
# Anime-Ultime uses POST for search
|
||||
search_url = "https://www.anime-ultime.net/search-0-1"
|
||||
|
||||
response = await self.client.post(search_url, data={'search': query})
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
response = await self.client.post(search_url, data={"search": query})
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
elapsed = time.time() - start
|
||||
print(f"[ANIME-ULTIME] Got response {response.status_code} in {elapsed:.2f}s")
|
||||
print(
|
||||
f"[ANIME-ULTIME] Got response {response.status_code} in {elapsed:.2f}s"
|
||||
)
|
||||
|
||||
results = []
|
||||
|
||||
# Look for search result links - better parsing
|
||||
# Search results use file-0-1/ pattern, not info-
|
||||
search_results = soup.find_all('a', href=re.compile(r'file-0-1/'))
|
||||
search_results = soup.find_all("a", href=re.compile(r"file-0-1/"))
|
||||
|
||||
seen_urls = set()
|
||||
for result in search_results[:10]: # Limit to 10 results
|
||||
href = result.get('href', '')
|
||||
href = result.get("href", "")
|
||||
raw_title = result.get_text().strip()
|
||||
|
||||
# Skip if no href
|
||||
@@ -322,40 +363,44 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
better_title = raw_title
|
||||
|
||||
# If raw_title is just "Télécharger" or similar, try to find better title
|
||||
if len(raw_title) < 5 or raw_title.lower() in ['télécharger', 'download', 'ddl']:
|
||||
if len(raw_title) < 5 or raw_title.lower() in [
|
||||
"télécharger",
|
||||
"download",
|
||||
"ddl",
|
||||
]:
|
||||
# Try to extract from URL (file-0-1/ID-Title format)
|
||||
url_match = re.search(r'file-0-1/\d+-(.+)$', href)
|
||||
url_match = re.search(r"file-0-1/\d+-(.+)$", href)
|
||||
if url_match:
|
||||
better_title = url_match.group(1).replace('-', ' ').title()
|
||||
better_title = url_match.group(1).replace("-", " ").title()
|
||||
|
||||
# If still no good title, look at parent/row elements
|
||||
if len(better_title) < 5:
|
||||
# Check parent row (table structure)
|
||||
row = result.find_parent(['tr', 'td', 'div'])
|
||||
row = result.find_parent(["tr", "td", "div"])
|
||||
if row:
|
||||
# Look for text in the row that's not the link text
|
||||
row_text = row.get_text().strip()
|
||||
# Remove the link text from row text
|
||||
if raw_title in row_text:
|
||||
row_text = row_text.replace(raw_title, '').strip()
|
||||
row_text = row_text.replace(raw_title, "").strip()
|
||||
if len(row_text) > 5 and len(row_text) < 100:
|
||||
better_title = row_text
|
||||
|
||||
# Make URL absolute
|
||||
if not href.startswith('http'):
|
||||
if not href.startswith("http"):
|
||||
href = urljoin("https://www.anime-ultime.net/", href)
|
||||
|
||||
result_item = {
|
||||
'title': better_title,
|
||||
'url': href,
|
||||
'type': 'search_result',
|
||||
'metadata': None
|
||||
"title": better_title,
|
||||
"url": href,
|
||||
"type": "search_result",
|
||||
"metadata": None,
|
||||
}
|
||||
|
||||
# Fetch metadata if requested
|
||||
if include_metadata:
|
||||
metadata = await self.get_anime_metadata(href)
|
||||
result_item['metadata'] = metadata
|
||||
result_item["metadata"] = metadata
|
||||
|
||||
results.append(result_item)
|
||||
|
||||
@@ -373,27 +418,27 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
"""
|
||||
try:
|
||||
response = await self.client.get(anime_url)
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
episodes = []
|
||||
|
||||
# Look for episode links - anime-ultime uses info-XXXXX-Name-XX-vostfr format
|
||||
# The URL pattern is info-0-1/ID-Anime-Name-XX-vostfr where XX is episode number
|
||||
episode_links = soup.find_all('a', href=re.compile(r'info-0-1/\d+'))
|
||||
episode_links = soup.find_all("a", href=re.compile(r"info-0-1/\d+"))
|
||||
|
||||
for link in episode_links:
|
||||
href = link.get('href', '')
|
||||
href = link.get("href", "")
|
||||
text = link.get_text().strip()
|
||||
|
||||
# Extract episode number from URL pattern
|
||||
# Matches: info-0-1/30200/Naruto-OAV-01-vostfr
|
||||
match = re.search(r'-(\d+)-vostfr$', href, re.I)
|
||||
match = re.search(r"-(\d+)-vostfr$", href, re.I)
|
||||
if not match:
|
||||
# Try other patterns
|
||||
match = re.search(r'Episode[-\s]?(\d+)', href, re.I)
|
||||
match = re.search(r"Episode[-\s]?(\d+)", href, re.I)
|
||||
if not match:
|
||||
# Try to extract from text
|
||||
match = re.search(r'(\d+)', text)
|
||||
match = re.search(r"(\d+)", text)
|
||||
|
||||
if match:
|
||||
episode_num = match.group(1).zfill(2) # Pad with zero
|
||||
@@ -401,32 +446,30 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
# Extract the episode ID from href and build correct URL
|
||||
# href might be "info-0-1/30200" or "info-0-1/30200/..."
|
||||
# We need: https://www.anime-ultime.net/info-0-1/30200
|
||||
ep_id_match = re.search(r'info-0-1/(\d+)', href)
|
||||
ep_id_match = re.search(r"info-0-1/(\d+)", href)
|
||||
if ep_id_match:
|
||||
ep_id = ep_id_match.group(1)
|
||||
# Build the correct episode URL
|
||||
episode_url = f"https://www.anime-ultime.net/info-0-1/{ep_id}"
|
||||
else:
|
||||
# Fallback to making URL absolute
|
||||
if not href.startswith('http'):
|
||||
if not href.startswith("http"):
|
||||
href = urljoin(anime_url, href)
|
||||
episode_url = href
|
||||
|
||||
episodes.append({
|
||||
'episode': episode_num,
|
||||
'url': episode_url,
|
||||
'title': text
|
||||
})
|
||||
episodes.append(
|
||||
{"episode": episode_num, "url": episode_url, "title": text}
|
||||
)
|
||||
|
||||
# Remove duplicates and sort
|
||||
seen = set()
|
||||
unique_episodes = []
|
||||
for ep in episodes:
|
||||
if ep['episode'] not in seen:
|
||||
seen.add(ep['episode'])
|
||||
if ep["episode"] not in seen:
|
||||
seen.add(ep["episode"])
|
||||
unique_episodes.append(ep)
|
||||
|
||||
unique_episodes.sort(key=lambda x: int(x['episode']))
|
||||
unique_episodes.sort(key=lambda x: int(x["episode"]))
|
||||
|
||||
return unique_episodes
|
||||
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
"""French-Manga.net anime streaming site downloader"""
|
||||
|
||||
from .base import BaseAnimeSite
|
||||
from bs4 import BeautifulSoup
|
||||
import re
|
||||
@@ -17,11 +18,12 @@ class FrenchMangaDownloader(BaseAnimeSite):
|
||||
"french-manga.net",
|
||||
"w16.french-manga.net",
|
||||
"w15.french-manga.net",
|
||||
"www.french-manga.net"
|
||||
"www.french-manga.net",
|
||||
]
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.id = "french-manga"
|
||||
self.base_url = "https://w16.french-manga.net"
|
||||
|
||||
def can_handle(self, url: str) -> bool:
|
||||
@@ -29,9 +31,7 @@ class FrenchMangaDownloader(BaseAnimeSite):
|
||||
return any(domain in url.lower() for domain in self.BASE_DOMAINS)
|
||||
|
||||
async def search_anime(
|
||||
self,
|
||||
query: str,
|
||||
lang: str = "vostfr"
|
||||
self, query: str, lang: str = "vostfr"
|
||||
) -> List[Dict[str, str]]:
|
||||
"""
|
||||
Search for anime on French-Manga.
|
||||
@@ -47,46 +47,50 @@ class FrenchMangaDownloader(BaseAnimeSite):
|
||||
# French-Manga uses a search endpoint
|
||||
search_url = f"{self.base_url}/index.php?do=search"
|
||||
params = {
|
||||
'do': 'search',
|
||||
'subaction': 'search',
|
||||
'story': query,
|
||||
'x': '0',
|
||||
'y': '0'
|
||||
"do": "search",
|
||||
"subaction": "search",
|
||||
"story": query,
|
||||
"x": "0",
|
||||
"y": "0",
|
||||
}
|
||||
|
||||
response = await self.client.post(search_url, data=params)
|
||||
response.raise_for_status()
|
||||
html = response.text
|
||||
|
||||
soup = BeautifulSoup(html, 'lxml')
|
||||
soup = BeautifulSoup(html, "lxml")
|
||||
results = []
|
||||
|
||||
# Look for search results in article or story classes
|
||||
for item in soup.find_all('article', class_=lambda x: x and 'story' in x.lower()):
|
||||
title_elem = item.find(['h2', 'h3', 'h4'])
|
||||
link_elem = item.find('a', href=True)
|
||||
img_elem = item.find('img')
|
||||
for item in soup.find_all(
|
||||
"article", class_=lambda x: x and "story" in x.lower()
|
||||
):
|
||||
title_elem = item.find(["h2", "h3", "h4"])
|
||||
link_elem = item.find("a", href=True)
|
||||
img_elem = item.find("img")
|
||||
|
||||
if title_elem and link_elem:
|
||||
title = title_elem.get_text(strip=True)
|
||||
url = link_elem['href']
|
||||
url = link_elem["href"]
|
||||
|
||||
# Ensure absolute URL
|
||||
if url.startswith('/'):
|
||||
if url.startswith("/"):
|
||||
url = self.base_url + url
|
||||
|
||||
cover_image = ""
|
||||
if img_elem and img_elem.get('src'):
|
||||
cover_image = img_elem['src']
|
||||
if cover_image.startswith('/'):
|
||||
if img_elem and img_elem.get("src"):
|
||||
cover_image = img_elem["src"]
|
||||
if cover_image.startswith("/"):
|
||||
cover_image = self.base_url + cover_image
|
||||
|
||||
results.append({
|
||||
'title': title,
|
||||
'url': url,
|
||||
'cover_image': cover_image,
|
||||
'lang': lang
|
||||
})
|
||||
results.append(
|
||||
{
|
||||
"title": title,
|
||||
"url": url,
|
||||
"cover_image": cover_image,
|
||||
"lang": lang,
|
||||
}
|
||||
)
|
||||
|
||||
logger.info(f"Found {len(results)} anime results for query: {query}")
|
||||
return results
|
||||
@@ -96,9 +100,7 @@ class FrenchMangaDownloader(BaseAnimeSite):
|
||||
return []
|
||||
|
||||
async def get_episodes(
|
||||
self,
|
||||
anime_url: str,
|
||||
lang: str = "vostfr"
|
||||
self, anime_url: str, lang: str = "vostfr"
|
||||
) -> List[Dict[str, str]]:
|
||||
"""
|
||||
Get episode list for an anime.
|
||||
@@ -115,34 +117,36 @@ class FrenchMangaDownloader(BaseAnimeSite):
|
||||
response.raise_for_status()
|
||||
html = response.text
|
||||
|
||||
soup = BeautifulSoup(html, 'lxml')
|
||||
soup = BeautifulSoup(html, "lxml")
|
||||
episodes = []
|
||||
|
||||
# Look for episode links (typically in a list or table)
|
||||
# French-Manga usually has episode links in <a> tags with episode numbers
|
||||
for link in soup.find_all('a', href=True):
|
||||
href = link['href']
|
||||
for link in soup.find_all("a", href=True):
|
||||
href = link["href"]
|
||||
text = link.get_text(strip=True)
|
||||
|
||||
# Pattern: Episode links usually contain "episode" or numbers
|
||||
if re.search(r'episode?\s*\d+', text.lower()):
|
||||
episode_num = re.search(r'(\d+)', text)
|
||||
if re.search(r"episode?\s*\d+", text.lower()):
|
||||
episode_num = re.search(r"(\d+)", text)
|
||||
if episode_num:
|
||||
episode_number = int(episode_num.group(1))
|
||||
|
||||
# Ensure absolute URL
|
||||
if href.startswith('/'):
|
||||
if href.startswith("/"):
|
||||
href = self.base_url + href
|
||||
|
||||
episodes.append({
|
||||
'episode_number': episode_number,
|
||||
'url': href,
|
||||
'title': text,
|
||||
'host': 'french-manga'
|
||||
})
|
||||
episodes.append(
|
||||
{
|
||||
"episode_number": episode_number,
|
||||
"url": href,
|
||||
"title": text,
|
||||
"host": "french-manga",
|
||||
}
|
||||
)
|
||||
|
||||
# Sort by episode number
|
||||
episodes.sort(key=lambda x: x['episode_number'])
|
||||
episodes.sort(key=lambda x: x["episode_number"])
|
||||
|
||||
logger.info(f"Found {len(episodes)} episodes for {anime_url}")
|
||||
return episodes
|
||||
@@ -166,31 +170,33 @@ class FrenchMangaDownloader(BaseAnimeSite):
|
||||
response.raise_for_status()
|
||||
html = response.text
|
||||
|
||||
soup = BeautifulSoup(html, 'lxml')
|
||||
soup = BeautifulSoup(html, "lxml")
|
||||
|
||||
# Extract title
|
||||
title = ""
|
||||
title_elem = soup.find('h1') or soup.find('h2', class_='title')
|
||||
title_elem = soup.find("h1") or soup.find("h2", class_="title")
|
||||
if title_elem:
|
||||
title = title_elem.get_text(strip=True)
|
||||
|
||||
# Extract synopsis
|
||||
synopsis = ""
|
||||
synopsis_elem = soup.find('div', class_=lambda x: x and 'story' in x.lower())
|
||||
synopsis_elem = soup.find(
|
||||
"div", class_=lambda x: x and "story" in x.lower()
|
||||
)
|
||||
if synopsis_elem:
|
||||
synopsis = synopsis_elem.get_text(strip=True)
|
||||
|
||||
# Extract cover image
|
||||
poster_image = ""
|
||||
img_elem = soup.find('img', class_=lambda x: x and 'poster' in x.lower())
|
||||
if img_elem and img_elem.get('src'):
|
||||
poster_image = img_elem['src']
|
||||
if poster_image.startswith('/'):
|
||||
img_elem = soup.find("img", class_=lambda x: x and "poster" in x.lower())
|
||||
if img_elem and img_elem.get("src"):
|
||||
poster_image = img_elem["src"]
|
||||
if poster_image.startswith("/"):
|
||||
poster_image = self.base_url + poster_image
|
||||
|
||||
# Extract genres
|
||||
genres = []
|
||||
genre_links = soup.find_all('a', href=re.compile(r'/xfsearch/.*genre/'))
|
||||
genre_links = soup.find_all("a", href=re.compile(r"/xfsearch/.*genre/"))
|
||||
for link in genre_links[:10]: # Limit to 10 genres
|
||||
genre = link.get_text(strip=True)
|
||||
if genre:
|
||||
@@ -198,36 +204,38 @@ class FrenchMangaDownloader(BaseAnimeSite):
|
||||
|
||||
# Extract rating (if available)
|
||||
rating = ""
|
||||
rating_elem = soup.find(['span', 'div'], class_=lambda x: x and 'rating' in x.lower())
|
||||
rating_elem = soup.find(
|
||||
["span", "div"], class_=lambda x: x and "rating" in x.lower()
|
||||
)
|
||||
if rating_elem:
|
||||
rating = rating_elem.get_text(strip=True)
|
||||
|
||||
return {
|
||||
'title': title,
|
||||
'synopsis': synopsis,
|
||||
'genres': genres,
|
||||
'rating': rating,
|
||||
'release_year': '',
|
||||
'studio': '',
|
||||
'poster_image': poster_image,
|
||||
'total_episodes': len(await self.get_episodes(anime_url)),
|
||||
'status': '',
|
||||
'languages': ['vf', 'vostfr']
|
||||
"title": title,
|
||||
"synopsis": synopsis,
|
||||
"genres": genres,
|
||||
"rating": rating,
|
||||
"release_year": "",
|
||||
"studio": "",
|
||||
"poster_image": poster_image,
|
||||
"total_episodes": len(await self.get_episodes(anime_url)),
|
||||
"status": "",
|
||||
"languages": ["vf", "vostfr"],
|
||||
}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error getting anime metadata: {e}")
|
||||
return {
|
||||
'title': '',
|
||||
'synopsis': '',
|
||||
'genres': [],
|
||||
'rating': '',
|
||||
'release_year': '',
|
||||
'studio': '',
|
||||
'poster_image': '',
|
||||
'total_episodes': 0,
|
||||
'status': '',
|
||||
'languages': ['vf', 'vostfr']
|
||||
"title": "",
|
||||
"synopsis": "",
|
||||
"genres": [],
|
||||
"rating": "",
|
||||
"release_year": "",
|
||||
"studio": "",
|
||||
"poster_image": "",
|
||||
"total_episodes": 0,
|
||||
"status": "",
|
||||
"languages": ["vf", "vostfr"],
|
||||
}
|
||||
|
||||
async def get_download_link(self, url: str) -> tuple[str, str]:
|
||||
@@ -248,20 +256,20 @@ class FrenchMangaDownloader(BaseAnimeSite):
|
||||
response.raise_for_status()
|
||||
html = response.text
|
||||
|
||||
soup = BeautifulSoup(html, 'lxml')
|
||||
soup = BeautifulSoup(html, "lxml")
|
||||
|
||||
# Look for iframe or video player
|
||||
iframe = soup.find('iframe', src=True)
|
||||
iframe = soup.find("iframe", src=True)
|
||||
if iframe:
|
||||
video_url = iframe['src']
|
||||
video_url = iframe["src"]
|
||||
else:
|
||||
# Look for video tag directly
|
||||
video = soup.find('video', src=True)
|
||||
video = soup.find("video", src=True)
|
||||
if video:
|
||||
video_url = video['src']
|
||||
video_url = video["src"]
|
||||
else:
|
||||
# Try to find in script tags
|
||||
scripts = soup.find_all('script')
|
||||
scripts = soup.find_all("script")
|
||||
for script in scripts:
|
||||
if script.string:
|
||||
# Look for iframe or video URLs in JavaScript
|
||||
@@ -274,20 +282,20 @@ class FrenchMangaDownloader(BaseAnimeSite):
|
||||
if match:
|
||||
video_url = match.group(1)
|
||||
break
|
||||
if 'video_url' in locals():
|
||||
if "video_url" in locals():
|
||||
break
|
||||
|
||||
if 'video_url' not in locals():
|
||||
if "video_url" not in locals():
|
||||
raise ValueError("Could not find video player URL")
|
||||
|
||||
# Ensure absolute URL
|
||||
if video_url.startswith('//'):
|
||||
video_url = 'https:' + video_url
|
||||
elif video_url.startswith('/'):
|
||||
if video_url.startswith("//"):
|
||||
video_url = "https:" + video_url
|
||||
elif video_url.startswith("/"):
|
||||
video_url = self.base_url + video_url
|
||||
|
||||
# Extract episode title
|
||||
title_elem = soup.find('h1') or soup.find('h2')
|
||||
title_elem = soup.find("h1") or soup.find("h2")
|
||||
episode_title = title_elem.get_text(strip=True) if title_elem else "Episode"
|
||||
episode_title = sanitize_filename(episode_title)
|
||||
|
||||
|
||||
@@ -7,79 +7,100 @@ from urllib.parse import urljoin
|
||||
|
||||
class NekoSamaDownloader(BaseAnimeSite):
|
||||
"""Downloader for neko-sama.org (anime streaming via Gupy)
|
||||
|
||||
|
||||
NOTE: neko-sama.org now redirects to Gupy, which is a legal streaming search engine.
|
||||
It does NOT host video content - it provides metadata about where to watch legally.
|
||||
This provider can search and get metadata but cannot provide direct download links.
|
||||
"""
|
||||
|
||||
BASE_DOMAINS = ["neko-sama.org", "www.neko-sama.org", "neko-sama.fr", "nekosama.fr", "www.gupy.fr", "gupy.fr"]
|
||||
BASE_DOMAINS = [
|
||||
"neko-sama.org",
|
||||
"www.neko-sama.org",
|
||||
"neko-sama.fr",
|
||||
"nekosama.fr",
|
||||
"www.gupy.fr",
|
||||
"gupy.fr",
|
||||
]
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.id = "neko-sama"
|
||||
|
||||
def can_handle(self, url: str) -> bool:
|
||||
return any(domain in url.lower() for domain in self.BASE_DOMAINS)
|
||||
|
||||
async def get_download_link(self, url: str, target_filename: Optional[str] = None) -> tuple[str, str]:
|
||||
async def get_download_link(
|
||||
self, url: str, target_filename: Optional[str] = None
|
||||
) -> tuple[str, str]:
|
||||
"""
|
||||
Extract download link from neko-sama URL.
|
||||
|
||||
|
||||
NOTE: neko-sama.org/Gupy is a legal streaming search engine, NOT a video host.
|
||||
This returns streaming platform information instead of direct video links.
|
||||
"""
|
||||
try:
|
||||
# Check if this is a Gupy URL
|
||||
if 'gupy.fr' in url or 'neko-sama.org' in url:
|
||||
if "gupy.fr" in url or "neko-sama.org" in url:
|
||||
response = await self.client.get(url, follow_redirects=True)
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
# Look for streaming platform links
|
||||
streaming_links = []
|
||||
for link in soup.find_all('a', href=True):
|
||||
href = link.get('href', '')
|
||||
if '/out/' in href:
|
||||
for link in soup.find_all("a", href=True):
|
||||
href = link.get("href", "")
|
||||
if "/out/" in href:
|
||||
text = link.get_text(strip=True)
|
||||
if text and 'Regarder' in text:
|
||||
if text and "Regarder" in text:
|
||||
streaming_links.append(f"{text}: {href}")
|
||||
|
||||
|
||||
if streaming_links:
|
||||
title_elem = soup.find('h1') or soup.find('title')
|
||||
title = title_elem.get_text(strip=True).split('|')[0].strip() if title_elem else "Unknown"
|
||||
info = "Available streaming platforms:\n" + "\n".join(streaming_links[:5])
|
||||
title_elem = soup.find("h1") or soup.find("title")
|
||||
title = (
|
||||
title_elem.get_text(strip=True).split("|")[0].strip()
|
||||
if title_elem
|
||||
else "Unknown"
|
||||
)
|
||||
info = "Available streaming platforms:\n" + "\n".join(
|
||||
streaming_links[:5]
|
||||
)
|
||||
filename = target_filename or f"{title}_streaming_info.txt"
|
||||
return info, filename
|
||||
|
||||
raise Exception("No streaming links found - Gupy is a legal streaming search, not a video host")
|
||||
|
||||
|
||||
raise Exception(
|
||||
"No streaming links found - Gupy is a legal streaming search, not a video host"
|
||||
)
|
||||
|
||||
# Legacy: try original method for other URLs
|
||||
response = await self.client.get(url, follow_redirects=True)
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
# Method 1: Look for iframes with video
|
||||
iframes = soup.find_all('iframe')
|
||||
iframes = soup.find_all("iframe")
|
||||
for iframe in iframes:
|
||||
src = iframe.get('src', '')
|
||||
if src and any(p in src for p in ['video', 'player', 'stream']):
|
||||
if not src.startswith('http'):
|
||||
src = iframe.get("src", "")
|
||||
if src and any(p in src for p in ["video", "player", "stream"]):
|
||||
if not src.startswith("http"):
|
||||
src = urljoin(str(response.url), src)
|
||||
filename = self._generate_filename(str(response.url))
|
||||
return src, filename
|
||||
|
||||
# Method 2: Look for video tags
|
||||
videos = soup.find_all('video')
|
||||
videos = soup.find_all("video")
|
||||
for video in videos:
|
||||
src = video.get('src') or video.get('data-src')
|
||||
src = video.get("src") or video.get("data-src")
|
||||
if src:
|
||||
filename = self._generate_filename(str(response.url))
|
||||
return src, filename
|
||||
|
||||
sources = video.find_all('source')
|
||||
sources = video.find_all("source")
|
||||
for source in sources:
|
||||
src = source.get('src', '')
|
||||
src = source.get("src", "")
|
||||
if src:
|
||||
filename = self._generate_filename(str(response.url))
|
||||
return src, filename
|
||||
|
||||
# Method 3: Look in scripts
|
||||
scripts = soup.find_all('script')
|
||||
scripts = soup.find_all("script")
|
||||
for script in scripts:
|
||||
if script.string:
|
||||
patterns = [
|
||||
@@ -90,24 +111,26 @@ class NekoSamaDownloader(BaseAnimeSite):
|
||||
for pattern in patterns:
|
||||
matches = re.findall(pattern, script.string)
|
||||
for match in matches:
|
||||
match = match.replace('\\/', '/')
|
||||
if any(ext in match for ext in ['mp4', 'm3u8']):
|
||||
match = match.replace("\\/", "/")
|
||||
if any(ext in match for ext in ["mp4", "m3u8"]):
|
||||
filename = self._generate_filename(str(response.url))
|
||||
return match, filename
|
||||
|
||||
raise Exception("Could not find video link - Neko-Sama/Gupy does not host video content")
|
||||
raise Exception(
|
||||
"Could not find video link - Neko-Sama/Gupy does not host video content"
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
raise Exception(f"Error extracting NekoSama link: {str(e)}")
|
||||
|
||||
def _generate_filename(self, url: str) -> str:
|
||||
parts = url.split('/')
|
||||
parts = url.split("/")
|
||||
anime_name = "anime"
|
||||
episode = "1"
|
||||
|
||||
for i, part in enumerate(parts):
|
||||
if 'episode' in part.lower():
|
||||
match = re.search(r'episode[-\s]*(\d+)', part, re.I)
|
||||
if "episode" in part.lower():
|
||||
match = re.search(r"episode[-\s]*(\d+)", part, re.I)
|
||||
if match:
|
||||
episode = match.group(1)
|
||||
|
||||
@@ -118,31 +141,31 @@ class NekoSamaDownloader(BaseAnimeSite):
|
||||
"""Get list of episodes for an anime."""
|
||||
try:
|
||||
response = await self.client.get(anime_url)
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
episodes = []
|
||||
# Try to find episode links
|
||||
episode_links = soup.find_all('a', href=re.compile(r'episode'))
|
||||
episode_links = soup.find_all("a", href=re.compile(r"episode"))
|
||||
|
||||
for link in episode_links:
|
||||
href = link.get('href', '')
|
||||
match = re.search(r'episode[-\s]*(\d+)', href, re.I)
|
||||
href = link.get("href", "")
|
||||
match = re.search(r"episode[-\s]*(\d+)", href, re.I)
|
||||
if match:
|
||||
episode_num = match.group(1)
|
||||
if not href.startswith('http'):
|
||||
if not href.startswith("http"):
|
||||
href = urljoin(anime_url, href)
|
||||
|
||||
episodes.append({'episode': episode_num, 'url': href})
|
||||
episodes.append({"episode": episode_num, "url": href})
|
||||
|
||||
# Deduplicate and sort
|
||||
seen = set()
|
||||
unique_episodes = []
|
||||
for ep in episodes:
|
||||
if ep['episode'] not in seen:
|
||||
seen.add(ep['episode'])
|
||||
if ep["episode"] not in seen:
|
||||
seen.add(ep["episode"])
|
||||
unique_episodes.append(ep)
|
||||
|
||||
unique_episodes.sort(key=lambda x: int(x['episode']))
|
||||
unique_episodes.sort(key=lambda x: int(x["episode"]))
|
||||
return unique_episodes
|
||||
|
||||
except Exception as e:
|
||||
@@ -153,70 +176,70 @@ class NekoSamaDownloader(BaseAnimeSite):
|
||||
try:
|
||||
print(f"[NEKO-SAMA] Extracting metadata from: {anime_url}")
|
||||
response = await self.client.get(anime_url)
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
metadata = {
|
||||
'synopsis': None,
|
||||
'genres': [],
|
||||
'rating': None,
|
||||
'release_year': None,
|
||||
'studio': None,
|
||||
'poster_image': None,
|
||||
'banner_image': None,
|
||||
'total_episodes': None,
|
||||
'status': None,
|
||||
'alternative_titles': []
|
||||
"synopsis": None,
|
||||
"genres": [],
|
||||
"rating": None,
|
||||
"release_year": None,
|
||||
"studio": None,
|
||||
"poster_image": None,
|
||||
"banner_image": None,
|
||||
"total_episodes": None,
|
||||
"status": None,
|
||||
"alternative_titles": [],
|
||||
}
|
||||
|
||||
# Extract title and year from h1
|
||||
title_elem = soup.find('h1')
|
||||
title_elem = soup.find("h1")
|
||||
if title_elem:
|
||||
title_text = title_elem.get_text(strip=True)
|
||||
# Extract year from title like "Naruto (2002)"
|
||||
year_match = re.search(r'\((\d{4})\)', title_text)
|
||||
year_match = re.search(r"\((\d{4})\)", title_text)
|
||||
if year_match:
|
||||
metadata['release_year'] = int(year_match.group(1))
|
||||
|
||||
metadata["release_year"] = int(year_match.group(1))
|
||||
|
||||
# Extract synopsis - Gupy shows it as paragraphs
|
||||
synopsis_elem = soup.find('p')
|
||||
synopsis_elem = soup.find("p")
|
||||
if synopsis_elem:
|
||||
text = synopsis_elem.get_text(strip=True)
|
||||
if len(text) > 50:
|
||||
metadata['synopsis'] = text
|
||||
metadata["synopsis"] = text
|
||||
|
||||
# Extract genres from meta tags or links
|
||||
genre_links = soup.find_all('a', href=re.compile(r'serie-|genre|tag'))
|
||||
genre_links = soup.find_all("a", href=re.compile(r"serie-|genre|tag"))
|
||||
if genre_links:
|
||||
genres = []
|
||||
for link in genre_links[:5]:
|
||||
text = link.get_text(strip=True)
|
||||
if text and '/' not in text and len(text) < 30:
|
||||
if text and "/" not in text and len(text) < 30:
|
||||
genres.append(text)
|
||||
metadata['genres'] = genres
|
||||
metadata["genres"] = genres
|
||||
|
||||
# Extract rating from percentage
|
||||
rating_elem = soup.find(string=re.compile(r'\d+(\.\d+)?%'))
|
||||
rating_elem = soup.find(string=re.compile(r"\d+(\.\d+)?%"))
|
||||
if rating_elem:
|
||||
match = re.search(r'(\d+(\.\d+)?)%', rating_elem)
|
||||
match = re.search(r"(\d+(\.\d+)?)%", rating_elem)
|
||||
if match:
|
||||
rating = float(match.group(1)) / 10
|
||||
metadata['rating'] = f"{rating:.1f}/10"
|
||||
metadata["rating"] = f"{rating:.1f}/10"
|
||||
|
||||
# Extract poster image
|
||||
poster_elem = soup.find('img', src=re.compile(r'poster|poster'))
|
||||
poster_elem = soup.find("img", src=re.compile(r"poster|poster"))
|
||||
if poster_elem:
|
||||
metadata['poster_image'] = poster_elem.get('src')
|
||||
metadata["poster_image"] = poster_elem.get("src")
|
||||
|
||||
# Extract episode count from page text
|
||||
page_text = soup.get_text()
|
||||
ep_match = re.search(r'(\d+)\s*episodes?', page_text, re.I)
|
||||
ep_match = re.search(r"(\d+)\s*episodes?", page_text, re.I)
|
||||
if ep_match:
|
||||
metadata['total_episodes'] = int(ep_match.group(1))
|
||||
metadata["total_episodes"] = int(ep_match.group(1))
|
||||
|
||||
# Extract studio/director
|
||||
director_elem = soup.find('a', href=re.compile(r'person|réalisé'))
|
||||
director_elem = soup.find("a", href=re.compile(r"person|réalisé"))
|
||||
if director_elem:
|
||||
metadata['studio'] = director_elem.get_text(strip=True)
|
||||
metadata["studio"] = director_elem.get_text(strip=True)
|
||||
|
||||
print(f"[NEKO-SAMA] Extracted metadata: {metadata}")
|
||||
return metadata
|
||||
@@ -225,16 +248,19 @@ class NekoSamaDownloader(BaseAnimeSite):
|
||||
print(f"[NEKO-SAMA] Error extracting metadata: {e}")
|
||||
return {}
|
||||
|
||||
async def search_anime(self, query: str, lang: str = "vostfr", include_metadata: bool = False) -> list[dict]:
|
||||
async def search_anime(
|
||||
self, query: str, lang: str = "vostfr", include_metadata: bool = False
|
||||
) -> list[dict]:
|
||||
"""Search for anime on neko-sama (uses Gupy backend)."""
|
||||
try:
|
||||
import time
|
||||
from html import unescape
|
||||
|
||||
start = time.time()
|
||||
print(f"[NEKO-SAMA] Searching for '{query}' ({lang})...")
|
||||
|
||||
# Neko-Sama now uses Gupy - try the direct URL pattern
|
||||
search_slug = query.lower().replace(' ', '-')
|
||||
search_slug = query.lower().replace(" ", "-")
|
||||
search_urls = [
|
||||
f"https://www.gupy.fr/series/{search_slug}/",
|
||||
f"https://neko-sama.org/series/{search_slug}/",
|
||||
@@ -250,34 +276,40 @@ class NekoSamaDownloader(BaseAnimeSite):
|
||||
print(f"[NEKO-SAMA] Found anime at {final_url}")
|
||||
|
||||
# Extract title from page
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
title_elem = soup.find('h1') or soup.find('title')
|
||||
title = unescape(title_elem.get_text(strip=True)) if title_elem else query
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
title_elem = soup.find("h1") or soup.find("title")
|
||||
title = (
|
||||
unescape(title_elem.get_text(strip=True))
|
||||
if title_elem
|
||||
else query
|
||||
)
|
||||
# Clean up title
|
||||
title = title.split('|')[0].split('-')[0].strip()
|
||||
title = title.split("|")[0].split("-")[0].strip()
|
||||
|
||||
result = {
|
||||
'title': title,
|
||||
'url': final_url,
|
||||
'cover_image': None,
|
||||
'type': 'direct',
|
||||
'metadata': None
|
||||
"title": title,
|
||||
"url": final_url,
|
||||
"cover_image": None,
|
||||
"type": "direct",
|
||||
"metadata": None,
|
||||
}
|
||||
|
||||
# Try to get poster
|
||||
poster = soup.find('img', src=re.compile(r'poster'))
|
||||
poster = soup.find("img", src=re.compile(r"poster"))
|
||||
if poster:
|
||||
result['cover_image'] = poster.get('src')
|
||||
result["cover_image"] = poster.get("src")
|
||||
|
||||
if include_metadata:
|
||||
metadata = await self.get_anime_metadata(final_url)
|
||||
result['metadata'] = metadata
|
||||
result["metadata"] = metadata
|
||||
|
||||
results.append(result)
|
||||
break
|
||||
|
||||
elapsed = time.time() - start
|
||||
print(f"[NEKO-SAMA] Search completed in {elapsed:.2f}s, found {len(results)} results")
|
||||
print(
|
||||
f"[NEKO-SAMA] Search completed in {elapsed:.2f}s, found {len(results)} results"
|
||||
)
|
||||
return results
|
||||
|
||||
except Exception as e:
|
||||
|
||||
@@ -9,6 +9,10 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
|
||||
BASE_DOMAINS = ["vostfree.tv", "www.vostfree.tv"]
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.id = "vostfree"
|
||||
|
||||
def can_handle(self, url: str) -> bool:
|
||||
return any(domain in url.lower() for domain in self.BASE_DOMAINS)
|
||||
|
||||
@@ -16,35 +20,35 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
"""Extract download link from vostfree URL"""
|
||||
try:
|
||||
response = await self.client.get(url, follow_redirects=True)
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
# Method 1: Look for iframe players
|
||||
iframes = soup.find_all('iframe')
|
||||
iframes = soup.find_all("iframe")
|
||||
for iframe in iframes:
|
||||
src = iframe.get('src', '')
|
||||
if src and any(p in src for p in ['player', 'video', 'stream']):
|
||||
if not src.startswith('http'):
|
||||
src = iframe.get("src", "")
|
||||
if src and any(p in src for p in ["player", "video", "stream"]):
|
||||
if not src.startswith("http"):
|
||||
src = urljoin(str(response.url), src)
|
||||
filename = self._generate_filename(str(response.url))
|
||||
return src, filename
|
||||
|
||||
# Method 2: Look for video tags
|
||||
videos = soup.find_all('video')
|
||||
videos = soup.find_all("video")
|
||||
for video in videos:
|
||||
src = video.get('src')
|
||||
src = video.get("src")
|
||||
if src:
|
||||
filename = self._generate_filename(str(response.url))
|
||||
return src, filename
|
||||
|
||||
sources = video.find_all('source')
|
||||
sources = video.find_all("source")
|
||||
for source in sources:
|
||||
src = source.get('src', '')
|
||||
if src and any(ext in src for ext in ['mp4', 'm3u8']):
|
||||
src = source.get("src", "")
|
||||
if src and any(ext in src for ext in ["mp4", "m3u8"]):
|
||||
filename = self._generate_filename(str(response.url))
|
||||
return src, filename
|
||||
|
||||
# Method 3: Look in scripts
|
||||
scripts = soup.find_all('script')
|
||||
scripts = soup.find_all("script")
|
||||
for script in scripts:
|
||||
if script.string:
|
||||
patterns = [
|
||||
@@ -56,8 +60,8 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
for pattern in patterns:
|
||||
matches = re.findall(pattern, script.string)
|
||||
for match in matches:
|
||||
match = match.replace('\\/', '/')
|
||||
if any(ext in match for ext in ['mp4', 'm3u8']):
|
||||
match = match.replace("\\/", "/")
|
||||
if any(ext in match for ext in ["mp4", "m3u8"]):
|
||||
filename = self._generate_filename(str(response.url))
|
||||
return match, filename
|
||||
|
||||
@@ -67,12 +71,12 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
raise Exception(f"Error extracting Vostfree link: {str(e)}")
|
||||
|
||||
def _generate_filename(self, url: str) -> str:
|
||||
parts = url.split('/')
|
||||
parts = url.split("/")
|
||||
anime_name = "anime"
|
||||
episode = "1"
|
||||
|
||||
for part in parts:
|
||||
match = re.search(r'episode[-\s]*(\d+)', part, re.I)
|
||||
match = re.search(r"episode[-\s]*(\d+)", part, re.I)
|
||||
if match:
|
||||
episode = match.group(1)
|
||||
|
||||
@@ -82,30 +86,30 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
async def get_episodes(self, anime_url: str, lang: str = "vostfr") -> list[dict]:
|
||||
try:
|
||||
response = await self.client.get(anime_url)
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
episodes = []
|
||||
episode_links = soup.find_all('a', href=re.compile(r'episode', re.I))
|
||||
episode_links = soup.find_all("a", href=re.compile(r"episode", re.I))
|
||||
|
||||
for link in episode_links:
|
||||
href = link.get('href', '')
|
||||
match = re.search(r'episode[-\s]*(\d+)', href, re.I)
|
||||
href = link.get("href", "")
|
||||
match = re.search(r"episode[-\s]*(\d+)", href, re.I)
|
||||
if match:
|
||||
episode_num = match.group(1)
|
||||
if not href.startswith('http'):
|
||||
if not href.startswith("http"):
|
||||
href = urljoin(anime_url, href)
|
||||
|
||||
episodes.append({'episode': episode_num, 'url': href})
|
||||
episodes.append({"episode": episode_num, "url": href})
|
||||
|
||||
# Deduplicate and sort
|
||||
seen = set()
|
||||
unique_episodes = []
|
||||
for ep in episodes:
|
||||
if ep['episode'] not in seen:
|
||||
seen.add(ep['episode'])
|
||||
if ep["episode"] not in seen:
|
||||
seen.add(ep["episode"])
|
||||
unique_episodes.append(ep)
|
||||
|
||||
unique_episodes.sort(key=lambda x: int(x['episode']))
|
||||
unique_episodes.sort(key=lambda x: int(x["episode"]))
|
||||
return unique_episodes
|
||||
|
||||
except Exception as e:
|
||||
@@ -119,29 +123,29 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
try:
|
||||
print(f"[VOSTFREE] Extracting metadata from: {anime_url}")
|
||||
response = await self.client.get(anime_url)
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
metadata = {
|
||||
'synopsis': None,
|
||||
'genres': [],
|
||||
'rating': None,
|
||||
'release_year': None,
|
||||
'studio': None,
|
||||
'poster_image': None,
|
||||
'banner_image': None,
|
||||
'total_episodes': None,
|
||||
'status': None,
|
||||
'alternative_titles': []
|
||||
"synopsis": None,
|
||||
"genres": [],
|
||||
"rating": None,
|
||||
"release_year": None,
|
||||
"studio": None,
|
||||
"poster_image": None,
|
||||
"banner_image": None,
|
||||
"total_episodes": None,
|
||||
"status": None,
|
||||
"alternative_titles": [],
|
||||
}
|
||||
|
||||
# Extract synopsis
|
||||
synopsis_selectors = [
|
||||
'div.synopsis',
|
||||
'div.description',
|
||||
"div.synopsis",
|
||||
"div.description",
|
||||
'div[class*="synopsis"]',
|
||||
'div[class*="desc"]',
|
||||
'p.synopsis',
|
||||
'.anime-synopsis'
|
||||
"p.synopsis",
|
||||
".anime-synopsis",
|
||||
]
|
||||
|
||||
for selector in synopsis_selectors:
|
||||
@@ -149,57 +153,65 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
if synopsis_elem:
|
||||
synopsis = synopsis_elem.get_text(strip=True)
|
||||
if len(synopsis) > 50:
|
||||
metadata['synopsis'] = synopsis
|
||||
metadata["synopsis"] = synopsis
|
||||
break
|
||||
|
||||
# Extract genres
|
||||
genre_links = soup.find_all('a', href=re.compile(r'genre|tag|type', re.I))
|
||||
genre_links = soup.find_all("a", href=re.compile(r"genre|tag|type", re.I))
|
||||
if genre_links:
|
||||
metadata['genres'] = [link.get_text(strip=True) for link in genre_links[:5]]
|
||||
metadata["genres"] = [
|
||||
link.get_text(strip=True) for link in genre_links[:5]
|
||||
]
|
||||
|
||||
# Extract rating
|
||||
rating_selectors = [
|
||||
'span.rating',
|
||||
'div.rating',
|
||||
'span.score',
|
||||
"span.rating",
|
||||
"div.rating",
|
||||
"span.score",
|
||||
'div[class*="rating"]',
|
||||
'div[class*="score"]'
|
||||
'div[class*="score"]',
|
||||
]
|
||||
|
||||
for selector in rating_selectors:
|
||||
rating_elem = soup.select_one(selector)
|
||||
if rating_elem:
|
||||
rating_text = rating_elem.get_text(strip=True)
|
||||
rating_match = re.search(r'(\d+\.?\d*)\s*/\s*10', rating_text)
|
||||
rating_match = re.search(r"(\d+\.?\d*)\s*/\s*10", rating_text)
|
||||
if rating_match:
|
||||
metadata['rating'] = f"{rating_match.group(1)}/10"
|
||||
metadata["rating"] = f"{rating_match.group(1)}/10"
|
||||
break
|
||||
|
||||
# Extract release year
|
||||
page_text = soup.get_text()
|
||||
year_matches = re.findall(r'\b(19\d{2}|20\d{2})\b', page_text)
|
||||
year_matches = re.findall(r"\b(19\d{2}|20\d{2})\b", page_text)
|
||||
if year_matches:
|
||||
import datetime
|
||||
|
||||
current_year = datetime.datetime.now().year + 2
|
||||
valid_years = [int(y) for y in year_matches if 1950 <= int(y) <= current_year]
|
||||
valid_years = [
|
||||
int(y) for y in year_matches if 1950 <= int(y) <= current_year
|
||||
]
|
||||
if valid_years:
|
||||
from collections import Counter
|
||||
metadata['release_year'] = Counter(valid_years).most_common(1)[0][0]
|
||||
|
||||
metadata["release_year"] = Counter(valid_years).most_common(1)[0][0]
|
||||
|
||||
# Extract poster image
|
||||
poster_elem = soup.select_one('img.poster, img.cover, .anime-poster img')
|
||||
poster_elem = soup.select_one("img.poster, img.cover, .anime-poster img")
|
||||
if poster_elem:
|
||||
metadata['poster_image'] = poster_elem.get('src') or poster_elem.get('data-src')
|
||||
metadata["poster_image"] = poster_elem.get("src") or poster_elem.get(
|
||||
"data-src"
|
||||
)
|
||||
|
||||
# Extract poster from og:image
|
||||
og_image = soup.find('meta', property='og:image')
|
||||
if og_image and not metadata['poster_image']:
|
||||
metadata['poster_image'] = og_image.get('content')
|
||||
og_image = soup.find("meta", property="og:image")
|
||||
if og_image and not metadata["poster_image"]:
|
||||
metadata["poster_image"] = og_image.get("content")
|
||||
|
||||
# Extract total episodes
|
||||
episodes_count = len(await self.get_episodes(anime_url))
|
||||
if episodes_count > 0:
|
||||
metadata['total_episodes'] = episodes_count
|
||||
metadata["total_episodes"] = episodes_count
|
||||
|
||||
print(f"[VOSTFREE] Extracted metadata: {metadata}")
|
||||
return metadata
|
||||
@@ -208,7 +220,9 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
print(f"[VOSTFREE] Error extracting metadata: {e}")
|
||||
return {}
|
||||
|
||||
async def search_anime(self, query: str, lang: str = "vostfr", include_metadata: bool = False) -> list[dict]:
|
||||
async def search_anime(
|
||||
self, query: str, lang: str = "vostfr", include_metadata: bool = False
|
||||
) -> list[dict]:
|
||||
"""
|
||||
Search for anime on vostfree
|
||||
|
||||
@@ -219,6 +233,7 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
"""
|
||||
try:
|
||||
import time
|
||||
|
||||
start = time.time()
|
||||
print(f"[VOSTFREE] Searching for '{query}' ({lang})...")
|
||||
|
||||
@@ -233,15 +248,15 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
if response.status_code == 200:
|
||||
print(f"[VOSTFREE] Found anime at {str(response.url)}")
|
||||
result = {
|
||||
'title': query,
|
||||
'url': str(response.url),
|
||||
'type': 'direct',
|
||||
'metadata': None
|
||||
"title": query,
|
||||
"url": str(response.url),
|
||||
"type": "direct",
|
||||
"metadata": None,
|
||||
}
|
||||
|
||||
if include_metadata:
|
||||
metadata = await self.get_anime_metadata(str(response.url))
|
||||
result['metadata'] = metadata
|
||||
result["metadata"] = metadata
|
||||
|
||||
return [result]
|
||||
|
||||
|
||||
+128
-139
@@ -1,4 +1,5 @@
|
||||
"""FS7 (French Stream) series site downloader"""
|
||||
|
||||
import logging
|
||||
import re
|
||||
from typing import List, Dict, Any, Optional
|
||||
@@ -19,29 +20,46 @@ class FS7Downloader(BaseSeriesSite):
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.base_url = "https://fs7.lol"
|
||||
self.search_url = f"{self.base_url}/"
|
||||
# Update client headers to mimic browser
|
||||
self.client.headers.update({
|
||||
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
|
||||
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
|
||||
'Accept-Language': 'fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7',
|
||||
'Accept-Encoding': 'gzip, deflate',
|
||||
'Connection': 'keep-alive',
|
||||
'Upgrade-Insecure-Requests': '1'
|
||||
})
|
||||
self.id = "fs7"
|
||||
self.provider_id = "fs7"
|
||||
self.default_domain = "fs7.lol"
|
||||
self.test_tlds = ["lol", "com", "net", "org", "tv", "ws", "cc", "co"]
|
||||
self.base_url = f"https://{self.default_domain}"
|
||||
self._domain_checked = False
|
||||
self.client.headers.update(
|
||||
{
|
||||
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
|
||||
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
|
||||
"Accept-Language": "fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7",
|
||||
"Accept-Encoding": "gzip, deflate",
|
||||
"Connection": "keep-alive",
|
||||
"Upgrade-Insecure-Requests": "1",
|
||||
}
|
||||
)
|
||||
|
||||
async def _ensure_base_url(self):
|
||||
"""Ensure base_url is set to the current active domain"""
|
||||
if self._domain_checked:
|
||||
return
|
||||
self._domain_checked = True
|
||||
try:
|
||||
from app.utils import DomainManager
|
||||
|
||||
active_domain = await DomainManager.get_active_domain(
|
||||
self.provider_id, self.default_domain, self.test_tlds, test_path="/"
|
||||
)
|
||||
self.base_url = f"https://{active_domain}"
|
||||
logger.info(f"Using active domain for FS7: {self.base_url}")
|
||||
except Exception as e:
|
||||
logger.warning(f"Domain check failed for FS7, using default: {e}")
|
||||
|
||||
def can_handle(self, url: str) -> bool:
|
||||
"""Check if this downloader can handle the given URL"""
|
||||
return "fs7.lol" in url.lower() or "french-stream" in url.lower()
|
||||
|
||||
async def search_anime(
|
||||
self,
|
||||
query: str,
|
||||
lang: str = "vf"
|
||||
) -> List[Dict[str, str]]:
|
||||
async def search_anime(self, query: str, lang: str = "vf") -> List[Dict[str, str]]:
|
||||
"""
|
||||
Search for series on FS7.
|
||||
Search for series on FS7 using DLE AJAX search endpoint.
|
||||
|
||||
Args:
|
||||
query: Search query
|
||||
@@ -51,91 +69,61 @@ class FS7Downloader(BaseSeriesSite):
|
||||
List of series with title, url, cover_image
|
||||
"""
|
||||
try:
|
||||
await self._ensure_base_url()
|
||||
logger.info(f"Searching FS7 for: {query}")
|
||||
|
||||
# FS7 uses GET request with query parameters for search
|
||||
response = await self.client.get(
|
||||
self.search_url,
|
||||
params={
|
||||
"do": "search",
|
||||
"subaction": "search",
|
||||
"story": query
|
||||
}
|
||||
ajax_url = f"{self.base_url}/engine/ajax/search.php"
|
||||
response = await self.client.post(
|
||||
ajax_url,
|
||||
data={"query": query, "page": "1"},
|
||||
headers={
|
||||
"Content-Type": "application/x-www-form-urlencoded",
|
||||
"X-Requested-With": "XMLHttpRequest",
|
||||
"Referer": f"{self.base_url}/",
|
||||
},
|
||||
)
|
||||
response.raise_for_status()
|
||||
html = response.text
|
||||
|
||||
soup = BeautifulSoup(html, 'lxml')
|
||||
soup = BeautifulSoup(html, "lxml")
|
||||
results = []
|
||||
|
||||
# Look for series items
|
||||
# FS7 usually structure: <div class="movie-item">...<a href="..."><img src="..."></a>...</div>
|
||||
# Or directly <a> tags with images
|
||||
items = soup.find_all('div', class_='movie-item')
|
||||
if not items:
|
||||
# Fallback to the previous method if layout is different
|
||||
items = soup.find_all('a', href=re.compile(r'/s-tv/\d+-.+\.html'))
|
||||
|
||||
for item in items[:24]: # Limit to 24 results
|
||||
# Find the link and image within the item or the item itself
|
||||
if item.name == 'a':
|
||||
link_elem = item
|
||||
else:
|
||||
link_elem = item.find('a', href=re.compile(r'/s-tv/|/films/'))
|
||||
|
||||
if not link_elem:
|
||||
for item in soup.find_all("div", class_="search-item")[:24]:
|
||||
onclick = item.get("onclick", "")
|
||||
url_match = re.search(r"location\.href=['\"]([^'\"]+)['\"]", onclick)
|
||||
if not url_match:
|
||||
continue
|
||||
|
||||
url = link_elem.get('href', '')
|
||||
if not url.startswith('http'):
|
||||
url = url_match.group(1)
|
||||
if not url.startswith("http"):
|
||||
url = urljoin(self.base_url, url)
|
||||
|
||||
# Extract title
|
||||
img_elem = item.find('img')
|
||||
title = ""
|
||||
if img_elem and img_elem.get('alt'):
|
||||
title = img_elem.get('alt').strip()
|
||||
elif link_elem.get('title'):
|
||||
title = link_elem.get('title').strip()
|
||||
else:
|
||||
title = item.get_text(strip=True)
|
||||
title_elem = item.find("div", class_="search-title")
|
||||
title = title_elem.get_text(strip=True) if title_elem else ""
|
||||
title = re.sub(r"\s+", " ", title).strip()
|
||||
|
||||
# Extract cover image
|
||||
img_elem = item.find('img')
|
||||
cover_image = ""
|
||||
if img_elem:
|
||||
# Check for common lazy loading attributes used by various themes
|
||||
cover_image = (
|
||||
img_elem.get('data-src') or
|
||||
img_elem.get('data-original') or
|
||||
img_elem.get('src') or
|
||||
""
|
||||
)
|
||||
|
||||
# If still empty, look for background-style images in inline styles
|
||||
if not cover_image:
|
||||
style = item.get('style', '')
|
||||
if 'background-image' in style:
|
||||
match = re.search(r'url\([\'"]?(.*?)[\'"]?\)', style)
|
||||
if match:
|
||||
cover_image = match.group(1)
|
||||
|
||||
if cover_image and not cover_image.startswith('http'):
|
||||
cover_image = urljoin(self.base_url, cover_image)
|
||||
|
||||
# Clean up title
|
||||
title = re.sub(r'\s+affiche$', '', title, flags=re.IGNORECASE).strip()
|
||||
title = re.sub(r'\s+', ' ', title)
|
||||
poster_elem = item.find("div", class_="search-poster")
|
||||
if poster_elem:
|
||||
img = poster_elem.find("img")
|
||||
if img:
|
||||
cover_image = (
|
||||
img.get("data-src")
|
||||
or img.get("data-original")
|
||||
or img.get("src")
|
||||
or ""
|
||||
)
|
||||
|
||||
if title and len(title) > 2:
|
||||
if not any(r['url'] == url for r in results):
|
||||
results.append({
|
||||
'title': title,
|
||||
'url': url,
|
||||
'cover_image': cover_image
|
||||
})
|
||||
results.append(
|
||||
{
|
||||
"title": title,
|
||||
"url": url,
|
||||
"cover_image": cover_image,
|
||||
"provider_id": self.provider_id,
|
||||
}
|
||||
)
|
||||
|
||||
logger.info(f"Found {len(results)} series on FS7")
|
||||
logger.info(f"Found {len(results)} results on FS7 for '{query}'")
|
||||
return results
|
||||
|
||||
except Exception as e:
|
||||
@@ -143,9 +131,7 @@ class FS7Downloader(BaseSeriesSite):
|
||||
return []
|
||||
|
||||
async def get_episodes(
|
||||
self,
|
||||
anime_url: str,
|
||||
lang: str = "vf"
|
||||
self, anime_url: str, lang: str = "vf"
|
||||
) -> List[Dict[str, str]]:
|
||||
"""
|
||||
Get episode list for a series.
|
||||
@@ -164,31 +150,33 @@ class FS7Downloader(BaseSeriesSite):
|
||||
response.raise_for_status()
|
||||
html = response.text
|
||||
|
||||
soup = BeautifulSoup(html, 'lxml')
|
||||
soup = BeautifulSoup(html, "lxml")
|
||||
episodes = []
|
||||
|
||||
# Get series title for episode naming
|
||||
title_elem = soup.find('h1')
|
||||
title_elem = soup.find("h1")
|
||||
series_title = title_elem.get_text(strip=True) if title_elem else "Series"
|
||||
# Clean up title: remove "affiche" suffix
|
||||
series_title = re.sub(r'\s+affiche$', '', series_title, flags=re.IGNORECASE).strip()
|
||||
series_title = re.sub(
|
||||
r"\s+affiche$", "", series_title, flags=re.IGNORECASE
|
||||
).strip()
|
||||
|
||||
# FS7 stores episode data in JavaScript div elements
|
||||
# Format: <div data-ep="1" data-vidzy="..." data-uqload="..." data-netu="..." data-voe="..."></div>
|
||||
episode_divs = soup.find_all('div', attrs={'data-ep': True})
|
||||
episode_divs = soup.find_all("div", attrs={"data-ep": True})
|
||||
|
||||
for div in episode_divs:
|
||||
ep_num = div.get('data-ep', '').strip()
|
||||
ep_num = div.get("data-ep", "").strip()
|
||||
|
||||
# Try different video players in order of preference
|
||||
video_url = None
|
||||
host_name = None
|
||||
for player in ['data-vidzy', 'data-uqload', 'data-voe', 'data-netu']:
|
||||
player_url = div.get(player, '').strip()
|
||||
for player in ["data-vidzy", "data-uqload", "data-voe", "data-netu"]:
|
||||
player_url = div.get(player, "").strip()
|
||||
if player_url:
|
||||
video_url = player_url
|
||||
# Extract host name from attribute name
|
||||
host_name = player.replace('data-', '').title()
|
||||
host_name = player.replace("data-", "").title()
|
||||
logger.debug(f"Found episode {ep_num} on {host_name}")
|
||||
break
|
||||
|
||||
@@ -199,15 +187,19 @@ class FS7Downloader(BaseSeriesSite):
|
||||
# Use pipe-separated format: video_url|anime_url|episode_title
|
||||
combined_url = f"{video_url}|{anime_url}|{episode_title}"
|
||||
|
||||
episodes.append({
|
||||
'episode': ep_num,
|
||||
'url': combined_url,
|
||||
'title': episode_title,
|
||||
'host': host_name or 'Unknown'
|
||||
})
|
||||
episodes.append(
|
||||
{
|
||||
"episode": ep_num,
|
||||
"url": combined_url,
|
||||
"title": episode_title,
|
||||
"host": host_name or "Unknown",
|
||||
}
|
||||
)
|
||||
|
||||
# Sort by episode number
|
||||
episodes.sort(key=lambda x: int(x['episode']) if x['episode'].isdigit() else 0)
|
||||
episodes.sort(
|
||||
key=lambda x: int(x["episode"]) if x["episode"].isdigit() else 0
|
||||
)
|
||||
|
||||
logger.info(f"Found {len(episodes)} episodes")
|
||||
return episodes
|
||||
@@ -216,10 +208,7 @@ class FS7Downloader(BaseSeriesSite):
|
||||
logger.error(f"Error getting episodes from FS7: {e}")
|
||||
return []
|
||||
|
||||
async def get_anime_metadata(
|
||||
self,
|
||||
anime_url: str
|
||||
) -> Dict[str, Any]:
|
||||
async def get_anime_metadata(self, anime_url: str) -> Dict[str, Any]:
|
||||
"""
|
||||
Get metadata for a series.
|
||||
|
||||
@@ -236,62 +225,62 @@ class FS7Downloader(BaseSeriesSite):
|
||||
response.raise_for_status()
|
||||
html = response.text
|
||||
|
||||
soup = BeautifulSoup(html, 'lxml')
|
||||
soup = BeautifulSoup(html, "lxml")
|
||||
|
||||
# Extract title
|
||||
title = soup.find('h1')
|
||||
title = soup.find("h1")
|
||||
title = title.get_text(strip=True) if title else "Unknown"
|
||||
|
||||
# Clean up title: remove "affiche" suffix
|
||||
title = re.sub(r'\s+affiche$', '', title, flags=re.IGNORECASE).strip()
|
||||
title = re.sub(r"\s+affiche$", "", title, flags=re.IGNORECASE).strip()
|
||||
|
||||
# Extract description/synopsis
|
||||
description_elem = soup.find('div', class_='full-text')
|
||||
description = description_elem.get_text(strip=True) if description_elem else ""
|
||||
description_elem = soup.find("div", class_="full-text")
|
||||
description = (
|
||||
description_elem.get_text(strip=True) if description_elem else ""
|
||||
)
|
||||
|
||||
# Extract cover image
|
||||
img = soup.find('img', class_='poster')
|
||||
poster_image = img.get('src', '') if img else ''
|
||||
img = soup.find("img", class_="poster")
|
||||
poster_image = img.get("src", "") if img else ""
|
||||
|
||||
# Try to get poster from meta tag if not found
|
||||
if not poster_image:
|
||||
meta_img = soup.find('meta', property='og:image')
|
||||
poster_image = meta_img.get('content', '') if meta_img else ''
|
||||
meta_img = soup.find("meta", property="og:image")
|
||||
poster_image = meta_img.get("content", "") if meta_img else ""
|
||||
|
||||
# Extract year
|
||||
year_match = re.search(r'\b(19|20)\d{2}\b', description)
|
||||
year_match = re.search(r"\b(19|20)\d{2}\b", description)
|
||||
release_year = int(year_match.group()) if year_match else None
|
||||
|
||||
return {
|
||||
'title': title,
|
||||
'synopsis': description,
|
||||
'poster_image': poster_image,
|
||||
'release_year': release_year,
|
||||
'genres': [],
|
||||
'rating': None,
|
||||
'studio': None,
|
||||
'total_episodes': None,
|
||||
'status': None
|
||||
"title": title,
|
||||
"synopsis": description,
|
||||
"poster_image": poster_image,
|
||||
"release_year": release_year,
|
||||
"genres": [],
|
||||
"rating": None,
|
||||
"studio": None,
|
||||
"total_episodes": None,
|
||||
"status": None,
|
||||
}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error getting metadata from FS7: {e}")
|
||||
return {
|
||||
'title': "Unknown",
|
||||
'synopsis': "",
|
||||
'poster_image': '',
|
||||
'genres': [],
|
||||
'rating': None,
|
||||
'release_year': None,
|
||||
'studio': None,
|
||||
'total_episodes': None,
|
||||
'status': None
|
||||
"title": "Unknown",
|
||||
"synopsis": "",
|
||||
"poster_image": "",
|
||||
"genres": [],
|
||||
"rating": None,
|
||||
"release_year": None,
|
||||
"studio": None,
|
||||
"total_episodes": None,
|
||||
"status": None,
|
||||
}
|
||||
|
||||
async def get_download_link(
|
||||
self,
|
||||
url: str,
|
||||
target_filename: Optional[str] = None
|
||||
self, url: str, target_filename: Optional[str] = None
|
||||
) -> tuple[str, str]:
|
||||
"""
|
||||
Extract download link from video player URL.
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
"""Zone-Telechargement series site downloader"""
|
||||
|
||||
import logging
|
||||
import re
|
||||
from typing import List, Dict, Any, Optional, Tuple
|
||||
@@ -18,94 +19,106 @@ class ZoneTelechargementDownloader(BaseSeriesSite):
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.id = "zonetelechargement"
|
||||
self.provider_id = "zonetelechargement"
|
||||
self.default_domain = "zone-telechargement.cam"
|
||||
self.test_tlds = ["cam", "net", "org", "blue", "lol", "work"]
|
||||
self.base_url = None # Will be set dynamically
|
||||
|
||||
self.client.headers.update({
|
||||
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
|
||||
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
|
||||
'Accept-Language': 'fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7',
|
||||
})
|
||||
self.default_domain = "zone-telechargement.golf"
|
||||
self.test_tlds = ["golf", "cam", "net", "org", "blue", "lol", "work", "ws"]
|
||||
self.base_url = f"https://{self.default_domain}"
|
||||
self._domain_checked = False
|
||||
|
||||
self.client.headers.update(
|
||||
{
|
||||
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
|
||||
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
|
||||
"Accept-Language": "fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7",
|
||||
}
|
||||
)
|
||||
|
||||
async def _ensure_base_url(self):
|
||||
"""Ensure base_url is set to the current active domain"""
|
||||
if not self.base_url:
|
||||
if self._domain_checked:
|
||||
return
|
||||
self._domain_checked = True
|
||||
try:
|
||||
active_domain = await DomainManager.get_active_domain(
|
||||
self.provider_id,
|
||||
self.default_domain,
|
||||
self.test_tlds,
|
||||
test_path="/"
|
||||
self.provider_id, self.default_domain, self.test_tlds, test_path="/"
|
||||
)
|
||||
self.base_url = f"https://{active_domain}"
|
||||
logger.info(f"Using active domain for Zone-Telechargement: {self.base_url}")
|
||||
except Exception as e:
|
||||
logger.warning(
|
||||
f"Domain check failed for Zone-Telechargement, using default: {e}"
|
||||
)
|
||||
|
||||
def can_handle(self, url: str) -> bool:
|
||||
"""Check if this downloader can handle the given URL"""
|
||||
return "zone-telechargement" in url.lower() or "zt-za" in url.lower()
|
||||
|
||||
async def search_anime(
|
||||
self,
|
||||
query: str,
|
||||
lang: str = "vf"
|
||||
) -> List[Dict[str, str]]:
|
||||
"""Search for series on Zone-Telechargement"""
|
||||
async def search_anime(self, query: str, lang: str = "vf") -> List[Dict[str, str]]:
|
||||
"""Search for series on Zone-Telechargement.
|
||||
|
||||
ZT uses server-side rendered search: GET /?p=series&search=QUERY.
|
||||
Results are in div.cover_global containers with nested cover_infos_title links.
|
||||
"""
|
||||
try:
|
||||
await self._ensure_base_url()
|
||||
logger.info(f"Searching Zone-Telechargement for: {query}")
|
||||
|
||||
# ZT uses POST or GET for search depending on the version
|
||||
# Most modern versions use: /index.php?do=search
|
||||
search_url = f"{self.base_url}/index.php?do=search"
|
||||
|
||||
# Form data for search
|
||||
data = {
|
||||
"do": "search",
|
||||
"subaction": "search",
|
||||
"search_start": "0",
|
||||
"full_search": "0",
|
||||
"result_from": "1",
|
||||
"story": query
|
||||
}
|
||||
|
||||
response = await self.client.post(search_url, data=data)
|
||||
search_url = f"{self.base_url}/"
|
||||
params = {"p": "series", "search": query}
|
||||
|
||||
response = await self.client.get(search_url, params=params)
|
||||
response.raise_for_status()
|
||||
html = response.text
|
||||
|
||||
soup = BeautifulSoup(html, 'lxml')
|
||||
soup = BeautifulSoup(html, "lxml")
|
||||
results = []
|
||||
|
||||
# Look for items
|
||||
items = soup.find_all('div', class_='shm-item') or soup.find_all('div', class_='movie-item')
|
||||
|
||||
for item in items[:24]:
|
||||
link_elem = item.find('a', class_='shm-title') or item.find('a')
|
||||
if not link_elem:
|
||||
for cover_div in soup.find_all("div", class_="cover_global")[:24]:
|
||||
link_in_cover = cover_div.find("a", class_="mainimg")
|
||||
if not link_in_cover:
|
||||
link_in_cover = cover_div.find("a")
|
||||
|
||||
if not link_in_cover:
|
||||
continue
|
||||
|
||||
url = link_elem.get('href', '')
|
||||
if not url.startswith('http'):
|
||||
url = link_in_cover.get("href", "")
|
||||
if not url.startswith("http"):
|
||||
url = urljoin(self.base_url, url)
|
||||
|
||||
title = link_elem.get_text(strip=True)
|
||||
|
||||
img_elem = item.find('img')
|
||||
img = cover_div.find("img")
|
||||
cover_image = ""
|
||||
if img_elem:
|
||||
cover_image = img_elem.get('data-src') or img_elem.get('src') or ""
|
||||
|
||||
if cover_image and not cover_image.startswith('http'):
|
||||
cover_image = urljoin(self.base_url, cover_image)
|
||||
if img:
|
||||
cover_image = img.get("data-src") or img.get("src") or ""
|
||||
if cover_image and not cover_image.startswith("http"):
|
||||
cover_image = urljoin(self.base_url, cover_image)
|
||||
|
||||
title = ""
|
||||
info_div = cover_div.find("div", class_="cover_infos_title")
|
||||
if info_div:
|
||||
title_link = info_div.find("a")
|
||||
if title_link:
|
||||
title = title_link.get_text(strip=True)
|
||||
else:
|
||||
title = info_div.get_text(strip=True)
|
||||
else:
|
||||
title = link_in_cover.get("title", "")
|
||||
if not title:
|
||||
title = link_in_cover.get_text(strip=True)
|
||||
|
||||
if title and len(title) > 2:
|
||||
results.append({
|
||||
'title': title,
|
||||
'url': url,
|
||||
'cover_image': cover_image,
|
||||
'provider_id': self.provider_id
|
||||
})
|
||||
results.append(
|
||||
{
|
||||
"title": title,
|
||||
"url": url,
|
||||
"cover_image": cover_image,
|
||||
"provider_id": self.provider_id,
|
||||
}
|
||||
)
|
||||
|
||||
logger.info(
|
||||
f"Zone-Telechargement found {len(results)} results for '{query}'"
|
||||
)
|
||||
return results
|
||||
|
||||
except Exception as e:
|
||||
@@ -113,39 +126,35 @@ class ZoneTelechargementDownloader(BaseSeriesSite):
|
||||
return []
|
||||
|
||||
async def get_episodes(
|
||||
self,
|
||||
anime_url: str,
|
||||
lang: str = "vf"
|
||||
self, anime_url: str, lang: str = "vf"
|
||||
) -> List[Dict[str, str]]:
|
||||
"""Extract episodes from a series page"""
|
||||
try:
|
||||
await self._ensure_base_url()
|
||||
html = await self._fetch_page(anime_url)
|
||||
soup = BeautifulSoup(html, 'lxml')
|
||||
|
||||
soup = BeautifulSoup(html, "lxml")
|
||||
|
||||
episodes = []
|
||||
|
||||
|
||||
# ZT typically lists episodes in a table or list of links
|
||||
# Links often look like: /telecharger-series/.../saison-X-episode-Y.html
|
||||
links = soup.find_all('a', href=re.compile(r'episode-\d+'))
|
||||
|
||||
links = soup.find_all("a", href=re.compile(r"episode-\d+"))
|
||||
|
||||
for i, link in enumerate(links):
|
||||
href = link.get('href', '')
|
||||
if not href.startswith('http'):
|
||||
href = link.get("href", "")
|
||||
if not href.startswith("http"):
|
||||
href = urljoin(self.base_url, href)
|
||||
|
||||
|
||||
title = link.get_text(strip=True)
|
||||
ep_match = re.search(r'episode\s*(\d+)', title.lower())
|
||||
ep_match = re.search(r"episode\s*(\d+)", title.lower())
|
||||
ep_number = int(ep_match.group(1)) if ep_match else i + 1
|
||||
|
||||
episodes.append({
|
||||
'episode_number': ep_number,
|
||||
'url': href,
|
||||
'title': title
|
||||
})
|
||||
|
||||
|
||||
episodes.append(
|
||||
{"episode_number": ep_number, "url": href, "title": title}
|
||||
)
|
||||
|
||||
# Sort by episode number
|
||||
episodes.sort(key=lambda x: x['episode_number'])
|
||||
episodes.sort(key=lambda x: x["episode_number"])
|
||||
return episodes
|
||||
|
||||
except Exception as e:
|
||||
@@ -157,32 +166,40 @@ class ZoneTelechargementDownloader(BaseSeriesSite):
|
||||
try:
|
||||
await self._ensure_base_url()
|
||||
html = await self._fetch_page(anime_url)
|
||||
soup = BeautifulSoup(html, 'lxml')
|
||||
|
||||
soup = BeautifulSoup(html, "lxml")
|
||||
|
||||
metadata = {
|
||||
'title': "",
|
||||
'synopsis': "",
|
||||
'genres': [],
|
||||
'poster_image': "",
|
||||
'status': "Unknown"
|
||||
"title": "",
|
||||
"synopsis": "",
|
||||
"genres": [],
|
||||
"poster_image": "",
|
||||
"status": "Unknown",
|
||||
}
|
||||
|
||||
title_elem = soup.find('h1')
|
||||
|
||||
title_elem = soup.find("h1")
|
||||
if title_elem:
|
||||
metadata['title'] = title_elem.get_text(strip=True)
|
||||
|
||||
metadata["title"] = title_elem.get_text(strip=True)
|
||||
|
||||
# Synopsis
|
||||
syn_elem = soup.find('div', class_='shm-description') or soup.find('div', class_='movie-desc')
|
||||
syn_elem = soup.find("div", class_="shm-description") or soup.find(
|
||||
"div", class_="movie-desc"
|
||||
)
|
||||
if syn_elem:
|
||||
metadata['synopsis'] = syn_elem.get_text(strip=True)
|
||||
|
||||
metadata["synopsis"] = syn_elem.get_text(strip=True)
|
||||
|
||||
# Poster
|
||||
img_elem = soup.find('div', class_='shm-img').find('img') if soup.find('div', class_='shm-img') else None
|
||||
img_elem = (
|
||||
soup.find("div", class_="shm-img").find("img")
|
||||
if soup.find("div", class_="shm-img")
|
||||
else None
|
||||
)
|
||||
if img_elem:
|
||||
metadata['poster_image'] = urljoin(self.base_url, img_elem.get('src', ''))
|
||||
|
||||
metadata["poster_image"] = urljoin(
|
||||
self.base_url, img_elem.get("src", "")
|
||||
)
|
||||
|
||||
return metadata
|
||||
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error getting metadata from Zone-Telechargement: {e}")
|
||||
return {}
|
||||
@@ -192,19 +209,25 @@ class ZoneTelechargementDownloader(BaseSeriesSite):
|
||||
try:
|
||||
await self._ensure_base_url()
|
||||
html = await self._fetch_page(url)
|
||||
soup = BeautifulSoup(html, 'lxml')
|
||||
|
||||
soup = BeautifulSoup(html, "lxml")
|
||||
|
||||
# Look for video player links (Uptobox, 1fichier, etc.)
|
||||
# ZT often has multiple hosts
|
||||
links = soup.find_all('a', href=re.compile(r'uptobox|1fichier|doodstream|vidmoly'))
|
||||
|
||||
links = soup.find_all(
|
||||
"a", href=re.compile(r"uptobox|1fichier|doodstream|vidmoly")
|
||||
)
|
||||
|
||||
if links:
|
||||
player_url = links[0].get('href', '')
|
||||
title = soup.find('h1').get_text(strip=True) if soup.find('h1') else "Episode"
|
||||
player_url = links[0].get("href", "")
|
||||
title = (
|
||||
soup.find("h1").get_text(strip=True)
|
||||
if soup.find("h1")
|
||||
else "Episode"
|
||||
)
|
||||
return player_url, title
|
||||
|
||||
|
||||
return "", ""
|
||||
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error getting download link from Zone-Telechargement: {e}")
|
||||
return "", ""
|
||||
|
||||
@@ -1,16 +1,26 @@
|
||||
# Video Players (app/downloaders/video_players)
|
||||
# Video Players (app/downloaders/video_players/)
|
||||
|
||||
## OVERVIEW
|
||||
File hosting extractors that extract direct download links from video player pages (Doodstream, Sibnet, VidMoly, etc.).
|
||||
File hosting extractors that extract direct download links from video player pages (Doodstream, Sibnet, VidMoly, Uptobox, etc.).
|
||||
|
||||
## WHERE TO LOOK
|
||||
|
||||
| Need | File |
|
||||
|------|------|
|
||||
| Base class | `base.py` - `BaseVideoPlayer` abstract class |
|
||||
| Add new player | Create new `.py` file, inherit `BaseVideoPlayer`, add to `__init__.py` |
|
||||
| URL detection logic | Each player's `can_handle()` method |
|
||||
| Extract download link | Each player's `get_download_link()` method |
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `base.py` | `BaseVideoPlayer` abstract class |
|
||||
| `unfichier.py` | 1fichier.com |
|
||||
| `doodstream.py` | Doodstream |
|
||||
| `vidmoly.py` | VidMoly (requires Playwright for extraction) |
|
||||
| `uptobox.py` | Uptobox |
|
||||
| `sendvid.py` | SendVid |
|
||||
| `sibnet.py` | Sibnet |
|
||||
| `rapidfile.py` | Rapidfile |
|
||||
| `uqload.py` | Uqload |
|
||||
| `lpayer.py` | Lplayer |
|
||||
| `vidzy.py` | Vidzy |
|
||||
| `luluv.py` | LuLuvid |
|
||||
| `smoothpre.py` | Smoothpre |
|
||||
| `oneupload.py` | OneUpload |
|
||||
|
||||
## CONVENTIONS
|
||||
|
||||
@@ -22,16 +32,17 @@ def can_handle(self, url: str) -> bool: ...
|
||||
async def get_download_link(self, url: str, target_filename: str = None) -> tuple[str, str]: ...
|
||||
```
|
||||
|
||||
**File operation**: Always use `sanitize_filename()` on extracted filenames.
|
||||
|
||||
**HTTP client**: Use `self.client` (AsyncClient from base class). Always close via `await self.close()` when done.
|
||||
|
||||
**Return format**: `(download_url, filename)` tuple.
|
||||
|
||||
**HTTP client**: Use `self.client` (AsyncClient from base class). Always close via `await self.close()`.
|
||||
|
||||
**File operation**: Always `sanitize_filename()` on extracted filenames.
|
||||
|
||||
## ANTI-PATTERNS
|
||||
|
||||
- Do NOT hardcode User-Agent in each player (use base class headers)
|
||||
- Do NOT forget to call `await self.close()` after extraction
|
||||
- Do NOT return None for missing URLs, raise an exception
|
||||
- Do NOT use sync `requests`, use async `httpx`
|
||||
- Do NOT skip the `target_filename` parameter, even if unused
|
||||
- Do NOT hardcode User-Agent per player — use base class headers
|
||||
- Do NOT forget `await self.close()` — resource leak
|
||||
- Do NOT return None for missing URLs — raise an exception
|
||||
- Do NOT use sync `requests` — use async `httpx`
|
||||
- Do NOT skip `target_filename` parameter — required for anime/series site compatibility
|
||||
- 8 empty `except:` blocks across players — known tech debt
|
||||
|
||||
Reference in New Issue
Block a user