feat: fix auth, provider health checks, search, and redesign UI
CI / Test (Python 3.11) (push) Has been cancelled
CI / Test (Python 3.12) (push) Has been cancelled
CI / Lint (push) Has been cancelled
CI / Type Check (push) Has been cancelled
CI / Summary (push) Has been cancelled

- Fix register/login: dict-style access on UserTable ORM objects
- Fix HTMX auth: inject JWT token in all HTMX request headers
- Fix FS7 search: use DLE AJAX endpoint /engine/ajax/search.php
- Fix ZT search: use ?p=series&search=QUERY (not DLE format)
- Fix provider health: load hardcoded providers + domain manager
- Add self.id to all anime/series providers
- Redesign homepage: Netflix-style horizontal scroll cards (.hc)
- Redesign search results: grouped by title, poster + synopsis + 3 buttons
- Add Télécharger dropdown: season download + episode picker
- Fix navbar CSS: restore .tabs flex layout, remove orphan rules
- Fix HTMX spinner: remove inline display:none, use CSS indicator
- Add AGENTS.md files across project for developer documentation
This commit is contained in:
root
2026-03-28 00:14:31 +00:00
parent 5d23a3d663
commit 3dc5dd8fe9
36 changed files with 2735 additions and 1989 deletions
+49
View File
@@ -0,0 +1,49 @@
# Downloaders (app/downloaders/)
## OVERVIEW
3-tier scraper architecture: anime catalogs → series catalogs → video players. Factory pattern routes URLs through each tier.
## STRUCTURE
```
downloaders/
├── __init__.py # get_downloader(url) — 3-tier factory + GenericDownloader
├── base.py # Legacy BaseDownloader (kept for compat)
├── anime_sites/ # Anime streaming catalogs (see anime_sites/AGENTS.md)
│ ├── __init__.py # get_anime_site(url) factory
│ ├── base.py # BaseAnimeSite abstract class
│ └── *.py # 5 anime providers
├── series_sites/ # TV series catalogs (see series_sites/AGENTS.md)
│ ├── __init__.py # get_series_site(url) factory
│ ├── base.py # BaseSeriesSite abstract class
│ └── fs7.py # 1 series provider
└── video_players/ # File hosting extractors (see video_players/AGENTS.md)
├── __init__.py # get_video_player(url) factory
├── base.py # BaseVideoPlayer abstract class
└── *.py # 13 video player handlers
```
## WHERE TO LOOK
| Need | File | Notes |
|------|------|-------|
| Route URL to downloader | `__init__.py:32` | `get_downloader(url)` tries anime→series→video→generic |
| Add anime provider | `anime_sites/` | Inherit BaseAnimeSite, register in anime_sites/__init__.py |
| Add series provider | `series_sites/` | Inherit BaseSeriesSite, register in series_sites/__init__.py |
| Add video player | `video_players/` | Inherit BaseVideoPlayer, register in video_players/__init__.py |
| Provider domains/icons | `app/providers.py` | Separate from downloader code |
## CONVENTIONS
**URL pipe format**: `video_url|anime_page_url|episode_title` — metadata preserved through tiers. Anime/series sites return player URLs (not direct downloads). Video players extract final download links.
**Factory chain**: `get_downloader()``get_anime_site()``get_series_site()``get_video_player()``GenericDownloader`.
**New provider checklist**: 1) Create .py inheriting base class, 2) Implement required methods, 3) Add to `__init__.py` factory list, 4) Add to `app/providers.py`.
## ANTI-PATTERNS
- Do NOT return None from `get_download_link()` — raise Exception
- Do NOT use sync `requests` — always `httpx.AsyncClient`
- Do NOT forget `await self.close()` — causes resource leaks
- Do NOT skip `sanitize_filename()` on extracted filenames
- Do NOT hardcode User-Agent per player — use base class headers
+23 -23
View File
@@ -1,4 +1,4 @@
# Anime Sites Downloaders
# Anime Sites (app/downloaders/anime_sites/)
## OVERVIEW
Handlers for French anime streaming catalogs that provide metadata and episode listings, delegating actual video extraction to video player handlers.
@@ -7,8 +7,8 @@ Handlers for French anime streaming catalogs that provide metadata and episode l
| File | Purpose |
|------|---------|
| `base.py` | Abstract `BaseAnimeSite` class defining the interface all anime sites implement |
| `animesama.py` | Primary provider with dynamic domain switching, multiple video player extraction |
| `base.py` | Abstract `BaseAnimeSite` class defining the interface |
| `animesama.py` | Primary provider dynamic domain switching, multiple video player extraction |
| `nekosama.py` | Neko-Sama / Gupy integration (metadata-only, no direct downloads) |
| `animeultime.py` | Anime-Ultime catalog handler |
| `vostfree.py` | Vostfree catalog handler |
@@ -16,26 +16,26 @@ Handlers for French anime streaming catalogs that provide metadata and episode l
## CONVENTIONS
### Interface Contract
Each site must implement four async methods from `BaseAnimeSite`:
- `can_handle(url: str) -> bool` — URL pattern matching
- `search_anime(query, lang) -> list[dict]` — Returns `{title, url, cover_image}`
- `get_episodes(anime_url, lang) -> list[dict]` — Returns `{episode_number, url, title, host}`
- `get_anime_metadata(anime_url) -> dict` — Returns `{synopsis, genres, rating, release_year, studio, poster_image, total_episodes, status}`
- `get_download_link(url) -> tuple[str, str]` — Returns `(video_player_url, filename)`
**Interface contract** — each site implements from `BaseAnimeSite`:
- `can_handle(url)` — URL pattern matching
- `search_anime(query, lang)``[{title, url, cover_image}]`
- `get_episodes(anime_url, lang)``[{episode_number, url, title, host}]`
- `get_anime_metadata(anime_url)``{synopsis, genres, rating, release_year, studio, poster_image, total_episodes, status}`
- `get_download_link(url)``(video_player_url, filename)`
### Key Patterns
- **Pipe-separated URLs**: `video_url|anime_page_url|episode_title` — preserves context across extraction
- **Language parameter**: `lang="vostfr"` or `"vf"` — controls which episodes to return
- **Video player delegation**: Anime sites return player URLs (vidmoly, sendvid, sibnet, lpayer), not direct downloads
- **Filename generation**: `{anime_name} - S{season} - {episode}.mp4` format
- **HTTP headers**: Browser UA and referer required to avoid blocking
**Key patterns**:
- Pipe-separated URLs: `video_url|anime_page_url|episode_title`
- Language param: `lang="vostfr"` or `"vf"`
- Video player delegation: returns player URLs (vidmoly, sendvid, etc.), NOT direct downloads
- Filename format: `{anime_name} - S{season} - {episode}.mp4`
- Browser UA + referer headers required
### Domain Detection
- `AnimeSamaDownloader` fetches current domain from `anime-sama.pw` dynamically
- Uses fallback chain for video extraction: detected player → cached player → priority list
**Domain detection**: `AnimeSamaDownloader` fetches current domain from `anime-sama.pw` dynamically. Uses fallback chain for video extraction.
### Error Handling
- Raise `Exception` with descriptive message on failure
- Log at appropriate level (`debug` for expected failures, `error` for unexpected)
- Validate extracted URLs with `_test_video_url()` before returning
**Error handling**: Raise `Exception` with descriptive message. Log at `debug` for expected failures, `error` for unexpected. Validate URLs with `_test_video_url()` before returning.
## ANTI-PATTERNS
- Do NOT return direct download URLs from anime sites — return player URLs
- Do NOT skip URL validation — use `_test_video_url()`
- 5 empty `except:` blocks in `animesama.py` — known tech debt, silently swallow failures
File diff suppressed because it is too large Load Diff
+147 -104
View File
@@ -10,6 +10,10 @@ class AnimeUltimeDownloader(BaseAnimeSite):
BASE_DOMAINS = ["anime-ultime.com", "anime-ultime.net", "www.anime-ultime.net"]
def __init__(self):
super().__init__()
self.id = "anime-ultime"
def can_handle(self, url: str) -> bool:
return any(domain in url.lower() for domain in self.BASE_DOMAINS)
@@ -24,58 +28,79 @@ class AnimeUltimeDownloader(BaseAnimeSite):
final_url = str(response.url)
# Parse the page
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
# Method 0: Look for og:video meta tag (most reliable for anime-ultime)
og_video = soup.find('meta', property='og:video')
if og_video and og_video.get('content'):
video_url = og_video['content']
if video_url.endswith('.mp4'):
og_video = soup.find("meta", property="og:video")
if og_video and og_video.get("content"):
video_url = og_video["content"]
if video_url.endswith(".mp4"):
filename = self._generate_filename(final_url)
print(f"[ANIME-ULTIME] Found og:video link: {video_url}")
return video_url, filename
# Method 1: Look for direct download links (DDL)
# Anime-Ultime often uses links to file hosts
download_links = soup.find_all('a', href=True)
download_links = soup.find_all("a", href=True)
for link in download_links:
href = link['href']
href = link["href"]
text = link.get_text().lower()
# Look for download buttons/links
if any(keyword in text for keyword in ['télécharger', 'download', 'ddl', 'mega', 'google', 'drive']):
if any(
keyword in text
for keyword in [
"télécharger",
"download",
"ddl",
"mega",
"google",
"drive",
]
):
# Check if it's a direct link or to a file host
if any(host in href.lower() for host in ['mega.nz', 'drive.google.com', 'uptobox.com', '1fichier.com']):
if any(
host in href.lower()
for host in [
"mega.nz",
"drive.google.com",
"uptobox.com",
"1fichier.com",
]
):
filename = self._generate_filename(final_url)
return href, filename
# Method 2: Look for iframe with video player
iframes = soup.find_all('iframe')
iframes = soup.find_all("iframe")
for iframe in iframes:
src = iframe.get('src', '')
if src and any(provider in src for provider in ['video', 'player', 'stream', 'play']):
if src.startswith('http'):
src = iframe.get("src", "")
if src and any(
provider in src
for provider in ["video", "player", "stream", "play"]
):
if src.startswith("http"):
filename = self._generate_filename(final_url)
return src, filename
# Method 3: Look for video tags
videos = soup.find_all('video')
videos = soup.find_all("video")
for video in videos:
src = video.get('src', '')
src = video.get("src", "")
if src:
filename = self._generate_filename(final_url)
return src, filename
# Check source tags
sources = video.find_all('source')
sources = video.find_all("source")
for source in sources:
src = source.get('src', '')
src = source.get("src", "")
if src:
filename = self._generate_filename(final_url)
return src, filename
# Method 4: Look in scripts for video URLs
scripts = soup.find_all('script')
scripts = soup.find_all("script")
for script in scripts:
if script.string:
# Look for common video patterns
@@ -91,26 +116,30 @@ class AnimeUltimeDownloader(BaseAnimeSite):
matches = re.findall(pattern, script.string)
for match in matches:
# Clean up escaped characters
match = match.replace('\\/', '/').replace('\\', '')
if any(ext in match for ext in ['mp4', 'm3u8', 'mkv']):
match = match.replace("\\/", "/").replace("\\", "")
if any(ext in match for ext in ["mp4", "m3u8", "mkv"]):
filename = self._generate_filename(final_url)
return match, filename
# Look for anime-ultime specific patterns
# They sometimes store links in JavaScript variables
ddl_match = re.search(r'ddl["\']?\s*:\s*["\']([^"\']+)["\']', script.string)
ddl_match = re.search(
r'ddl["\']?\s*:\s*["\']([^"\']+)["\']', script.string
)
if ddl_match:
ddl_url = ddl_match.group(1)
if ddl_url.startswith('http'):
if ddl_url.startswith("http"):
filename = self._generate_filename(final_url)
return ddl_url, filename
# Method 5: Look for links with specific classes or IDs
# Anime-Ultime might use specific class names for download links
potential_links = soup.find_all('a', class_=re.compile(r'download|ddl|episode', re.I))
potential_links = soup.find_all(
"a", class_=re.compile(r"download|ddl|episode", re.I)
)
for link in potential_links:
href = link.get('href', '')
if href and href.startswith('http'):
href = link.get("href", "")
if href and href.startswith("http"):
filename = self._generate_filename(final_url)
return href, filename
@@ -132,36 +161,38 @@ class AnimeUltimeDownloader(BaseAnimeSite):
episode = "01"
# Format: info-0-1/EPISODE_ID or info-0-1/EPISODE_ID/NAME-EP-vostfr
if 'info-0-1/' in url:
if "info-0-1/" in url:
# Extract episode ID
ep_match = re.search(r'info-0-1/(\d+)', url)
ep_match = re.search(r"info-0-1/(\d+)", url)
if ep_match:
ep_id = ep_match.group(1)
# Try to get anime name from URL path
name_match = re.search(r'info-0-1/\d+/([^/]+)', url)
name_match = re.search(r"info-0-1/\d+/([^/]+)", url)
if name_match:
raw_name = name_match.group(1)
# Extract episode number
ep_num_match = re.search(r'-(\d+)-vostfr$', raw_name, re.I)
ep_num_match = re.search(r"-(\d+)-vostfr$", raw_name, re.I)
if ep_num_match:
episode = ep_num_match.group(1).zfill(2)
# Remove episode number and suffix from name
anime_name = re.sub(r'-\d+-vostfr$', '', raw_name, flags=re.I).replace('-', ' ')
anime_name = re.sub(
r"-\d+-vostfr$", "", raw_name, flags=re.I
).replace("-", " ")
else:
# Just use the ID
anime_name = f"Episode {ep_id}"
else:
anime_name = f"Episode {ep_id}"
elif 'file-0-1/' in url:
elif "file-0-1/" in url:
# Extract from file-0-1/ID-NAME format
file_match = re.search(r'file-0-1/\d+-(.+)$', url)
file_match = re.search(r"file-0-1/\d+-(.+)$", url)
if file_match:
anime_name = file_match.group(1).replace('-', ' ')
anime_name = file_match.group(1).replace("-", " ")
# Sanitize filename
anime_name = anime_name.replace('/', ' ').strip()
anime_name = anime_name.replace("/", " ").strip()
filename = f"{anime_name} - Episode {episode}.mp4"
return filename.title()
@@ -173,30 +204,30 @@ class AnimeUltimeDownloader(BaseAnimeSite):
try:
print(f"[ANIME-ULTIME] Extracting metadata from: {anime_url}")
response = await self.client.get(anime_url)
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
metadata = {
'synopsis': None,
'genres': [],
'rating': None,
'release_year': None,
'studio': None,
'poster_image': None,
'banner_image': None,
'total_episodes': None,
'status': None,
'alternative_titles': []
"synopsis": None,
"genres": [],
"rating": None,
"release_year": None,
"studio": None,
"poster_image": None,
"banner_image": None,
"total_episodes": None,
"status": None,
"alternative_titles": [],
}
# Extract synopsis
synopsis_selectors = [
'div.synopsis',
'div.description',
"div.synopsis",
"div.description",
'div[class*="synopsis"]',
'div[class*="synopsis"]',
'p.synopsis',
'.info',
'div.texte'
"p.synopsis",
".info",
"div.texte",
]
for selector in synopsis_selectors:
@@ -204,68 +235,73 @@ class AnimeUltimeDownloader(BaseAnimeSite):
if synopsis_elem:
synopsis = synopsis_elem.get_text(strip=True)
if len(synopsis) > 50:
metadata['synopsis'] = synopsis
metadata["synopsis"] = synopsis
break
# Extract genres from meta tags and page content
page_text = soup.get_text()
# Look for genre in meta tags
genre_meta = soup.find('meta', property='genre') or soup.find('meta', attrs={'name': 'genre'})
genre_meta = soup.find("meta", property="genre") or soup.find(
"meta", attrs={"name": "genre"}
)
if genre_meta:
genres_text = genre_meta.get('content', '')
genres_text = genre_meta.get("content", "")
if genres_text:
metadata['genres'] = [g.strip() for g in genres_text.split(',')]
metadata["genres"] = [g.strip() for g in genres_text.split(",")]
# Try to find genre links
genre_links = soup.find_all('a', href=re.compile(r'genre|tag|type|cat', re.I))
genre_links = soup.find_all(
"a", href=re.compile(r"genre|tag|type|cat", re.I)
)
if genre_links:
for link in genre_links[:5]:
genre = link.get_text(strip=True)
if genre and genre not in metadata['genres']:
metadata['genres'].append(genre)
if genre and genre not in metadata["genres"]:
metadata["genres"].append(genre)
# Extract rating
rating_selectors = [
'span.rating',
'div.rating',
'span.score',
'div.note',
'.rating'
"span.rating",
"div.rating",
"span.score",
"div.note",
".rating",
]
for selector in rating_selectors:
rating_elem = soup.select_one(selector)
if rating_elem:
rating_text = rating_elem.get_text(strip=True)
rating_match = re.search(r'(\d+\.?\d*)\s*/\s*10', rating_text)
rating_match = re.search(r"(\d+\.?\d*)\s*/\s*10", rating_text)
if rating_match:
metadata['rating'] = f"{rating_match.group(1)}/10"
metadata["rating"] = f"{rating_match.group(1)}/10"
break
rating_match = re.search(r'(\d+\.?\d*)\s*/\s*5', rating_text)
rating_match = re.search(r"(\d+\.?\d*)\s*/\s*5", rating_text)
if rating_match:
rating_val = float(rating_match.group(1)) * 2
metadata['rating'] = f"{rating_val:.1f}/10"
metadata["rating"] = f"{rating_val:.1f}/10"
break
# Extract release year
year_match = re.search(r'\b(19\d{2}|20\d{2})\b', page_text)
year_match = re.search(r"\b(19\d{2}|20\d{2})\b", page_text)
if year_match:
import datetime
current_year = datetime.datetime.now().year + 2
year = int(year_match.group(1))
if 1950 <= year <= current_year:
metadata['release_year'] = year
metadata["release_year"] = year
# Extract poster image from og:image
og_image = soup.find('meta', property='og:image')
og_image = soup.find("meta", property="og:image")
if og_image:
metadata['poster_image'] = og_image.get('content')
metadata["poster_image"] = og_image.get("content")
# Extract total episodes
episodes_count = len(await self.get_episodes(anime_url))
if episodes_count > 0:
metadata['total_episodes'] = episodes_count
metadata["total_episodes"] = episodes_count
print(f"[ANIME-ULTIME] Extracted metadata: {metadata}")
return metadata
@@ -274,7 +310,9 @@ class AnimeUltimeDownloader(BaseAnimeSite):
print(f"[ANIME-ULTIME] Error extracting metadata: {e}")
return {}
async def search_anime(self, query: str, lang: str = "vostfr", include_metadata: bool = False) -> list[dict]:
async def search_anime(
self, query: str, lang: str = "vostfr", include_metadata: bool = False
) -> list[dict]:
"""
Search for anime on anime-ultime
Returns list of anime with title, url, and cover image
@@ -286,27 +324,30 @@ class AnimeUltimeDownloader(BaseAnimeSite):
"""
try:
import time
start = time.time()
print(f"[ANIME-ULTIME] Searching for '{query}' ({lang})...")
# Anime-Ultime uses POST for search
search_url = "https://www.anime-ultime.net/search-0-1"
response = await self.client.post(search_url, data={'search': query})
soup = BeautifulSoup(response.text, 'lxml')
response = await self.client.post(search_url, data={"search": query})
soup = BeautifulSoup(response.text, "lxml")
elapsed = time.time() - start
print(f"[ANIME-ULTIME] Got response {response.status_code} in {elapsed:.2f}s")
print(
f"[ANIME-ULTIME] Got response {response.status_code} in {elapsed:.2f}s"
)
results = []
# Look for search result links - better parsing
# Search results use file-0-1/ pattern, not info-
search_results = soup.find_all('a', href=re.compile(r'file-0-1/'))
search_results = soup.find_all("a", href=re.compile(r"file-0-1/"))
seen_urls = set()
for result in search_results[:10]: # Limit to 10 results
href = result.get('href', '')
href = result.get("href", "")
raw_title = result.get_text().strip()
# Skip if no href
@@ -322,40 +363,44 @@ class AnimeUltimeDownloader(BaseAnimeSite):
better_title = raw_title
# If raw_title is just "Télécharger" or similar, try to find better title
if len(raw_title) < 5 or raw_title.lower() in ['télécharger', 'download', 'ddl']:
if len(raw_title) < 5 or raw_title.lower() in [
"télécharger",
"download",
"ddl",
]:
# Try to extract from URL (file-0-1/ID-Title format)
url_match = re.search(r'file-0-1/\d+-(.+)$', href)
url_match = re.search(r"file-0-1/\d+-(.+)$", href)
if url_match:
better_title = url_match.group(1).replace('-', ' ').title()
better_title = url_match.group(1).replace("-", " ").title()
# If still no good title, look at parent/row elements
if len(better_title) < 5:
# Check parent row (table structure)
row = result.find_parent(['tr', 'td', 'div'])
row = result.find_parent(["tr", "td", "div"])
if row:
# Look for text in the row that's not the link text
row_text = row.get_text().strip()
# Remove the link text from row text
if raw_title in row_text:
row_text = row_text.replace(raw_title, '').strip()
row_text = row_text.replace(raw_title, "").strip()
if len(row_text) > 5 and len(row_text) < 100:
better_title = row_text
# Make URL absolute
if not href.startswith('http'):
if not href.startswith("http"):
href = urljoin("https://www.anime-ultime.net/", href)
result_item = {
'title': better_title,
'url': href,
'type': 'search_result',
'metadata': None
"title": better_title,
"url": href,
"type": "search_result",
"metadata": None,
}
# Fetch metadata if requested
if include_metadata:
metadata = await self.get_anime_metadata(href)
result_item['metadata'] = metadata
result_item["metadata"] = metadata
results.append(result_item)
@@ -373,27 +418,27 @@ class AnimeUltimeDownloader(BaseAnimeSite):
"""
try:
response = await self.client.get(anime_url)
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
episodes = []
# Look for episode links - anime-ultime uses info-XXXXX-Name-XX-vostfr format
# The URL pattern is info-0-1/ID-Anime-Name-XX-vostfr where XX is episode number
episode_links = soup.find_all('a', href=re.compile(r'info-0-1/\d+'))
episode_links = soup.find_all("a", href=re.compile(r"info-0-1/\d+"))
for link in episode_links:
href = link.get('href', '')
href = link.get("href", "")
text = link.get_text().strip()
# Extract episode number from URL pattern
# Matches: info-0-1/30200/Naruto-OAV-01-vostfr
match = re.search(r'-(\d+)-vostfr$', href, re.I)
match = re.search(r"-(\d+)-vostfr$", href, re.I)
if not match:
# Try other patterns
match = re.search(r'Episode[-\s]?(\d+)', href, re.I)
match = re.search(r"Episode[-\s]?(\d+)", href, re.I)
if not match:
# Try to extract from text
match = re.search(r'(\d+)', text)
match = re.search(r"(\d+)", text)
if match:
episode_num = match.group(1).zfill(2) # Pad with zero
@@ -401,32 +446,30 @@ class AnimeUltimeDownloader(BaseAnimeSite):
# Extract the episode ID from href and build correct URL
# href might be "info-0-1/30200" or "info-0-1/30200/..."
# We need: https://www.anime-ultime.net/info-0-1/30200
ep_id_match = re.search(r'info-0-1/(\d+)', href)
ep_id_match = re.search(r"info-0-1/(\d+)", href)
if ep_id_match:
ep_id = ep_id_match.group(1)
# Build the correct episode URL
episode_url = f"https://www.anime-ultime.net/info-0-1/{ep_id}"
else:
# Fallback to making URL absolute
if not href.startswith('http'):
if not href.startswith("http"):
href = urljoin(anime_url, href)
episode_url = href
episodes.append({
'episode': episode_num,
'url': episode_url,
'title': text
})
episodes.append(
{"episode": episode_num, "url": episode_url, "title": text}
)
# Remove duplicates and sort
seen = set()
unique_episodes = []
for ep in episodes:
if ep['episode'] not in seen:
seen.add(ep['episode'])
if ep["episode"] not in seen:
seen.add(ep["episode"])
unique_episodes.append(ep)
unique_episodes.sort(key=lambda x: int(x['episode']))
unique_episodes.sort(key=lambda x: int(x["episode"]))
return unique_episodes
+90 -82
View File
@@ -1,4 +1,5 @@
"""French-Manga.net anime streaming site downloader"""
from .base import BaseAnimeSite
from bs4 import BeautifulSoup
import re
@@ -17,11 +18,12 @@ class FrenchMangaDownloader(BaseAnimeSite):
"french-manga.net",
"w16.french-manga.net",
"w15.french-manga.net",
"www.french-manga.net"
"www.french-manga.net",
]
def __init__(self):
super().__init__()
self.id = "french-manga"
self.base_url = "https://w16.french-manga.net"
def can_handle(self, url: str) -> bool:
@@ -29,9 +31,7 @@ class FrenchMangaDownloader(BaseAnimeSite):
return any(domain in url.lower() for domain in self.BASE_DOMAINS)
async def search_anime(
self,
query: str,
lang: str = "vostfr"
self, query: str, lang: str = "vostfr"
) -> List[Dict[str, str]]:
"""
Search for anime on French-Manga.
@@ -47,46 +47,50 @@ class FrenchMangaDownloader(BaseAnimeSite):
# French-Manga uses a search endpoint
search_url = f"{self.base_url}/index.php?do=search"
params = {
'do': 'search',
'subaction': 'search',
'story': query,
'x': '0',
'y': '0'
"do": "search",
"subaction": "search",
"story": query,
"x": "0",
"y": "0",
}
response = await self.client.post(search_url, data=params)
response.raise_for_status()
html = response.text
soup = BeautifulSoup(html, 'lxml')
soup = BeautifulSoup(html, "lxml")
results = []
# Look for search results in article or story classes
for item in soup.find_all('article', class_=lambda x: x and 'story' in x.lower()):
title_elem = item.find(['h2', 'h3', 'h4'])
link_elem = item.find('a', href=True)
img_elem = item.find('img')
for item in soup.find_all(
"article", class_=lambda x: x and "story" in x.lower()
):
title_elem = item.find(["h2", "h3", "h4"])
link_elem = item.find("a", href=True)
img_elem = item.find("img")
if title_elem and link_elem:
title = title_elem.get_text(strip=True)
url = link_elem['href']
url = link_elem["href"]
# Ensure absolute URL
if url.startswith('/'):
if url.startswith("/"):
url = self.base_url + url
cover_image = ""
if img_elem and img_elem.get('src'):
cover_image = img_elem['src']
if cover_image.startswith('/'):
if img_elem and img_elem.get("src"):
cover_image = img_elem["src"]
if cover_image.startswith("/"):
cover_image = self.base_url + cover_image
results.append({
'title': title,
'url': url,
'cover_image': cover_image,
'lang': lang
})
results.append(
{
"title": title,
"url": url,
"cover_image": cover_image,
"lang": lang,
}
)
logger.info(f"Found {len(results)} anime results for query: {query}")
return results
@@ -96,9 +100,7 @@ class FrenchMangaDownloader(BaseAnimeSite):
return []
async def get_episodes(
self,
anime_url: str,
lang: str = "vostfr"
self, anime_url: str, lang: str = "vostfr"
) -> List[Dict[str, str]]:
"""
Get episode list for an anime.
@@ -115,34 +117,36 @@ class FrenchMangaDownloader(BaseAnimeSite):
response.raise_for_status()
html = response.text
soup = BeautifulSoup(html, 'lxml')
soup = BeautifulSoup(html, "lxml")
episodes = []
# Look for episode links (typically in a list or table)
# French-Manga usually has episode links in <a> tags with episode numbers
for link in soup.find_all('a', href=True):
href = link['href']
for link in soup.find_all("a", href=True):
href = link["href"]
text = link.get_text(strip=True)
# Pattern: Episode links usually contain "episode" or numbers
if re.search(r'episode?\s*\d+', text.lower()):
episode_num = re.search(r'(\d+)', text)
if re.search(r"episode?\s*\d+", text.lower()):
episode_num = re.search(r"(\d+)", text)
if episode_num:
episode_number = int(episode_num.group(1))
# Ensure absolute URL
if href.startswith('/'):
if href.startswith("/"):
href = self.base_url + href
episodes.append({
'episode_number': episode_number,
'url': href,
'title': text,
'host': 'french-manga'
})
episodes.append(
{
"episode_number": episode_number,
"url": href,
"title": text,
"host": "french-manga",
}
)
# Sort by episode number
episodes.sort(key=lambda x: x['episode_number'])
episodes.sort(key=lambda x: x["episode_number"])
logger.info(f"Found {len(episodes)} episodes for {anime_url}")
return episodes
@@ -166,31 +170,33 @@ class FrenchMangaDownloader(BaseAnimeSite):
response.raise_for_status()
html = response.text
soup = BeautifulSoup(html, 'lxml')
soup = BeautifulSoup(html, "lxml")
# Extract title
title = ""
title_elem = soup.find('h1') or soup.find('h2', class_='title')
title_elem = soup.find("h1") or soup.find("h2", class_="title")
if title_elem:
title = title_elem.get_text(strip=True)
# Extract synopsis
synopsis = ""
synopsis_elem = soup.find('div', class_=lambda x: x and 'story' in x.lower())
synopsis_elem = soup.find(
"div", class_=lambda x: x and "story" in x.lower()
)
if synopsis_elem:
synopsis = synopsis_elem.get_text(strip=True)
# Extract cover image
poster_image = ""
img_elem = soup.find('img', class_=lambda x: x and 'poster' in x.lower())
if img_elem and img_elem.get('src'):
poster_image = img_elem['src']
if poster_image.startswith('/'):
img_elem = soup.find("img", class_=lambda x: x and "poster" in x.lower())
if img_elem and img_elem.get("src"):
poster_image = img_elem["src"]
if poster_image.startswith("/"):
poster_image = self.base_url + poster_image
# Extract genres
genres = []
genre_links = soup.find_all('a', href=re.compile(r'/xfsearch/.*genre/'))
genre_links = soup.find_all("a", href=re.compile(r"/xfsearch/.*genre/"))
for link in genre_links[:10]: # Limit to 10 genres
genre = link.get_text(strip=True)
if genre:
@@ -198,36 +204,38 @@ class FrenchMangaDownloader(BaseAnimeSite):
# Extract rating (if available)
rating = ""
rating_elem = soup.find(['span', 'div'], class_=lambda x: x and 'rating' in x.lower())
rating_elem = soup.find(
["span", "div"], class_=lambda x: x and "rating" in x.lower()
)
if rating_elem:
rating = rating_elem.get_text(strip=True)
return {
'title': title,
'synopsis': synopsis,
'genres': genres,
'rating': rating,
'release_year': '',
'studio': '',
'poster_image': poster_image,
'total_episodes': len(await self.get_episodes(anime_url)),
'status': '',
'languages': ['vf', 'vostfr']
"title": title,
"synopsis": synopsis,
"genres": genres,
"rating": rating,
"release_year": "",
"studio": "",
"poster_image": poster_image,
"total_episodes": len(await self.get_episodes(anime_url)),
"status": "",
"languages": ["vf", "vostfr"],
}
except Exception as e:
logger.error(f"Error getting anime metadata: {e}")
return {
'title': '',
'synopsis': '',
'genres': [],
'rating': '',
'release_year': '',
'studio': '',
'poster_image': '',
'total_episodes': 0,
'status': '',
'languages': ['vf', 'vostfr']
"title": "",
"synopsis": "",
"genres": [],
"rating": "",
"release_year": "",
"studio": "",
"poster_image": "",
"total_episodes": 0,
"status": "",
"languages": ["vf", "vostfr"],
}
async def get_download_link(self, url: str) -> tuple[str, str]:
@@ -248,20 +256,20 @@ class FrenchMangaDownloader(BaseAnimeSite):
response.raise_for_status()
html = response.text
soup = BeautifulSoup(html, 'lxml')
soup = BeautifulSoup(html, "lxml")
# Look for iframe or video player
iframe = soup.find('iframe', src=True)
iframe = soup.find("iframe", src=True)
if iframe:
video_url = iframe['src']
video_url = iframe["src"]
else:
# Look for video tag directly
video = soup.find('video', src=True)
video = soup.find("video", src=True)
if video:
video_url = video['src']
video_url = video["src"]
else:
# Try to find in script tags
scripts = soup.find_all('script')
scripts = soup.find_all("script")
for script in scripts:
if script.string:
# Look for iframe or video URLs in JavaScript
@@ -274,20 +282,20 @@ class FrenchMangaDownloader(BaseAnimeSite):
if match:
video_url = match.group(1)
break
if 'video_url' in locals():
if "video_url" in locals():
break
if 'video_url' not in locals():
if "video_url" not in locals():
raise ValueError("Could not find video player URL")
# Ensure absolute URL
if video_url.startswith('//'):
video_url = 'https:' + video_url
elif video_url.startswith('/'):
if video_url.startswith("//"):
video_url = "https:" + video_url
elif video_url.startswith("/"):
video_url = self.base_url + video_url
# Extract episode title
title_elem = soup.find('h1') or soup.find('h2')
title_elem = soup.find("h1") or soup.find("h2")
episode_title = title_elem.get_text(strip=True) if title_elem else "Episode"
episode_title = sanitize_filename(episode_title)
+119 -87
View File
@@ -7,79 +7,100 @@ from urllib.parse import urljoin
class NekoSamaDownloader(BaseAnimeSite):
"""Downloader for neko-sama.org (anime streaming via Gupy)
NOTE: neko-sama.org now redirects to Gupy, which is a legal streaming search engine.
It does NOT host video content - it provides metadata about where to watch legally.
This provider can search and get metadata but cannot provide direct download links.
"""
BASE_DOMAINS = ["neko-sama.org", "www.neko-sama.org", "neko-sama.fr", "nekosama.fr", "www.gupy.fr", "gupy.fr"]
BASE_DOMAINS = [
"neko-sama.org",
"www.neko-sama.org",
"neko-sama.fr",
"nekosama.fr",
"www.gupy.fr",
"gupy.fr",
]
def __init__(self):
super().__init__()
self.id = "neko-sama"
def can_handle(self, url: str) -> bool:
return any(domain in url.lower() for domain in self.BASE_DOMAINS)
async def get_download_link(self, url: str, target_filename: Optional[str] = None) -> tuple[str, str]:
async def get_download_link(
self, url: str, target_filename: Optional[str] = None
) -> tuple[str, str]:
"""
Extract download link from neko-sama URL.
NOTE: neko-sama.org/Gupy is a legal streaming search engine, NOT a video host.
This returns streaming platform information instead of direct video links.
"""
try:
# Check if this is a Gupy URL
if 'gupy.fr' in url or 'neko-sama.org' in url:
if "gupy.fr" in url or "neko-sama.org" in url:
response = await self.client.get(url, follow_redirects=True)
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
# Look for streaming platform links
streaming_links = []
for link in soup.find_all('a', href=True):
href = link.get('href', '')
if '/out/' in href:
for link in soup.find_all("a", href=True):
href = link.get("href", "")
if "/out/" in href:
text = link.get_text(strip=True)
if text and 'Regarder' in text:
if text and "Regarder" in text:
streaming_links.append(f"{text}: {href}")
if streaming_links:
title_elem = soup.find('h1') or soup.find('title')
title = title_elem.get_text(strip=True).split('|')[0].strip() if title_elem else "Unknown"
info = "Available streaming platforms:\n" + "\n".join(streaming_links[:5])
title_elem = soup.find("h1") or soup.find("title")
title = (
title_elem.get_text(strip=True).split("|")[0].strip()
if title_elem
else "Unknown"
)
info = "Available streaming platforms:\n" + "\n".join(
streaming_links[:5]
)
filename = target_filename or f"{title}_streaming_info.txt"
return info, filename
raise Exception("No streaming links found - Gupy is a legal streaming search, not a video host")
raise Exception(
"No streaming links found - Gupy is a legal streaming search, not a video host"
)
# Legacy: try original method for other URLs
response = await self.client.get(url, follow_redirects=True)
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
# Method 1: Look for iframes with video
iframes = soup.find_all('iframe')
iframes = soup.find_all("iframe")
for iframe in iframes:
src = iframe.get('src', '')
if src and any(p in src for p in ['video', 'player', 'stream']):
if not src.startswith('http'):
src = iframe.get("src", "")
if src and any(p in src for p in ["video", "player", "stream"]):
if not src.startswith("http"):
src = urljoin(str(response.url), src)
filename = self._generate_filename(str(response.url))
return src, filename
# Method 2: Look for video tags
videos = soup.find_all('video')
videos = soup.find_all("video")
for video in videos:
src = video.get('src') or video.get('data-src')
src = video.get("src") or video.get("data-src")
if src:
filename = self._generate_filename(str(response.url))
return src, filename
sources = video.find_all('source')
sources = video.find_all("source")
for source in sources:
src = source.get('src', '')
src = source.get("src", "")
if src:
filename = self._generate_filename(str(response.url))
return src, filename
# Method 3: Look in scripts
scripts = soup.find_all('script')
scripts = soup.find_all("script")
for script in scripts:
if script.string:
patterns = [
@@ -90,24 +111,26 @@ class NekoSamaDownloader(BaseAnimeSite):
for pattern in patterns:
matches = re.findall(pattern, script.string)
for match in matches:
match = match.replace('\\/', '/')
if any(ext in match for ext in ['mp4', 'm3u8']):
match = match.replace("\\/", "/")
if any(ext in match for ext in ["mp4", "m3u8"]):
filename = self._generate_filename(str(response.url))
return match, filename
raise Exception("Could not find video link - Neko-Sama/Gupy does not host video content")
raise Exception(
"Could not find video link - Neko-Sama/Gupy does not host video content"
)
except Exception as e:
raise Exception(f"Error extracting NekoSama link: {str(e)}")
def _generate_filename(self, url: str) -> str:
parts = url.split('/')
parts = url.split("/")
anime_name = "anime"
episode = "1"
for i, part in enumerate(parts):
if 'episode' in part.lower():
match = re.search(r'episode[-\s]*(\d+)', part, re.I)
if "episode" in part.lower():
match = re.search(r"episode[-\s]*(\d+)", part, re.I)
if match:
episode = match.group(1)
@@ -118,31 +141,31 @@ class NekoSamaDownloader(BaseAnimeSite):
"""Get list of episodes for an anime."""
try:
response = await self.client.get(anime_url)
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
episodes = []
# Try to find episode links
episode_links = soup.find_all('a', href=re.compile(r'episode'))
episode_links = soup.find_all("a", href=re.compile(r"episode"))
for link in episode_links:
href = link.get('href', '')
match = re.search(r'episode[-\s]*(\d+)', href, re.I)
href = link.get("href", "")
match = re.search(r"episode[-\s]*(\d+)", href, re.I)
if match:
episode_num = match.group(1)
if not href.startswith('http'):
if not href.startswith("http"):
href = urljoin(anime_url, href)
episodes.append({'episode': episode_num, 'url': href})
episodes.append({"episode": episode_num, "url": href})
# Deduplicate and sort
seen = set()
unique_episodes = []
for ep in episodes:
if ep['episode'] not in seen:
seen.add(ep['episode'])
if ep["episode"] not in seen:
seen.add(ep["episode"])
unique_episodes.append(ep)
unique_episodes.sort(key=lambda x: int(x['episode']))
unique_episodes.sort(key=lambda x: int(x["episode"]))
return unique_episodes
except Exception as e:
@@ -153,70 +176,70 @@ class NekoSamaDownloader(BaseAnimeSite):
try:
print(f"[NEKO-SAMA] Extracting metadata from: {anime_url}")
response = await self.client.get(anime_url)
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
metadata = {
'synopsis': None,
'genres': [],
'rating': None,
'release_year': None,
'studio': None,
'poster_image': None,
'banner_image': None,
'total_episodes': None,
'status': None,
'alternative_titles': []
"synopsis": None,
"genres": [],
"rating": None,
"release_year": None,
"studio": None,
"poster_image": None,
"banner_image": None,
"total_episodes": None,
"status": None,
"alternative_titles": [],
}
# Extract title and year from h1
title_elem = soup.find('h1')
title_elem = soup.find("h1")
if title_elem:
title_text = title_elem.get_text(strip=True)
# Extract year from title like "Naruto (2002)"
year_match = re.search(r'\((\d{4})\)', title_text)
year_match = re.search(r"\((\d{4})\)", title_text)
if year_match:
metadata['release_year'] = int(year_match.group(1))
metadata["release_year"] = int(year_match.group(1))
# Extract synopsis - Gupy shows it as paragraphs
synopsis_elem = soup.find('p')
synopsis_elem = soup.find("p")
if synopsis_elem:
text = synopsis_elem.get_text(strip=True)
if len(text) > 50:
metadata['synopsis'] = text
metadata["synopsis"] = text
# Extract genres from meta tags or links
genre_links = soup.find_all('a', href=re.compile(r'serie-|genre|tag'))
genre_links = soup.find_all("a", href=re.compile(r"serie-|genre|tag"))
if genre_links:
genres = []
for link in genre_links[:5]:
text = link.get_text(strip=True)
if text and '/' not in text and len(text) < 30:
if text and "/" not in text and len(text) < 30:
genres.append(text)
metadata['genres'] = genres
metadata["genres"] = genres
# Extract rating from percentage
rating_elem = soup.find(string=re.compile(r'\d+(\.\d+)?%'))
rating_elem = soup.find(string=re.compile(r"\d+(\.\d+)?%"))
if rating_elem:
match = re.search(r'(\d+(\.\d+)?)%', rating_elem)
match = re.search(r"(\d+(\.\d+)?)%", rating_elem)
if match:
rating = float(match.group(1)) / 10
metadata['rating'] = f"{rating:.1f}/10"
metadata["rating"] = f"{rating:.1f}/10"
# Extract poster image
poster_elem = soup.find('img', src=re.compile(r'poster|poster'))
poster_elem = soup.find("img", src=re.compile(r"poster|poster"))
if poster_elem:
metadata['poster_image'] = poster_elem.get('src')
metadata["poster_image"] = poster_elem.get("src")
# Extract episode count from page text
page_text = soup.get_text()
ep_match = re.search(r'(\d+)\s*episodes?', page_text, re.I)
ep_match = re.search(r"(\d+)\s*episodes?", page_text, re.I)
if ep_match:
metadata['total_episodes'] = int(ep_match.group(1))
metadata["total_episodes"] = int(ep_match.group(1))
# Extract studio/director
director_elem = soup.find('a', href=re.compile(r'person|réalisé'))
director_elem = soup.find("a", href=re.compile(r"person|réalisé"))
if director_elem:
metadata['studio'] = director_elem.get_text(strip=True)
metadata["studio"] = director_elem.get_text(strip=True)
print(f"[NEKO-SAMA] Extracted metadata: {metadata}")
return metadata
@@ -225,16 +248,19 @@ class NekoSamaDownloader(BaseAnimeSite):
print(f"[NEKO-SAMA] Error extracting metadata: {e}")
return {}
async def search_anime(self, query: str, lang: str = "vostfr", include_metadata: bool = False) -> list[dict]:
async def search_anime(
self, query: str, lang: str = "vostfr", include_metadata: bool = False
) -> list[dict]:
"""Search for anime on neko-sama (uses Gupy backend)."""
try:
import time
from html import unescape
start = time.time()
print(f"[NEKO-SAMA] Searching for '{query}' ({lang})...")
# Neko-Sama now uses Gupy - try the direct URL pattern
search_slug = query.lower().replace(' ', '-')
search_slug = query.lower().replace(" ", "-")
search_urls = [
f"https://www.gupy.fr/series/{search_slug}/",
f"https://neko-sama.org/series/{search_slug}/",
@@ -250,34 +276,40 @@ class NekoSamaDownloader(BaseAnimeSite):
print(f"[NEKO-SAMA] Found anime at {final_url}")
# Extract title from page
soup = BeautifulSoup(response.text, 'lxml')
title_elem = soup.find('h1') or soup.find('title')
title = unescape(title_elem.get_text(strip=True)) if title_elem else query
soup = BeautifulSoup(response.text, "lxml")
title_elem = soup.find("h1") or soup.find("title")
title = (
unescape(title_elem.get_text(strip=True))
if title_elem
else query
)
# Clean up title
title = title.split('|')[0].split('-')[0].strip()
title = title.split("|")[0].split("-")[0].strip()
result = {
'title': title,
'url': final_url,
'cover_image': None,
'type': 'direct',
'metadata': None
"title": title,
"url": final_url,
"cover_image": None,
"type": "direct",
"metadata": None,
}
# Try to get poster
poster = soup.find('img', src=re.compile(r'poster'))
poster = soup.find("img", src=re.compile(r"poster"))
if poster:
result['cover_image'] = poster.get('src')
result["cover_image"] = poster.get("src")
if include_metadata:
metadata = await self.get_anime_metadata(final_url)
result['metadata'] = metadata
result["metadata"] = metadata
results.append(result)
break
elapsed = time.time() - start
print(f"[NEKO-SAMA] Search completed in {elapsed:.2f}s, found {len(results)} results")
print(
f"[NEKO-SAMA] Search completed in {elapsed:.2f}s, found {len(results)} results"
)
return results
except Exception as e:
+78 -63
View File
@@ -9,6 +9,10 @@ class VostfreeDownloader(BaseAnimeSite):
BASE_DOMAINS = ["vostfree.tv", "www.vostfree.tv"]
def __init__(self):
super().__init__()
self.id = "vostfree"
def can_handle(self, url: str) -> bool:
return any(domain in url.lower() for domain in self.BASE_DOMAINS)
@@ -16,35 +20,35 @@ class VostfreeDownloader(BaseAnimeSite):
"""Extract download link from vostfree URL"""
try:
response = await self.client.get(url, follow_redirects=True)
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
# Method 1: Look for iframe players
iframes = soup.find_all('iframe')
iframes = soup.find_all("iframe")
for iframe in iframes:
src = iframe.get('src', '')
if src and any(p in src for p in ['player', 'video', 'stream']):
if not src.startswith('http'):
src = iframe.get("src", "")
if src and any(p in src for p in ["player", "video", "stream"]):
if not src.startswith("http"):
src = urljoin(str(response.url), src)
filename = self._generate_filename(str(response.url))
return src, filename
# Method 2: Look for video tags
videos = soup.find_all('video')
videos = soup.find_all("video")
for video in videos:
src = video.get('src')
src = video.get("src")
if src:
filename = self._generate_filename(str(response.url))
return src, filename
sources = video.find_all('source')
sources = video.find_all("source")
for source in sources:
src = source.get('src', '')
if src and any(ext in src for ext in ['mp4', 'm3u8']):
src = source.get("src", "")
if src and any(ext in src for ext in ["mp4", "m3u8"]):
filename = self._generate_filename(str(response.url))
return src, filename
# Method 3: Look in scripts
scripts = soup.find_all('script')
scripts = soup.find_all("script")
for script in scripts:
if script.string:
patterns = [
@@ -56,8 +60,8 @@ class VostfreeDownloader(BaseAnimeSite):
for pattern in patterns:
matches = re.findall(pattern, script.string)
for match in matches:
match = match.replace('\\/', '/')
if any(ext in match for ext in ['mp4', 'm3u8']):
match = match.replace("\\/", "/")
if any(ext in match for ext in ["mp4", "m3u8"]):
filename = self._generate_filename(str(response.url))
return match, filename
@@ -67,12 +71,12 @@ class VostfreeDownloader(BaseAnimeSite):
raise Exception(f"Error extracting Vostfree link: {str(e)}")
def _generate_filename(self, url: str) -> str:
parts = url.split('/')
parts = url.split("/")
anime_name = "anime"
episode = "1"
for part in parts:
match = re.search(r'episode[-\s]*(\d+)', part, re.I)
match = re.search(r"episode[-\s]*(\d+)", part, re.I)
if match:
episode = match.group(1)
@@ -82,30 +86,30 @@ class VostfreeDownloader(BaseAnimeSite):
async def get_episodes(self, anime_url: str, lang: str = "vostfr") -> list[dict]:
try:
response = await self.client.get(anime_url)
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
episodes = []
episode_links = soup.find_all('a', href=re.compile(r'episode', re.I))
episode_links = soup.find_all("a", href=re.compile(r"episode", re.I))
for link in episode_links:
href = link.get('href', '')
match = re.search(r'episode[-\s]*(\d+)', href, re.I)
href = link.get("href", "")
match = re.search(r"episode[-\s]*(\d+)", href, re.I)
if match:
episode_num = match.group(1)
if not href.startswith('http'):
if not href.startswith("http"):
href = urljoin(anime_url, href)
episodes.append({'episode': episode_num, 'url': href})
episodes.append({"episode": episode_num, "url": href})
# Deduplicate and sort
seen = set()
unique_episodes = []
for ep in episodes:
if ep['episode'] not in seen:
seen.add(ep['episode'])
if ep["episode"] not in seen:
seen.add(ep["episode"])
unique_episodes.append(ep)
unique_episodes.sort(key=lambda x: int(x['episode']))
unique_episodes.sort(key=lambda x: int(x["episode"]))
return unique_episodes
except Exception as e:
@@ -119,29 +123,29 @@ class VostfreeDownloader(BaseAnimeSite):
try:
print(f"[VOSTFREE] Extracting metadata from: {anime_url}")
response = await self.client.get(anime_url)
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
metadata = {
'synopsis': None,
'genres': [],
'rating': None,
'release_year': None,
'studio': None,
'poster_image': None,
'banner_image': None,
'total_episodes': None,
'status': None,
'alternative_titles': []
"synopsis": None,
"genres": [],
"rating": None,
"release_year": None,
"studio": None,
"poster_image": None,
"banner_image": None,
"total_episodes": None,
"status": None,
"alternative_titles": [],
}
# Extract synopsis
synopsis_selectors = [
'div.synopsis',
'div.description',
"div.synopsis",
"div.description",
'div[class*="synopsis"]',
'div[class*="desc"]',
'p.synopsis',
'.anime-synopsis'
"p.synopsis",
".anime-synopsis",
]
for selector in synopsis_selectors:
@@ -149,57 +153,65 @@ class VostfreeDownloader(BaseAnimeSite):
if synopsis_elem:
synopsis = synopsis_elem.get_text(strip=True)
if len(synopsis) > 50:
metadata['synopsis'] = synopsis
metadata["synopsis"] = synopsis
break
# Extract genres
genre_links = soup.find_all('a', href=re.compile(r'genre|tag|type', re.I))
genre_links = soup.find_all("a", href=re.compile(r"genre|tag|type", re.I))
if genre_links:
metadata['genres'] = [link.get_text(strip=True) for link in genre_links[:5]]
metadata["genres"] = [
link.get_text(strip=True) for link in genre_links[:5]
]
# Extract rating
rating_selectors = [
'span.rating',
'div.rating',
'span.score',
"span.rating",
"div.rating",
"span.score",
'div[class*="rating"]',
'div[class*="score"]'
'div[class*="score"]',
]
for selector in rating_selectors:
rating_elem = soup.select_one(selector)
if rating_elem:
rating_text = rating_elem.get_text(strip=True)
rating_match = re.search(r'(\d+\.?\d*)\s*/\s*10', rating_text)
rating_match = re.search(r"(\d+\.?\d*)\s*/\s*10", rating_text)
if rating_match:
metadata['rating'] = f"{rating_match.group(1)}/10"
metadata["rating"] = f"{rating_match.group(1)}/10"
break
# Extract release year
page_text = soup.get_text()
year_matches = re.findall(r'\b(19\d{2}|20\d{2})\b', page_text)
year_matches = re.findall(r"\b(19\d{2}|20\d{2})\b", page_text)
if year_matches:
import datetime
current_year = datetime.datetime.now().year + 2
valid_years = [int(y) for y in year_matches if 1950 <= int(y) <= current_year]
valid_years = [
int(y) for y in year_matches if 1950 <= int(y) <= current_year
]
if valid_years:
from collections import Counter
metadata['release_year'] = Counter(valid_years).most_common(1)[0][0]
metadata["release_year"] = Counter(valid_years).most_common(1)[0][0]
# Extract poster image
poster_elem = soup.select_one('img.poster, img.cover, .anime-poster img')
poster_elem = soup.select_one("img.poster, img.cover, .anime-poster img")
if poster_elem:
metadata['poster_image'] = poster_elem.get('src') or poster_elem.get('data-src')
metadata["poster_image"] = poster_elem.get("src") or poster_elem.get(
"data-src"
)
# Extract poster from og:image
og_image = soup.find('meta', property='og:image')
if og_image and not metadata['poster_image']:
metadata['poster_image'] = og_image.get('content')
og_image = soup.find("meta", property="og:image")
if og_image and not metadata["poster_image"]:
metadata["poster_image"] = og_image.get("content")
# Extract total episodes
episodes_count = len(await self.get_episodes(anime_url))
if episodes_count > 0:
metadata['total_episodes'] = episodes_count
metadata["total_episodes"] = episodes_count
print(f"[VOSTFREE] Extracted metadata: {metadata}")
return metadata
@@ -208,7 +220,9 @@ class VostfreeDownloader(BaseAnimeSite):
print(f"[VOSTFREE] Error extracting metadata: {e}")
return {}
async def search_anime(self, query: str, lang: str = "vostfr", include_metadata: bool = False) -> list[dict]:
async def search_anime(
self, query: str, lang: str = "vostfr", include_metadata: bool = False
) -> list[dict]:
"""
Search for anime on vostfree
@@ -219,6 +233,7 @@ class VostfreeDownloader(BaseAnimeSite):
"""
try:
import time
start = time.time()
print(f"[VOSTFREE] Searching for '{query}' ({lang})...")
@@ -233,15 +248,15 @@ class VostfreeDownloader(BaseAnimeSite):
if response.status_code == 200:
print(f"[VOSTFREE] Found anime at {str(response.url)}")
result = {
'title': query,
'url': str(response.url),
'type': 'direct',
'metadata': None
"title": query,
"url": str(response.url),
"type": "direct",
"metadata": None,
}
if include_metadata:
metadata = await self.get_anime_metadata(str(response.url))
result['metadata'] = metadata
result["metadata"] = metadata
return [result]
+128 -139
View File
@@ -1,4 +1,5 @@
"""FS7 (French Stream) series site downloader"""
import logging
import re
from typing import List, Dict, Any, Optional
@@ -19,29 +20,46 @@ class FS7Downloader(BaseSeriesSite):
def __init__(self):
super().__init__()
self.base_url = "https://fs7.lol"
self.search_url = f"{self.base_url}/"
# Update client headers to mimic browser
self.client.headers.update({
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Accept-Language': 'fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7',
'Accept-Encoding': 'gzip, deflate',
'Connection': 'keep-alive',
'Upgrade-Insecure-Requests': '1'
})
self.id = "fs7"
self.provider_id = "fs7"
self.default_domain = "fs7.lol"
self.test_tlds = ["lol", "com", "net", "org", "tv", "ws", "cc", "co"]
self.base_url = f"https://{self.default_domain}"
self._domain_checked = False
self.client.headers.update(
{
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
"Accept-Language": "fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7",
"Accept-Encoding": "gzip, deflate",
"Connection": "keep-alive",
"Upgrade-Insecure-Requests": "1",
}
)
async def _ensure_base_url(self):
"""Ensure base_url is set to the current active domain"""
if self._domain_checked:
return
self._domain_checked = True
try:
from app.utils import DomainManager
active_domain = await DomainManager.get_active_domain(
self.provider_id, self.default_domain, self.test_tlds, test_path="/"
)
self.base_url = f"https://{active_domain}"
logger.info(f"Using active domain for FS7: {self.base_url}")
except Exception as e:
logger.warning(f"Domain check failed for FS7, using default: {e}")
def can_handle(self, url: str) -> bool:
"""Check if this downloader can handle the given URL"""
return "fs7.lol" in url.lower() or "french-stream" in url.lower()
async def search_anime(
self,
query: str,
lang: str = "vf"
) -> List[Dict[str, str]]:
async def search_anime(self, query: str, lang: str = "vf") -> List[Dict[str, str]]:
"""
Search for series on FS7.
Search for series on FS7 using DLE AJAX search endpoint.
Args:
query: Search query
@@ -51,91 +69,61 @@ class FS7Downloader(BaseSeriesSite):
List of series with title, url, cover_image
"""
try:
await self._ensure_base_url()
logger.info(f"Searching FS7 for: {query}")
# FS7 uses GET request with query parameters for search
response = await self.client.get(
self.search_url,
params={
"do": "search",
"subaction": "search",
"story": query
}
ajax_url = f"{self.base_url}/engine/ajax/search.php"
response = await self.client.post(
ajax_url,
data={"query": query, "page": "1"},
headers={
"Content-Type": "application/x-www-form-urlencoded",
"X-Requested-With": "XMLHttpRequest",
"Referer": f"{self.base_url}/",
},
)
response.raise_for_status()
html = response.text
soup = BeautifulSoup(html, 'lxml')
soup = BeautifulSoup(html, "lxml")
results = []
# Look for series items
# FS7 usually structure: <div class="movie-item">...<a href="..."><img src="..."></a>...</div>
# Or directly <a> tags with images
items = soup.find_all('div', class_='movie-item')
if not items:
# Fallback to the previous method if layout is different
items = soup.find_all('a', href=re.compile(r'/s-tv/\d+-.+\.html'))
for item in items[:24]: # Limit to 24 results
# Find the link and image within the item or the item itself
if item.name == 'a':
link_elem = item
else:
link_elem = item.find('a', href=re.compile(r'/s-tv/|/films/'))
if not link_elem:
for item in soup.find_all("div", class_="search-item")[:24]:
onclick = item.get("onclick", "")
url_match = re.search(r"location\.href=['\"]([^'\"]+)['\"]", onclick)
if not url_match:
continue
url = link_elem.get('href', '')
if not url.startswith('http'):
url = url_match.group(1)
if not url.startswith("http"):
url = urljoin(self.base_url, url)
# Extract title
img_elem = item.find('img')
title = ""
if img_elem and img_elem.get('alt'):
title = img_elem.get('alt').strip()
elif link_elem.get('title'):
title = link_elem.get('title').strip()
else:
title = item.get_text(strip=True)
title_elem = item.find("div", class_="search-title")
title = title_elem.get_text(strip=True) if title_elem else ""
title = re.sub(r"\s+", " ", title).strip()
# Extract cover image
img_elem = item.find('img')
cover_image = ""
if img_elem:
# Check for common lazy loading attributes used by various themes
cover_image = (
img_elem.get('data-src') or
img_elem.get('data-original') or
img_elem.get('src') or
""
)
# If still empty, look for background-style images in inline styles
if not cover_image:
style = item.get('style', '')
if 'background-image' in style:
match = re.search(r'url\([\'"]?(.*?)[\'"]?\)', style)
if match:
cover_image = match.group(1)
if cover_image and not cover_image.startswith('http'):
cover_image = urljoin(self.base_url, cover_image)
# Clean up title
title = re.sub(r'\s+affiche$', '', title, flags=re.IGNORECASE).strip()
title = re.sub(r'\s+', ' ', title)
poster_elem = item.find("div", class_="search-poster")
if poster_elem:
img = poster_elem.find("img")
if img:
cover_image = (
img.get("data-src")
or img.get("data-original")
or img.get("src")
or ""
)
if title and len(title) > 2:
if not any(r['url'] == url for r in results):
results.append({
'title': title,
'url': url,
'cover_image': cover_image
})
results.append(
{
"title": title,
"url": url,
"cover_image": cover_image,
"provider_id": self.provider_id,
}
)
logger.info(f"Found {len(results)} series on FS7")
logger.info(f"Found {len(results)} results on FS7 for '{query}'")
return results
except Exception as e:
@@ -143,9 +131,7 @@ class FS7Downloader(BaseSeriesSite):
return []
async def get_episodes(
self,
anime_url: str,
lang: str = "vf"
self, anime_url: str, lang: str = "vf"
) -> List[Dict[str, str]]:
"""
Get episode list for a series.
@@ -164,31 +150,33 @@ class FS7Downloader(BaseSeriesSite):
response.raise_for_status()
html = response.text
soup = BeautifulSoup(html, 'lxml')
soup = BeautifulSoup(html, "lxml")
episodes = []
# Get series title for episode naming
title_elem = soup.find('h1')
title_elem = soup.find("h1")
series_title = title_elem.get_text(strip=True) if title_elem else "Series"
# Clean up title: remove "affiche" suffix
series_title = re.sub(r'\s+affiche$', '', series_title, flags=re.IGNORECASE).strip()
series_title = re.sub(
r"\s+affiche$", "", series_title, flags=re.IGNORECASE
).strip()
# FS7 stores episode data in JavaScript div elements
# Format: <div data-ep="1" data-vidzy="..." data-uqload="..." data-netu="..." data-voe="..."></div>
episode_divs = soup.find_all('div', attrs={'data-ep': True})
episode_divs = soup.find_all("div", attrs={"data-ep": True})
for div in episode_divs:
ep_num = div.get('data-ep', '').strip()
ep_num = div.get("data-ep", "").strip()
# Try different video players in order of preference
video_url = None
host_name = None
for player in ['data-vidzy', 'data-uqload', 'data-voe', 'data-netu']:
player_url = div.get(player, '').strip()
for player in ["data-vidzy", "data-uqload", "data-voe", "data-netu"]:
player_url = div.get(player, "").strip()
if player_url:
video_url = player_url
# Extract host name from attribute name
host_name = player.replace('data-', '').title()
host_name = player.replace("data-", "").title()
logger.debug(f"Found episode {ep_num} on {host_name}")
break
@@ -199,15 +187,19 @@ class FS7Downloader(BaseSeriesSite):
# Use pipe-separated format: video_url|anime_url|episode_title
combined_url = f"{video_url}|{anime_url}|{episode_title}"
episodes.append({
'episode': ep_num,
'url': combined_url,
'title': episode_title,
'host': host_name or 'Unknown'
})
episodes.append(
{
"episode": ep_num,
"url": combined_url,
"title": episode_title,
"host": host_name or "Unknown",
}
)
# Sort by episode number
episodes.sort(key=lambda x: int(x['episode']) if x['episode'].isdigit() else 0)
episodes.sort(
key=lambda x: int(x["episode"]) if x["episode"].isdigit() else 0
)
logger.info(f"Found {len(episodes)} episodes")
return episodes
@@ -216,10 +208,7 @@ class FS7Downloader(BaseSeriesSite):
logger.error(f"Error getting episodes from FS7: {e}")
return []
async def get_anime_metadata(
self,
anime_url: str
) -> Dict[str, Any]:
async def get_anime_metadata(self, anime_url: str) -> Dict[str, Any]:
"""
Get metadata for a series.
@@ -236,62 +225,62 @@ class FS7Downloader(BaseSeriesSite):
response.raise_for_status()
html = response.text
soup = BeautifulSoup(html, 'lxml')
soup = BeautifulSoup(html, "lxml")
# Extract title
title = soup.find('h1')
title = soup.find("h1")
title = title.get_text(strip=True) if title else "Unknown"
# Clean up title: remove "affiche" suffix
title = re.sub(r'\s+affiche$', '', title, flags=re.IGNORECASE).strip()
title = re.sub(r"\s+affiche$", "", title, flags=re.IGNORECASE).strip()
# Extract description/synopsis
description_elem = soup.find('div', class_='full-text')
description = description_elem.get_text(strip=True) if description_elem else ""
description_elem = soup.find("div", class_="full-text")
description = (
description_elem.get_text(strip=True) if description_elem else ""
)
# Extract cover image
img = soup.find('img', class_='poster')
poster_image = img.get('src', '') if img else ''
img = soup.find("img", class_="poster")
poster_image = img.get("src", "") if img else ""
# Try to get poster from meta tag if not found
if not poster_image:
meta_img = soup.find('meta', property='og:image')
poster_image = meta_img.get('content', '') if meta_img else ''
meta_img = soup.find("meta", property="og:image")
poster_image = meta_img.get("content", "") if meta_img else ""
# Extract year
year_match = re.search(r'\b(19|20)\d{2}\b', description)
year_match = re.search(r"\b(19|20)\d{2}\b", description)
release_year = int(year_match.group()) if year_match else None
return {
'title': title,
'synopsis': description,
'poster_image': poster_image,
'release_year': release_year,
'genres': [],
'rating': None,
'studio': None,
'total_episodes': None,
'status': None
"title": title,
"synopsis": description,
"poster_image": poster_image,
"release_year": release_year,
"genres": [],
"rating": None,
"studio": None,
"total_episodes": None,
"status": None,
}
except Exception as e:
logger.error(f"Error getting metadata from FS7: {e}")
return {
'title': "Unknown",
'synopsis': "",
'poster_image': '',
'genres': [],
'rating': None,
'release_year': None,
'studio': None,
'total_episodes': None,
'status': None
"title": "Unknown",
"synopsis": "",
"poster_image": "",
"genres": [],
"rating": None,
"release_year": None,
"studio": None,
"total_episodes": None,
"status": None,
}
async def get_download_link(
self,
url: str,
target_filename: Optional[str] = None
self, url: str, target_filename: Optional[str] = None
) -> tuple[str, str]:
"""
Extract download link from video player URL.
+127 -104
View File
@@ -1,4 +1,5 @@
"""Zone-Telechargement series site downloader"""
import logging
import re
from typing import List, Dict, Any, Optional, Tuple
@@ -18,94 +19,106 @@ class ZoneTelechargementDownloader(BaseSeriesSite):
def __init__(self):
super().__init__()
self.id = "zonetelechargement"
self.provider_id = "zonetelechargement"
self.default_domain = "zone-telechargement.cam"
self.test_tlds = ["cam", "net", "org", "blue", "lol", "work"]
self.base_url = None # Will be set dynamically
self.client.headers.update({
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Accept-Language': 'fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7',
})
self.default_domain = "zone-telechargement.golf"
self.test_tlds = ["golf", "cam", "net", "org", "blue", "lol", "work", "ws"]
self.base_url = f"https://{self.default_domain}"
self._domain_checked = False
self.client.headers.update(
{
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
"Accept-Language": "fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7",
}
)
async def _ensure_base_url(self):
"""Ensure base_url is set to the current active domain"""
if not self.base_url:
if self._domain_checked:
return
self._domain_checked = True
try:
active_domain = await DomainManager.get_active_domain(
self.provider_id,
self.default_domain,
self.test_tlds,
test_path="/"
self.provider_id, self.default_domain, self.test_tlds, test_path="/"
)
self.base_url = f"https://{active_domain}"
logger.info(f"Using active domain for Zone-Telechargement: {self.base_url}")
except Exception as e:
logger.warning(
f"Domain check failed for Zone-Telechargement, using default: {e}"
)
def can_handle(self, url: str) -> bool:
"""Check if this downloader can handle the given URL"""
return "zone-telechargement" in url.lower() or "zt-za" in url.lower()
async def search_anime(
self,
query: str,
lang: str = "vf"
) -> List[Dict[str, str]]:
"""Search for series on Zone-Telechargement"""
async def search_anime(self, query: str, lang: str = "vf") -> List[Dict[str, str]]:
"""Search for series on Zone-Telechargement.
ZT uses server-side rendered search: GET /?p=series&search=QUERY.
Results are in div.cover_global containers with nested cover_infos_title links.
"""
try:
await self._ensure_base_url()
logger.info(f"Searching Zone-Telechargement for: {query}")
# ZT uses POST or GET for search depending on the version
# Most modern versions use: /index.php?do=search
search_url = f"{self.base_url}/index.php?do=search"
# Form data for search
data = {
"do": "search",
"subaction": "search",
"search_start": "0",
"full_search": "0",
"result_from": "1",
"story": query
}
response = await self.client.post(search_url, data=data)
search_url = f"{self.base_url}/"
params = {"p": "series", "search": query}
response = await self.client.get(search_url, params=params)
response.raise_for_status()
html = response.text
soup = BeautifulSoup(html, 'lxml')
soup = BeautifulSoup(html, "lxml")
results = []
# Look for items
items = soup.find_all('div', class_='shm-item') or soup.find_all('div', class_='movie-item')
for item in items[:24]:
link_elem = item.find('a', class_='shm-title') or item.find('a')
if not link_elem:
for cover_div in soup.find_all("div", class_="cover_global")[:24]:
link_in_cover = cover_div.find("a", class_="mainimg")
if not link_in_cover:
link_in_cover = cover_div.find("a")
if not link_in_cover:
continue
url = link_elem.get('href', '')
if not url.startswith('http'):
url = link_in_cover.get("href", "")
if not url.startswith("http"):
url = urljoin(self.base_url, url)
title = link_elem.get_text(strip=True)
img_elem = item.find('img')
img = cover_div.find("img")
cover_image = ""
if img_elem:
cover_image = img_elem.get('data-src') or img_elem.get('src') or ""
if cover_image and not cover_image.startswith('http'):
cover_image = urljoin(self.base_url, cover_image)
if img:
cover_image = img.get("data-src") or img.get("src") or ""
if cover_image and not cover_image.startswith("http"):
cover_image = urljoin(self.base_url, cover_image)
title = ""
info_div = cover_div.find("div", class_="cover_infos_title")
if info_div:
title_link = info_div.find("a")
if title_link:
title = title_link.get_text(strip=True)
else:
title = info_div.get_text(strip=True)
else:
title = link_in_cover.get("title", "")
if not title:
title = link_in_cover.get_text(strip=True)
if title and len(title) > 2:
results.append({
'title': title,
'url': url,
'cover_image': cover_image,
'provider_id': self.provider_id
})
results.append(
{
"title": title,
"url": url,
"cover_image": cover_image,
"provider_id": self.provider_id,
}
)
logger.info(
f"Zone-Telechargement found {len(results)} results for '{query}'"
)
return results
except Exception as e:
@@ -113,39 +126,35 @@ class ZoneTelechargementDownloader(BaseSeriesSite):
return []
async def get_episodes(
self,
anime_url: str,
lang: str = "vf"
self, anime_url: str, lang: str = "vf"
) -> List[Dict[str, str]]:
"""Extract episodes from a series page"""
try:
await self._ensure_base_url()
html = await self._fetch_page(anime_url)
soup = BeautifulSoup(html, 'lxml')
soup = BeautifulSoup(html, "lxml")
episodes = []
# ZT typically lists episodes in a table or list of links
# Links often look like: /telecharger-series/.../saison-X-episode-Y.html
links = soup.find_all('a', href=re.compile(r'episode-\d+'))
links = soup.find_all("a", href=re.compile(r"episode-\d+"))
for i, link in enumerate(links):
href = link.get('href', '')
if not href.startswith('http'):
href = link.get("href", "")
if not href.startswith("http"):
href = urljoin(self.base_url, href)
title = link.get_text(strip=True)
ep_match = re.search(r'episode\s*(\d+)', title.lower())
ep_match = re.search(r"episode\s*(\d+)", title.lower())
ep_number = int(ep_match.group(1)) if ep_match else i + 1
episodes.append({
'episode_number': ep_number,
'url': href,
'title': title
})
episodes.append(
{"episode_number": ep_number, "url": href, "title": title}
)
# Sort by episode number
episodes.sort(key=lambda x: x['episode_number'])
episodes.sort(key=lambda x: x["episode_number"])
return episodes
except Exception as e:
@@ -157,32 +166,40 @@ class ZoneTelechargementDownloader(BaseSeriesSite):
try:
await self._ensure_base_url()
html = await self._fetch_page(anime_url)
soup = BeautifulSoup(html, 'lxml')
soup = BeautifulSoup(html, "lxml")
metadata = {
'title': "",
'synopsis': "",
'genres': [],
'poster_image': "",
'status': "Unknown"
"title": "",
"synopsis": "",
"genres": [],
"poster_image": "",
"status": "Unknown",
}
title_elem = soup.find('h1')
title_elem = soup.find("h1")
if title_elem:
metadata['title'] = title_elem.get_text(strip=True)
metadata["title"] = title_elem.get_text(strip=True)
# Synopsis
syn_elem = soup.find('div', class_='shm-description') or soup.find('div', class_='movie-desc')
syn_elem = soup.find("div", class_="shm-description") or soup.find(
"div", class_="movie-desc"
)
if syn_elem:
metadata['synopsis'] = syn_elem.get_text(strip=True)
metadata["synopsis"] = syn_elem.get_text(strip=True)
# Poster
img_elem = soup.find('div', class_='shm-img').find('img') if soup.find('div', class_='shm-img') else None
img_elem = (
soup.find("div", class_="shm-img").find("img")
if soup.find("div", class_="shm-img")
else None
)
if img_elem:
metadata['poster_image'] = urljoin(self.base_url, img_elem.get('src', ''))
metadata["poster_image"] = urljoin(
self.base_url, img_elem.get("src", "")
)
return metadata
except Exception as e:
logger.error(f"Error getting metadata from Zone-Telechargement: {e}")
return {}
@@ -192,19 +209,25 @@ class ZoneTelechargementDownloader(BaseSeriesSite):
try:
await self._ensure_base_url()
html = await self._fetch_page(url)
soup = BeautifulSoup(html, 'lxml')
soup = BeautifulSoup(html, "lxml")
# Look for video player links (Uptobox, 1fichier, etc.)
# ZT often has multiple hosts
links = soup.find_all('a', href=re.compile(r'uptobox|1fichier|doodstream|vidmoly'))
links = soup.find_all(
"a", href=re.compile(r"uptobox|1fichier|doodstream|vidmoly")
)
if links:
player_url = links[0].get('href', '')
title = soup.find('h1').get_text(strip=True) if soup.find('h1') else "Episode"
player_url = links[0].get("href", "")
title = (
soup.find("h1").get_text(strip=True)
if soup.find("h1")
else "Episode"
)
return player_url, title
return "", ""
except Exception as e:
logger.error(f"Error getting download link from Zone-Telechargement: {e}")
return "", ""
+28 -17
View File
@@ -1,16 +1,26 @@
# Video Players (app/downloaders/video_players)
# Video Players (app/downloaders/video_players/)
## OVERVIEW
File hosting extractors that extract direct download links from video player pages (Doodstream, Sibnet, VidMoly, etc.).
File hosting extractors that extract direct download links from video player pages (Doodstream, Sibnet, VidMoly, Uptobox, etc.).
## WHERE TO LOOK
| Need | File |
|------|------|
| Base class | `base.py` - `BaseVideoPlayer` abstract class |
| Add new player | Create new `.py` file, inherit `BaseVideoPlayer`, add to `__init__.py` |
| URL detection logic | Each player's `can_handle()` method |
| Extract download link | Each player's `get_download_link()` method |
| File | Purpose |
|------|---------|
| `base.py` | `BaseVideoPlayer` abstract class |
| `unfichier.py` | 1fichier.com |
| `doodstream.py` | Doodstream |
| `vidmoly.py` | VidMoly (requires Playwright for extraction) |
| `uptobox.py` | Uptobox |
| `sendvid.py` | SendVid |
| `sibnet.py` | Sibnet |
| `rapidfile.py` | Rapidfile |
| `uqload.py` | Uqload |
| `lpayer.py` | Lplayer |
| `vidzy.py` | Vidzy |
| `luluv.py` | LuLuvid |
| `smoothpre.py` | Smoothpre |
| `oneupload.py` | OneUpload |
## CONVENTIONS
@@ -22,16 +32,17 @@ def can_handle(self, url: str) -> bool: ...
async def get_download_link(self, url: str, target_filename: str = None) -> tuple[str, str]: ...
```
**File operation**: Always use `sanitize_filename()` on extracted filenames.
**HTTP client**: Use `self.client` (AsyncClient from base class). Always close via `await self.close()` when done.
**Return format**: `(download_url, filename)` tuple.
**HTTP client**: Use `self.client` (AsyncClient from base class). Always close via `await self.close()`.
**File operation**: Always `sanitize_filename()` on extracted filenames.
## ANTI-PATTERNS
- Do NOT hardcode User-Agent in each player (use base class headers)
- Do NOT forget to call `await self.close()` after extraction
- Do NOT return None for missing URLs, raise an exception
- Do NOT use sync `requests`, use async `httpx`
- Do NOT skip the `target_filename` parameter, even if unused
- Do NOT hardcode User-Agent per player use base class headers
- Do NOT forget `await self.close()` — resource leak
- Do NOT return None for missing URLs raise an exception
- Do NOT use sync `requests` use async `httpx`
- Do NOT skip `target_filename` parameter — required for anime/series site compatibility
- 8 empty `except:` blocks across players — known tech debt