feat: fix auth, provider health checks, search, and redesign UI
CI / Test (Python 3.11) (push) Has been cancelled
CI / Test (Python 3.12) (push) Has been cancelled
CI / Lint (push) Has been cancelled
CI / Type Check (push) Has been cancelled
CI / Summary (push) Has been cancelled

- Fix register/login: dict-style access on UserTable ORM objects
- Fix HTMX auth: inject JWT token in all HTMX request headers
- Fix FS7 search: use DLE AJAX endpoint /engine/ajax/search.php
- Fix ZT search: use ?p=series&search=QUERY (not DLE format)
- Fix provider health: load hardcoded providers + domain manager
- Add self.id to all anime/series providers
- Redesign homepage: Netflix-style horizontal scroll cards (.hc)
- Redesign search results: grouped by title, poster + synopsis + 3 buttons
- Add Télécharger dropdown: season download + episode picker
- Fix navbar CSS: restore .tabs flex layout, remove orphan rules
- Fix HTMX spinner: remove inline display:none, use CSS indicator
- Add AGENTS.md files across project for developer documentation
This commit is contained in:
root
2026-03-28 00:14:31 +00:00
parent 5d23a3d663
commit 3dc5dd8fe9
36 changed files with 2735 additions and 1989 deletions
+23 -23
View File
@@ -1,4 +1,4 @@
# Anime Sites Downloaders
# Anime Sites (app/downloaders/anime_sites/)
## OVERVIEW
Handlers for French anime streaming catalogs that provide metadata and episode listings, delegating actual video extraction to video player handlers.
@@ -7,8 +7,8 @@ Handlers for French anime streaming catalogs that provide metadata and episode l
| File | Purpose |
|------|---------|
| `base.py` | Abstract `BaseAnimeSite` class defining the interface all anime sites implement |
| `animesama.py` | Primary provider with dynamic domain switching, multiple video player extraction |
| `base.py` | Abstract `BaseAnimeSite` class defining the interface |
| `animesama.py` | Primary provider dynamic domain switching, multiple video player extraction |
| `nekosama.py` | Neko-Sama / Gupy integration (metadata-only, no direct downloads) |
| `animeultime.py` | Anime-Ultime catalog handler |
| `vostfree.py` | Vostfree catalog handler |
@@ -16,26 +16,26 @@ Handlers for French anime streaming catalogs that provide metadata and episode l
## CONVENTIONS
### Interface Contract
Each site must implement four async methods from `BaseAnimeSite`:
- `can_handle(url: str) -> bool` — URL pattern matching
- `search_anime(query, lang) -> list[dict]` — Returns `{title, url, cover_image}`
- `get_episodes(anime_url, lang) -> list[dict]` — Returns `{episode_number, url, title, host}`
- `get_anime_metadata(anime_url) -> dict` — Returns `{synopsis, genres, rating, release_year, studio, poster_image, total_episodes, status}`
- `get_download_link(url) -> tuple[str, str]` — Returns `(video_player_url, filename)`
**Interface contract** — each site implements from `BaseAnimeSite`:
- `can_handle(url)` — URL pattern matching
- `search_anime(query, lang)``[{title, url, cover_image}]`
- `get_episodes(anime_url, lang)``[{episode_number, url, title, host}]`
- `get_anime_metadata(anime_url)``{synopsis, genres, rating, release_year, studio, poster_image, total_episodes, status}`
- `get_download_link(url)``(video_player_url, filename)`
### Key Patterns
- **Pipe-separated URLs**: `video_url|anime_page_url|episode_title` — preserves context across extraction
- **Language parameter**: `lang="vostfr"` or `"vf"` — controls which episodes to return
- **Video player delegation**: Anime sites return player URLs (vidmoly, sendvid, sibnet, lpayer), not direct downloads
- **Filename generation**: `{anime_name} - S{season} - {episode}.mp4` format
- **HTTP headers**: Browser UA and referer required to avoid blocking
**Key patterns**:
- Pipe-separated URLs: `video_url|anime_page_url|episode_title`
- Language param: `lang="vostfr"` or `"vf"`
- Video player delegation: returns player URLs (vidmoly, sendvid, etc.), NOT direct downloads
- Filename format: `{anime_name} - S{season} - {episode}.mp4`
- Browser UA + referer headers required
### Domain Detection
- `AnimeSamaDownloader` fetches current domain from `anime-sama.pw` dynamically
- Uses fallback chain for video extraction: detected player → cached player → priority list
**Domain detection**: `AnimeSamaDownloader` fetches current domain from `anime-sama.pw` dynamically. Uses fallback chain for video extraction.
### Error Handling
- Raise `Exception` with descriptive message on failure
- Log at appropriate level (`debug` for expected failures, `error` for unexpected)
- Validate extracted URLs with `_test_video_url()` before returning
**Error handling**: Raise `Exception` with descriptive message. Log at `debug` for expected failures, `error` for unexpected. Validate URLs with `_test_video_url()` before returning.
## ANTI-PATTERNS
- Do NOT return direct download URLs from anime sites — return player URLs
- Do NOT skip URL validation — use `_test_video_url()`
- 5 empty `except:` blocks in `animesama.py` — known tech debt, silently swallow failures
File diff suppressed because it is too large Load Diff
+147 -104
View File
@@ -10,6 +10,10 @@ class AnimeUltimeDownloader(BaseAnimeSite):
BASE_DOMAINS = ["anime-ultime.com", "anime-ultime.net", "www.anime-ultime.net"]
def __init__(self):
super().__init__()
self.id = "anime-ultime"
def can_handle(self, url: str) -> bool:
return any(domain in url.lower() for domain in self.BASE_DOMAINS)
@@ -24,58 +28,79 @@ class AnimeUltimeDownloader(BaseAnimeSite):
final_url = str(response.url)
# Parse the page
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
# Method 0: Look for og:video meta tag (most reliable for anime-ultime)
og_video = soup.find('meta', property='og:video')
if og_video and og_video.get('content'):
video_url = og_video['content']
if video_url.endswith('.mp4'):
og_video = soup.find("meta", property="og:video")
if og_video and og_video.get("content"):
video_url = og_video["content"]
if video_url.endswith(".mp4"):
filename = self._generate_filename(final_url)
print(f"[ANIME-ULTIME] Found og:video link: {video_url}")
return video_url, filename
# Method 1: Look for direct download links (DDL)
# Anime-Ultime often uses links to file hosts
download_links = soup.find_all('a', href=True)
download_links = soup.find_all("a", href=True)
for link in download_links:
href = link['href']
href = link["href"]
text = link.get_text().lower()
# Look for download buttons/links
if any(keyword in text for keyword in ['télécharger', 'download', 'ddl', 'mega', 'google', 'drive']):
if any(
keyword in text
for keyword in [
"télécharger",
"download",
"ddl",
"mega",
"google",
"drive",
]
):
# Check if it's a direct link or to a file host
if any(host in href.lower() for host in ['mega.nz', 'drive.google.com', 'uptobox.com', '1fichier.com']):
if any(
host in href.lower()
for host in [
"mega.nz",
"drive.google.com",
"uptobox.com",
"1fichier.com",
]
):
filename = self._generate_filename(final_url)
return href, filename
# Method 2: Look for iframe with video player
iframes = soup.find_all('iframe')
iframes = soup.find_all("iframe")
for iframe in iframes:
src = iframe.get('src', '')
if src and any(provider in src for provider in ['video', 'player', 'stream', 'play']):
if src.startswith('http'):
src = iframe.get("src", "")
if src and any(
provider in src
for provider in ["video", "player", "stream", "play"]
):
if src.startswith("http"):
filename = self._generate_filename(final_url)
return src, filename
# Method 3: Look for video tags
videos = soup.find_all('video')
videos = soup.find_all("video")
for video in videos:
src = video.get('src', '')
src = video.get("src", "")
if src:
filename = self._generate_filename(final_url)
return src, filename
# Check source tags
sources = video.find_all('source')
sources = video.find_all("source")
for source in sources:
src = source.get('src', '')
src = source.get("src", "")
if src:
filename = self._generate_filename(final_url)
return src, filename
# Method 4: Look in scripts for video URLs
scripts = soup.find_all('script')
scripts = soup.find_all("script")
for script in scripts:
if script.string:
# Look for common video patterns
@@ -91,26 +116,30 @@ class AnimeUltimeDownloader(BaseAnimeSite):
matches = re.findall(pattern, script.string)
for match in matches:
# Clean up escaped characters
match = match.replace('\\/', '/').replace('\\', '')
if any(ext in match for ext in ['mp4', 'm3u8', 'mkv']):
match = match.replace("\\/", "/").replace("\\", "")
if any(ext in match for ext in ["mp4", "m3u8", "mkv"]):
filename = self._generate_filename(final_url)
return match, filename
# Look for anime-ultime specific patterns
# They sometimes store links in JavaScript variables
ddl_match = re.search(r'ddl["\']?\s*:\s*["\']([^"\']+)["\']', script.string)
ddl_match = re.search(
r'ddl["\']?\s*:\s*["\']([^"\']+)["\']', script.string
)
if ddl_match:
ddl_url = ddl_match.group(1)
if ddl_url.startswith('http'):
if ddl_url.startswith("http"):
filename = self._generate_filename(final_url)
return ddl_url, filename
# Method 5: Look for links with specific classes or IDs
# Anime-Ultime might use specific class names for download links
potential_links = soup.find_all('a', class_=re.compile(r'download|ddl|episode', re.I))
potential_links = soup.find_all(
"a", class_=re.compile(r"download|ddl|episode", re.I)
)
for link in potential_links:
href = link.get('href', '')
if href and href.startswith('http'):
href = link.get("href", "")
if href and href.startswith("http"):
filename = self._generate_filename(final_url)
return href, filename
@@ -132,36 +161,38 @@ class AnimeUltimeDownloader(BaseAnimeSite):
episode = "01"
# Format: info-0-1/EPISODE_ID or info-0-1/EPISODE_ID/NAME-EP-vostfr
if 'info-0-1/' in url:
if "info-0-1/" in url:
# Extract episode ID
ep_match = re.search(r'info-0-1/(\d+)', url)
ep_match = re.search(r"info-0-1/(\d+)", url)
if ep_match:
ep_id = ep_match.group(1)
# Try to get anime name from URL path
name_match = re.search(r'info-0-1/\d+/([^/]+)', url)
name_match = re.search(r"info-0-1/\d+/([^/]+)", url)
if name_match:
raw_name = name_match.group(1)
# Extract episode number
ep_num_match = re.search(r'-(\d+)-vostfr$', raw_name, re.I)
ep_num_match = re.search(r"-(\d+)-vostfr$", raw_name, re.I)
if ep_num_match:
episode = ep_num_match.group(1).zfill(2)
# Remove episode number and suffix from name
anime_name = re.sub(r'-\d+-vostfr$', '', raw_name, flags=re.I).replace('-', ' ')
anime_name = re.sub(
r"-\d+-vostfr$", "", raw_name, flags=re.I
).replace("-", " ")
else:
# Just use the ID
anime_name = f"Episode {ep_id}"
else:
anime_name = f"Episode {ep_id}"
elif 'file-0-1/' in url:
elif "file-0-1/" in url:
# Extract from file-0-1/ID-NAME format
file_match = re.search(r'file-0-1/\d+-(.+)$', url)
file_match = re.search(r"file-0-1/\d+-(.+)$", url)
if file_match:
anime_name = file_match.group(1).replace('-', ' ')
anime_name = file_match.group(1).replace("-", " ")
# Sanitize filename
anime_name = anime_name.replace('/', ' ').strip()
anime_name = anime_name.replace("/", " ").strip()
filename = f"{anime_name} - Episode {episode}.mp4"
return filename.title()
@@ -173,30 +204,30 @@ class AnimeUltimeDownloader(BaseAnimeSite):
try:
print(f"[ANIME-ULTIME] Extracting metadata from: {anime_url}")
response = await self.client.get(anime_url)
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
metadata = {
'synopsis': None,
'genres': [],
'rating': None,
'release_year': None,
'studio': None,
'poster_image': None,
'banner_image': None,
'total_episodes': None,
'status': None,
'alternative_titles': []
"synopsis": None,
"genres": [],
"rating": None,
"release_year": None,
"studio": None,
"poster_image": None,
"banner_image": None,
"total_episodes": None,
"status": None,
"alternative_titles": [],
}
# Extract synopsis
synopsis_selectors = [
'div.synopsis',
'div.description',
"div.synopsis",
"div.description",
'div[class*="synopsis"]',
'div[class*="synopsis"]',
'p.synopsis',
'.info',
'div.texte'
"p.synopsis",
".info",
"div.texte",
]
for selector in synopsis_selectors:
@@ -204,68 +235,73 @@ class AnimeUltimeDownloader(BaseAnimeSite):
if synopsis_elem:
synopsis = synopsis_elem.get_text(strip=True)
if len(synopsis) > 50:
metadata['synopsis'] = synopsis
metadata["synopsis"] = synopsis
break
# Extract genres from meta tags and page content
page_text = soup.get_text()
# Look for genre in meta tags
genre_meta = soup.find('meta', property='genre') or soup.find('meta', attrs={'name': 'genre'})
genre_meta = soup.find("meta", property="genre") or soup.find(
"meta", attrs={"name": "genre"}
)
if genre_meta:
genres_text = genre_meta.get('content', '')
genres_text = genre_meta.get("content", "")
if genres_text:
metadata['genres'] = [g.strip() for g in genres_text.split(',')]
metadata["genres"] = [g.strip() for g in genres_text.split(",")]
# Try to find genre links
genre_links = soup.find_all('a', href=re.compile(r'genre|tag|type|cat', re.I))
genre_links = soup.find_all(
"a", href=re.compile(r"genre|tag|type|cat", re.I)
)
if genre_links:
for link in genre_links[:5]:
genre = link.get_text(strip=True)
if genre and genre not in metadata['genres']:
metadata['genres'].append(genre)
if genre and genre not in metadata["genres"]:
metadata["genres"].append(genre)
# Extract rating
rating_selectors = [
'span.rating',
'div.rating',
'span.score',
'div.note',
'.rating'
"span.rating",
"div.rating",
"span.score",
"div.note",
".rating",
]
for selector in rating_selectors:
rating_elem = soup.select_one(selector)
if rating_elem:
rating_text = rating_elem.get_text(strip=True)
rating_match = re.search(r'(\d+\.?\d*)\s*/\s*10', rating_text)
rating_match = re.search(r"(\d+\.?\d*)\s*/\s*10", rating_text)
if rating_match:
metadata['rating'] = f"{rating_match.group(1)}/10"
metadata["rating"] = f"{rating_match.group(1)}/10"
break
rating_match = re.search(r'(\d+\.?\d*)\s*/\s*5', rating_text)
rating_match = re.search(r"(\d+\.?\d*)\s*/\s*5", rating_text)
if rating_match:
rating_val = float(rating_match.group(1)) * 2
metadata['rating'] = f"{rating_val:.1f}/10"
metadata["rating"] = f"{rating_val:.1f}/10"
break
# Extract release year
year_match = re.search(r'\b(19\d{2}|20\d{2})\b', page_text)
year_match = re.search(r"\b(19\d{2}|20\d{2})\b", page_text)
if year_match:
import datetime
current_year = datetime.datetime.now().year + 2
year = int(year_match.group(1))
if 1950 <= year <= current_year:
metadata['release_year'] = year
metadata["release_year"] = year
# Extract poster image from og:image
og_image = soup.find('meta', property='og:image')
og_image = soup.find("meta", property="og:image")
if og_image:
metadata['poster_image'] = og_image.get('content')
metadata["poster_image"] = og_image.get("content")
# Extract total episodes
episodes_count = len(await self.get_episodes(anime_url))
if episodes_count > 0:
metadata['total_episodes'] = episodes_count
metadata["total_episodes"] = episodes_count
print(f"[ANIME-ULTIME] Extracted metadata: {metadata}")
return metadata
@@ -274,7 +310,9 @@ class AnimeUltimeDownloader(BaseAnimeSite):
print(f"[ANIME-ULTIME] Error extracting metadata: {e}")
return {}
async def search_anime(self, query: str, lang: str = "vostfr", include_metadata: bool = False) -> list[dict]:
async def search_anime(
self, query: str, lang: str = "vostfr", include_metadata: bool = False
) -> list[dict]:
"""
Search for anime on anime-ultime
Returns list of anime with title, url, and cover image
@@ -286,27 +324,30 @@ class AnimeUltimeDownloader(BaseAnimeSite):
"""
try:
import time
start = time.time()
print(f"[ANIME-ULTIME] Searching for '{query}' ({lang})...")
# Anime-Ultime uses POST for search
search_url = "https://www.anime-ultime.net/search-0-1"
response = await self.client.post(search_url, data={'search': query})
soup = BeautifulSoup(response.text, 'lxml')
response = await self.client.post(search_url, data={"search": query})
soup = BeautifulSoup(response.text, "lxml")
elapsed = time.time() - start
print(f"[ANIME-ULTIME] Got response {response.status_code} in {elapsed:.2f}s")
print(
f"[ANIME-ULTIME] Got response {response.status_code} in {elapsed:.2f}s"
)
results = []
# Look for search result links - better parsing
# Search results use file-0-1/ pattern, not info-
search_results = soup.find_all('a', href=re.compile(r'file-0-1/'))
search_results = soup.find_all("a", href=re.compile(r"file-0-1/"))
seen_urls = set()
for result in search_results[:10]: # Limit to 10 results
href = result.get('href', '')
href = result.get("href", "")
raw_title = result.get_text().strip()
# Skip if no href
@@ -322,40 +363,44 @@ class AnimeUltimeDownloader(BaseAnimeSite):
better_title = raw_title
# If raw_title is just "Télécharger" or similar, try to find better title
if len(raw_title) < 5 or raw_title.lower() in ['télécharger', 'download', 'ddl']:
if len(raw_title) < 5 or raw_title.lower() in [
"télécharger",
"download",
"ddl",
]:
# Try to extract from URL (file-0-1/ID-Title format)
url_match = re.search(r'file-0-1/\d+-(.+)$', href)
url_match = re.search(r"file-0-1/\d+-(.+)$", href)
if url_match:
better_title = url_match.group(1).replace('-', ' ').title()
better_title = url_match.group(1).replace("-", " ").title()
# If still no good title, look at parent/row elements
if len(better_title) < 5:
# Check parent row (table structure)
row = result.find_parent(['tr', 'td', 'div'])
row = result.find_parent(["tr", "td", "div"])
if row:
# Look for text in the row that's not the link text
row_text = row.get_text().strip()
# Remove the link text from row text
if raw_title in row_text:
row_text = row_text.replace(raw_title, '').strip()
row_text = row_text.replace(raw_title, "").strip()
if len(row_text) > 5 and len(row_text) < 100:
better_title = row_text
# Make URL absolute
if not href.startswith('http'):
if not href.startswith("http"):
href = urljoin("https://www.anime-ultime.net/", href)
result_item = {
'title': better_title,
'url': href,
'type': 'search_result',
'metadata': None
"title": better_title,
"url": href,
"type": "search_result",
"metadata": None,
}
# Fetch metadata if requested
if include_metadata:
metadata = await self.get_anime_metadata(href)
result_item['metadata'] = metadata
result_item["metadata"] = metadata
results.append(result_item)
@@ -373,27 +418,27 @@ class AnimeUltimeDownloader(BaseAnimeSite):
"""
try:
response = await self.client.get(anime_url)
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
episodes = []
# Look for episode links - anime-ultime uses info-XXXXX-Name-XX-vostfr format
# The URL pattern is info-0-1/ID-Anime-Name-XX-vostfr where XX is episode number
episode_links = soup.find_all('a', href=re.compile(r'info-0-1/\d+'))
episode_links = soup.find_all("a", href=re.compile(r"info-0-1/\d+"))
for link in episode_links:
href = link.get('href', '')
href = link.get("href", "")
text = link.get_text().strip()
# Extract episode number from URL pattern
# Matches: info-0-1/30200/Naruto-OAV-01-vostfr
match = re.search(r'-(\d+)-vostfr$', href, re.I)
match = re.search(r"-(\d+)-vostfr$", href, re.I)
if not match:
# Try other patterns
match = re.search(r'Episode[-\s]?(\d+)', href, re.I)
match = re.search(r"Episode[-\s]?(\d+)", href, re.I)
if not match:
# Try to extract from text
match = re.search(r'(\d+)', text)
match = re.search(r"(\d+)", text)
if match:
episode_num = match.group(1).zfill(2) # Pad with zero
@@ -401,32 +446,30 @@ class AnimeUltimeDownloader(BaseAnimeSite):
# Extract the episode ID from href and build correct URL
# href might be "info-0-1/30200" or "info-0-1/30200/..."
# We need: https://www.anime-ultime.net/info-0-1/30200
ep_id_match = re.search(r'info-0-1/(\d+)', href)
ep_id_match = re.search(r"info-0-1/(\d+)", href)
if ep_id_match:
ep_id = ep_id_match.group(1)
# Build the correct episode URL
episode_url = f"https://www.anime-ultime.net/info-0-1/{ep_id}"
else:
# Fallback to making URL absolute
if not href.startswith('http'):
if not href.startswith("http"):
href = urljoin(anime_url, href)
episode_url = href
episodes.append({
'episode': episode_num,
'url': episode_url,
'title': text
})
episodes.append(
{"episode": episode_num, "url": episode_url, "title": text}
)
# Remove duplicates and sort
seen = set()
unique_episodes = []
for ep in episodes:
if ep['episode'] not in seen:
seen.add(ep['episode'])
if ep["episode"] not in seen:
seen.add(ep["episode"])
unique_episodes.append(ep)
unique_episodes.sort(key=lambda x: int(x['episode']))
unique_episodes.sort(key=lambda x: int(x["episode"]))
return unique_episodes
+90 -82
View File
@@ -1,4 +1,5 @@
"""French-Manga.net anime streaming site downloader"""
from .base import BaseAnimeSite
from bs4 import BeautifulSoup
import re
@@ -17,11 +18,12 @@ class FrenchMangaDownloader(BaseAnimeSite):
"french-manga.net",
"w16.french-manga.net",
"w15.french-manga.net",
"www.french-manga.net"
"www.french-manga.net",
]
def __init__(self):
super().__init__()
self.id = "french-manga"
self.base_url = "https://w16.french-manga.net"
def can_handle(self, url: str) -> bool:
@@ -29,9 +31,7 @@ class FrenchMangaDownloader(BaseAnimeSite):
return any(domain in url.lower() for domain in self.BASE_DOMAINS)
async def search_anime(
self,
query: str,
lang: str = "vostfr"
self, query: str, lang: str = "vostfr"
) -> List[Dict[str, str]]:
"""
Search for anime on French-Manga.
@@ -47,46 +47,50 @@ class FrenchMangaDownloader(BaseAnimeSite):
# French-Manga uses a search endpoint
search_url = f"{self.base_url}/index.php?do=search"
params = {
'do': 'search',
'subaction': 'search',
'story': query,
'x': '0',
'y': '0'
"do": "search",
"subaction": "search",
"story": query,
"x": "0",
"y": "0",
}
response = await self.client.post(search_url, data=params)
response.raise_for_status()
html = response.text
soup = BeautifulSoup(html, 'lxml')
soup = BeautifulSoup(html, "lxml")
results = []
# Look for search results in article or story classes
for item in soup.find_all('article', class_=lambda x: x and 'story' in x.lower()):
title_elem = item.find(['h2', 'h3', 'h4'])
link_elem = item.find('a', href=True)
img_elem = item.find('img')
for item in soup.find_all(
"article", class_=lambda x: x and "story" in x.lower()
):
title_elem = item.find(["h2", "h3", "h4"])
link_elem = item.find("a", href=True)
img_elem = item.find("img")
if title_elem and link_elem:
title = title_elem.get_text(strip=True)
url = link_elem['href']
url = link_elem["href"]
# Ensure absolute URL
if url.startswith('/'):
if url.startswith("/"):
url = self.base_url + url
cover_image = ""
if img_elem and img_elem.get('src'):
cover_image = img_elem['src']
if cover_image.startswith('/'):
if img_elem and img_elem.get("src"):
cover_image = img_elem["src"]
if cover_image.startswith("/"):
cover_image = self.base_url + cover_image
results.append({
'title': title,
'url': url,
'cover_image': cover_image,
'lang': lang
})
results.append(
{
"title": title,
"url": url,
"cover_image": cover_image,
"lang": lang,
}
)
logger.info(f"Found {len(results)} anime results for query: {query}")
return results
@@ -96,9 +100,7 @@ class FrenchMangaDownloader(BaseAnimeSite):
return []
async def get_episodes(
self,
anime_url: str,
lang: str = "vostfr"
self, anime_url: str, lang: str = "vostfr"
) -> List[Dict[str, str]]:
"""
Get episode list for an anime.
@@ -115,34 +117,36 @@ class FrenchMangaDownloader(BaseAnimeSite):
response.raise_for_status()
html = response.text
soup = BeautifulSoup(html, 'lxml')
soup = BeautifulSoup(html, "lxml")
episodes = []
# Look for episode links (typically in a list or table)
# French-Manga usually has episode links in <a> tags with episode numbers
for link in soup.find_all('a', href=True):
href = link['href']
for link in soup.find_all("a", href=True):
href = link["href"]
text = link.get_text(strip=True)
# Pattern: Episode links usually contain "episode" or numbers
if re.search(r'episode?\s*\d+', text.lower()):
episode_num = re.search(r'(\d+)', text)
if re.search(r"episode?\s*\d+", text.lower()):
episode_num = re.search(r"(\d+)", text)
if episode_num:
episode_number = int(episode_num.group(1))
# Ensure absolute URL
if href.startswith('/'):
if href.startswith("/"):
href = self.base_url + href
episodes.append({
'episode_number': episode_number,
'url': href,
'title': text,
'host': 'french-manga'
})
episodes.append(
{
"episode_number": episode_number,
"url": href,
"title": text,
"host": "french-manga",
}
)
# Sort by episode number
episodes.sort(key=lambda x: x['episode_number'])
episodes.sort(key=lambda x: x["episode_number"])
logger.info(f"Found {len(episodes)} episodes for {anime_url}")
return episodes
@@ -166,31 +170,33 @@ class FrenchMangaDownloader(BaseAnimeSite):
response.raise_for_status()
html = response.text
soup = BeautifulSoup(html, 'lxml')
soup = BeautifulSoup(html, "lxml")
# Extract title
title = ""
title_elem = soup.find('h1') or soup.find('h2', class_='title')
title_elem = soup.find("h1") or soup.find("h2", class_="title")
if title_elem:
title = title_elem.get_text(strip=True)
# Extract synopsis
synopsis = ""
synopsis_elem = soup.find('div', class_=lambda x: x and 'story' in x.lower())
synopsis_elem = soup.find(
"div", class_=lambda x: x and "story" in x.lower()
)
if synopsis_elem:
synopsis = synopsis_elem.get_text(strip=True)
# Extract cover image
poster_image = ""
img_elem = soup.find('img', class_=lambda x: x and 'poster' in x.lower())
if img_elem and img_elem.get('src'):
poster_image = img_elem['src']
if poster_image.startswith('/'):
img_elem = soup.find("img", class_=lambda x: x and "poster" in x.lower())
if img_elem and img_elem.get("src"):
poster_image = img_elem["src"]
if poster_image.startswith("/"):
poster_image = self.base_url + poster_image
# Extract genres
genres = []
genre_links = soup.find_all('a', href=re.compile(r'/xfsearch/.*genre/'))
genre_links = soup.find_all("a", href=re.compile(r"/xfsearch/.*genre/"))
for link in genre_links[:10]: # Limit to 10 genres
genre = link.get_text(strip=True)
if genre:
@@ -198,36 +204,38 @@ class FrenchMangaDownloader(BaseAnimeSite):
# Extract rating (if available)
rating = ""
rating_elem = soup.find(['span', 'div'], class_=lambda x: x and 'rating' in x.lower())
rating_elem = soup.find(
["span", "div"], class_=lambda x: x and "rating" in x.lower()
)
if rating_elem:
rating = rating_elem.get_text(strip=True)
return {
'title': title,
'synopsis': synopsis,
'genres': genres,
'rating': rating,
'release_year': '',
'studio': '',
'poster_image': poster_image,
'total_episodes': len(await self.get_episodes(anime_url)),
'status': '',
'languages': ['vf', 'vostfr']
"title": title,
"synopsis": synopsis,
"genres": genres,
"rating": rating,
"release_year": "",
"studio": "",
"poster_image": poster_image,
"total_episodes": len(await self.get_episodes(anime_url)),
"status": "",
"languages": ["vf", "vostfr"],
}
except Exception as e:
logger.error(f"Error getting anime metadata: {e}")
return {
'title': '',
'synopsis': '',
'genres': [],
'rating': '',
'release_year': '',
'studio': '',
'poster_image': '',
'total_episodes': 0,
'status': '',
'languages': ['vf', 'vostfr']
"title": "",
"synopsis": "",
"genres": [],
"rating": "",
"release_year": "",
"studio": "",
"poster_image": "",
"total_episodes": 0,
"status": "",
"languages": ["vf", "vostfr"],
}
async def get_download_link(self, url: str) -> tuple[str, str]:
@@ -248,20 +256,20 @@ class FrenchMangaDownloader(BaseAnimeSite):
response.raise_for_status()
html = response.text
soup = BeautifulSoup(html, 'lxml')
soup = BeautifulSoup(html, "lxml")
# Look for iframe or video player
iframe = soup.find('iframe', src=True)
iframe = soup.find("iframe", src=True)
if iframe:
video_url = iframe['src']
video_url = iframe["src"]
else:
# Look for video tag directly
video = soup.find('video', src=True)
video = soup.find("video", src=True)
if video:
video_url = video['src']
video_url = video["src"]
else:
# Try to find in script tags
scripts = soup.find_all('script')
scripts = soup.find_all("script")
for script in scripts:
if script.string:
# Look for iframe or video URLs in JavaScript
@@ -274,20 +282,20 @@ class FrenchMangaDownloader(BaseAnimeSite):
if match:
video_url = match.group(1)
break
if 'video_url' in locals():
if "video_url" in locals():
break
if 'video_url' not in locals():
if "video_url" not in locals():
raise ValueError("Could not find video player URL")
# Ensure absolute URL
if video_url.startswith('//'):
video_url = 'https:' + video_url
elif video_url.startswith('/'):
if video_url.startswith("//"):
video_url = "https:" + video_url
elif video_url.startswith("/"):
video_url = self.base_url + video_url
# Extract episode title
title_elem = soup.find('h1') or soup.find('h2')
title_elem = soup.find("h1") or soup.find("h2")
episode_title = title_elem.get_text(strip=True) if title_elem else "Episode"
episode_title = sanitize_filename(episode_title)
+119 -87
View File
@@ -7,79 +7,100 @@ from urllib.parse import urljoin
class NekoSamaDownloader(BaseAnimeSite):
"""Downloader for neko-sama.org (anime streaming via Gupy)
NOTE: neko-sama.org now redirects to Gupy, which is a legal streaming search engine.
It does NOT host video content - it provides metadata about where to watch legally.
This provider can search and get metadata but cannot provide direct download links.
"""
BASE_DOMAINS = ["neko-sama.org", "www.neko-sama.org", "neko-sama.fr", "nekosama.fr", "www.gupy.fr", "gupy.fr"]
BASE_DOMAINS = [
"neko-sama.org",
"www.neko-sama.org",
"neko-sama.fr",
"nekosama.fr",
"www.gupy.fr",
"gupy.fr",
]
def __init__(self):
super().__init__()
self.id = "neko-sama"
def can_handle(self, url: str) -> bool:
return any(domain in url.lower() for domain in self.BASE_DOMAINS)
async def get_download_link(self, url: str, target_filename: Optional[str] = None) -> tuple[str, str]:
async def get_download_link(
self, url: str, target_filename: Optional[str] = None
) -> tuple[str, str]:
"""
Extract download link from neko-sama URL.
NOTE: neko-sama.org/Gupy is a legal streaming search engine, NOT a video host.
This returns streaming platform information instead of direct video links.
"""
try:
# Check if this is a Gupy URL
if 'gupy.fr' in url or 'neko-sama.org' in url:
if "gupy.fr" in url or "neko-sama.org" in url:
response = await self.client.get(url, follow_redirects=True)
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
# Look for streaming platform links
streaming_links = []
for link in soup.find_all('a', href=True):
href = link.get('href', '')
if '/out/' in href:
for link in soup.find_all("a", href=True):
href = link.get("href", "")
if "/out/" in href:
text = link.get_text(strip=True)
if text and 'Regarder' in text:
if text and "Regarder" in text:
streaming_links.append(f"{text}: {href}")
if streaming_links:
title_elem = soup.find('h1') or soup.find('title')
title = title_elem.get_text(strip=True).split('|')[0].strip() if title_elem else "Unknown"
info = "Available streaming platforms:\n" + "\n".join(streaming_links[:5])
title_elem = soup.find("h1") or soup.find("title")
title = (
title_elem.get_text(strip=True).split("|")[0].strip()
if title_elem
else "Unknown"
)
info = "Available streaming platforms:\n" + "\n".join(
streaming_links[:5]
)
filename = target_filename or f"{title}_streaming_info.txt"
return info, filename
raise Exception("No streaming links found - Gupy is a legal streaming search, not a video host")
raise Exception(
"No streaming links found - Gupy is a legal streaming search, not a video host"
)
# Legacy: try original method for other URLs
response = await self.client.get(url, follow_redirects=True)
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
# Method 1: Look for iframes with video
iframes = soup.find_all('iframe')
iframes = soup.find_all("iframe")
for iframe in iframes:
src = iframe.get('src', '')
if src and any(p in src for p in ['video', 'player', 'stream']):
if not src.startswith('http'):
src = iframe.get("src", "")
if src and any(p in src for p in ["video", "player", "stream"]):
if not src.startswith("http"):
src = urljoin(str(response.url), src)
filename = self._generate_filename(str(response.url))
return src, filename
# Method 2: Look for video tags
videos = soup.find_all('video')
videos = soup.find_all("video")
for video in videos:
src = video.get('src') or video.get('data-src')
src = video.get("src") or video.get("data-src")
if src:
filename = self._generate_filename(str(response.url))
return src, filename
sources = video.find_all('source')
sources = video.find_all("source")
for source in sources:
src = source.get('src', '')
src = source.get("src", "")
if src:
filename = self._generate_filename(str(response.url))
return src, filename
# Method 3: Look in scripts
scripts = soup.find_all('script')
scripts = soup.find_all("script")
for script in scripts:
if script.string:
patterns = [
@@ -90,24 +111,26 @@ class NekoSamaDownloader(BaseAnimeSite):
for pattern in patterns:
matches = re.findall(pattern, script.string)
for match in matches:
match = match.replace('\\/', '/')
if any(ext in match for ext in ['mp4', 'm3u8']):
match = match.replace("\\/", "/")
if any(ext in match for ext in ["mp4", "m3u8"]):
filename = self._generate_filename(str(response.url))
return match, filename
raise Exception("Could not find video link - Neko-Sama/Gupy does not host video content")
raise Exception(
"Could not find video link - Neko-Sama/Gupy does not host video content"
)
except Exception as e:
raise Exception(f"Error extracting NekoSama link: {str(e)}")
def _generate_filename(self, url: str) -> str:
parts = url.split('/')
parts = url.split("/")
anime_name = "anime"
episode = "1"
for i, part in enumerate(parts):
if 'episode' in part.lower():
match = re.search(r'episode[-\s]*(\d+)', part, re.I)
if "episode" in part.lower():
match = re.search(r"episode[-\s]*(\d+)", part, re.I)
if match:
episode = match.group(1)
@@ -118,31 +141,31 @@ class NekoSamaDownloader(BaseAnimeSite):
"""Get list of episodes for an anime."""
try:
response = await self.client.get(anime_url)
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
episodes = []
# Try to find episode links
episode_links = soup.find_all('a', href=re.compile(r'episode'))
episode_links = soup.find_all("a", href=re.compile(r"episode"))
for link in episode_links:
href = link.get('href', '')
match = re.search(r'episode[-\s]*(\d+)', href, re.I)
href = link.get("href", "")
match = re.search(r"episode[-\s]*(\d+)", href, re.I)
if match:
episode_num = match.group(1)
if not href.startswith('http'):
if not href.startswith("http"):
href = urljoin(anime_url, href)
episodes.append({'episode': episode_num, 'url': href})
episodes.append({"episode": episode_num, "url": href})
# Deduplicate and sort
seen = set()
unique_episodes = []
for ep in episodes:
if ep['episode'] not in seen:
seen.add(ep['episode'])
if ep["episode"] not in seen:
seen.add(ep["episode"])
unique_episodes.append(ep)
unique_episodes.sort(key=lambda x: int(x['episode']))
unique_episodes.sort(key=lambda x: int(x["episode"]))
return unique_episodes
except Exception as e:
@@ -153,70 +176,70 @@ class NekoSamaDownloader(BaseAnimeSite):
try:
print(f"[NEKO-SAMA] Extracting metadata from: {anime_url}")
response = await self.client.get(anime_url)
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
metadata = {
'synopsis': None,
'genres': [],
'rating': None,
'release_year': None,
'studio': None,
'poster_image': None,
'banner_image': None,
'total_episodes': None,
'status': None,
'alternative_titles': []
"synopsis": None,
"genres": [],
"rating": None,
"release_year": None,
"studio": None,
"poster_image": None,
"banner_image": None,
"total_episodes": None,
"status": None,
"alternative_titles": [],
}
# Extract title and year from h1
title_elem = soup.find('h1')
title_elem = soup.find("h1")
if title_elem:
title_text = title_elem.get_text(strip=True)
# Extract year from title like "Naruto (2002)"
year_match = re.search(r'\((\d{4})\)', title_text)
year_match = re.search(r"\((\d{4})\)", title_text)
if year_match:
metadata['release_year'] = int(year_match.group(1))
metadata["release_year"] = int(year_match.group(1))
# Extract synopsis - Gupy shows it as paragraphs
synopsis_elem = soup.find('p')
synopsis_elem = soup.find("p")
if synopsis_elem:
text = synopsis_elem.get_text(strip=True)
if len(text) > 50:
metadata['synopsis'] = text
metadata["synopsis"] = text
# Extract genres from meta tags or links
genre_links = soup.find_all('a', href=re.compile(r'serie-|genre|tag'))
genre_links = soup.find_all("a", href=re.compile(r"serie-|genre|tag"))
if genre_links:
genres = []
for link in genre_links[:5]:
text = link.get_text(strip=True)
if text and '/' not in text and len(text) < 30:
if text and "/" not in text and len(text) < 30:
genres.append(text)
metadata['genres'] = genres
metadata["genres"] = genres
# Extract rating from percentage
rating_elem = soup.find(string=re.compile(r'\d+(\.\d+)?%'))
rating_elem = soup.find(string=re.compile(r"\d+(\.\d+)?%"))
if rating_elem:
match = re.search(r'(\d+(\.\d+)?)%', rating_elem)
match = re.search(r"(\d+(\.\d+)?)%", rating_elem)
if match:
rating = float(match.group(1)) / 10
metadata['rating'] = f"{rating:.1f}/10"
metadata["rating"] = f"{rating:.1f}/10"
# Extract poster image
poster_elem = soup.find('img', src=re.compile(r'poster|poster'))
poster_elem = soup.find("img", src=re.compile(r"poster|poster"))
if poster_elem:
metadata['poster_image'] = poster_elem.get('src')
metadata["poster_image"] = poster_elem.get("src")
# Extract episode count from page text
page_text = soup.get_text()
ep_match = re.search(r'(\d+)\s*episodes?', page_text, re.I)
ep_match = re.search(r"(\d+)\s*episodes?", page_text, re.I)
if ep_match:
metadata['total_episodes'] = int(ep_match.group(1))
metadata["total_episodes"] = int(ep_match.group(1))
# Extract studio/director
director_elem = soup.find('a', href=re.compile(r'person|réalisé'))
director_elem = soup.find("a", href=re.compile(r"person|réalisé"))
if director_elem:
metadata['studio'] = director_elem.get_text(strip=True)
metadata["studio"] = director_elem.get_text(strip=True)
print(f"[NEKO-SAMA] Extracted metadata: {metadata}")
return metadata
@@ -225,16 +248,19 @@ class NekoSamaDownloader(BaseAnimeSite):
print(f"[NEKO-SAMA] Error extracting metadata: {e}")
return {}
async def search_anime(self, query: str, lang: str = "vostfr", include_metadata: bool = False) -> list[dict]:
async def search_anime(
self, query: str, lang: str = "vostfr", include_metadata: bool = False
) -> list[dict]:
"""Search for anime on neko-sama (uses Gupy backend)."""
try:
import time
from html import unescape
start = time.time()
print(f"[NEKO-SAMA] Searching for '{query}' ({lang})...")
# Neko-Sama now uses Gupy - try the direct URL pattern
search_slug = query.lower().replace(' ', '-')
search_slug = query.lower().replace(" ", "-")
search_urls = [
f"https://www.gupy.fr/series/{search_slug}/",
f"https://neko-sama.org/series/{search_slug}/",
@@ -250,34 +276,40 @@ class NekoSamaDownloader(BaseAnimeSite):
print(f"[NEKO-SAMA] Found anime at {final_url}")
# Extract title from page
soup = BeautifulSoup(response.text, 'lxml')
title_elem = soup.find('h1') or soup.find('title')
title = unescape(title_elem.get_text(strip=True)) if title_elem else query
soup = BeautifulSoup(response.text, "lxml")
title_elem = soup.find("h1") or soup.find("title")
title = (
unescape(title_elem.get_text(strip=True))
if title_elem
else query
)
# Clean up title
title = title.split('|')[0].split('-')[0].strip()
title = title.split("|")[0].split("-")[0].strip()
result = {
'title': title,
'url': final_url,
'cover_image': None,
'type': 'direct',
'metadata': None
"title": title,
"url": final_url,
"cover_image": None,
"type": "direct",
"metadata": None,
}
# Try to get poster
poster = soup.find('img', src=re.compile(r'poster'))
poster = soup.find("img", src=re.compile(r"poster"))
if poster:
result['cover_image'] = poster.get('src')
result["cover_image"] = poster.get("src")
if include_metadata:
metadata = await self.get_anime_metadata(final_url)
result['metadata'] = metadata
result["metadata"] = metadata
results.append(result)
break
elapsed = time.time() - start
print(f"[NEKO-SAMA] Search completed in {elapsed:.2f}s, found {len(results)} results")
print(
f"[NEKO-SAMA] Search completed in {elapsed:.2f}s, found {len(results)} results"
)
return results
except Exception as e:
+78 -63
View File
@@ -9,6 +9,10 @@ class VostfreeDownloader(BaseAnimeSite):
BASE_DOMAINS = ["vostfree.tv", "www.vostfree.tv"]
def __init__(self):
super().__init__()
self.id = "vostfree"
def can_handle(self, url: str) -> bool:
return any(domain in url.lower() for domain in self.BASE_DOMAINS)
@@ -16,35 +20,35 @@ class VostfreeDownloader(BaseAnimeSite):
"""Extract download link from vostfree URL"""
try:
response = await self.client.get(url, follow_redirects=True)
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
# Method 1: Look for iframe players
iframes = soup.find_all('iframe')
iframes = soup.find_all("iframe")
for iframe in iframes:
src = iframe.get('src', '')
if src and any(p in src for p in ['player', 'video', 'stream']):
if not src.startswith('http'):
src = iframe.get("src", "")
if src and any(p in src for p in ["player", "video", "stream"]):
if not src.startswith("http"):
src = urljoin(str(response.url), src)
filename = self._generate_filename(str(response.url))
return src, filename
# Method 2: Look for video tags
videos = soup.find_all('video')
videos = soup.find_all("video")
for video in videos:
src = video.get('src')
src = video.get("src")
if src:
filename = self._generate_filename(str(response.url))
return src, filename
sources = video.find_all('source')
sources = video.find_all("source")
for source in sources:
src = source.get('src', '')
if src and any(ext in src for ext in ['mp4', 'm3u8']):
src = source.get("src", "")
if src and any(ext in src for ext in ["mp4", "m3u8"]):
filename = self._generate_filename(str(response.url))
return src, filename
# Method 3: Look in scripts
scripts = soup.find_all('script')
scripts = soup.find_all("script")
for script in scripts:
if script.string:
patterns = [
@@ -56,8 +60,8 @@ class VostfreeDownloader(BaseAnimeSite):
for pattern in patterns:
matches = re.findall(pattern, script.string)
for match in matches:
match = match.replace('\\/', '/')
if any(ext in match for ext in ['mp4', 'm3u8']):
match = match.replace("\\/", "/")
if any(ext in match for ext in ["mp4", "m3u8"]):
filename = self._generate_filename(str(response.url))
return match, filename
@@ -67,12 +71,12 @@ class VostfreeDownloader(BaseAnimeSite):
raise Exception(f"Error extracting Vostfree link: {str(e)}")
def _generate_filename(self, url: str) -> str:
parts = url.split('/')
parts = url.split("/")
anime_name = "anime"
episode = "1"
for part in parts:
match = re.search(r'episode[-\s]*(\d+)', part, re.I)
match = re.search(r"episode[-\s]*(\d+)", part, re.I)
if match:
episode = match.group(1)
@@ -82,30 +86,30 @@ class VostfreeDownloader(BaseAnimeSite):
async def get_episodes(self, anime_url: str, lang: str = "vostfr") -> list[dict]:
try:
response = await self.client.get(anime_url)
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
episodes = []
episode_links = soup.find_all('a', href=re.compile(r'episode', re.I))
episode_links = soup.find_all("a", href=re.compile(r"episode", re.I))
for link in episode_links:
href = link.get('href', '')
match = re.search(r'episode[-\s]*(\d+)', href, re.I)
href = link.get("href", "")
match = re.search(r"episode[-\s]*(\d+)", href, re.I)
if match:
episode_num = match.group(1)
if not href.startswith('http'):
if not href.startswith("http"):
href = urljoin(anime_url, href)
episodes.append({'episode': episode_num, 'url': href})
episodes.append({"episode": episode_num, "url": href})
# Deduplicate and sort
seen = set()
unique_episodes = []
for ep in episodes:
if ep['episode'] not in seen:
seen.add(ep['episode'])
if ep["episode"] not in seen:
seen.add(ep["episode"])
unique_episodes.append(ep)
unique_episodes.sort(key=lambda x: int(x['episode']))
unique_episodes.sort(key=lambda x: int(x["episode"]))
return unique_episodes
except Exception as e:
@@ -119,29 +123,29 @@ class VostfreeDownloader(BaseAnimeSite):
try:
print(f"[VOSTFREE] Extracting metadata from: {anime_url}")
response = await self.client.get(anime_url)
soup = BeautifulSoup(response.text, 'lxml')
soup = BeautifulSoup(response.text, "lxml")
metadata = {
'synopsis': None,
'genres': [],
'rating': None,
'release_year': None,
'studio': None,
'poster_image': None,
'banner_image': None,
'total_episodes': None,
'status': None,
'alternative_titles': []
"synopsis": None,
"genres": [],
"rating": None,
"release_year": None,
"studio": None,
"poster_image": None,
"banner_image": None,
"total_episodes": None,
"status": None,
"alternative_titles": [],
}
# Extract synopsis
synopsis_selectors = [
'div.synopsis',
'div.description',
"div.synopsis",
"div.description",
'div[class*="synopsis"]',
'div[class*="desc"]',
'p.synopsis',
'.anime-synopsis'
"p.synopsis",
".anime-synopsis",
]
for selector in synopsis_selectors:
@@ -149,57 +153,65 @@ class VostfreeDownloader(BaseAnimeSite):
if synopsis_elem:
synopsis = synopsis_elem.get_text(strip=True)
if len(synopsis) > 50:
metadata['synopsis'] = synopsis
metadata["synopsis"] = synopsis
break
# Extract genres
genre_links = soup.find_all('a', href=re.compile(r'genre|tag|type', re.I))
genre_links = soup.find_all("a", href=re.compile(r"genre|tag|type", re.I))
if genre_links:
metadata['genres'] = [link.get_text(strip=True) for link in genre_links[:5]]
metadata["genres"] = [
link.get_text(strip=True) for link in genre_links[:5]
]
# Extract rating
rating_selectors = [
'span.rating',
'div.rating',
'span.score',
"span.rating",
"div.rating",
"span.score",
'div[class*="rating"]',
'div[class*="score"]'
'div[class*="score"]',
]
for selector in rating_selectors:
rating_elem = soup.select_one(selector)
if rating_elem:
rating_text = rating_elem.get_text(strip=True)
rating_match = re.search(r'(\d+\.?\d*)\s*/\s*10', rating_text)
rating_match = re.search(r"(\d+\.?\d*)\s*/\s*10", rating_text)
if rating_match:
metadata['rating'] = f"{rating_match.group(1)}/10"
metadata["rating"] = f"{rating_match.group(1)}/10"
break
# Extract release year
page_text = soup.get_text()
year_matches = re.findall(r'\b(19\d{2}|20\d{2})\b', page_text)
year_matches = re.findall(r"\b(19\d{2}|20\d{2})\b", page_text)
if year_matches:
import datetime
current_year = datetime.datetime.now().year + 2
valid_years = [int(y) for y in year_matches if 1950 <= int(y) <= current_year]
valid_years = [
int(y) for y in year_matches if 1950 <= int(y) <= current_year
]
if valid_years:
from collections import Counter
metadata['release_year'] = Counter(valid_years).most_common(1)[0][0]
metadata["release_year"] = Counter(valid_years).most_common(1)[0][0]
# Extract poster image
poster_elem = soup.select_one('img.poster, img.cover, .anime-poster img')
poster_elem = soup.select_one("img.poster, img.cover, .anime-poster img")
if poster_elem:
metadata['poster_image'] = poster_elem.get('src') or poster_elem.get('data-src')
metadata["poster_image"] = poster_elem.get("src") or poster_elem.get(
"data-src"
)
# Extract poster from og:image
og_image = soup.find('meta', property='og:image')
if og_image and not metadata['poster_image']:
metadata['poster_image'] = og_image.get('content')
og_image = soup.find("meta", property="og:image")
if og_image and not metadata["poster_image"]:
metadata["poster_image"] = og_image.get("content")
# Extract total episodes
episodes_count = len(await self.get_episodes(anime_url))
if episodes_count > 0:
metadata['total_episodes'] = episodes_count
metadata["total_episodes"] = episodes_count
print(f"[VOSTFREE] Extracted metadata: {metadata}")
return metadata
@@ -208,7 +220,9 @@ class VostfreeDownloader(BaseAnimeSite):
print(f"[VOSTFREE] Error extracting metadata: {e}")
return {}
async def search_anime(self, query: str, lang: str = "vostfr", include_metadata: bool = False) -> list[dict]:
async def search_anime(
self, query: str, lang: str = "vostfr", include_metadata: bool = False
) -> list[dict]:
"""
Search for anime on vostfree
@@ -219,6 +233,7 @@ class VostfreeDownloader(BaseAnimeSite):
"""
try:
import time
start = time.time()
print(f"[VOSTFREE] Searching for '{query}' ({lang})...")
@@ -233,15 +248,15 @@ class VostfreeDownloader(BaseAnimeSite):
if response.status_code == 200:
print(f"[VOSTFREE] Found anime at {str(response.url)}")
result = {
'title': query,
'url': str(response.url),
'type': 'direct',
'metadata': None
"title": query,
"url": str(response.url),
"type": "direct",
"metadata": None,
}
if include_metadata:
metadata = await self.get_anime_metadata(str(response.url))
result['metadata'] = metadata
result["metadata"] = metadata
return [result]