feat: fix auth, provider health checks, search, and redesign UI
- Fix register/login: dict-style access on UserTable ORM objects - Fix HTMX auth: inject JWT token in all HTMX request headers - Fix FS7 search: use DLE AJAX endpoint /engine/ajax/search.php - Fix ZT search: use ?p=series&search=QUERY (not DLE format) - Fix provider health: load hardcoded providers + domain manager - Add self.id to all anime/series providers - Redesign homepage: Netflix-style horizontal scroll cards (.hc) - Redesign search results: grouped by title, poster + synopsis + 3 buttons - Add Télécharger dropdown: season download + episode picker - Fix navbar CSS: restore .tabs flex layout, remove orphan rules - Fix HTMX spinner: remove inline display:none, use CSS indicator - Add AGENTS.md files across project for developer documentation
This commit is contained in:
@@ -1,4 +1,4 @@
|
||||
# Anime Sites Downloaders
|
||||
# Anime Sites (app/downloaders/anime_sites/)
|
||||
|
||||
## OVERVIEW
|
||||
Handlers for French anime streaming catalogs that provide metadata and episode listings, delegating actual video extraction to video player handlers.
|
||||
@@ -7,8 +7,8 @@ Handlers for French anime streaming catalogs that provide metadata and episode l
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `base.py` | Abstract `BaseAnimeSite` class defining the interface all anime sites implement |
|
||||
| `animesama.py` | Primary provider with dynamic domain switching, multiple video player extraction |
|
||||
| `base.py` | Abstract `BaseAnimeSite` class defining the interface |
|
||||
| `animesama.py` | Primary provider — dynamic domain switching, multiple video player extraction |
|
||||
| `nekosama.py` | Neko-Sama / Gupy integration (metadata-only, no direct downloads) |
|
||||
| `animeultime.py` | Anime-Ultime catalog handler |
|
||||
| `vostfree.py` | Vostfree catalog handler |
|
||||
@@ -16,26 +16,26 @@ Handlers for French anime streaming catalogs that provide metadata and episode l
|
||||
|
||||
## CONVENTIONS
|
||||
|
||||
### Interface Contract
|
||||
Each site must implement four async methods from `BaseAnimeSite`:
|
||||
- `can_handle(url: str) -> bool` — URL pattern matching
|
||||
- `search_anime(query, lang) -> list[dict]` — Returns `{title, url, cover_image}`
|
||||
- `get_episodes(anime_url, lang) -> list[dict]` — Returns `{episode_number, url, title, host}`
|
||||
- `get_anime_metadata(anime_url) -> dict` — Returns `{synopsis, genres, rating, release_year, studio, poster_image, total_episodes, status}`
|
||||
- `get_download_link(url) -> tuple[str, str]` — Returns `(video_player_url, filename)`
|
||||
**Interface contract** — each site implements from `BaseAnimeSite`:
|
||||
- `can_handle(url)` — URL pattern matching
|
||||
- `search_anime(query, lang)` → `[{title, url, cover_image}]`
|
||||
- `get_episodes(anime_url, lang)` → `[{episode_number, url, title, host}]`
|
||||
- `get_anime_metadata(anime_url)` → `{synopsis, genres, rating, release_year, studio, poster_image, total_episodes, status}`
|
||||
- `get_download_link(url)` → `(video_player_url, filename)`
|
||||
|
||||
### Key Patterns
|
||||
- **Pipe-separated URLs**: `video_url|anime_page_url|episode_title` — preserves context across extraction
|
||||
- **Language parameter**: `lang="vostfr"` or `"vf"` — controls which episodes to return
|
||||
- **Video player delegation**: Anime sites return player URLs (vidmoly, sendvid, sibnet, lpayer), not direct downloads
|
||||
- **Filename generation**: `{anime_name} - S{season} - {episode}.mp4` format
|
||||
- **HTTP headers**: Browser UA and referer required to avoid blocking
|
||||
**Key patterns**:
|
||||
- Pipe-separated URLs: `video_url|anime_page_url|episode_title`
|
||||
- Language param: `lang="vostfr"` or `"vf"`
|
||||
- Video player delegation: returns player URLs (vidmoly, sendvid, etc.), NOT direct downloads
|
||||
- Filename format: `{anime_name} - S{season} - {episode}.mp4`
|
||||
- Browser UA + referer headers required
|
||||
|
||||
### Domain Detection
|
||||
- `AnimeSamaDownloader` fetches current domain from `anime-sama.pw` dynamically
|
||||
- Uses fallback chain for video extraction: detected player → cached player → priority list
|
||||
**Domain detection**: `AnimeSamaDownloader` fetches current domain from `anime-sama.pw` dynamically. Uses fallback chain for video extraction.
|
||||
|
||||
### Error Handling
|
||||
- Raise `Exception` with descriptive message on failure
|
||||
- Log at appropriate level (`debug` for expected failures, `error` for unexpected)
|
||||
- Validate extracted URLs with `_test_video_url()` before returning
|
||||
**Error handling**: Raise `Exception` with descriptive message. Log at `debug` for expected failures, `error` for unexpected. Validate URLs with `_test_video_url()` before returning.
|
||||
|
||||
## ANTI-PATTERNS
|
||||
|
||||
- Do NOT return direct download URLs from anime sites — return player URLs
|
||||
- Do NOT skip URL validation — use `_test_video_url()`
|
||||
- 5 empty `except:` blocks in `animesama.py` — known tech debt, silently swallow failures
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -10,6 +10,10 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
|
||||
BASE_DOMAINS = ["anime-ultime.com", "anime-ultime.net", "www.anime-ultime.net"]
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.id = "anime-ultime"
|
||||
|
||||
def can_handle(self, url: str) -> bool:
|
||||
return any(domain in url.lower() for domain in self.BASE_DOMAINS)
|
||||
|
||||
@@ -24,58 +28,79 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
final_url = str(response.url)
|
||||
|
||||
# Parse the page
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
# Method 0: Look for og:video meta tag (most reliable for anime-ultime)
|
||||
og_video = soup.find('meta', property='og:video')
|
||||
if og_video and og_video.get('content'):
|
||||
video_url = og_video['content']
|
||||
if video_url.endswith('.mp4'):
|
||||
og_video = soup.find("meta", property="og:video")
|
||||
if og_video and og_video.get("content"):
|
||||
video_url = og_video["content"]
|
||||
if video_url.endswith(".mp4"):
|
||||
filename = self._generate_filename(final_url)
|
||||
print(f"[ANIME-ULTIME] Found og:video link: {video_url}")
|
||||
return video_url, filename
|
||||
|
||||
# Method 1: Look for direct download links (DDL)
|
||||
# Anime-Ultime often uses links to file hosts
|
||||
download_links = soup.find_all('a', href=True)
|
||||
download_links = soup.find_all("a", href=True)
|
||||
for link in download_links:
|
||||
href = link['href']
|
||||
href = link["href"]
|
||||
text = link.get_text().lower()
|
||||
|
||||
# Look for download buttons/links
|
||||
if any(keyword in text for keyword in ['télécharger', 'download', 'ddl', 'mega', 'google', 'drive']):
|
||||
if any(
|
||||
keyword in text
|
||||
for keyword in [
|
||||
"télécharger",
|
||||
"download",
|
||||
"ddl",
|
||||
"mega",
|
||||
"google",
|
||||
"drive",
|
||||
]
|
||||
):
|
||||
# Check if it's a direct link or to a file host
|
||||
if any(host in href.lower() for host in ['mega.nz', 'drive.google.com', 'uptobox.com', '1fichier.com']):
|
||||
if any(
|
||||
host in href.lower()
|
||||
for host in [
|
||||
"mega.nz",
|
||||
"drive.google.com",
|
||||
"uptobox.com",
|
||||
"1fichier.com",
|
||||
]
|
||||
):
|
||||
filename = self._generate_filename(final_url)
|
||||
return href, filename
|
||||
|
||||
# Method 2: Look for iframe with video player
|
||||
iframes = soup.find_all('iframe')
|
||||
iframes = soup.find_all("iframe")
|
||||
for iframe in iframes:
|
||||
src = iframe.get('src', '')
|
||||
if src and any(provider in src for provider in ['video', 'player', 'stream', 'play']):
|
||||
if src.startswith('http'):
|
||||
src = iframe.get("src", "")
|
||||
if src and any(
|
||||
provider in src
|
||||
for provider in ["video", "player", "stream", "play"]
|
||||
):
|
||||
if src.startswith("http"):
|
||||
filename = self._generate_filename(final_url)
|
||||
return src, filename
|
||||
|
||||
# Method 3: Look for video tags
|
||||
videos = soup.find_all('video')
|
||||
videos = soup.find_all("video")
|
||||
for video in videos:
|
||||
src = video.get('src', '')
|
||||
src = video.get("src", "")
|
||||
if src:
|
||||
filename = self._generate_filename(final_url)
|
||||
return src, filename
|
||||
|
||||
# Check source tags
|
||||
sources = video.find_all('source')
|
||||
sources = video.find_all("source")
|
||||
for source in sources:
|
||||
src = source.get('src', '')
|
||||
src = source.get("src", "")
|
||||
if src:
|
||||
filename = self._generate_filename(final_url)
|
||||
return src, filename
|
||||
|
||||
# Method 4: Look in scripts for video URLs
|
||||
scripts = soup.find_all('script')
|
||||
scripts = soup.find_all("script")
|
||||
for script in scripts:
|
||||
if script.string:
|
||||
# Look for common video patterns
|
||||
@@ -91,26 +116,30 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
matches = re.findall(pattern, script.string)
|
||||
for match in matches:
|
||||
# Clean up escaped characters
|
||||
match = match.replace('\\/', '/').replace('\\', '')
|
||||
if any(ext in match for ext in ['mp4', 'm3u8', 'mkv']):
|
||||
match = match.replace("\\/", "/").replace("\\", "")
|
||||
if any(ext in match for ext in ["mp4", "m3u8", "mkv"]):
|
||||
filename = self._generate_filename(final_url)
|
||||
return match, filename
|
||||
|
||||
# Look for anime-ultime specific patterns
|
||||
# They sometimes store links in JavaScript variables
|
||||
ddl_match = re.search(r'ddl["\']?\s*:\s*["\']([^"\']+)["\']', script.string)
|
||||
ddl_match = re.search(
|
||||
r'ddl["\']?\s*:\s*["\']([^"\']+)["\']', script.string
|
||||
)
|
||||
if ddl_match:
|
||||
ddl_url = ddl_match.group(1)
|
||||
if ddl_url.startswith('http'):
|
||||
if ddl_url.startswith("http"):
|
||||
filename = self._generate_filename(final_url)
|
||||
return ddl_url, filename
|
||||
|
||||
# Method 5: Look for links with specific classes or IDs
|
||||
# Anime-Ultime might use specific class names for download links
|
||||
potential_links = soup.find_all('a', class_=re.compile(r'download|ddl|episode', re.I))
|
||||
potential_links = soup.find_all(
|
||||
"a", class_=re.compile(r"download|ddl|episode", re.I)
|
||||
)
|
||||
for link in potential_links:
|
||||
href = link.get('href', '')
|
||||
if href and href.startswith('http'):
|
||||
href = link.get("href", "")
|
||||
if href and href.startswith("http"):
|
||||
filename = self._generate_filename(final_url)
|
||||
return href, filename
|
||||
|
||||
@@ -132,36 +161,38 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
episode = "01"
|
||||
|
||||
# Format: info-0-1/EPISODE_ID or info-0-1/EPISODE_ID/NAME-EP-vostfr
|
||||
if 'info-0-1/' in url:
|
||||
if "info-0-1/" in url:
|
||||
# Extract episode ID
|
||||
ep_match = re.search(r'info-0-1/(\d+)', url)
|
||||
ep_match = re.search(r"info-0-1/(\d+)", url)
|
||||
if ep_match:
|
||||
ep_id = ep_match.group(1)
|
||||
|
||||
# Try to get anime name from URL path
|
||||
name_match = re.search(r'info-0-1/\d+/([^/]+)', url)
|
||||
name_match = re.search(r"info-0-1/\d+/([^/]+)", url)
|
||||
if name_match:
|
||||
raw_name = name_match.group(1)
|
||||
# Extract episode number
|
||||
ep_num_match = re.search(r'-(\d+)-vostfr$', raw_name, re.I)
|
||||
ep_num_match = re.search(r"-(\d+)-vostfr$", raw_name, re.I)
|
||||
if ep_num_match:
|
||||
episode = ep_num_match.group(1).zfill(2)
|
||||
# Remove episode number and suffix from name
|
||||
anime_name = re.sub(r'-\d+-vostfr$', '', raw_name, flags=re.I).replace('-', ' ')
|
||||
anime_name = re.sub(
|
||||
r"-\d+-vostfr$", "", raw_name, flags=re.I
|
||||
).replace("-", " ")
|
||||
else:
|
||||
# Just use the ID
|
||||
anime_name = f"Episode {ep_id}"
|
||||
else:
|
||||
anime_name = f"Episode {ep_id}"
|
||||
|
||||
elif 'file-0-1/' in url:
|
||||
elif "file-0-1/" in url:
|
||||
# Extract from file-0-1/ID-NAME format
|
||||
file_match = re.search(r'file-0-1/\d+-(.+)$', url)
|
||||
file_match = re.search(r"file-0-1/\d+-(.+)$", url)
|
||||
if file_match:
|
||||
anime_name = file_match.group(1).replace('-', ' ')
|
||||
anime_name = file_match.group(1).replace("-", " ")
|
||||
|
||||
# Sanitize filename
|
||||
anime_name = anime_name.replace('/', ' ').strip()
|
||||
anime_name = anime_name.replace("/", " ").strip()
|
||||
filename = f"{anime_name} - Episode {episode}.mp4"
|
||||
return filename.title()
|
||||
|
||||
@@ -173,30 +204,30 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
try:
|
||||
print(f"[ANIME-ULTIME] Extracting metadata from: {anime_url}")
|
||||
response = await self.client.get(anime_url)
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
metadata = {
|
||||
'synopsis': None,
|
||||
'genres': [],
|
||||
'rating': None,
|
||||
'release_year': None,
|
||||
'studio': None,
|
||||
'poster_image': None,
|
||||
'banner_image': None,
|
||||
'total_episodes': None,
|
||||
'status': None,
|
||||
'alternative_titles': []
|
||||
"synopsis": None,
|
||||
"genres": [],
|
||||
"rating": None,
|
||||
"release_year": None,
|
||||
"studio": None,
|
||||
"poster_image": None,
|
||||
"banner_image": None,
|
||||
"total_episodes": None,
|
||||
"status": None,
|
||||
"alternative_titles": [],
|
||||
}
|
||||
|
||||
# Extract synopsis
|
||||
synopsis_selectors = [
|
||||
'div.synopsis',
|
||||
'div.description',
|
||||
"div.synopsis",
|
||||
"div.description",
|
||||
'div[class*="synopsis"]',
|
||||
'div[class*="synopsis"]',
|
||||
'p.synopsis',
|
||||
'.info',
|
||||
'div.texte'
|
||||
"p.synopsis",
|
||||
".info",
|
||||
"div.texte",
|
||||
]
|
||||
|
||||
for selector in synopsis_selectors:
|
||||
@@ -204,68 +235,73 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
if synopsis_elem:
|
||||
synopsis = synopsis_elem.get_text(strip=True)
|
||||
if len(synopsis) > 50:
|
||||
metadata['synopsis'] = synopsis
|
||||
metadata["synopsis"] = synopsis
|
||||
break
|
||||
|
||||
# Extract genres from meta tags and page content
|
||||
page_text = soup.get_text()
|
||||
|
||||
# Look for genre in meta tags
|
||||
genre_meta = soup.find('meta', property='genre') or soup.find('meta', attrs={'name': 'genre'})
|
||||
genre_meta = soup.find("meta", property="genre") or soup.find(
|
||||
"meta", attrs={"name": "genre"}
|
||||
)
|
||||
if genre_meta:
|
||||
genres_text = genre_meta.get('content', '')
|
||||
genres_text = genre_meta.get("content", "")
|
||||
if genres_text:
|
||||
metadata['genres'] = [g.strip() for g in genres_text.split(',')]
|
||||
metadata["genres"] = [g.strip() for g in genres_text.split(",")]
|
||||
|
||||
# Try to find genre links
|
||||
genre_links = soup.find_all('a', href=re.compile(r'genre|tag|type|cat', re.I))
|
||||
genre_links = soup.find_all(
|
||||
"a", href=re.compile(r"genre|tag|type|cat", re.I)
|
||||
)
|
||||
if genre_links:
|
||||
for link in genre_links[:5]:
|
||||
genre = link.get_text(strip=True)
|
||||
if genre and genre not in metadata['genres']:
|
||||
metadata['genres'].append(genre)
|
||||
if genre and genre not in metadata["genres"]:
|
||||
metadata["genres"].append(genre)
|
||||
|
||||
# Extract rating
|
||||
rating_selectors = [
|
||||
'span.rating',
|
||||
'div.rating',
|
||||
'span.score',
|
||||
'div.note',
|
||||
'.rating'
|
||||
"span.rating",
|
||||
"div.rating",
|
||||
"span.score",
|
||||
"div.note",
|
||||
".rating",
|
||||
]
|
||||
|
||||
for selector in rating_selectors:
|
||||
rating_elem = soup.select_one(selector)
|
||||
if rating_elem:
|
||||
rating_text = rating_elem.get_text(strip=True)
|
||||
rating_match = re.search(r'(\d+\.?\d*)\s*/\s*10', rating_text)
|
||||
rating_match = re.search(r"(\d+\.?\d*)\s*/\s*10", rating_text)
|
||||
if rating_match:
|
||||
metadata['rating'] = f"{rating_match.group(1)}/10"
|
||||
metadata["rating"] = f"{rating_match.group(1)}/10"
|
||||
break
|
||||
rating_match = re.search(r'(\d+\.?\d*)\s*/\s*5', rating_text)
|
||||
rating_match = re.search(r"(\d+\.?\d*)\s*/\s*5", rating_text)
|
||||
if rating_match:
|
||||
rating_val = float(rating_match.group(1)) * 2
|
||||
metadata['rating'] = f"{rating_val:.1f}/10"
|
||||
metadata["rating"] = f"{rating_val:.1f}/10"
|
||||
break
|
||||
|
||||
# Extract release year
|
||||
year_match = re.search(r'\b(19\d{2}|20\d{2})\b', page_text)
|
||||
year_match = re.search(r"\b(19\d{2}|20\d{2})\b", page_text)
|
||||
if year_match:
|
||||
import datetime
|
||||
|
||||
current_year = datetime.datetime.now().year + 2
|
||||
year = int(year_match.group(1))
|
||||
if 1950 <= year <= current_year:
|
||||
metadata['release_year'] = year
|
||||
metadata["release_year"] = year
|
||||
|
||||
# Extract poster image from og:image
|
||||
og_image = soup.find('meta', property='og:image')
|
||||
og_image = soup.find("meta", property="og:image")
|
||||
if og_image:
|
||||
metadata['poster_image'] = og_image.get('content')
|
||||
metadata["poster_image"] = og_image.get("content")
|
||||
|
||||
# Extract total episodes
|
||||
episodes_count = len(await self.get_episodes(anime_url))
|
||||
if episodes_count > 0:
|
||||
metadata['total_episodes'] = episodes_count
|
||||
metadata["total_episodes"] = episodes_count
|
||||
|
||||
print(f"[ANIME-ULTIME] Extracted metadata: {metadata}")
|
||||
return metadata
|
||||
@@ -274,7 +310,9 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
print(f"[ANIME-ULTIME] Error extracting metadata: {e}")
|
||||
return {}
|
||||
|
||||
async def search_anime(self, query: str, lang: str = "vostfr", include_metadata: bool = False) -> list[dict]:
|
||||
async def search_anime(
|
||||
self, query: str, lang: str = "vostfr", include_metadata: bool = False
|
||||
) -> list[dict]:
|
||||
"""
|
||||
Search for anime on anime-ultime
|
||||
Returns list of anime with title, url, and cover image
|
||||
@@ -286,27 +324,30 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
"""
|
||||
try:
|
||||
import time
|
||||
|
||||
start = time.time()
|
||||
print(f"[ANIME-ULTIME] Searching for '{query}' ({lang})...")
|
||||
|
||||
# Anime-Ultime uses POST for search
|
||||
search_url = "https://www.anime-ultime.net/search-0-1"
|
||||
|
||||
response = await self.client.post(search_url, data={'search': query})
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
response = await self.client.post(search_url, data={"search": query})
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
elapsed = time.time() - start
|
||||
print(f"[ANIME-ULTIME] Got response {response.status_code} in {elapsed:.2f}s")
|
||||
print(
|
||||
f"[ANIME-ULTIME] Got response {response.status_code} in {elapsed:.2f}s"
|
||||
)
|
||||
|
||||
results = []
|
||||
|
||||
# Look for search result links - better parsing
|
||||
# Search results use file-0-1/ pattern, not info-
|
||||
search_results = soup.find_all('a', href=re.compile(r'file-0-1/'))
|
||||
search_results = soup.find_all("a", href=re.compile(r"file-0-1/"))
|
||||
|
||||
seen_urls = set()
|
||||
for result in search_results[:10]: # Limit to 10 results
|
||||
href = result.get('href', '')
|
||||
href = result.get("href", "")
|
||||
raw_title = result.get_text().strip()
|
||||
|
||||
# Skip if no href
|
||||
@@ -322,40 +363,44 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
better_title = raw_title
|
||||
|
||||
# If raw_title is just "Télécharger" or similar, try to find better title
|
||||
if len(raw_title) < 5 or raw_title.lower() in ['télécharger', 'download', 'ddl']:
|
||||
if len(raw_title) < 5 or raw_title.lower() in [
|
||||
"télécharger",
|
||||
"download",
|
||||
"ddl",
|
||||
]:
|
||||
# Try to extract from URL (file-0-1/ID-Title format)
|
||||
url_match = re.search(r'file-0-1/\d+-(.+)$', href)
|
||||
url_match = re.search(r"file-0-1/\d+-(.+)$", href)
|
||||
if url_match:
|
||||
better_title = url_match.group(1).replace('-', ' ').title()
|
||||
better_title = url_match.group(1).replace("-", " ").title()
|
||||
|
||||
# If still no good title, look at parent/row elements
|
||||
if len(better_title) < 5:
|
||||
# Check parent row (table structure)
|
||||
row = result.find_parent(['tr', 'td', 'div'])
|
||||
row = result.find_parent(["tr", "td", "div"])
|
||||
if row:
|
||||
# Look for text in the row that's not the link text
|
||||
row_text = row.get_text().strip()
|
||||
# Remove the link text from row text
|
||||
if raw_title in row_text:
|
||||
row_text = row_text.replace(raw_title, '').strip()
|
||||
row_text = row_text.replace(raw_title, "").strip()
|
||||
if len(row_text) > 5 and len(row_text) < 100:
|
||||
better_title = row_text
|
||||
|
||||
# Make URL absolute
|
||||
if not href.startswith('http'):
|
||||
if not href.startswith("http"):
|
||||
href = urljoin("https://www.anime-ultime.net/", href)
|
||||
|
||||
result_item = {
|
||||
'title': better_title,
|
||||
'url': href,
|
||||
'type': 'search_result',
|
||||
'metadata': None
|
||||
"title": better_title,
|
||||
"url": href,
|
||||
"type": "search_result",
|
||||
"metadata": None,
|
||||
}
|
||||
|
||||
# Fetch metadata if requested
|
||||
if include_metadata:
|
||||
metadata = await self.get_anime_metadata(href)
|
||||
result_item['metadata'] = metadata
|
||||
result_item["metadata"] = metadata
|
||||
|
||||
results.append(result_item)
|
||||
|
||||
@@ -373,27 +418,27 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
"""
|
||||
try:
|
||||
response = await self.client.get(anime_url)
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
episodes = []
|
||||
|
||||
# Look for episode links - anime-ultime uses info-XXXXX-Name-XX-vostfr format
|
||||
# The URL pattern is info-0-1/ID-Anime-Name-XX-vostfr where XX is episode number
|
||||
episode_links = soup.find_all('a', href=re.compile(r'info-0-1/\d+'))
|
||||
episode_links = soup.find_all("a", href=re.compile(r"info-0-1/\d+"))
|
||||
|
||||
for link in episode_links:
|
||||
href = link.get('href', '')
|
||||
href = link.get("href", "")
|
||||
text = link.get_text().strip()
|
||||
|
||||
# Extract episode number from URL pattern
|
||||
# Matches: info-0-1/30200/Naruto-OAV-01-vostfr
|
||||
match = re.search(r'-(\d+)-vostfr$', href, re.I)
|
||||
match = re.search(r"-(\d+)-vostfr$", href, re.I)
|
||||
if not match:
|
||||
# Try other patterns
|
||||
match = re.search(r'Episode[-\s]?(\d+)', href, re.I)
|
||||
match = re.search(r"Episode[-\s]?(\d+)", href, re.I)
|
||||
if not match:
|
||||
# Try to extract from text
|
||||
match = re.search(r'(\d+)', text)
|
||||
match = re.search(r"(\d+)", text)
|
||||
|
||||
if match:
|
||||
episode_num = match.group(1).zfill(2) # Pad with zero
|
||||
@@ -401,32 +446,30 @@ class AnimeUltimeDownloader(BaseAnimeSite):
|
||||
# Extract the episode ID from href and build correct URL
|
||||
# href might be "info-0-1/30200" or "info-0-1/30200/..."
|
||||
# We need: https://www.anime-ultime.net/info-0-1/30200
|
||||
ep_id_match = re.search(r'info-0-1/(\d+)', href)
|
||||
ep_id_match = re.search(r"info-0-1/(\d+)", href)
|
||||
if ep_id_match:
|
||||
ep_id = ep_id_match.group(1)
|
||||
# Build the correct episode URL
|
||||
episode_url = f"https://www.anime-ultime.net/info-0-1/{ep_id}"
|
||||
else:
|
||||
# Fallback to making URL absolute
|
||||
if not href.startswith('http'):
|
||||
if not href.startswith("http"):
|
||||
href = urljoin(anime_url, href)
|
||||
episode_url = href
|
||||
|
||||
episodes.append({
|
||||
'episode': episode_num,
|
||||
'url': episode_url,
|
||||
'title': text
|
||||
})
|
||||
episodes.append(
|
||||
{"episode": episode_num, "url": episode_url, "title": text}
|
||||
)
|
||||
|
||||
# Remove duplicates and sort
|
||||
seen = set()
|
||||
unique_episodes = []
|
||||
for ep in episodes:
|
||||
if ep['episode'] not in seen:
|
||||
seen.add(ep['episode'])
|
||||
if ep["episode"] not in seen:
|
||||
seen.add(ep["episode"])
|
||||
unique_episodes.append(ep)
|
||||
|
||||
unique_episodes.sort(key=lambda x: int(x['episode']))
|
||||
unique_episodes.sort(key=lambda x: int(x["episode"]))
|
||||
|
||||
return unique_episodes
|
||||
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
"""French-Manga.net anime streaming site downloader"""
|
||||
|
||||
from .base import BaseAnimeSite
|
||||
from bs4 import BeautifulSoup
|
||||
import re
|
||||
@@ -17,11 +18,12 @@ class FrenchMangaDownloader(BaseAnimeSite):
|
||||
"french-manga.net",
|
||||
"w16.french-manga.net",
|
||||
"w15.french-manga.net",
|
||||
"www.french-manga.net"
|
||||
"www.french-manga.net",
|
||||
]
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.id = "french-manga"
|
||||
self.base_url = "https://w16.french-manga.net"
|
||||
|
||||
def can_handle(self, url: str) -> bool:
|
||||
@@ -29,9 +31,7 @@ class FrenchMangaDownloader(BaseAnimeSite):
|
||||
return any(domain in url.lower() for domain in self.BASE_DOMAINS)
|
||||
|
||||
async def search_anime(
|
||||
self,
|
||||
query: str,
|
||||
lang: str = "vostfr"
|
||||
self, query: str, lang: str = "vostfr"
|
||||
) -> List[Dict[str, str]]:
|
||||
"""
|
||||
Search for anime on French-Manga.
|
||||
@@ -47,46 +47,50 @@ class FrenchMangaDownloader(BaseAnimeSite):
|
||||
# French-Manga uses a search endpoint
|
||||
search_url = f"{self.base_url}/index.php?do=search"
|
||||
params = {
|
||||
'do': 'search',
|
||||
'subaction': 'search',
|
||||
'story': query,
|
||||
'x': '0',
|
||||
'y': '0'
|
||||
"do": "search",
|
||||
"subaction": "search",
|
||||
"story": query,
|
||||
"x": "0",
|
||||
"y": "0",
|
||||
}
|
||||
|
||||
response = await self.client.post(search_url, data=params)
|
||||
response.raise_for_status()
|
||||
html = response.text
|
||||
|
||||
soup = BeautifulSoup(html, 'lxml')
|
||||
soup = BeautifulSoup(html, "lxml")
|
||||
results = []
|
||||
|
||||
# Look for search results in article or story classes
|
||||
for item in soup.find_all('article', class_=lambda x: x and 'story' in x.lower()):
|
||||
title_elem = item.find(['h2', 'h3', 'h4'])
|
||||
link_elem = item.find('a', href=True)
|
||||
img_elem = item.find('img')
|
||||
for item in soup.find_all(
|
||||
"article", class_=lambda x: x and "story" in x.lower()
|
||||
):
|
||||
title_elem = item.find(["h2", "h3", "h4"])
|
||||
link_elem = item.find("a", href=True)
|
||||
img_elem = item.find("img")
|
||||
|
||||
if title_elem and link_elem:
|
||||
title = title_elem.get_text(strip=True)
|
||||
url = link_elem['href']
|
||||
url = link_elem["href"]
|
||||
|
||||
# Ensure absolute URL
|
||||
if url.startswith('/'):
|
||||
if url.startswith("/"):
|
||||
url = self.base_url + url
|
||||
|
||||
cover_image = ""
|
||||
if img_elem and img_elem.get('src'):
|
||||
cover_image = img_elem['src']
|
||||
if cover_image.startswith('/'):
|
||||
if img_elem and img_elem.get("src"):
|
||||
cover_image = img_elem["src"]
|
||||
if cover_image.startswith("/"):
|
||||
cover_image = self.base_url + cover_image
|
||||
|
||||
results.append({
|
||||
'title': title,
|
||||
'url': url,
|
||||
'cover_image': cover_image,
|
||||
'lang': lang
|
||||
})
|
||||
results.append(
|
||||
{
|
||||
"title": title,
|
||||
"url": url,
|
||||
"cover_image": cover_image,
|
||||
"lang": lang,
|
||||
}
|
||||
)
|
||||
|
||||
logger.info(f"Found {len(results)} anime results for query: {query}")
|
||||
return results
|
||||
@@ -96,9 +100,7 @@ class FrenchMangaDownloader(BaseAnimeSite):
|
||||
return []
|
||||
|
||||
async def get_episodes(
|
||||
self,
|
||||
anime_url: str,
|
||||
lang: str = "vostfr"
|
||||
self, anime_url: str, lang: str = "vostfr"
|
||||
) -> List[Dict[str, str]]:
|
||||
"""
|
||||
Get episode list for an anime.
|
||||
@@ -115,34 +117,36 @@ class FrenchMangaDownloader(BaseAnimeSite):
|
||||
response.raise_for_status()
|
||||
html = response.text
|
||||
|
||||
soup = BeautifulSoup(html, 'lxml')
|
||||
soup = BeautifulSoup(html, "lxml")
|
||||
episodes = []
|
||||
|
||||
# Look for episode links (typically in a list or table)
|
||||
# French-Manga usually has episode links in <a> tags with episode numbers
|
||||
for link in soup.find_all('a', href=True):
|
||||
href = link['href']
|
||||
for link in soup.find_all("a", href=True):
|
||||
href = link["href"]
|
||||
text = link.get_text(strip=True)
|
||||
|
||||
# Pattern: Episode links usually contain "episode" or numbers
|
||||
if re.search(r'episode?\s*\d+', text.lower()):
|
||||
episode_num = re.search(r'(\d+)', text)
|
||||
if re.search(r"episode?\s*\d+", text.lower()):
|
||||
episode_num = re.search(r"(\d+)", text)
|
||||
if episode_num:
|
||||
episode_number = int(episode_num.group(1))
|
||||
|
||||
# Ensure absolute URL
|
||||
if href.startswith('/'):
|
||||
if href.startswith("/"):
|
||||
href = self.base_url + href
|
||||
|
||||
episodes.append({
|
||||
'episode_number': episode_number,
|
||||
'url': href,
|
||||
'title': text,
|
||||
'host': 'french-manga'
|
||||
})
|
||||
episodes.append(
|
||||
{
|
||||
"episode_number": episode_number,
|
||||
"url": href,
|
||||
"title": text,
|
||||
"host": "french-manga",
|
||||
}
|
||||
)
|
||||
|
||||
# Sort by episode number
|
||||
episodes.sort(key=lambda x: x['episode_number'])
|
||||
episodes.sort(key=lambda x: x["episode_number"])
|
||||
|
||||
logger.info(f"Found {len(episodes)} episodes for {anime_url}")
|
||||
return episodes
|
||||
@@ -166,31 +170,33 @@ class FrenchMangaDownloader(BaseAnimeSite):
|
||||
response.raise_for_status()
|
||||
html = response.text
|
||||
|
||||
soup = BeautifulSoup(html, 'lxml')
|
||||
soup = BeautifulSoup(html, "lxml")
|
||||
|
||||
# Extract title
|
||||
title = ""
|
||||
title_elem = soup.find('h1') or soup.find('h2', class_='title')
|
||||
title_elem = soup.find("h1") or soup.find("h2", class_="title")
|
||||
if title_elem:
|
||||
title = title_elem.get_text(strip=True)
|
||||
|
||||
# Extract synopsis
|
||||
synopsis = ""
|
||||
synopsis_elem = soup.find('div', class_=lambda x: x and 'story' in x.lower())
|
||||
synopsis_elem = soup.find(
|
||||
"div", class_=lambda x: x and "story" in x.lower()
|
||||
)
|
||||
if synopsis_elem:
|
||||
synopsis = synopsis_elem.get_text(strip=True)
|
||||
|
||||
# Extract cover image
|
||||
poster_image = ""
|
||||
img_elem = soup.find('img', class_=lambda x: x and 'poster' in x.lower())
|
||||
if img_elem and img_elem.get('src'):
|
||||
poster_image = img_elem['src']
|
||||
if poster_image.startswith('/'):
|
||||
img_elem = soup.find("img", class_=lambda x: x and "poster" in x.lower())
|
||||
if img_elem and img_elem.get("src"):
|
||||
poster_image = img_elem["src"]
|
||||
if poster_image.startswith("/"):
|
||||
poster_image = self.base_url + poster_image
|
||||
|
||||
# Extract genres
|
||||
genres = []
|
||||
genre_links = soup.find_all('a', href=re.compile(r'/xfsearch/.*genre/'))
|
||||
genre_links = soup.find_all("a", href=re.compile(r"/xfsearch/.*genre/"))
|
||||
for link in genre_links[:10]: # Limit to 10 genres
|
||||
genre = link.get_text(strip=True)
|
||||
if genre:
|
||||
@@ -198,36 +204,38 @@ class FrenchMangaDownloader(BaseAnimeSite):
|
||||
|
||||
# Extract rating (if available)
|
||||
rating = ""
|
||||
rating_elem = soup.find(['span', 'div'], class_=lambda x: x and 'rating' in x.lower())
|
||||
rating_elem = soup.find(
|
||||
["span", "div"], class_=lambda x: x and "rating" in x.lower()
|
||||
)
|
||||
if rating_elem:
|
||||
rating = rating_elem.get_text(strip=True)
|
||||
|
||||
return {
|
||||
'title': title,
|
||||
'synopsis': synopsis,
|
||||
'genres': genres,
|
||||
'rating': rating,
|
||||
'release_year': '',
|
||||
'studio': '',
|
||||
'poster_image': poster_image,
|
||||
'total_episodes': len(await self.get_episodes(anime_url)),
|
||||
'status': '',
|
||||
'languages': ['vf', 'vostfr']
|
||||
"title": title,
|
||||
"synopsis": synopsis,
|
||||
"genres": genres,
|
||||
"rating": rating,
|
||||
"release_year": "",
|
||||
"studio": "",
|
||||
"poster_image": poster_image,
|
||||
"total_episodes": len(await self.get_episodes(anime_url)),
|
||||
"status": "",
|
||||
"languages": ["vf", "vostfr"],
|
||||
}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error getting anime metadata: {e}")
|
||||
return {
|
||||
'title': '',
|
||||
'synopsis': '',
|
||||
'genres': [],
|
||||
'rating': '',
|
||||
'release_year': '',
|
||||
'studio': '',
|
||||
'poster_image': '',
|
||||
'total_episodes': 0,
|
||||
'status': '',
|
||||
'languages': ['vf', 'vostfr']
|
||||
"title": "",
|
||||
"synopsis": "",
|
||||
"genres": [],
|
||||
"rating": "",
|
||||
"release_year": "",
|
||||
"studio": "",
|
||||
"poster_image": "",
|
||||
"total_episodes": 0,
|
||||
"status": "",
|
||||
"languages": ["vf", "vostfr"],
|
||||
}
|
||||
|
||||
async def get_download_link(self, url: str) -> tuple[str, str]:
|
||||
@@ -248,20 +256,20 @@ class FrenchMangaDownloader(BaseAnimeSite):
|
||||
response.raise_for_status()
|
||||
html = response.text
|
||||
|
||||
soup = BeautifulSoup(html, 'lxml')
|
||||
soup = BeautifulSoup(html, "lxml")
|
||||
|
||||
# Look for iframe or video player
|
||||
iframe = soup.find('iframe', src=True)
|
||||
iframe = soup.find("iframe", src=True)
|
||||
if iframe:
|
||||
video_url = iframe['src']
|
||||
video_url = iframe["src"]
|
||||
else:
|
||||
# Look for video tag directly
|
||||
video = soup.find('video', src=True)
|
||||
video = soup.find("video", src=True)
|
||||
if video:
|
||||
video_url = video['src']
|
||||
video_url = video["src"]
|
||||
else:
|
||||
# Try to find in script tags
|
||||
scripts = soup.find_all('script')
|
||||
scripts = soup.find_all("script")
|
||||
for script in scripts:
|
||||
if script.string:
|
||||
# Look for iframe or video URLs in JavaScript
|
||||
@@ -274,20 +282,20 @@ class FrenchMangaDownloader(BaseAnimeSite):
|
||||
if match:
|
||||
video_url = match.group(1)
|
||||
break
|
||||
if 'video_url' in locals():
|
||||
if "video_url" in locals():
|
||||
break
|
||||
|
||||
if 'video_url' not in locals():
|
||||
if "video_url" not in locals():
|
||||
raise ValueError("Could not find video player URL")
|
||||
|
||||
# Ensure absolute URL
|
||||
if video_url.startswith('//'):
|
||||
video_url = 'https:' + video_url
|
||||
elif video_url.startswith('/'):
|
||||
if video_url.startswith("//"):
|
||||
video_url = "https:" + video_url
|
||||
elif video_url.startswith("/"):
|
||||
video_url = self.base_url + video_url
|
||||
|
||||
# Extract episode title
|
||||
title_elem = soup.find('h1') or soup.find('h2')
|
||||
title_elem = soup.find("h1") or soup.find("h2")
|
||||
episode_title = title_elem.get_text(strip=True) if title_elem else "Episode"
|
||||
episode_title = sanitize_filename(episode_title)
|
||||
|
||||
|
||||
@@ -7,79 +7,100 @@ from urllib.parse import urljoin
|
||||
|
||||
class NekoSamaDownloader(BaseAnimeSite):
|
||||
"""Downloader for neko-sama.org (anime streaming via Gupy)
|
||||
|
||||
|
||||
NOTE: neko-sama.org now redirects to Gupy, which is a legal streaming search engine.
|
||||
It does NOT host video content - it provides metadata about where to watch legally.
|
||||
This provider can search and get metadata but cannot provide direct download links.
|
||||
"""
|
||||
|
||||
BASE_DOMAINS = ["neko-sama.org", "www.neko-sama.org", "neko-sama.fr", "nekosama.fr", "www.gupy.fr", "gupy.fr"]
|
||||
BASE_DOMAINS = [
|
||||
"neko-sama.org",
|
||||
"www.neko-sama.org",
|
||||
"neko-sama.fr",
|
||||
"nekosama.fr",
|
||||
"www.gupy.fr",
|
||||
"gupy.fr",
|
||||
]
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.id = "neko-sama"
|
||||
|
||||
def can_handle(self, url: str) -> bool:
|
||||
return any(domain in url.lower() for domain in self.BASE_DOMAINS)
|
||||
|
||||
async def get_download_link(self, url: str, target_filename: Optional[str] = None) -> tuple[str, str]:
|
||||
async def get_download_link(
|
||||
self, url: str, target_filename: Optional[str] = None
|
||||
) -> tuple[str, str]:
|
||||
"""
|
||||
Extract download link from neko-sama URL.
|
||||
|
||||
|
||||
NOTE: neko-sama.org/Gupy is a legal streaming search engine, NOT a video host.
|
||||
This returns streaming platform information instead of direct video links.
|
||||
"""
|
||||
try:
|
||||
# Check if this is a Gupy URL
|
||||
if 'gupy.fr' in url or 'neko-sama.org' in url:
|
||||
if "gupy.fr" in url or "neko-sama.org" in url:
|
||||
response = await self.client.get(url, follow_redirects=True)
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
# Look for streaming platform links
|
||||
streaming_links = []
|
||||
for link in soup.find_all('a', href=True):
|
||||
href = link.get('href', '')
|
||||
if '/out/' in href:
|
||||
for link in soup.find_all("a", href=True):
|
||||
href = link.get("href", "")
|
||||
if "/out/" in href:
|
||||
text = link.get_text(strip=True)
|
||||
if text and 'Regarder' in text:
|
||||
if text and "Regarder" in text:
|
||||
streaming_links.append(f"{text}: {href}")
|
||||
|
||||
|
||||
if streaming_links:
|
||||
title_elem = soup.find('h1') or soup.find('title')
|
||||
title = title_elem.get_text(strip=True).split('|')[0].strip() if title_elem else "Unknown"
|
||||
info = "Available streaming platforms:\n" + "\n".join(streaming_links[:5])
|
||||
title_elem = soup.find("h1") or soup.find("title")
|
||||
title = (
|
||||
title_elem.get_text(strip=True).split("|")[0].strip()
|
||||
if title_elem
|
||||
else "Unknown"
|
||||
)
|
||||
info = "Available streaming platforms:\n" + "\n".join(
|
||||
streaming_links[:5]
|
||||
)
|
||||
filename = target_filename or f"{title}_streaming_info.txt"
|
||||
return info, filename
|
||||
|
||||
raise Exception("No streaming links found - Gupy is a legal streaming search, not a video host")
|
||||
|
||||
|
||||
raise Exception(
|
||||
"No streaming links found - Gupy is a legal streaming search, not a video host"
|
||||
)
|
||||
|
||||
# Legacy: try original method for other URLs
|
||||
response = await self.client.get(url, follow_redirects=True)
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
# Method 1: Look for iframes with video
|
||||
iframes = soup.find_all('iframe')
|
||||
iframes = soup.find_all("iframe")
|
||||
for iframe in iframes:
|
||||
src = iframe.get('src', '')
|
||||
if src and any(p in src for p in ['video', 'player', 'stream']):
|
||||
if not src.startswith('http'):
|
||||
src = iframe.get("src", "")
|
||||
if src and any(p in src for p in ["video", "player", "stream"]):
|
||||
if not src.startswith("http"):
|
||||
src = urljoin(str(response.url), src)
|
||||
filename = self._generate_filename(str(response.url))
|
||||
return src, filename
|
||||
|
||||
# Method 2: Look for video tags
|
||||
videos = soup.find_all('video')
|
||||
videos = soup.find_all("video")
|
||||
for video in videos:
|
||||
src = video.get('src') or video.get('data-src')
|
||||
src = video.get("src") or video.get("data-src")
|
||||
if src:
|
||||
filename = self._generate_filename(str(response.url))
|
||||
return src, filename
|
||||
|
||||
sources = video.find_all('source')
|
||||
sources = video.find_all("source")
|
||||
for source in sources:
|
||||
src = source.get('src', '')
|
||||
src = source.get("src", "")
|
||||
if src:
|
||||
filename = self._generate_filename(str(response.url))
|
||||
return src, filename
|
||||
|
||||
# Method 3: Look in scripts
|
||||
scripts = soup.find_all('script')
|
||||
scripts = soup.find_all("script")
|
||||
for script in scripts:
|
||||
if script.string:
|
||||
patterns = [
|
||||
@@ -90,24 +111,26 @@ class NekoSamaDownloader(BaseAnimeSite):
|
||||
for pattern in patterns:
|
||||
matches = re.findall(pattern, script.string)
|
||||
for match in matches:
|
||||
match = match.replace('\\/', '/')
|
||||
if any(ext in match for ext in ['mp4', 'm3u8']):
|
||||
match = match.replace("\\/", "/")
|
||||
if any(ext in match for ext in ["mp4", "m3u8"]):
|
||||
filename = self._generate_filename(str(response.url))
|
||||
return match, filename
|
||||
|
||||
raise Exception("Could not find video link - Neko-Sama/Gupy does not host video content")
|
||||
raise Exception(
|
||||
"Could not find video link - Neko-Sama/Gupy does not host video content"
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
raise Exception(f"Error extracting NekoSama link: {str(e)}")
|
||||
|
||||
def _generate_filename(self, url: str) -> str:
|
||||
parts = url.split('/')
|
||||
parts = url.split("/")
|
||||
anime_name = "anime"
|
||||
episode = "1"
|
||||
|
||||
for i, part in enumerate(parts):
|
||||
if 'episode' in part.lower():
|
||||
match = re.search(r'episode[-\s]*(\d+)', part, re.I)
|
||||
if "episode" in part.lower():
|
||||
match = re.search(r"episode[-\s]*(\d+)", part, re.I)
|
||||
if match:
|
||||
episode = match.group(1)
|
||||
|
||||
@@ -118,31 +141,31 @@ class NekoSamaDownloader(BaseAnimeSite):
|
||||
"""Get list of episodes for an anime."""
|
||||
try:
|
||||
response = await self.client.get(anime_url)
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
episodes = []
|
||||
# Try to find episode links
|
||||
episode_links = soup.find_all('a', href=re.compile(r'episode'))
|
||||
episode_links = soup.find_all("a", href=re.compile(r"episode"))
|
||||
|
||||
for link in episode_links:
|
||||
href = link.get('href', '')
|
||||
match = re.search(r'episode[-\s]*(\d+)', href, re.I)
|
||||
href = link.get("href", "")
|
||||
match = re.search(r"episode[-\s]*(\d+)", href, re.I)
|
||||
if match:
|
||||
episode_num = match.group(1)
|
||||
if not href.startswith('http'):
|
||||
if not href.startswith("http"):
|
||||
href = urljoin(anime_url, href)
|
||||
|
||||
episodes.append({'episode': episode_num, 'url': href})
|
||||
episodes.append({"episode": episode_num, "url": href})
|
||||
|
||||
# Deduplicate and sort
|
||||
seen = set()
|
||||
unique_episodes = []
|
||||
for ep in episodes:
|
||||
if ep['episode'] not in seen:
|
||||
seen.add(ep['episode'])
|
||||
if ep["episode"] not in seen:
|
||||
seen.add(ep["episode"])
|
||||
unique_episodes.append(ep)
|
||||
|
||||
unique_episodes.sort(key=lambda x: int(x['episode']))
|
||||
unique_episodes.sort(key=lambda x: int(x["episode"]))
|
||||
return unique_episodes
|
||||
|
||||
except Exception as e:
|
||||
@@ -153,70 +176,70 @@ class NekoSamaDownloader(BaseAnimeSite):
|
||||
try:
|
||||
print(f"[NEKO-SAMA] Extracting metadata from: {anime_url}")
|
||||
response = await self.client.get(anime_url)
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
metadata = {
|
||||
'synopsis': None,
|
||||
'genres': [],
|
||||
'rating': None,
|
||||
'release_year': None,
|
||||
'studio': None,
|
||||
'poster_image': None,
|
||||
'banner_image': None,
|
||||
'total_episodes': None,
|
||||
'status': None,
|
||||
'alternative_titles': []
|
||||
"synopsis": None,
|
||||
"genres": [],
|
||||
"rating": None,
|
||||
"release_year": None,
|
||||
"studio": None,
|
||||
"poster_image": None,
|
||||
"banner_image": None,
|
||||
"total_episodes": None,
|
||||
"status": None,
|
||||
"alternative_titles": [],
|
||||
}
|
||||
|
||||
# Extract title and year from h1
|
||||
title_elem = soup.find('h1')
|
||||
title_elem = soup.find("h1")
|
||||
if title_elem:
|
||||
title_text = title_elem.get_text(strip=True)
|
||||
# Extract year from title like "Naruto (2002)"
|
||||
year_match = re.search(r'\((\d{4})\)', title_text)
|
||||
year_match = re.search(r"\((\d{4})\)", title_text)
|
||||
if year_match:
|
||||
metadata['release_year'] = int(year_match.group(1))
|
||||
|
||||
metadata["release_year"] = int(year_match.group(1))
|
||||
|
||||
# Extract synopsis - Gupy shows it as paragraphs
|
||||
synopsis_elem = soup.find('p')
|
||||
synopsis_elem = soup.find("p")
|
||||
if synopsis_elem:
|
||||
text = synopsis_elem.get_text(strip=True)
|
||||
if len(text) > 50:
|
||||
metadata['synopsis'] = text
|
||||
metadata["synopsis"] = text
|
||||
|
||||
# Extract genres from meta tags or links
|
||||
genre_links = soup.find_all('a', href=re.compile(r'serie-|genre|tag'))
|
||||
genre_links = soup.find_all("a", href=re.compile(r"serie-|genre|tag"))
|
||||
if genre_links:
|
||||
genres = []
|
||||
for link in genre_links[:5]:
|
||||
text = link.get_text(strip=True)
|
||||
if text and '/' not in text and len(text) < 30:
|
||||
if text and "/" not in text and len(text) < 30:
|
||||
genres.append(text)
|
||||
metadata['genres'] = genres
|
||||
metadata["genres"] = genres
|
||||
|
||||
# Extract rating from percentage
|
||||
rating_elem = soup.find(string=re.compile(r'\d+(\.\d+)?%'))
|
||||
rating_elem = soup.find(string=re.compile(r"\d+(\.\d+)?%"))
|
||||
if rating_elem:
|
||||
match = re.search(r'(\d+(\.\d+)?)%', rating_elem)
|
||||
match = re.search(r"(\d+(\.\d+)?)%", rating_elem)
|
||||
if match:
|
||||
rating = float(match.group(1)) / 10
|
||||
metadata['rating'] = f"{rating:.1f}/10"
|
||||
metadata["rating"] = f"{rating:.1f}/10"
|
||||
|
||||
# Extract poster image
|
||||
poster_elem = soup.find('img', src=re.compile(r'poster|poster'))
|
||||
poster_elem = soup.find("img", src=re.compile(r"poster|poster"))
|
||||
if poster_elem:
|
||||
metadata['poster_image'] = poster_elem.get('src')
|
||||
metadata["poster_image"] = poster_elem.get("src")
|
||||
|
||||
# Extract episode count from page text
|
||||
page_text = soup.get_text()
|
||||
ep_match = re.search(r'(\d+)\s*episodes?', page_text, re.I)
|
||||
ep_match = re.search(r"(\d+)\s*episodes?", page_text, re.I)
|
||||
if ep_match:
|
||||
metadata['total_episodes'] = int(ep_match.group(1))
|
||||
metadata["total_episodes"] = int(ep_match.group(1))
|
||||
|
||||
# Extract studio/director
|
||||
director_elem = soup.find('a', href=re.compile(r'person|réalisé'))
|
||||
director_elem = soup.find("a", href=re.compile(r"person|réalisé"))
|
||||
if director_elem:
|
||||
metadata['studio'] = director_elem.get_text(strip=True)
|
||||
metadata["studio"] = director_elem.get_text(strip=True)
|
||||
|
||||
print(f"[NEKO-SAMA] Extracted metadata: {metadata}")
|
||||
return metadata
|
||||
@@ -225,16 +248,19 @@ class NekoSamaDownloader(BaseAnimeSite):
|
||||
print(f"[NEKO-SAMA] Error extracting metadata: {e}")
|
||||
return {}
|
||||
|
||||
async def search_anime(self, query: str, lang: str = "vostfr", include_metadata: bool = False) -> list[dict]:
|
||||
async def search_anime(
|
||||
self, query: str, lang: str = "vostfr", include_metadata: bool = False
|
||||
) -> list[dict]:
|
||||
"""Search for anime on neko-sama (uses Gupy backend)."""
|
||||
try:
|
||||
import time
|
||||
from html import unescape
|
||||
|
||||
start = time.time()
|
||||
print(f"[NEKO-SAMA] Searching for '{query}' ({lang})...")
|
||||
|
||||
# Neko-Sama now uses Gupy - try the direct URL pattern
|
||||
search_slug = query.lower().replace(' ', '-')
|
||||
search_slug = query.lower().replace(" ", "-")
|
||||
search_urls = [
|
||||
f"https://www.gupy.fr/series/{search_slug}/",
|
||||
f"https://neko-sama.org/series/{search_slug}/",
|
||||
@@ -250,34 +276,40 @@ class NekoSamaDownloader(BaseAnimeSite):
|
||||
print(f"[NEKO-SAMA] Found anime at {final_url}")
|
||||
|
||||
# Extract title from page
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
title_elem = soup.find('h1') or soup.find('title')
|
||||
title = unescape(title_elem.get_text(strip=True)) if title_elem else query
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
title_elem = soup.find("h1") or soup.find("title")
|
||||
title = (
|
||||
unescape(title_elem.get_text(strip=True))
|
||||
if title_elem
|
||||
else query
|
||||
)
|
||||
# Clean up title
|
||||
title = title.split('|')[0].split('-')[0].strip()
|
||||
title = title.split("|")[0].split("-")[0].strip()
|
||||
|
||||
result = {
|
||||
'title': title,
|
||||
'url': final_url,
|
||||
'cover_image': None,
|
||||
'type': 'direct',
|
||||
'metadata': None
|
||||
"title": title,
|
||||
"url": final_url,
|
||||
"cover_image": None,
|
||||
"type": "direct",
|
||||
"metadata": None,
|
||||
}
|
||||
|
||||
# Try to get poster
|
||||
poster = soup.find('img', src=re.compile(r'poster'))
|
||||
poster = soup.find("img", src=re.compile(r"poster"))
|
||||
if poster:
|
||||
result['cover_image'] = poster.get('src')
|
||||
result["cover_image"] = poster.get("src")
|
||||
|
||||
if include_metadata:
|
||||
metadata = await self.get_anime_metadata(final_url)
|
||||
result['metadata'] = metadata
|
||||
result["metadata"] = metadata
|
||||
|
||||
results.append(result)
|
||||
break
|
||||
|
||||
elapsed = time.time() - start
|
||||
print(f"[NEKO-SAMA] Search completed in {elapsed:.2f}s, found {len(results)} results")
|
||||
print(
|
||||
f"[NEKO-SAMA] Search completed in {elapsed:.2f}s, found {len(results)} results"
|
||||
)
|
||||
return results
|
||||
|
||||
except Exception as e:
|
||||
|
||||
@@ -9,6 +9,10 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
|
||||
BASE_DOMAINS = ["vostfree.tv", "www.vostfree.tv"]
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.id = "vostfree"
|
||||
|
||||
def can_handle(self, url: str) -> bool:
|
||||
return any(domain in url.lower() for domain in self.BASE_DOMAINS)
|
||||
|
||||
@@ -16,35 +20,35 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
"""Extract download link from vostfree URL"""
|
||||
try:
|
||||
response = await self.client.get(url, follow_redirects=True)
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
# Method 1: Look for iframe players
|
||||
iframes = soup.find_all('iframe')
|
||||
iframes = soup.find_all("iframe")
|
||||
for iframe in iframes:
|
||||
src = iframe.get('src', '')
|
||||
if src and any(p in src for p in ['player', 'video', 'stream']):
|
||||
if not src.startswith('http'):
|
||||
src = iframe.get("src", "")
|
||||
if src and any(p in src for p in ["player", "video", "stream"]):
|
||||
if not src.startswith("http"):
|
||||
src = urljoin(str(response.url), src)
|
||||
filename = self._generate_filename(str(response.url))
|
||||
return src, filename
|
||||
|
||||
# Method 2: Look for video tags
|
||||
videos = soup.find_all('video')
|
||||
videos = soup.find_all("video")
|
||||
for video in videos:
|
||||
src = video.get('src')
|
||||
src = video.get("src")
|
||||
if src:
|
||||
filename = self._generate_filename(str(response.url))
|
||||
return src, filename
|
||||
|
||||
sources = video.find_all('source')
|
||||
sources = video.find_all("source")
|
||||
for source in sources:
|
||||
src = source.get('src', '')
|
||||
if src and any(ext in src for ext in ['mp4', 'm3u8']):
|
||||
src = source.get("src", "")
|
||||
if src and any(ext in src for ext in ["mp4", "m3u8"]):
|
||||
filename = self._generate_filename(str(response.url))
|
||||
return src, filename
|
||||
|
||||
# Method 3: Look in scripts
|
||||
scripts = soup.find_all('script')
|
||||
scripts = soup.find_all("script")
|
||||
for script in scripts:
|
||||
if script.string:
|
||||
patterns = [
|
||||
@@ -56,8 +60,8 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
for pattern in patterns:
|
||||
matches = re.findall(pattern, script.string)
|
||||
for match in matches:
|
||||
match = match.replace('\\/', '/')
|
||||
if any(ext in match for ext in ['mp4', 'm3u8']):
|
||||
match = match.replace("\\/", "/")
|
||||
if any(ext in match for ext in ["mp4", "m3u8"]):
|
||||
filename = self._generate_filename(str(response.url))
|
||||
return match, filename
|
||||
|
||||
@@ -67,12 +71,12 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
raise Exception(f"Error extracting Vostfree link: {str(e)}")
|
||||
|
||||
def _generate_filename(self, url: str) -> str:
|
||||
parts = url.split('/')
|
||||
parts = url.split("/")
|
||||
anime_name = "anime"
|
||||
episode = "1"
|
||||
|
||||
for part in parts:
|
||||
match = re.search(r'episode[-\s]*(\d+)', part, re.I)
|
||||
match = re.search(r"episode[-\s]*(\d+)", part, re.I)
|
||||
if match:
|
||||
episode = match.group(1)
|
||||
|
||||
@@ -82,30 +86,30 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
async def get_episodes(self, anime_url: str, lang: str = "vostfr") -> list[dict]:
|
||||
try:
|
||||
response = await self.client.get(anime_url)
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
episodes = []
|
||||
episode_links = soup.find_all('a', href=re.compile(r'episode', re.I))
|
||||
episode_links = soup.find_all("a", href=re.compile(r"episode", re.I))
|
||||
|
||||
for link in episode_links:
|
||||
href = link.get('href', '')
|
||||
match = re.search(r'episode[-\s]*(\d+)', href, re.I)
|
||||
href = link.get("href", "")
|
||||
match = re.search(r"episode[-\s]*(\d+)", href, re.I)
|
||||
if match:
|
||||
episode_num = match.group(1)
|
||||
if not href.startswith('http'):
|
||||
if not href.startswith("http"):
|
||||
href = urljoin(anime_url, href)
|
||||
|
||||
episodes.append({'episode': episode_num, 'url': href})
|
||||
episodes.append({"episode": episode_num, "url": href})
|
||||
|
||||
# Deduplicate and sort
|
||||
seen = set()
|
||||
unique_episodes = []
|
||||
for ep in episodes:
|
||||
if ep['episode'] not in seen:
|
||||
seen.add(ep['episode'])
|
||||
if ep["episode"] not in seen:
|
||||
seen.add(ep["episode"])
|
||||
unique_episodes.append(ep)
|
||||
|
||||
unique_episodes.sort(key=lambda x: int(x['episode']))
|
||||
unique_episodes.sort(key=lambda x: int(x["episode"]))
|
||||
return unique_episodes
|
||||
|
||||
except Exception as e:
|
||||
@@ -119,29 +123,29 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
try:
|
||||
print(f"[VOSTFREE] Extracting metadata from: {anime_url}")
|
||||
response = await self.client.get(anime_url)
|
||||
soup = BeautifulSoup(response.text, 'lxml')
|
||||
soup = BeautifulSoup(response.text, "lxml")
|
||||
|
||||
metadata = {
|
||||
'synopsis': None,
|
||||
'genres': [],
|
||||
'rating': None,
|
||||
'release_year': None,
|
||||
'studio': None,
|
||||
'poster_image': None,
|
||||
'banner_image': None,
|
||||
'total_episodes': None,
|
||||
'status': None,
|
||||
'alternative_titles': []
|
||||
"synopsis": None,
|
||||
"genres": [],
|
||||
"rating": None,
|
||||
"release_year": None,
|
||||
"studio": None,
|
||||
"poster_image": None,
|
||||
"banner_image": None,
|
||||
"total_episodes": None,
|
||||
"status": None,
|
||||
"alternative_titles": [],
|
||||
}
|
||||
|
||||
# Extract synopsis
|
||||
synopsis_selectors = [
|
||||
'div.synopsis',
|
||||
'div.description',
|
||||
"div.synopsis",
|
||||
"div.description",
|
||||
'div[class*="synopsis"]',
|
||||
'div[class*="desc"]',
|
||||
'p.synopsis',
|
||||
'.anime-synopsis'
|
||||
"p.synopsis",
|
||||
".anime-synopsis",
|
||||
]
|
||||
|
||||
for selector in synopsis_selectors:
|
||||
@@ -149,57 +153,65 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
if synopsis_elem:
|
||||
synopsis = synopsis_elem.get_text(strip=True)
|
||||
if len(synopsis) > 50:
|
||||
metadata['synopsis'] = synopsis
|
||||
metadata["synopsis"] = synopsis
|
||||
break
|
||||
|
||||
# Extract genres
|
||||
genre_links = soup.find_all('a', href=re.compile(r'genre|tag|type', re.I))
|
||||
genre_links = soup.find_all("a", href=re.compile(r"genre|tag|type", re.I))
|
||||
if genre_links:
|
||||
metadata['genres'] = [link.get_text(strip=True) for link in genre_links[:5]]
|
||||
metadata["genres"] = [
|
||||
link.get_text(strip=True) for link in genre_links[:5]
|
||||
]
|
||||
|
||||
# Extract rating
|
||||
rating_selectors = [
|
||||
'span.rating',
|
||||
'div.rating',
|
||||
'span.score',
|
||||
"span.rating",
|
||||
"div.rating",
|
||||
"span.score",
|
||||
'div[class*="rating"]',
|
||||
'div[class*="score"]'
|
||||
'div[class*="score"]',
|
||||
]
|
||||
|
||||
for selector in rating_selectors:
|
||||
rating_elem = soup.select_one(selector)
|
||||
if rating_elem:
|
||||
rating_text = rating_elem.get_text(strip=True)
|
||||
rating_match = re.search(r'(\d+\.?\d*)\s*/\s*10', rating_text)
|
||||
rating_match = re.search(r"(\d+\.?\d*)\s*/\s*10", rating_text)
|
||||
if rating_match:
|
||||
metadata['rating'] = f"{rating_match.group(1)}/10"
|
||||
metadata["rating"] = f"{rating_match.group(1)}/10"
|
||||
break
|
||||
|
||||
# Extract release year
|
||||
page_text = soup.get_text()
|
||||
year_matches = re.findall(r'\b(19\d{2}|20\d{2})\b', page_text)
|
||||
year_matches = re.findall(r"\b(19\d{2}|20\d{2})\b", page_text)
|
||||
if year_matches:
|
||||
import datetime
|
||||
|
||||
current_year = datetime.datetime.now().year + 2
|
||||
valid_years = [int(y) for y in year_matches if 1950 <= int(y) <= current_year]
|
||||
valid_years = [
|
||||
int(y) for y in year_matches if 1950 <= int(y) <= current_year
|
||||
]
|
||||
if valid_years:
|
||||
from collections import Counter
|
||||
metadata['release_year'] = Counter(valid_years).most_common(1)[0][0]
|
||||
|
||||
metadata["release_year"] = Counter(valid_years).most_common(1)[0][0]
|
||||
|
||||
# Extract poster image
|
||||
poster_elem = soup.select_one('img.poster, img.cover, .anime-poster img')
|
||||
poster_elem = soup.select_one("img.poster, img.cover, .anime-poster img")
|
||||
if poster_elem:
|
||||
metadata['poster_image'] = poster_elem.get('src') or poster_elem.get('data-src')
|
||||
metadata["poster_image"] = poster_elem.get("src") or poster_elem.get(
|
||||
"data-src"
|
||||
)
|
||||
|
||||
# Extract poster from og:image
|
||||
og_image = soup.find('meta', property='og:image')
|
||||
if og_image and not metadata['poster_image']:
|
||||
metadata['poster_image'] = og_image.get('content')
|
||||
og_image = soup.find("meta", property="og:image")
|
||||
if og_image and not metadata["poster_image"]:
|
||||
metadata["poster_image"] = og_image.get("content")
|
||||
|
||||
# Extract total episodes
|
||||
episodes_count = len(await self.get_episodes(anime_url))
|
||||
if episodes_count > 0:
|
||||
metadata['total_episodes'] = episodes_count
|
||||
metadata["total_episodes"] = episodes_count
|
||||
|
||||
print(f"[VOSTFREE] Extracted metadata: {metadata}")
|
||||
return metadata
|
||||
@@ -208,7 +220,9 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
print(f"[VOSTFREE] Error extracting metadata: {e}")
|
||||
return {}
|
||||
|
||||
async def search_anime(self, query: str, lang: str = "vostfr", include_metadata: bool = False) -> list[dict]:
|
||||
async def search_anime(
|
||||
self, query: str, lang: str = "vostfr", include_metadata: bool = False
|
||||
) -> list[dict]:
|
||||
"""
|
||||
Search for anime on vostfree
|
||||
|
||||
@@ -219,6 +233,7 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
"""
|
||||
try:
|
||||
import time
|
||||
|
||||
start = time.time()
|
||||
print(f"[VOSTFREE] Searching for '{query}' ({lang})...")
|
||||
|
||||
@@ -233,15 +248,15 @@ class VostfreeDownloader(BaseAnimeSite):
|
||||
if response.status_code == 200:
|
||||
print(f"[VOSTFREE] Found anime at {str(response.url)}")
|
||||
result = {
|
||||
'title': query,
|
||||
'url': str(response.url),
|
||||
'type': 'direct',
|
||||
'metadata': None
|
||||
"title": query,
|
||||
"url": str(response.url),
|
||||
"type": "direct",
|
||||
"metadata": None,
|
||||
}
|
||||
|
||||
if include_metadata:
|
||||
metadata = await self.get_anime_metadata(str(response.url))
|
||||
result['metadata'] = metadata
|
||||
result["metadata"] = metadata
|
||||
|
||||
return [result]
|
||||
|
||||
|
||||
Reference in New Issue
Block a user