SONE-162-JAVHD-TODAY-04192024-JAVHD-TODAY02-23-...
SONE-162-JAVHD-TODAY-04192024-JAVHD-TODAY02-23-...

Sone-162-javhd-today-04192024-javhd-today02-23-...

# Detect duplicate JAVHD-TODAY pattern if filename.count("JAVHD-TODAY") > 1: features["is_duplicate_tag"] = True

# Extract date (MMDDYYYY) date_match = re.search(r'(\d2)(\d2)(\d4)', filename) if date_match: try: date_str = f"date_match.group(1)/date_match.group(2)/date_match.group(3)" features["release_date"] = datetime.strptime(date_str, "%m/%d/%Y").date().isoformat() except ValueError: pass SONE-162-JAVHD-TODAY-04192024-JAVHD-TODAY02-23-...

# Extract movie ID (e.g., SONE-162) movie_match = re.search(r'([A-Z]+-\d+)', filename) if movie_match: features["movie_id"] = movie_match.group(1) # Detect duplicate JAVHD-TODAY pattern if filename

It looks like you're referencing a filename pattern from a JAV (Japanese Adult Video) source — possibly an MP4 file naming convention that includes a code (), a site label ( JAVHD ), and dates. SONE-162) movie_match = re.search(r'([A-Z]+-\d+)'