# 🏭 THE FACTORY β€” ImplementaciΓ³n RunPod ## GuΓ­a de Despliegue Serverless v1.0 **Sistema SFE/HST Enterprise v5.0** --- ## Índice 1. [VisiΓ³n General](#1-visiΓ³n-general) 2. [Arquitectura RunPod](#2-arquitectura-runpod) 3. [Estructura del Proyecto](#3-estructura-del-proyecto) 4. [Handler Serverless](#4-handler-serverless) 5. [Workers Especializados](#5-workers-especializados) 6. [Dockerfile](#6-dockerfile) 7. [ConfiguraciΓ³n de Modelos](#7-configuraciΓ³n-de-modelos) 8. [Despliegue](#8-despliegue) 9. [Testing Local](#9-testing-local) 10. [MonitorizaciΓ³n y Logs](#10-monitorizaciΓ³n-y-logs) 11. [Costos Estimados](#11-costos-estimados) --- ## 1. VisiΓ³n General ### 1.1 ΒΏPor quΓ© RunPod? RunPod Serverless permite ejecutar The Factory con: | Ventaja | DescripciΓ³n | |---------|-------------| | **GPU bajo demanda** | Solo pagas cuando hay trabajo | | **Auto-scaling** | Escala de 0 a N workers automΓ‘ticamente | | **Zero idle cost** | Sin costo cuando no hay Jobs | | **Cold start rΓ‘pido** | ~30s para iniciar un worker | | **Variedad de GPUs** | Desde RTX 4000 hasta H100 | ### 1.2 Modelo de EjecuciΓ³n ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ RUNPOD SERVERLESS β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ β”‚ β”‚ THE CORP (quien sea) β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ POST /run β”‚ β”‚ β–Ό β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ RunPod Endpoint β”‚ ◄─── Auto-scaling β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β–Ό β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Worker GPU β”‚ β”‚ Worker GPU β”‚ β”‚ Worker GPU β”‚ β”‚ β”‚ β”‚ THE FACTORY β”‚ β”‚ THE FACTORY β”‚ β”‚ THE FACTORY β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β–Ό β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Storage Externo β”‚ (S3, Hostinger, etc.) β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` ### 1.3 Flujo de un Job en RunPod 1. **Cliente** envΓ­a Job a RunPod endpoint 2. **RunPod** asigna worker disponible (o inicia uno nuevo) 3. **Worker** ejecuta handler con el Job 4. **Handler** corre el ciclo iterativo de The Factory 5. **Worker** devuelve resultado 6. **RunPod** escala down si no hay mΓ‘s trabajo --- ## 2. Arquitectura RunPod ### 2.1 Componentes ``` the-factory-runpod/ β”œβ”€β”€ src/ β”‚ β”œβ”€β”€ handler.py # Entry point RunPod β”‚ β”œβ”€β”€ factory/ β”‚ β”‚ β”œβ”€β”€ __init__.py β”‚ β”‚ β”œβ”€β”€ job_manager.py # Orquestador principal β”‚ β”‚ β”œβ”€β”€ executor.py # Productor de artefactos β”‚ β”‚ β”œβ”€β”€ evaluator.py # Evaluador de calidad β”‚ β”‚ └── logger.py # Sistema de logging β”‚ β”œβ”€β”€ workers/ β”‚ β”‚ β”œβ”€β”€ __init__.py β”‚ β”‚ β”œβ”€β”€ text_worker.py # Worker para texto/cΓ³digo β”‚ β”‚ β”œβ”€β”€ image_worker.py # Worker para imΓ‘genes β”‚ β”‚ β”œβ”€β”€ audio_worker.py # Worker para audio β”‚ β”‚ └── video_worker.py # Worker para video β”‚ β”œβ”€β”€ models/ β”‚ β”‚ β”œβ”€β”€ __init__.py β”‚ β”‚ β”œβ”€β”€ schemas.py # Pydantic models β”‚ β”‚ └── enums.py # Enumeraciones β”‚ └── utils/ β”‚ β”œβ”€β”€ __init__.py β”‚ β”œβ”€β”€ storage.py # Acceso a storage externo β”‚ └── contracts.py # Adapter para Contrato v1.2 β”œβ”€β”€ Dockerfile β”œβ”€β”€ requirements.txt β”œβ”€β”€ test_input.json └── README.md ``` ### 2.2 DecisiΓ³n de GPU por FunciΓ³n | Function | GPU MΓ­nima | GPU Recomendada | VRAM | |----------|------------|-----------------|------| | TEXT_GENERATION | CPU / RTX 4000 | RTX 4000 | 4GB+ | | IMAGE_GENERATION | RTX 3090 | RTX 4090 / A40 | 24GB | | AUDIO_GENERATION | RTX 3060 | RTX 4070 | 8GB | | VIDEO_GENERATION | A40 | A100 | 48GB+ | | CODE_GENERATION | CPU / RTX 4000 | RTX 4000 | 4GB+ | ### 2.3 Endpoints Separados vs Único **OpciΓ³n A: Endpoint ΓΊnico (recomendado para empezar)** - Un solo endpoint maneja todas las funciones - El handler decide quΓ© worker usar segΓΊn `function` - MΓ‘s simple de mantener **OpciΓ³n B: Endpoints por funciΓ³n** - Un endpoint por tipo (text, image, audio, video) - Permite GPUs optimizadas por tipo - Mejor para alto volumen --- ## 3. Estructura del Proyecto ### 3.1 Schemas (models/schemas.py) ```python """ Schemas Pydantic para The Factory """ from datetime import datetime from typing import Optional, List, Dict, Any from pydantic import BaseModel, Field from uuid import uuid4 from .enums import JobFunction, JobStatus, StopReason class Limits(BaseModel): """Restricciones del Job""" max_cycles: int = Field(default=10, ge=1, le=50) budget_usd: float = Field(default=10.0, ge=0.1, le=100.0) timeout_minutes: int = Field(default=60, ge=1, le=480) min_confidence: float = Field(default=0.8, ge=0.5, le=1.0) class JobInput(BaseModel): """Entrada de un Job""" job_id: str = Field(default_factory=lambda: str(uuid4())) seed: str = Field(..., min_length=1, max_length=50000) objective: str = Field(..., min_length=1, max_length=5000) function: JobFunction = Field(default=JobFunction.TEXT_GENERATION) input_refs: List[str] = Field(default_factory=list) output_ref: str = Field(...) log_ref: str = Field(...) limits: Limits = Field(default_factory=Limits) # Opcional: configuraciΓ³n de modelos executor_model: Optional[str] = None evaluator_model: Optional[str] = None class JobState(BaseModel): """Estado de un Job en ejecuciΓ³n""" job_id: str status: JobStatus = JobStatus.PENDING current_iteration: int = 0 spent_usd: float = 0.0 stop_reason: Optional[StopReason] = None final_artifact_ref: Optional[str] = None last_functional_iteration: Optional[int] = None error_message: Optional[str] = None created_at: datetime = Field(default_factory=datetime.utcnow) started_at: Optional[datetime] = None completed_at: Optional[datetime] = None class JobResult(BaseModel): """Resultado final de un Job""" job_id: str status: JobStatus stop_reason: StopReason converged: bool final_artifact_ref: Optional[str] last_functional_iteration: Optional[int] total_iterations: int total_cost_usd: float total_duration_ms: int created_at: datetime started_at: datetime completed_at: datetime # Metadata adicional executor_model_used: Optional[str] = None evaluator_model_used: Optional[str] = None iterations_detail: List[Dict[str, Any]] = Field(default_factory=list) class IterationInput(BaseModel): """Entrada para una iteraciΓ³n""" iteration_number: int seed: str previous_feedback: Optional[str] = None accumulated_context: Dict[str, Any] = Field(default_factory=dict) class IterationOutput(BaseModel): """Salida de una iteraciΓ³n""" artifact_ref: str artifact_hash: str executor_model: str cost_usd: float tokens_used: Optional[int] = None metadata: Dict[str, Any] = Field(default_factory=dict) class EvaluationInput(BaseModel): """Entrada para evaluaciΓ³n""" job_objective: str iteration_number: int artifact_ref: str artifact_content: Optional[str] = None # Para texto previous_evaluations: List[Dict[str, Any]] = Field(default_factory=list) class EvaluationResult(BaseModel): """Resultado de evaluaciΓ³n""" converged: bool confidence: float = Field(ge=0.0, le=1.0) strengths: List[str] = Field(default_factory=list) weaknesses: List[str] = Field(default_factory=list) feedback_for_executor: Optional[str] = None evaluator_model: str cost_usd: float ``` ### 3.2 Enums (models/enums.py) ```python """ Enumeraciones para The Factory """ from enum import Enum class JobFunction(str, Enum): TEXT_GENERATION = "TEXT_GENERATION" IMAGE_GENERATION = "IMAGE_GENERATION" AUDIO_GENERATION = "AUDIO_GENERATION" VIDEO_GENERATION = "VIDEO_GENERATION" CODE_GENERATION = "CODE_GENERATION" DOCUMENT_GENERATION = "DOCUMENT_GENERATION" MULTIMODAL = "MULTIMODAL" class JobStatus(str, Enum): PENDING = "PENDING" RUNNING = "RUNNING" CONVERGED = "CONVERGED" EXHAUSTED = "EXHAUSTED" FAILED = "FAILED" CANCELLED = "CANCELLED" class StopReason(str, Enum): CONFORMITY = "CONFORMITY" MAX_CYCLES = "MAX_CYCLES" BUDGET = "BUDGET" TIMEOUT = "TIMEOUT" ERROR = "ERROR" CANCELLED = "CANCELLED" class EventType(str, Enum): JOB_CREATED = "JOB_CREATED" JOB_STARTED = "JOB_STARTED" ITERATION_START = "ITERATION_START" EXECUTOR_CALL = "EXECUTOR_CALL" EXECUTOR_RESPONSE = "EXECUTOR_RESPONSE" EVALUATOR_CALL = "EVALUATOR_CALL" EVALUATOR_RESPONSE = "EVALUATOR_RESPONSE" ITERATION_END = "ITERATION_END" JOB_CONVERGED = "JOB_CONVERGED" JOB_MAX_ITERATIONS = "JOB_MAX_ITERATIONS" JOB_BUDGET_EXHAUSTED = "JOB_BUDGET_EXHAUSTED" JOB_ERROR = "JOB_ERROR" JOB_COMPLETED = "JOB_COMPLETED" ``` --- ## 4. Handler Serverless ### 4.1 Handler Principal (handler.py) ```python """ ═══════════════════════════════════════════════════════════════════════════════ THE FACTORY β€” RunPod Serverless Handler ═══════════════════════════════════════════════════════════════════════════════ """ import runpod import asyncio import traceback from datetime import datetime from src.factory.job_manager import JobManager from src.models.schemas import JobInput, JobResult from src.models.enums import JobStatus, StopReason def handler(job: dict) -> dict: """ RunPod serverless handler. Recibe un Job, ejecuta el ciclo iterativo de The Factory, y devuelve el resultado. Args: job: Dict con estructura {"id": "...", "input": {...}} Returns: Dict con resultado del Job """ job_id = job.get("id", "unknown") job_input_raw = job.get("input", {}) try: # Validar input job_input = JobInput(**job_input_raw) # Ejecutar ciclo Factory (sync wrapper para async) result = asyncio.get_event_loop().run_until_complete( process_job(job_input) ) return { "status": "success", "result": result.model_dump() } except Exception as e: error_trace = traceback.format_exc() return { "status": "error", "error": str(e), "traceback": error_trace, "job_id": job_input_raw.get("job_id", job_id) } async def process_job(job_input: JobInput) -> JobResult: """ Procesa un Job completo con el ciclo iterativo. Args: job_input: JobInput validado Returns: JobResult con resultado final """ manager = JobManager() try: # Inicializar workers await manager.initialize() # Procesar Job (ciclo iterativo completo) result = await manager.process_job(job_input) return result finally: # Cleanup await manager.shutdown() # Progress updates para jobs largos def handler_with_progress(job: dict) -> dict: """ Handler con actualizaciones de progreso. Útil para jobs que tardan mucho. """ job_id = job.get("id", "unknown") job_input_raw = job.get("input", {}) try: job_input = JobInput(**job_input_raw) # Callback para progreso def progress_callback(iteration: int, total: int, message: str): runpod.serverless.progress_update( job, f"Iteration {iteration}/{total}: {message}" ) result = asyncio.get_event_loop().run_until_complete( process_job_with_progress(job_input, progress_callback) ) return { "status": "success", "result": result.model_dump() } except Exception as e: return { "status": "error", "error": str(e), "job_id": job_input_raw.get("job_id", job_id) } async def process_job_with_progress( job_input: JobInput, progress_callback ) -> JobResult: """VersiΓ³n con callbacks de progreso""" manager = JobManager(progress_callback=progress_callback) try: await manager.initialize() result = await manager.process_job(job_input) return result finally: await manager.shutdown() # Iniciar serverless if __name__ == "__main__": runpod.serverless.start({ "handler": handler, # "handler": handler_with_progress, # Alternativa con progreso "return_aggregate_stream": False }) ``` ### 4.2 Job Manager (factory/job_manager.py) ```python """ ═══════════════════════════════════════════════════════════════════════════════ THE FACTORY β€” Job Manager Orquestador del ciclo iterativo ═══════════════════════════════════════════════════════════════════════════════ """ import logging from datetime import datetime from typing import Optional, Callable from src.models.schemas import ( JobInput, JobResult, JobState, IterationInput, IterationOutput, EvaluationInput, EvaluationResult, ) from src.models.enums import JobStatus, StopReason, EventType from src.factory.executor import Executor from src.factory.evaluator import Evaluator from src.factory.logger import FactoryLogger logger = logging.getLogger("factory.job_manager") class JobManager: """ Gestiona el ciclo de vida de Jobs. Flujo: Job ──► EJECUTOR ──► Artefacto ──► EVALUADOR ──► ΒΏConverge? β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ SΓ­β”‚No β”‚ β”‚ β”‚ β”‚ β–Ό └──────────────┐ β”‚ FIN β”‚ β”‚ β–Ό └─────────── Nueva iteraciΓ³n """ def __init__(self, progress_callback: Optional[Callable] = None): self.executor = Executor() self.evaluator = Evaluator() self.progress_callback = progress_callback self._initialized = False async def initialize(self): """Inicializa workers.""" if self._initialized: return await self.executor.initialize() await self.evaluator.initialize() self._initialized = True logger.info("JobManager initialized") async def shutdown(self): """Cleanup.""" if not self._initialized: return await self.executor.shutdown() await self.evaluator.shutdown() self._initialized = False logger.info("JobManager shutdown") async def process_job(self, job_input: JobInput) -> JobResult: """ Procesa un Job hasta convergencia o lΓ­mite. CICLO PRINCIPAL: 1. Ejecutor genera artefacto 2. Evaluador evalΓΊa 3. Si converge β†’ FIN 4. Si no converge β†’ feedback β†’ nueva iteraciΓ³n 5. Si lΓ­mite alcanzado β†’ FIN con informe """ state = JobState(job_id=job_input.job_id) job_logger = FactoryLogger(job_input.job_id, job_input.log_ref) iterations_detail = [] try: state.status = JobStatus.RUNNING state.started_at = datetime.utcnow() await job_logger.log(EventType.JOB_STARTED, { "seed_length": len(job_input.seed), "objective_length": len(job_input.objective), "function": job_input.function.value, "limits": job_input.limits.model_dump() }) previous_feedback: Optional[str] = None accumulated_context: dict = {} executor_model_used = None evaluator_model_used = None # ───────────────────────────────────────────────────────────── # CICLO ITERATIVO # ───────────────────────────────────────────────────────────── while True: state.current_iteration += 1 iteration = state.current_iteration iteration_start = datetime.utcnow() # Progress callback if self.progress_callback: self.progress_callback( iteration, job_input.limits.max_cycles, "Starting iteration" ) await job_logger.log(EventType.ITERATION_START, { "iteration": iteration }) # ───────────────────────────────────────────────────────── # CHECK: ΒΏLΓ­mite de ciclos? # ───────────────────────────────────────────────────────── if iteration > job_input.limits.max_cycles: state.stop_reason = StopReason.MAX_CYCLES await job_logger.log(EventType.JOB_MAX_ITERATIONS, { "last_iteration": iteration - 1 }) break # ───────────────────────────────────────────────────────── # CHECK: ΒΏLΓ­mite de presupuesto? # ───────────────────────────────────────────────────────── if state.spent_usd >= job_input.limits.budget_usd: state.stop_reason = StopReason.BUDGET await job_logger.log(EventType.JOB_BUDGET_EXHAUSTED, { "spent_usd": state.spent_usd, "budget_usd": job_input.limits.budget_usd }) break # ───────────────────────────────────────────────────────── # PASO 1: EJECUTOR # ───────────────────────────────────────────────────────── iter_input = IterationInput( iteration_number=iteration, seed=job_input.seed, previous_feedback=previous_feedback, accumulated_context=accumulated_context, ) await job_logger.log(EventType.EXECUTOR_CALL, { "iteration": iteration, "has_feedback": previous_feedback is not None }) try: iter_output = await self.executor.execute( job_input=job_input, iter_input=iter_input, ) executor_model_used = iter_output.executor_model except Exception as e: logger.error(f"Executor error: {e}") await job_logger.log(EventType.JOB_ERROR, { "error": str(e), "phase": "executor", "iteration": iteration }) state.stop_reason = StopReason.ERROR state.error_message = str(e) break await job_logger.log(EventType.EXECUTOR_RESPONSE, { "artifact_ref": iter_output.artifact_ref, "model": iter_output.executor_model, "cost_usd": iter_output.cost_usd }) state.spent_usd += iter_output.cost_usd # ───────────────────────────────────────────────────────── # PASO 2: EVALUADOR # ───────────────────────────────────────────────────────── eval_input = EvaluationInput( job_objective=job_input.objective, iteration_number=iteration, artifact_ref=iter_output.artifact_ref, ) await job_logger.log(EventType.EVALUATOR_CALL, { "iteration": iteration, "artifact_ref": iter_output.artifact_ref }) try: eval_result = await self.evaluator.evaluate( job_input=job_input, eval_input=eval_input, ) evaluator_model_used = eval_result.evaluator_model except Exception as e: logger.error(f"Evaluator error: {e}") await job_logger.log(EventType.JOB_ERROR, { "error": str(e), "phase": "evaluator", "iteration": iteration }) state.stop_reason = StopReason.ERROR state.error_message = str(e) break await job_logger.log(EventType.EVALUATOR_RESPONSE, { "converged": eval_result.converged, "confidence": eval_result.confidence, "model": eval_result.evaluator_model, "cost_usd": eval_result.cost_usd }) state.spent_usd += eval_result.cost_usd # Guardar detalle de iteraciΓ³n iteration_end = datetime.utcnow() iterations_detail.append({ "iteration": iteration, "duration_ms": int((iteration_end - iteration_start).total_seconds() * 1000), "converged": eval_result.converged, "confidence": eval_result.confidence, "cost_usd": iter_output.cost_usd + eval_result.cost_usd, "artifact_ref": iter_output.artifact_ref }) # Guardar ΓΊltima iteraciΓ³n funcional if not eval_result.weaknesses: state.last_functional_iteration = iteration state.final_artifact_ref = iter_output.artifact_ref await job_logger.log(EventType.ITERATION_END, { "iteration": iteration, "converged": eval_result.converged, "total_spent": state.spent_usd }) # ───────────────────────────────────────────────────────── # PASO 3: ΒΏCONVERGIΓ“? # ───────────────────────────────────────────────────────── if eval_result.converged and eval_result.confidence >= job_input.limits.min_confidence: state.stop_reason = StopReason.CONFORMITY state.final_artifact_ref = iter_output.artifact_ref await job_logger.log(EventType.JOB_CONVERGED, { "iteration": iteration, "confidence": eval_result.confidence }) break # No convergiΓ³ β†’ preparar siguiente iteraciΓ³n previous_feedback = eval_result.feedback_for_executor accumulated_context["last_strengths"] = eval_result.strengths accumulated_context["last_weaknesses"] = eval_result.weaknesses accumulated_context["last_confidence"] = eval_result.confidence # ───────────────────────────────────────────────────────────── # FINALIZACIΓ“N # ───────────────────────────────────────────────────────────── state.completed_at = datetime.utcnow() if state.stop_reason == StopReason.CONFORMITY: state.status = JobStatus.CONVERGED elif state.stop_reason == StopReason.ERROR: state.status = JobStatus.FAILED else: state.status = JobStatus.EXHAUSTED await job_logger.log(EventType.JOB_COMPLETED, { "status": state.status.value, "stop_reason": state.stop_reason.value if state.stop_reason else None, "total_iterations": state.current_iteration, "total_cost_usd": state.spent_usd }) # Guardar log final await job_logger.save() except Exception as e: logger.exception(f"Fatal error processing job {job_input.job_id}") state.status = JobStatus.FAILED state.stop_reason = StopReason.ERROR state.error_message = str(e) state.completed_at = datetime.utcnow() # Construir resultado return JobResult( job_id=state.job_id, status=state.status, stop_reason=state.stop_reason or StopReason.ERROR, converged=state.stop_reason == StopReason.CONFORMITY, final_artifact_ref=state.final_artifact_ref, last_functional_iteration=state.last_functional_iteration, total_iterations=state.current_iteration, total_cost_usd=state.spent_usd, total_duration_ms=int( (state.completed_at - state.started_at).total_seconds() * 1000 ) if state.started_at and state.completed_at else 0, created_at=state.created_at, started_at=state.started_at or state.created_at, completed_at=state.completed_at or datetime.utcnow(), executor_model_used=executor_model_used, evaluator_model_used=evaluator_model_used, iterations_detail=iterations_detail ) ``` --- ## 5. Workers Especializados ### 5.1 Executor (factory/executor.py) ```python """ ═══════════════════════════════════════════════════════════════════════════════ THE FACTORY β€” Executor Productor de artefactos ═══════════════════════════════════════════════════════════════════════════════ """ import os import hashlib import logging from typing import Optional from src.models.schemas import JobInput, IterationInput, IterationOutput from src.models.enums import JobFunction from src.workers.text_worker import TextWorker from src.workers.image_worker import ImageWorker from src.workers.audio_worker import AudioWorker from src.workers.video_worker import VideoWorker from src.utils.storage import Storage logger = logging.getLogger("factory.executor") class Executor: """ Ejecutor agnΓ³stico que delega a workers especializados. """ def __init__(self): self.text_worker = TextWorker() self.image_worker = ImageWorker() self.audio_worker = AudioWorker() self.video_worker = VideoWorker() self.storage = Storage() self._initialized = False async def initialize(self): """Inicializa workers segΓΊn disponibilidad.""" if self._initialized: return await self.text_worker.initialize() await self.image_worker.initialize() await self.audio_worker.initialize() await self.video_worker.initialize() await self.storage.initialize() self._initialized = True logger.info("Executor initialized") async def shutdown(self): """Cleanup.""" await self.text_worker.shutdown() await self.image_worker.shutdown() await self.audio_worker.shutdown() await self.video_worker.shutdown() self._initialized = False async def execute( self, job_input: JobInput, iter_input: IterationInput ) -> IterationOutput: """ Ejecuta una iteraciΓ³n de generaciΓ³n. Delega al worker apropiado segΓΊn la funciΓ³n del Job. """ function = job_input.function # Construir prompt con feedback si existe prompt = self._build_prompt(iter_input) # Delegar a worker apropiado if function in (JobFunction.TEXT_GENERATION, JobFunction.CODE_GENERATION): result = await self.text_worker.generate( prompt=prompt, model=job_input.executor_model, is_code=function == JobFunction.CODE_GENERATION ) elif function == JobFunction.IMAGE_GENERATION: result = await self.image_worker.generate( prompt=prompt, model=job_input.executor_model ) elif function == JobFunction.AUDIO_GENERATION: result = await self.audio_worker.generate( prompt=prompt, model=job_input.executor_model ) elif function == JobFunction.VIDEO_GENERATION: result = await self.video_worker.generate( prompt=prompt, model=job_input.executor_model ) else: # Default a texto result = await self.text_worker.generate( prompt=prompt, model=job_input.executor_model ) # Guardar artefacto en storage artifact_ref = await self.storage.save_artifact( content=result["content"], output_ref=job_input.output_ref, iteration=iter_input.iteration_number, extension=result.get("extension", "txt") ) # Calcular hash artifact_hash = hashlib.sha256( result["content"] if isinstance(result["content"], bytes) else result["content"].encode() ).hexdigest() return IterationOutput( artifact_ref=artifact_ref, artifact_hash=artifact_hash, executor_model=result["model"], cost_usd=result["cost_usd"], tokens_used=result.get("tokens_used"), metadata=result.get("metadata", {}) ) def _build_prompt(self, iter_input: IterationInput) -> str: """Construye prompt incluyendo feedback si existe.""" prompt = iter_input.seed if iter_input.previous_feedback: prompt = f"""INSTRUCCIΓ“N ORIGINAL: {iter_input.seed} FEEDBACK DE LA ITERACIΓ“N ANTERIOR: {iter_input.previous_feedback} Por favor, genera una nueva versiΓ³n que incorpore las mejoras sugeridas.""" return prompt ``` ### 5.2 Text Worker (workers/text_worker.py) ```python """ ═══════════════════════════════════════════════════════════════════════════════ THE FACTORY β€” Text Worker Worker para generaciΓ³n de texto/cΓ³digo via LLM ═══════════════════════════════════════════════════════════════════════════════ """ import os import logging from typing import Optional, Dict, Any import anthropic import openai logger = logging.getLogger("factory.workers.text") # Precios por 1M tokens (input/output) MODEL_PRICING = { "claude-sonnet-4-20250514": {"input": 3.0, "output": 15.0}, "claude-haiku-3-5-20241022": {"input": 0.25, "output": 1.25}, "gpt-4o": {"input": 2.5, "output": 10.0}, "gpt-4o-mini": {"input": 0.15, "output": 0.6}, } DEFAULT_MODEL = "claude-sonnet-4-20250514" class TextWorker: """Worker para generaciΓ³n de texto usando LLMs.""" def __init__(self): self.anthropic_client = None self.openai_client = None self._initialized = False async def initialize(self): """Inicializa clientes de API.""" if self._initialized: return # Anthropic anthropic_key = os.getenv("ANTHROPIC_API_KEY") if anthropic_key: self.anthropic_client = anthropic.Anthropic(api_key=anthropic_key) logger.info("Anthropic client initialized") # OpenAI openai_key = os.getenv("OPENAI_API_KEY") if openai_key: self.openai_client = openai.OpenAI(api_key=openai_key) logger.info("OpenAI client initialized") if not self.anthropic_client and not self.openai_client: raise RuntimeError("No LLM API keys configured") self._initialized = True async def shutdown(self): """Cleanup.""" self._initialized = False async def generate( self, prompt: str, model: Optional[str] = None, is_code: bool = False, max_tokens: int = 4096 ) -> Dict[str, Any]: """ Genera texto usando el modelo especificado. Returns: Dict con content, model, cost_usd, tokens_used """ model = model or DEFAULT_MODEL # Determinar quΓ© cliente usar if model.startswith("claude") and self.anthropic_client: return await self._generate_anthropic(prompt, model, is_code, max_tokens) elif model.startswith("gpt") and self.openai_client: return await self._generate_openai(prompt, model, is_code, max_tokens) else: # Fallback if self.anthropic_client: return await self._generate_anthropic(prompt, DEFAULT_MODEL, is_code, max_tokens) else: return await self._generate_openai(prompt, "gpt-4o-mini", is_code, max_tokens) async def _generate_anthropic( self, prompt: str, model: str, is_code: bool, max_tokens: int ) -> Dict[str, Any]: """Genera usando Anthropic Claude.""" system = "Eres un asistente experto en generaciΓ³n de contenido de alta calidad." if is_code: system = "Eres un experto programador. Genera cΓ³digo limpio, eficiente y bien documentado." response = self.anthropic_client.messages.create( model=model, max_tokens=max_tokens, system=system, messages=[{"role": "user", "content": prompt}] ) content = response.content[0].text input_tokens = response.usage.input_tokens output_tokens = response.usage.output_tokens # Calcular costo pricing = MODEL_PRICING.get(model, MODEL_PRICING[DEFAULT_MODEL]) cost_usd = ( (input_tokens / 1_000_000) * pricing["input"] + (output_tokens / 1_000_000) * pricing["output"] ) return { "content": content, "model": model, "cost_usd": cost_usd, "tokens_used": input_tokens + output_tokens, "extension": "py" if is_code else "md", "metadata": { "input_tokens": input_tokens, "output_tokens": output_tokens } } async def _generate_openai( self, prompt: str, model: str, is_code: bool, max_tokens: int ) -> Dict[str, Any]: """Genera usando OpenAI.""" system = "Eres un asistente experto en generaciΓ³n de contenido de alta calidad." if is_code: system = "Eres un experto programador. Genera cΓ³digo limpio, eficiente y bien documentado." response = self.openai_client.chat.completions.create( model=model, max_tokens=max_tokens, messages=[ {"role": "system", "content": system}, {"role": "user", "content": prompt} ] ) content = response.choices[0].message.content input_tokens = response.usage.prompt_tokens output_tokens = response.usage.completion_tokens # Calcular costo pricing = MODEL_PRICING.get(model, {"input": 2.5, "output": 10.0}) cost_usd = ( (input_tokens / 1_000_000) * pricing["input"] + (output_tokens / 1_000_000) * pricing["output"] ) return { "content": content, "model": model, "cost_usd": cost_usd, "tokens_used": input_tokens + output_tokens, "extension": "py" if is_code else "md", "metadata": { "input_tokens": input_tokens, "output_tokens": output_tokens } } ``` ### 5.3 Image Worker (workers/image_worker.py) ```python """ ═══════════════════════════════════════════════════════════════════════════════ THE FACTORY β€” Image Worker Worker para generaciΓ³n de imΓ‘genes ═══════════════════════════════════════════════════════════════════════════════ """ import os import base64 import logging from typing import Optional, Dict, Any import httpx logger = logging.getLogger("factory.workers.image") # Precios por imagen IMAGE_PRICING = { "flux-1.1-pro": 0.04, "flux-schnell": 0.003, "dall-e-3": 0.04, "dall-e-3-hd": 0.08, } DEFAULT_MODEL = "flux-1.1-pro" class ImageWorker: """Worker para generaciΓ³n de imΓ‘genes.""" def __init__(self): self.fal_api_key = None self.openai_api_key = None self.replicate_api_key = None self._initialized = False async def initialize(self): """Inicializa clientes.""" if self._initialized: return self.fal_api_key = os.getenv("FAL_API_KEY") self.openai_api_key = os.getenv("OPENAI_API_KEY") self.replicate_api_key = os.getenv("REPLICATE_API_KEY") if not any([self.fal_api_key, self.openai_api_key, self.replicate_api_key]): logger.warning("No image API keys configured - image generation disabled") self._initialized = True async def shutdown(self): """Cleanup.""" self._initialized = False async def generate( self, prompt: str, model: Optional[str] = None, width: int = 1024, height: int = 1024 ) -> Dict[str, Any]: """ Genera imagen usando el modelo especificado. Returns: Dict con content (bytes), model, cost_usd """ model = model or DEFAULT_MODEL if model.startswith("flux") and self.fal_api_key: return await self._generate_fal(prompt, model, width, height) elif model.startswith("dall-e") and self.openai_api_key: return await self._generate_dalle(prompt, model, width, height) else: # Fallback if self.fal_api_key: return await self._generate_fal(prompt, "flux-schnell", width, height) elif self.openai_api_key: return await self._generate_dalle(prompt, "dall-e-3", width, height) else: raise RuntimeError("No image generation API configured") async def _generate_fal( self, prompt: str, model: str, width: int, height: int ) -> Dict[str, Any]: """Genera usando Fal.ai (Flux).""" async with httpx.AsyncClient() as client: response = await client.post( f"https://fal.run/fal-ai/{model}", headers={"Authorization": f"Key {self.fal_api_key}"}, json={ "prompt": prompt, "image_size": {"width": width, "height": height}, "num_images": 1 }, timeout=120.0 ) response.raise_for_status() data = response.json() # Descargar imagen image_url = data["images"][0]["url"] async with httpx.AsyncClient() as client: img_response = await client.get(image_url) image_bytes = img_response.content cost_usd = IMAGE_PRICING.get(model, 0.04) return { "content": image_bytes, "model": model, "cost_usd": cost_usd, "extension": "png", "metadata": { "width": width, "height": height, "url": image_url } } async def _generate_dalle( self, prompt: str, model: str, width: int, height: int ) -> Dict[str, Any]: """Genera usando OpenAI DALL-E.""" import openai client = openai.OpenAI(api_key=self.openai_api_key) # DALL-E tiene tamaΓ±os fijos size = "1024x1024" if width == 1792 or height == 1792: size = "1792x1024" if width > height else "1024x1792" quality = "hd" if "hd" in model else "standard" response = client.images.generate( model="dall-e-3", prompt=prompt, size=size, quality=quality, response_format="b64_json", n=1 ) image_bytes = base64.b64decode(response.data[0].b64_json) cost_usd = IMAGE_PRICING.get(model, 0.04) return { "content": image_bytes, "model": model, "cost_usd": cost_usd, "extension": "png", "metadata": { "size": size, "quality": quality } } ``` ### 5.4 Evaluator (factory/evaluator.py) ```python """ ═══════════════════════════════════════════════════════════════════════════════ THE FACTORY β€” Evaluator Evaluador de calidad de artefactos ═══════════════════════════════════════════════════════════════════════════════ """ import os import json import logging from typing import Optional import anthropic from src.models.schemas import JobInput, EvaluationInput, EvaluationResult from src.models.enums import JobFunction from src.utils.storage import Storage logger = logging.getLogger("factory.evaluator") EVALUATION_PROMPT = """Eres un evaluador experto. Tu trabajo es determinar si un artefacto cumple con el objetivo especificado. OBJETIVO: {objective} ARTEFACTO A EVALUAR: {artifact} EvalΓΊa el artefacto y responde en JSON con esta estructura exacta: {{ "converged": true/false, "confidence": 0.0-1.0, "strengths": ["fortaleza 1", "fortaleza 2"], "weaknesses": ["debilidad 1", "debilidad 2"], "feedback_for_executor": "Instrucciones especΓ­ficas para mejorar (si no converge)" }} CRITERIOS: - converged=true solo si el artefacto cumple COMPLETAMENTE el objetivo - confidence indica quΓ© tan seguro estΓ‘s de tu evaluaciΓ³n - Si hay debilidades significativas, converged debe ser false - El feedback debe ser especΓ­fico y actionable Responde SOLO con el JSON, sin texto adicional.""" class Evaluator: """Evaluador de artefactos usando LLM.""" def __init__(self): self.client = None self.storage = Storage() self._initialized = False async def initialize(self): """Inicializa cliente.""" if self._initialized: return api_key = os.getenv("ANTHROPIC_API_KEY") or os.getenv("OPENAI_API_KEY") if not api_key: raise RuntimeError("No evaluator API key configured") if os.getenv("ANTHROPIC_API_KEY"): self.client = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY")) self.provider = "anthropic" else: import openai self.client = openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY")) self.provider = "openai" await self.storage.initialize() self._initialized = True logger.info(f"Evaluator initialized with {self.provider}") async def shutdown(self): """Cleanup.""" self._initialized = False async def evaluate( self, job_input: JobInput, eval_input: EvaluationInput ) -> EvaluationResult: """ EvalΓΊa un artefacto contra el objetivo. """ # Cargar contenido del artefacto artifact_content = await self._load_artifact( eval_input.artifact_ref, job_input.function ) # Construir prompt de evaluaciΓ³n prompt = EVALUATION_PROMPT.format( objective=eval_input.job_objective, artifact=artifact_content[:10000] # Limitar tamaΓ±o ) # Llamar al evaluador if self.provider == "anthropic": result, cost_usd, model = await self._evaluate_anthropic(prompt) else: result, cost_usd, model = await self._evaluate_openai(prompt) return EvaluationResult( converged=result["converged"], confidence=result["confidence"], strengths=result.get("strengths", []), weaknesses=result.get("weaknesses", []), feedback_for_executor=result.get("feedback_for_executor"), evaluator_model=model, cost_usd=cost_usd ) async def _load_artifact(self, artifact_ref: str, function: JobFunction) -> str: """Carga contenido del artefacto para evaluaciΓ³n.""" content = await self.storage.load_artifact(artifact_ref) if function in (JobFunction.TEXT_GENERATION, JobFunction.CODE_GENERATION): if isinstance(content, bytes): return content.decode("utf-8") return content elif function == JobFunction.IMAGE_GENERATION: return "[IMAGEN GENERADA - evaluar por descripciΓ³n del prompt]" elif function == JobFunction.AUDIO_GENERATION: return "[AUDIO GENERADO - evaluar por descripciΓ³n del prompt]" elif function == JobFunction.VIDEO_GENERATION: return "[VIDEO GENERADO - evaluar por descripciΓ³n del prompt]" else: if isinstance(content, bytes): return content.decode("utf-8", errors="ignore") return str(content) async def _evaluate_anthropic(self, prompt: str): """EvalΓΊa usando Claude.""" model = "claude-sonnet-4-20250514" response = self.client.messages.create( model=model, max_tokens=1024, messages=[{"role": "user", "content": prompt}] ) content = response.content[0].text # Parsear JSON try: # Limpiar posibles caracteres extra content = content.strip() if content.startswith("```"): content = content.split("```")[1] if content.startswith("json"): content = content[4:] result = json.loads(content) except json.JSONDecodeError: logger.error(f"Failed to parse evaluation response: {content}") result = { "converged": False, "confidence": 0.5, "strengths": [], "weaknesses": ["Error parsing evaluation"], "feedback_for_executor": "Please try again with clearer output" } # Calcular costo input_tokens = response.usage.input_tokens output_tokens = response.usage.output_tokens cost_usd = (input_tokens / 1_000_000) * 3.0 + (output_tokens / 1_000_000) * 15.0 return result, cost_usd, model async def _evaluate_openai(self, prompt: str): """EvalΓΊa usando GPT-4.""" model = "gpt-4o" response = self.client.chat.completions.create( model=model, max_tokens=1024, response_format={"type": "json_object"}, messages=[{"role": "user", "content": prompt}] ) content = response.choices[0].message.content result = json.loads(content) # Calcular costo input_tokens = response.usage.prompt_tokens output_tokens = response.usage.completion_tokens cost_usd = (input_tokens / 1_000_000) * 2.5 + (output_tokens / 1_000_000) * 10.0 return result, cost_usd, model ``` --- ## 6. Dockerfile ### 6.1 Dockerfile Principal ```dockerfile # ═══════════════════════════════════════════════════════════════════════════════ # THE FACTORY β€” Dockerfile para RunPod Serverless # ═══════════════════════════════════════════════════════════════════════════════ # Base: PyTorch con CUDA para GPU FROM runpod/pytorch:2.2.0-py3.10-cuda12.1.1-devel-ubuntu22.04 # Metadata LABEL maintainer="your-email@example.com" LABEL version="1.0.0" LABEL description="The Factory - Incubador Iterativo para RunPod Serverless" # Variables de entorno ENV PYTHONUNBUFFERED=1 ENV PYTHONDONTWRITEBYTECODE=1 ENV PIP_NO_CACHE_DIR=1 # Directorio de trabajo WORKDIR /app # Copiar requirements primero (para cache de Docker) COPY requirements.txt . # Instalar dependencias Python RUN pip install --upgrade pip && \ pip install -r requirements.txt # Copiar cΓ³digo fuente COPY src/ ./src/ COPY handler.py . # Crear directorio para artefactos temporales RUN mkdir -p /tmp/artifacts # Puerto (no necesario para serverless, pero ΓΊtil para debug) EXPOSE 8000 # Healthcheck HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \ CMD python -c "import runpod; print('OK')" || exit 1 # Entry point para RunPod Serverless CMD ["python", "-u", "handler.py"] ``` ### 6.2 requirements.txt ```txt # RunPod SDK runpod>=1.7.0 # LLM APIs anthropic>=0.40.0 openai>=1.50.0 # HTTP httpx>=0.27.0 aiohttp>=3.9.0 # Data validation pydantic>=2.5.0 # Storage boto3>=1.34.0 # Para S3 aiofiles>=23.2.0 # Utilities python-dotenv>=1.0.0 tenacity>=8.2.0 # Retries # Image processing (opcional, para image worker local) # pillow>=10.0.0 # Audio processing (opcional, para audio worker local) # pydub>=0.25.0 # TTS>=0.22.0 ``` ### 6.3 Dockerfile Ligero (Solo APIs) ```dockerfile # ═══════════════════════════════════════════════════════════════════════════════ # THE FACTORY β€” Dockerfile Ligero (Solo APIs, sin GPU local) # ═══════════════════════════════════════════════════════════════════════════════ FROM python:3.11-slim ENV PYTHONUNBUFFERED=1 WORKDIR /app COPY requirements-light.txt requirements.txt RUN pip install --no-cache-dir -r requirements.txt COPY src/ ./src/ COPY handler.py . CMD ["python", "-u", "handler.py"] ``` --- ## 7. ConfiguraciΓ³n de Modelos ### 7.1 Variables de Entorno ```bash # ═══════════════════════════════════════════════════════════════════════════════ # THE FACTORY β€” Variables de Entorno # ═══════════════════════════════════════════════════════════════════════════════ # --- API Keys --- ANTHROPIC_API_KEY=sk-ant-... OPENAI_API_KEY=sk-... FAL_API_KEY=... REPLICATE_API_KEY=r8_... # --- Modelos por defecto --- DEFAULT_TEXT_MODEL=claude-sonnet-4-20250514 DEFAULT_IMAGE_MODEL=flux-1.1-pro DEFAULT_AUDIO_MODEL=elevenlabs-multilingual-v2 DEFAULT_EVALUATOR_MODEL=claude-sonnet-4-20250514 # --- Storage --- STORAGE_TYPE=s3 # s3, local, hostinger AWS_ACCESS_KEY_ID=... AWS_SECRET_ACCESS_KEY=... AWS_REGION=eu-west-1 S3_BUCKET=the-factory-artifacts # --- ConfiguraciΓ³n --- LOG_LEVEL=INFO MAX_ARTIFACT_SIZE_MB=100 ``` ### 7.2 ConfiguraciΓ³n en RunPod Al crear el endpoint en RunPod, configura estas variables: | Variable | Valor | DescripciΓ³n | |----------|-------|-------------| | `ANTHROPIC_API_KEY` | `sk-ant-...` | API key de Anthropic | | `OPENAI_API_KEY` | `sk-...` | API key de OpenAI | | `S3_BUCKET` | `your-bucket` | Bucket para artefactos | | `AWS_ACCESS_KEY_ID` | `...` | Credenciales S3 | | `AWS_SECRET_ACCESS_KEY` | `...` | Credenciales S3 | --- ## 8. Despliegue ### 8.1 Build y Push de Docker Image ```bash # Login a Docker Hub docker login # Build docker build -t yourusername/the-factory:latest . # Test local docker run --rm \ -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \ yourusername/the-factory:latest \ python handler.py --test_input '{"input": {"seed": "test", "objective": "test", "output_ref": "/tmp/out", "log_ref": "/tmp/log"}}' # Push docker push yourusername/the-factory:latest ``` ### 8.2 Crear Endpoint en RunPod 1. **Ir a RunPod Console** β†’ Serverless β†’ Endpoints 2. **New Endpoint** 3. **Configurar:** - **Name**: `the-factory` - **Docker Image**: `yourusername/the-factory:latest` - **GPU**: RTX 4000 (para texto) o A40 (para imagen) - **Min Workers**: 0 (scale to zero) - **Max Workers**: 5 - **Idle Timeout**: 5 segundos - **Environment Variables**: (agregar todas las API keys) 4. **Deploy** ### 8.3 Obtener Endpoint URL Una vez desplegado, obtendrΓ‘s: - **Endpoint ID**: `abc123xyz` - **URL**: `https://api.runpod.ai/v2/abc123xyz/run` --- ## 9. Testing Local ### 9.1 test_input.json ```json { "input": { "seed": "Escribe un haiku sobre la programaciΓ³n en Python", "objective": "Un haiku vΓ‘lido con estructura 5-7-5 sΓ­labas sobre programaciΓ³n en Python", "function": "TEXT_GENERATION", "output_ref": "/tmp/artifacts/haiku.md", "log_ref": "/tmp/logs/job.json", "limits": { "max_cycles": 3, "budget_usd": 0.5, "min_confidence": 0.85 } } } ``` ### 9.2 Ejecutar Test Local ```bash # Con archivo test_input.json python handler.py # Con input directo python handler.py --test_input '{ "input": { "seed": "Escribe un haiku sobre Python", "objective": "Haiku vΓ‘lido 5-7-5", "function": "TEXT_GENERATION", "output_ref": "/tmp/out.md", "log_ref": "/tmp/log.json" } }' ``` ### 9.3 Test con Docker ```bash docker run --rm \ -v $(pwd)/test_input.json:/app/test_input.json \ -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \ yourusername/the-factory:latest ``` --- ## 10. MonitorizaciΓ³n y Logs ### 10.1 Logs en RunPod Los logs estΓ‘n disponibles en: - RunPod Console β†’ Endpoints β†’ Tu Endpoint β†’ Logs - O via API: `GET /v2/{endpoint_id}/status/{job_id}` ### 10.2 Estructura de Log ```json { "job_id": "uuid", "events": [ { "timestamp": "2025-12-16T10:30:00Z", "event": "JOB_STARTED", "data": {"seed_length": 150} }, { "timestamp": "2025-12-16T10:30:01Z", "event": "ITERATION_START", "data": {"iteration": 1} }, { "timestamp": "2025-12-16T10:30:05Z", "event": "EXECUTOR_RESPONSE", "data": {"model": "claude-sonnet-4", "cost_usd": 0.015} } ] } ``` --- ## 11. Costos Estimados ### 11.1 Costo RunPod (GPU) | GPU | $/hora | Uso tΓ­pico | |-----|--------|------------| | RTX 4000 Ada | $0.20 | Texto/cΓ³digo | | RTX 4090 | $0.69 | Imagen | | A40 | $0.79 | Imagen/Video | | A100 80GB | $1.99 | Video/LLM local | ### 11.2 Costo por Job (ejemplo) **Job de texto (3 iteraciones):** - GPU RTX 4000: ~30 seg = $0.002 - Claude API: ~$0.05 - **Total: ~$0.05** **Job de imagen (2 iteraciones):** - GPU A40: ~60 seg = $0.013 - Flux API: 2 Γ— $0.04 = $0.08 - Claude evaluaciΓ³n: ~$0.03 - **Total: ~$0.12** ### 11.3 EstimaciΓ³n Mensual | Volumen | Jobs texto | Jobs imagen | Costo estimado | |---------|------------|-------------|----------------| | Bajo | 100 | 20 | ~$10 | | Medio | 500 | 100 | ~$50 | | Alto | 2000 | 500 | ~$200 | --- ## PrΓ³ximos Pasos ### Completado βœ… - [x] Handler serverless - [x] Job Manager con ciclo iterativo - [x] Workers: Text, Image - [x] Evaluator - [x] Dockerfile - [x] ConfiguraciΓ³n de modelos - [x] Testing local ### Pendiente para Fase 3 (IntegraciΓ³n Alfred) - [ ] Workflow n8n para invocar Factory - [ ] Webhooks de callback - [ ] Adapter para Contrato v1.2 - [ ] Logging a SYS_LOG - [ ] Persistencia en Hostinger --- **Sistema SFE/HST Enterprise v5.0** *Documento generado: Diciembre 2025*