Auditoria completa y plan de implementacion TZZR

- ARCHITECTURE.md: Estado real de 23 repos - IMPLEMENTATION_PLAN.md: 7 fases de implementacion - PHASES/: Scripts detallados para cada fase Resultado de auditoria: - 5 repos implementados - 4 repos parciales - 14 repos solo documentacion
2025-12-24 08:59:14 +00:00
parent 1638a8cf85
commit 73ae91d337
7 changed files with 2089 additions and 2 deletions
--- a/PHASES/FASE_2_PROCESAMIENTO_IA.md
+++ b/PHASES/FASE_2_PROCESAMIENTO_IA.md
@@ -0,0 +1,430 @@
+# FASE 2: PROCESAMIENTO IA
+
+**Complejidad:** Compleja
+**Duración estimada:** 3-5 días
+**Prioridad:** ALTA
+
+---
+
+## OBJETIVO
+
+Desplegar GRACE en RunPod para procesamiento de:
+- ASR (Speech-to-Text)
+- OCR (Imágenes a texto)
+- TTS (Text-to-Speech)
+- Embeddings (Vectorización semántica)
+- Face Detection
+- Avatar Generation
+
+---
+
+## PREREQUISITOS
+
+- [x] FASE 1 completada
+- [ ] Cuenta RunPod con créditos
+- [ ] API Key de RunPod
+- [ ] Docker Hub o registro privado para imágenes
+
+---
+
+## PASO 2.1: Preparar imagen Docker para RunPod
+
+### Estructura del handler
+
+El archivo `grace/runpod/handler.py` ya está implementado y soporta:
+
+| Módulo | Modelo | VRAM |
+|--------|--------|------|
+| ASR_ENGINE | Faster Whisper Large V3 | ~4GB |
+| OCR_CORE | GOT-OCR 2.0 | ~8GB |
+| TTS | XTTS-v2 | ~4GB |
+| FACE_VECTOR | InsightFace Buffalo L | ~2GB |
+| EMBEDDINGS | BGE-Large | ~2GB |
+| AVATAR_GEN | SDXL Base 1.0 | ~8GB |
+
+### Dockerfile optimizado
+
+```dockerfile
+# grace/runpod/Dockerfile
+FROM runpod/pytorch:2.1.0-py3.10-cuda11.8.0-devel-ubuntu22.04
+
+WORKDIR /app
+
+# Variables de entorno
+ENV PYTHONUNBUFFERED=1
+ENV TRANSFORMERS_CACHE=/app/models
+ENV HF_HOME=/app/models
+
+# Dependencias del sistema
+RUN apt-get update && apt-get install -y \
+    ffmpeg \
+    libsm6 \
+    libxext6 \
+    libgl1-mesa-glx \
+    && rm -rf /var/lib/apt/lists/*
+
+# Dependencias Python base
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+
+# Precargar modelos para reducir cold start
+RUN python -c "from faster_whisper import WhisperModel; WhisperModel('large-v3', device='cpu', compute_type='int8')"
+RUN python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('BAAI/bge-large-en-v1.5')"
+
+# Código del handler
+COPY handler.py .
+
+# Iniciar handler de RunPod
+CMD ["python", "-u", "handler.py"]
+```
+
+### requirements.txt
+
+```
+runpod>=1.3.0
+torch>=2.1.0
+transformers>=4.36.0
+faster-whisper>=0.10.0
+TTS>=0.22.0
+sentence-transformers>=2.2.2
+insightface>=0.7.3
+onnxruntime-gpu>=1.16.0
+diffusers>=0.24.0
+accelerate>=0.25.0
+safetensors>=0.4.0
+Pillow>=10.0.0
+opencv-python-headless>=4.8.0
+numpy>=1.24.0
+boto3>=1.34.0
+```
+
+---
+
+## PASO 2.2: Construir y subir imagen
+
+### Opción A: Docker Hub
+
+```bash
+# En máquina con Docker
+cd grace/runpod
+
+# Construir
+docker build -t tzzr/grace-gpu:v1.0 .
+
+# Login a Docker Hub
+docker login
+
+# Subir
+docker push tzzr/grace-gpu:v1.0
+```
+
+### Opción B: RunPod Registry
+
+```bash
+# Usar la CLI de RunPod
+runpodctl build --name grace-gpu --tag v1.0 .
+```
+
+---
+
+## PASO 2.3: Crear template en RunPod
+
+### Via Dashboard
+
+1. Ir a RunPod → Templates → Create Template
+2. Configurar:
+   - **Name**: GRACE-GPU
+   - **Container Image**: tzzr/grace-gpu:v1.0
+   - **Container Disk**: 50GB
+   - **Volume Disk**: 100GB (para modelos)
+   - **Volume Mount Path**: /app/models
+   - **Expose HTTP Ports**: 8000
+   - **Expose TCP Ports**: (vacío)
+
+### Via API
+
+```bash
+RUNPOD_API_KEY="<tu_api_key>"
+
+curl -X POST "https://api.runpod.io/graphql" \
+  -H "Content-Type: application/json" \
+  -H "Authorization: Bearer $RUNPOD_API_KEY" \
+  -d '{
+    "query": "mutation { saveTemplate(input: { name: \"GRACE-GPU\", imageName: \"tzzr/grace-gpu:v1.0\", dockerArgs: \"\", containerDiskInGb: 50, volumeInGb: 100, volumeMountPath: \"/app/models\", ports: \"8000/http\", isServerless: true }) { id name } }"
+  }'
+```
+
+---
+
+## PASO 2.4: Crear endpoint serverless
+
+### Via Dashboard
+
+1. Ir a Serverless → Create Endpoint
+2. Configurar:
+   - **Name**: GRACE-Endpoint
+   - **Template**: GRACE-GPU
+   - **GPU Type**: RTX 4090 (24GB VRAM)
+   - **Min Workers**: 0
+   - **Max Workers**: 3
+   - **Idle Timeout**: 5 segundos
+   - **Flash Boot**: Enabled
+
+### Configuración recomendada por módulo
+
+| Módulo | GPU Mínima | GPU Recomendada |
+|--------|------------|-----------------|
+| ASR_ENGINE | RTX 3080 (10GB) | RTX 4090 (24GB) |
+| OCR_CORE | RTX 3090 (24GB) | RTX 4090 (24GB) |
+| TTS | RTX 3080 (10GB) | RTX 4090 (24GB) |
+| FACE_VECTOR | RTX 3060 (8GB) | RTX 4090 (24GB) |
+| EMBEDDINGS | RTX 3060 (8GB) | RTX 4090 (24GB) |
+| AVATAR_GEN | RTX 3090 (24GB) | RTX 4090 (24GB) |
+
+---
+
+## PASO 2.5: Probar endpoint
+
+### Test ASR_ENGINE
+
+```bash
+ENDPOINT_ID="<tu_endpoint_id>"
+RUNPOD_API_KEY="<tu_api_key>"
+
+# Crear audio de prueba (o usar uno existente)
+# ffmpeg -f lavfi -i "sine=frequency=440:duration=3" -ar 16000 test.wav
+
+# Codificar en base64
+AUDIO_B64=$(base64 -w0 test.wav)
+
+# Enviar request
+curl -X POST "https://api.runpod.ai/v2/${ENDPOINT_ID}/runsync" \
+  -H "Content-Type: application/json" \
+  -H "Authorization: Bearer $RUNPOD_API_KEY" \
+  -d '{
+    "input": {
+      "contract_version": "2.1",
+      "profile": "LITE",
+      "envelope": {
+        "trace_id": "test-asr-001"
+      },
+      "routing": {
+        "module": "ASR_ENGINE"
+      },
+      "payload": {
+        "type": "audio",
+        "encoding": "base64",
+        "content": "'$AUDIO_B64'"
+      },
+      "context": {
+        "lang": "es"
+      }
+    }
+  }'
+```
+
+### Respuesta esperada
+
+```json
+{
+  "id": "...",
+  "status": "COMPLETED",
+  "output": {
+    "contract_version": "2.1",
+    "status": {"code": "SUCCESS"},
+    "result": {
+      "schema": "asr_output_v1",
+      "data": {
+        "text": "...",
+        "language_detected": "es",
+        "duration_seconds": 3.0,
+        "segments": [...]
+      }
+    },
+    "quality": {
+      "confidence": 0.95
+    }
+  }
+}
+```
+
+### Test OCR_CORE
+
+```bash
+# Imagen de prueba
+IMAGE_B64=$(base64 -w0 test_image.png)
+
+curl -X POST "https://api.runpod.ai/v2/${ENDPOINT_ID}/runsync" \
+  -H "Content-Type: application/json" \
+  -H "Authorization: Bearer $RUNPOD_API_KEY" \
+  -d '{
+    "input": {
+      "contract_version": "2.1",
+      "routing": {"module": "OCR_CORE"},
+      "payload": {
+        "type": "image",
+        "encoding": "base64",
+        "content": "'$IMAGE_B64'"
+      }
+    }
+  }'
+```
+
+---
+
+## PASO 2.6: Documentar endpoint
+
+### Guardar en credentials
+
+```bash
+# En ARCHITECT, actualizar repo credentials
+cd /tmp && rm -rf credentials
+GIT_SSH_COMMAND="ssh -i /home/orchestrator/.ssh/tzzr -p 2222" \
+  git clone ssh://git@localhost:2222/tzzr/credentials.git
+
+cd credentials
+
+cat >> inventario/08-gpu-runpod.md << 'EOF'
+
+## GRACE Endpoint (Actualizado 2025-12-24)
+
+| Parámetro | Valor |
+|-----------|-------|
+| Endpoint ID | <endpoint_id> |
+| Template | GRACE-GPU v1.0 |
+| GPU | RTX 4090 |
+| Max Workers | 3 |
+| Idle Timeout | 5s |
+
+### Módulos disponibles
+
+- ASR_ENGINE (Whisper Large V3)
+- OCR_CORE (GOT-OCR 2.0)
+- TTS (XTTS-v2)
+- FACE_VECTOR (InsightFace)
+- EMBEDDINGS (BGE-Large)
+- AVATAR_GEN (SDXL)
+
+### Ejemplo de uso
+
+```bash
+curl -X POST "https://api.runpod.ai/v2/${ENDPOINT_ID}/runsync" \
+  -H "Authorization: Bearer ${RUNPOD_API_KEY}" \
+  -d '{"input": {...}}'
+```
+EOF
+
+git add -A
+git commit -m "Documentar GRACE endpoint RunPod"
+GIT_SSH_COMMAND="ssh -i /home/orchestrator/.ssh/tzzr -p 2222" git push origin main
+```
+
+---
+
+## PASO 2.7: Integrar con DECK
+
+### Crear cliente GRACE en DECK
+
+```python
+# /opt/deck/grace_client.py
+
+import os
+import base64
+import requests
+from typing import Dict, Any, Optional
+
+class GraceClient:
+    def __init__(self):
+        self.endpoint_id = os.getenv("GRACE_ENDPOINT_ID")
+        self.api_key = os.getenv("RUNPOD_API_KEY")
+        self.base_url = f"https://api.runpod.ai/v2/{self.endpoint_id}"
+
+    def call(self, module: str, content: bytes, context: Dict = None) -> Dict[str, Any]:
+        """Llamar a módulo GRACE"""
+        payload = {
+            "input": {
+                "contract_version": "2.1",
+                "routing": {"module": module},
+                "payload": {
+                    "type": self._get_type(module),
+                    "encoding": "base64",
+                    "content": base64.b64encode(content).decode()
+                },
+                "context": context or {}
+            }
+        }
+
+        response = requests.post(
+            f"{self.base_url}/runsync",
+            headers={"Authorization": f"Bearer {self.api_key}"},
+            json=payload,
+            timeout=120
+        )
+
+        return response.json()
+
+    def _get_type(self, module: str) -> str:
+        types = {
+            "ASR_ENGINE": "audio",
+            "OCR_CORE": "image",
+            "TTS": "text",
+            "FACE_VECTOR": "image",
+            "EMBEDDINGS": "text",
+            "AVATAR_GEN": "text"
+        }
+        return types.get(module, "binary")
+
+    def transcribe(self, audio_bytes: bytes, lang: str = "es") -> str:
+        """Convenience method para ASR"""
+        result = self.call("ASR_ENGINE", audio_bytes, {"lang": lang})
+        return result.get("output", {}).get("result", {}).get("data", {}).get("text", "")
+
+    def ocr(self, image_bytes: bytes) -> str:
+        """Convenience method para OCR"""
+        result = self.call("OCR_CORE", image_bytes)
+        return result.get("output", {}).get("result", {}).get("data", {}).get("text", "")
+
+    def embed(self, text: str) -> list:
+        """Convenience method para embeddings"""
+        result = self.call("EMBEDDINGS", text.encode(), {})
+        return result.get("output", {}).get("result", {}).get("data", {}).get("vector", [])
+```
+
+---
+
+## CHECKLIST FINAL FASE 2
+
+- [ ] 2.1 - Dockerfile preparado
+- [ ] 2.2 - Imagen subida a registro
+- [ ] 2.3 - Template creado en RunPod
+- [ ] 2.4 - Endpoint serverless configurado
+- [ ] 2.5 - Tests exitosos (ASR, OCR, etc.)
+- [ ] 2.6 - Credenciales documentadas
+- [ ] 2.7 - Cliente integrado en DECK
+
+---
+
+## MÉTRICAS DE ÉXITO
+
+| Métrica | Valor Esperado |
+|---------|----------------|
+| Cold start | < 60s |
+| Warm ASR (30s audio) | < 10s |
+| Warm OCR (imagen) | < 5s |
+| Disponibilidad | > 99% |
+
+---
+
+## COSTOS ESTIMADOS
+
+| Uso | GPU | Costo/hora | Estimado mensual |
+|-----|-----|------------|------------------|
+| Bajo (10 req/día) | RTX 4090 | $0.69 | ~$5-10 |
+| Medio (100 req/día) | RTX 4090 | $0.69 | ~$30-50 |
+| Alto (1000 req/día) | RTX 4090 | $0.69 | ~$100-200 |
+
+---
+
+## SIGUIENTE FASE
+
+Continuar con [FASE_3_FLUJO_EMPRESARIAL.md](FASE_3_FLUJO_EMPRESARIAL.md)