Files
system-docs/05_OPERACIONES/backup-recovery.md

360 lines
7.7 KiB
Markdown
Raw Normal View History

# Backup y Recovery TZZR
**Versión:** 5.0
**Fecha:** 2024-12-24
---
## Estado Actual
### Backups Existentes
| Sistema | Backup | Destino | Frecuencia | Estado |
|---------|--------|---------|------------|--------|
| Gitea | Sí | R2 | Manual | Operativo |
| PostgreSQL ARCHITECT | No | - | - | **CRÍTICO** |
| PostgreSQL DECK | No | - | - | **CRÍTICO** |
| PostgreSQL CORP | No | - | - | **CRÍTICO** |
| PostgreSQL HST | No | - | - | **CRÍTICO** |
| R2 buckets | Built-in | R2 | Automático | Operativo |
---
## Plan de Backup Propuesto
### PostgreSQL - Backup Diario
```bash
#!/bin/bash
# /opt/scripts/backup_postgres.sh
set -e
DATE=$(date +%F)
BACKUP_DIR="/tmp/pg_backup"
# Cargar credenciales R2
source /home/orchestrator/orchestrator/.env
export AWS_ACCESS_KEY_ID="$R2_ACCESS_KEY"
export AWS_SECRET_ACCESS_KEY="$R2_SECRET_KEY"
R2_ENDPOINT="https://7dedae6030f5554d99d37e98a5232996.r2.cloudflarestorage.com"
mkdir -p $BACKUP_DIR
# Backup ARCHITECT
echo "Backing up ARCHITECT..."
sudo -u postgres pg_dump architect | gzip > $BACKUP_DIR/architect_$DATE.sql.gz
aws s3 cp $BACKUP_DIR/architect_$DATE.sql.gz s3://architect/backups/postgres/ \
--endpoint-url $R2_ENDPOINT
# Cleanup local
rm -rf $BACKUP_DIR
echo "Backup completado: $DATE"
```
### Cron Configuration
```bash
# /etc/cron.d/tzzr-backup
# Backup diario a las 3:00 AM
0 3 * * * orchestrator /opt/scripts/backup_postgres.sh >> /var/log/tzzr-backup.log 2>&1
```
---
## Backup por Servidor
### ARCHITECT (69.62.126.110)
```bash
# Base de datos: architect
sudo -u postgres pg_dump architect | gzip > architect_$(date +%F).sql.gz
# Subir a R2
aws s3 cp architect_$(date +%F).sql.gz s3://architect/backups/postgres/ \
--endpoint-url $R2_ENDPOINT
```
### DECK (72.62.1.113)
```bash
# Base de datos: tzzr
ssh deck 'sudo -u postgres pg_dump tzzr | gzip' > deck_tzzr_$(date +%F).sql.gz
# Subir a R2
aws s3 cp deck_tzzr_$(date +%F).sql.gz s3://architect/backups/deck/ \
--endpoint-url $R2_ENDPOINT
```
### CORP (92.112.181.188)
```bash
# Base de datos: corp
ssh corp 'sudo -u postgres pg_dump corp | gzip' > corp_$(date +%F).sql.gz
# Subir a R2
aws s3 cp corp_$(date +%F).sql.gz s3://architect/backups/corp/ \
--endpoint-url $R2_ENDPOINT
```
### HST (72.62.2.84)
```bash
# Base de datos: hst_images
ssh hst 'sudo -u postgres pg_dump hst_images | gzip' > hst_$(date +%F).sql.gz
# Subir a R2
aws s3 cp hst_$(date +%F).sql.gz s3://architect/backups/hst/ \
--endpoint-url $R2_ENDPOINT
```
---
## Gitea Backup
### Backup Manual
```bash
# En ARCHITECT
docker exec -t gitea bash -c 'gitea dump -c /data/gitea/conf/app.ini'
docker cp gitea:/app/gitea/gitea-dump-*.zip ./
# Subir a R2
aws s3 cp gitea-dump-*.zip s3://architect/backups/gitea/ \
--endpoint-url $R2_ENDPOINT
```
### Backup Automático
```bash
#!/bin/bash
# /opt/scripts/backup_gitea.sh
DATE=$(date +%F_%H%M)
# Crear dump
docker exec -t gitea bash -c "gitea dump -c /data/gitea/conf/app.ini -f /tmp/gitea-dump-$DATE.zip"
# Copiar fuera del container
docker cp gitea:/tmp/gitea-dump-$DATE.zip /tmp/
# Subir a R2
source /home/orchestrator/orchestrator/.env
export AWS_ACCESS_KEY_ID="$R2_ACCESS_KEY"
export AWS_SECRET_ACCESS_KEY="$R2_SECRET_KEY"
aws s3 cp /tmp/gitea-dump-$DATE.zip s3://architect/backups/gitea/ \
--endpoint-url https://7dedae6030f5554d99d37e98a5232996.r2.cloudflarestorage.com
# Cleanup
rm /tmp/gitea-dump-$DATE.zip
docker exec gitea rm /tmp/gitea-dump-$DATE.zip
```
---
## Estructura de Backups en R2
```
s3://architect/backups/
├── postgres/
│ ├── architect_2024-12-24.sql.gz
│ ├── architect_2024-12-23.sql.gz
│ └── ...
├── deck/
│ ├── deck_tzzr_2024-12-24.sql.gz
│ └── ...
├── corp/
│ ├── corp_2024-12-24.sql.gz
│ └── ...
├── hst/
│ ├── hst_2024-12-24.sql.gz
│ └── ...
└── gitea/
├── gitea-dump-2024-12-24_0300.zip
└── ...
```
---
## Retención de Backups
### Política Propuesta
| Tipo | Retención | Notas |
|------|-----------|-------|
| Diario | 7 días | Últimos 7 backups |
| Semanal | 4 semanas | Domingos |
| Mensual | 12 meses | Primer día del mes |
### Script de Limpieza
```bash
#!/bin/bash
# /opt/scripts/cleanup_backups.sh
source /home/orchestrator/orchestrator/.env
export AWS_ACCESS_KEY_ID="$R2_ACCESS_KEY"
export AWS_SECRET_ACCESS_KEY="$R2_SECRET_KEY"
R2_ENDPOINT="https://7dedae6030f5554d99d37e98a5232996.r2.cloudflarestorage.com"
# Eliminar backups más antiguos de 30 días
# (Implementar con lifecycle rules de R2 preferentemente)
aws s3 ls s3://architect/backups/postgres/ --endpoint-url $R2_ENDPOINT | \
while read -r line; do
createDate=$(echo $line | awk '{print $1}')
fileName=$(echo $line | awk '{print $4}')
# Comparar fechas y eliminar si > 30 días
# ...
done
```
---
## Recovery
### PostgreSQL - Restaurar Base de Datos
```bash
# Descargar backup
aws s3 cp s3://architect/backups/postgres/architect_2024-12-24.sql.gz . \
--endpoint-url $R2_ENDPOINT
# Descomprimir
gunzip architect_2024-12-24.sql.gz
# Restaurar (¡CUIDADO! Sobrescribe datos existentes)
sudo -u postgres psql -d architect < architect_2024-12-24.sql
```
### PostgreSQL - Restaurar a Nueva Base
```bash
# Crear nueva base de datos
sudo -u postgres createdb architect_restored
# Restaurar
gunzip -c architect_2024-12-24.sql.gz | sudo -u postgres psql -d architect_restored
```
### Gitea - Restaurar
```bash
# Descargar backup
aws s3 cp s3://architect/backups/gitea/gitea-dump-2024-12-24_0300.zip . \
--endpoint-url $R2_ENDPOINT
# Detener Gitea
docker stop gitea
# Copiar al container
docker cp gitea-dump-2024-12-24_0300.zip gitea:/tmp/
# Restaurar
docker exec gitea bash -c "cd /tmp && unzip gitea-dump-2024-12-24_0300.zip"
# Seguir instrucciones de Gitea para restore
# Iniciar Gitea
docker start gitea
```
---
## Disaster Recovery Plan
### Escenario 1: Pérdida de ARCHITECT
1. Provisionar nuevo VPS con misma IP (si posible)
2. Instalar Ubuntu 22.04
3. Configurar usuario orchestrator
4. Restaurar PostgreSQL desde R2
5. Restaurar Gitea desde R2
6. Reinstalar Docker y servicios
7. Verificar conectividad con DECK/CORP/HST
### Escenario 2: Pérdida de DECK
1. Provisionar nuevo VPS
2. Restaurar PostgreSQL (tzzr) desde backup
3. Reinstalar CLARA, ALFRED
4. Reinstalar Mailcow (requiere backup separado)
5. Actualizar DNS si IP cambió
### Escenario 3: Pérdida de CORP
1. Provisionar nuevo VPS
2. Restaurar PostgreSQL (corp) desde backup
3. Reinstalar MARGARET, JARED, MASON, FELDMAN
4. Reinstalar Odoo, Nextcloud
5. Activar UFW (nuevo servidor)
### Escenario 4: Pérdida de R2
**IMPROBABLE** - Cloudflare tiene redundancia multi-región.
Mitigación: Backup mensual a segundo proveedor (AWS S3 Glacier).
---
## Verificación de Backups
### Test Mensual
```bash
#!/bin/bash
# /opt/scripts/verify_backup.sh
# Descargar último backup
LATEST=$(aws s3 ls s3://architect/backups/postgres/ --endpoint-url $R2_ENDPOINT | \
sort | tail -1 | awk '{print $4}')
aws s3 cp s3://architect/backups/postgres/$LATEST /tmp/ \
--endpoint-url $R2_ENDPOINT
# Verificar integridad
gunzip -t /tmp/$LATEST
if [ $? -eq 0 ]; then
echo "Backup válido: $LATEST"
else
echo "ERROR: Backup corrupto: $LATEST"
# Enviar alerta
fi
rm /tmp/$LATEST
```
### Checklist de Verificación
- [ ] Backup PostgreSQL ARCHITECT existe (< 24h)
- [ ] Backup PostgreSQL DECK existe (< 24h)
- [ ] Backup PostgreSQL CORP existe (< 24h)
- [ ] Backup Gitea existe (< 7d)
- [ ] Integridad verificada (gunzip -t)
- [ ] Restore test exitoso (mensual)
---
## Alertas
### Configuración ntfy
```bash
# Notificar si backup falla
if [ $? -ne 0 ]; then
curl -d "Backup FALLIDO: $DATE" ntfy.sh/tzzr-alerts
fi
```
### Monitoreo
```bash
# Verificar último backup
aws s3 ls s3://architect/backups/postgres/ --endpoint-url $R2_ENDPOINT | \
sort | tail -5
```