# Network-Restricted Setup Guide (China Firewall)

## The Problem
Servers in mainland China (or behind restrictive firewalls) cannot reach:
- `huggingface.co` — DNS blocked
- `ollama.com` — Install script 504s
- `github.com` — Releases may 504; API usually works via token
- `pypi.org` — Usually fine, but some CDN assets are slow

## Solutions

### 1. HuggingFace Model Downloads

**Before** (will fail or timeout):
```python
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')  # hangs trying hf.co
```

**After** (works):
```bash
export HF_ENDPOINT=https://hf-mirror.com
# Then run your script
```

Or set it in Python:
```python
os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'
```

**Model**: `all-MiniLM-L6-v2` (384-dim, ~80MB) is the recommended default. It's small enough to download over slow links (~5-10 min) and works well for general-purpose text retrieval.

### 2. Ollama → Sentence-Transformers Fallback

When Ollama install script (`curl -fsSL https://ollama.com/install.sh | sh`) 504s:

**Don't fight it** — use the pure Python stack instead:

| Component | Ollama (blocked) | Pure Python (works) |
|-----------|-----------------|-------------------|
| Embedding model | nomic-embed-text | all-MiniLM-L6-v2 |
| Vector DB | ChromaDB | ChromaDB (same) |
| Installation | `curl | sh` → 504 | `pip install chromadb sentence-transformers` ✅ |
| Download | automatic | via hf-mirror ✅ |

Install commands:
```bash
pip install chromadb sentence-transformers huggingface_hub
export HF_ENDPOINT=https://hf-mirror.com
python3 -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('all-MiniLM-L6-v2')"
```

### 3. GitHub Operations

- **`gh repo create`** — Fine-grained PATs (`github_pat_...`) need "Repository creation: write" permission; use classic PAT with `repo` scope instead
- **Cloning** — `https://github.com/...` works fine via token
- **Releases** — Large binaries may 504; download from mirrors or compile from source
- **Git operations** — `git clone`, `git push` over HTTPS work with PAT stored in `~/.git-credentials`
- **When HTTPS is blocked but SSH works** — In some China networks, `github.com:443` is filtered but `api.github.com` and SSH (port 22) are not. Switch git remote to SSH:
  ```bash
  # Generate key on server
  ssh-keygen -t ed25519 -f ~/.ssh/github_hermes -N '' -C 'server-name'
  # Add to GitHub manually (web UI) since fine-grained PATs can't manage SSH keys
  cat ~/.ssh/github_hermes.pub
  # Update remote
  git remote set-url origin git@github.com:user/repo.git
  # Now push works
  git push -u origin main
  ```

### 4. pip / PyPI

PyPI generally works. For slow downloads, use mirrors:
```bash
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple <package>
```

### 5. General Timeout Tuning

```python
import os
os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'
os.environ['SENTENCE_TRANSFORMERS_HOME'] = '/home/ubuntu/.cache/torch/sentence-transformers'

# For urllib/requests
import socket
socket.setdefaulttimeout(300)  # 5 minutes
```
