Date: April 28, 2026
Hours Worked: 6
Status: On Track
-
Option Analyzer: Rule-based solver with problem-type classification
- Status: Implemented
- IoU: 7.03% (matches 7.4% baseline)
- Files:
track_a_option_analyzer.py,test_track_a_analyzer.py
-
Pattern Analysis: Analyzed 100 training samples
- All problems are "coverage" type
- Answers contain 1-5 cells (avg 2.3)
- Action keywords: tilt, azimuth, power for coverage
-
Detection Logic: Fixed path query vs link restore classification
- Excluded "topology" from link restore (Q5 fix)
- Added path phrases: "topology", "plan the", "links for"
- Status: 50/50 questions correctly detected
-
Solver Implementation:
- Fault Localization: 24 questions (NodeName;Destination;Reason format)
- ARP Link Restore: 4 questions (Q30-33 Prime nodes)
- Path Queries: 15 questions (needs server testing)
- Other Link Restore: 7 questions (needs interface desc fallback)
-
Coverage Analysis:
Category Questions Status Definitely Solvable 28 56% Potentially Solvable 22 44% Total 50 100% potential
-
Data Preparation: 10,000 augmented training examples
- Source: 2,000 original → 10,000 with paraphrasing
- Files:
track_a_train_augmented.jsonl,track_a_val.jsonl
-
Training Notebook:
kaggle_qlora_training.ipynb- Model: Qwen/Qwen2.5-32B-Instruct (or 14B fallback)
- Method: QLoRA (4-bit quantization)
- LoRA rank: 64, Alpha: 128
- Epochs: 3, Batch: 1, Gradient Accumulation: 16
- Status: Running on Kaggle T4x2 GPU
- Target: 7.4% → 40-50% IoU
- ETA: 6-8 hours (complete by tomorrow morning)
- Implementation:
track_a_ensemble.py- Combines: Option Analyzer (30%) + KNN (50%) + Heuristics (20%)
- Problem-type aware cell count (1-3 cells)
- Weighted voting for final selection
- Status: Implemented, awaiting data for evaluation
Commits: 8 commits pushed Total Changes: 1.6+ MiB of training data + code Repository: https://github.com/okech-christopher/Competitive-Data-Science
Files Created:
projects/telco-troubleshooting/
├── src/telco_agent/
│ ├── track_a_option_analyzer.py (Rule-based solver)
│ ├── track_b_agent.py (ARP link restore + fault localization)
│ └── track_a_ensemble.py (Ensemble solver)
├── prepare_track_a_data.py (Data augmentation)
├── kaggle_qlora_training.ipynb (Kaggle training notebook)
├── test_track_b_solvers.py (Solver testing)
├── track_a_train_augmented.jsonl (10,000 examples)
└── track_a_val.jsonl (200 validation)
| Metric | Target | Actual | Status |
|---|---|---|---|
| Hours Worked | 14 | 6 | 8 hours remaining |
| Git Commits | Regular | 8 commits | |
| Track A IoU | 30% | 7% (→40% with fine-tuning) | In progress |
| Track B Coverage | 90% | 56% analyzed | Partial |
| Kaggle Training | Started | Running |
- Track A: Monitor Kaggle training (runs automatically)
- Track B: Create mock server or test with real CLI
- Generate submission.csv with current solvers
- Document ensemble weights tuning
- Kaggle training completes (6-8 hours)
- Download fine-tuned model
- Test model inference
- Evaluate fine-tuned model IoU
- Compare: Baseline vs Ensemble vs Fine-tuned LLM
- Select best solver for Track A
- Integrate with Track B submission
| Risk | Impact | Mitigation |
|---|---|---|
| Kaggle training fails | High | Monitor logs, have 14B fallback ready |
| Track B server unavailable | Medium | Create mock server or submit partial |
| Data not loading (git-lfs) | Low | Use sample data for testing |
- Fine-tuned LLM: 40-50% IoU
- Track B: 90%+ coverage with working solvers
- Submission: Top 10% on leaderboard
- Fine-tuned LLM: 25-35% IoU
- Track B: 70-80% coverage
- Submission: Top 20% on leaderboard
- Fine-tuned LLM: 15-20% IoU
- Track B: 56% coverage (current)
- Submission: Baseline + some improvements
- GitHub: https://github.com/okech-christopher/Competitive-Data-Science
- Zindi Challenge: https://zindi.africa/competitions/telco-troubleshooting-agentic-challenge
- Kaggle: https://www.kaggle.com/code (check training status)
Last Updated: Hour 6 of Day 1
Next Check-in: Hour 8 (after Kaggle training progress check)