Honua-GIS
open-weights GIS evals.
Honua-GIS is the public track for a GIS-focused coding and workflow assistant: a benchmark dataset, deterministic eval harness, model-card scaffold, GGUF/Ollama packaging workflow, and NIM serving gate. The source track is open; released model quality and serving commands stay pending until their artifacts publish.
Source snapshot: honua-gis-llm commit 8ddf9331417d28644cb650c3eb67f92193ac337f.
What is published, and what is not.
The public page follows the same claims rule as the rest of Honua: shipped source evidence is linked, and missing model artifacts remain visibly pending.
Published with four task classes, deterministic graders, a report schema, and baseline runbooks. Live baseline reports are not published yet.
The model card names Qwen/Qwen2.5-Coder-32B-Instruct as the planned base model. No fine-tuned weights are published by the source repo yet.
Conversion scripts and Modelfiles exist. The Ollama namespace URL, latest tag, GGUF SHA-256, and live smoke evidence are still null in the state file.
The NIM wrapper and smoke harness are in source. A hosted endpoint, entitlement requirements, and live smoke report are owned by the open NIM evidence ticket.
Tooling, benchmark data, base-model, corpus, and distributed artifact license boundaries are tracked separately. Site copy is not a model-weight license grant.
This page may claim an open eval/model track. It must not claim production readiness, quality gains, or released serving tags before source artifacts publish them.
Evaluation evidence currently available.
No model-quality numbers are restated here because the source baseline reports and fine-tune comparison are still pending live runs.
| Artifact | Run / commit | Task classes | Result fields | Evidence |
|---|---|---|---|---|
| gis-workflow-eval dataset and harness | 8ddf9331417d28644cb650c3eb67f92193ac337f | query 13, geoprocessing 12, styling 13, raster_ops 12 | 50 prompts; deterministic graders; report schema published. | benchmark README |
| Baseline runbooks | 8ddf9331417d28644cb650c3eb67f92193ac337f | Qwen 2.5 Coder 32B, GPT-4o, Claude Opus 4.1 | Configs and commands are published; `report.json` files are pending provider credentials and Qwen endpoint access. | baseline README · live baseline gate |
| Honua-GIS-32B model-card eval table | Last updated 2026-05-22 | overall, query, geoprocessing, styling, raster_ops | Honua-GIS, Qwen base, GPT-4o, and Claude comparison pass rates are all TBD in source. | eval table |
| Ollama / GGUF smoke path | pending live run | serving smoke only | Tokens/sec, p50 latency, GGUF SHA-256, and namespace tag are null in the state file. | Ollama state |
| NIM smoke path | wrapper merged; live smoke pending honua-gis-llm#19 | serving smoke only | NIM wrapper and dry-run harness are published; endpoint, entitlement, throughput, and latency live evidence are not published. | NIM smoke README · live smoke gate |
NIM or GGUF smoke evidence proves serving plumbing only; it is not a substitute for gis-workflow-eval model-quality results.
Run only what the source artifacts support.
The source repo publishes an eval harness and local GGUF/Ollama workflow. It does not yet publish a pullable Ollama tag, Hugging Face weight repo, or NIM image.
Artifact links and caveats.
Use this page as a status surface, not a release note.
Until gis-workflow-eval reports publish exact pass rates, Honua-GIS should not be described as outperforming any base model or closed model.
The model-card scaffold says outputs require human review and must not be used for emergency, legal, surveying, permitting, environmental compliance, or safety decisions.
The unprefixed and namespaced Ollama pull commands must wait for the source state file to record a namespace URL, tag, and GGUF SHA-256.
The site will add exact NIM commands only after the source repo publishes entitlement requirements, endpoint or image/SKU, and smoke evidence.