kordoc by chrisryugj
38 score
An MCP server that parses South Korean document formats like HWP, HWPX, and PDF into Markdown. It features specialized table reconstruction and security-hardened extraction optimized for administrative and public institution files.
Ranked #2437 out of 6690 indexed tools.Actively maintained with commits in the last week.
Is this your tool? Claim this listing to add maintainer context, get a verified badge, and unlock analytics.
Claim listing → Signal Breakdown
Installs 0
Freshness 3d ago
Issue Health 50%
Stars 228
Platform Breadth 1 platform
Contributors 1
Description Detailed
How to Improve
Contributors medium impact
Platforms medium impact
Supported Platforms
From the README
# kordoc **모두 파싱해버리겠다** — The Korean Document Platform. > *Parse, compare, extract, and generate Korean documents. HWP, HWPX, PDF — all of them.* [한국어](./README-KR.md) --- ## What's New in v1.6.0 - **Cluster-Based Table Detection (PDF)** — Detects borderless tables by analyzing text alignment patterns. Baseline grouping + X-coordinate clustering identifies 2+ column tables that line-based detection misses. Sort-and-split clustering for order-independent results. - **Korean Special Table Detection** — Automatically detects `구분/항목/종류`-style key-value patterns common in Korean government documents and converts them to structured 2-column tables. - **Korean Word-Break Recovery** — Improved merging of broken Korean words in PDF table cells. Handles character-level PDF rendering (micro-gaps between Hangul characters) and cell line-break artifacts up to 8 characters. - **Empty Table Filtering** — Tables with all-empty cells (from line detection of decorative borders) are now automaticalRead full README on GitHub →