A product selection frontend that only an engineer could love.
Industrial spec data — drives, motors, gearheads, contactors, actuators — indexed, filtered, and exportable. No marketing copy on the rows. No "request a quote" gates. The number you need, with the datasheet that produced it.
-
TM-01
Filter chips, not facets
Every spec on every record is a chip. Click to constrain, click again to drop. No nested accordions, no "show more". The filter set is the data.
-
TM-02
Metric ↔ imperial, header toggle
Display-layer conversion across the whole catalog — torque, force, length, temperature. The underlying value never moves; the unit you read does.
-
TM-03
Datasheet links on every row
The PDF that produced the row is one click away. Verify a number, check a derate curve, copy a part code straight from the source.
-
TM-04
Rows export like a BOM
Filter to a shortlist, export to CSV. Tabular numerics, canonical units, manufacturer + part number — drop it into a spec sheet without massage.
-
DS-01
PDF catalogs
specodexCLI: page-finder identifies spec tables (free, no LLM call), Gemini extracts structured rows, Pydantic validates, DynamoDB stores. Never feed it a 600-page raw catalog — page filtering is mandatory. -
DS-02
Product webpages
web-scraperCLI: Playwright renders JS-heavy product pages, pulls JSON-LD + HTML, runs the same extraction pipeline. Behaves the same as PDFs from the database's perspective. -
DS-03
Manual entry
Admin-mode UI: presigned-URL upload, full CRUD, per-row edits. For when a vendor ships a one-off spec note that no PDF will ever show.
-
PT-01
Motors
Brushless DC, AC servo, AC induction. Voltage, current, power, torque, speed, encoder type, rotor inertia, IP rating.
-
PT-02
Drives
Servo and VFD drives. Input/output voltage, power, switching frequency, I/O counts, fieldbus protocols, safety ratings.
-
PT-03
Gearheads
Planetary, harmonic, cycloidal. Ratio, backlash, continuous and peak torque, input speed, torsional rigidity, service life.
-
PT-04
Electric cylinders
Stroke, push/pull force, continuous force, linear speed, positioning repeatability, lead-screw pitch.
-
PT-05
Linear actuators
Stroke, force class, lead, screw type, duty cycle, IP rating — the screw-driven side, kept separate from servo electric cylinders.
-
PT-06
Robot arms
Payload, reach, pose repeatability, max TCP speed, axis count, per-axis torque and speed.
-
PT-07
Contactors
AC-1/AC-3 ratings, coil voltages, auxiliary contact counts, short-circuit ratings — switching gear that lives one rack over from the drives.
-
PT-08
Extensible
Run
./Quickstart schemagen <pdf>... --type <name>with 3–5 vendor catalogs to scaffold a new Pydantic model. It auto-discovers in every CLI; the TS allow-lists are documented inCLAUDE.md.
Everything goes through ./Quickstart <command> — a single
bash entry point that delegates to cli/quickstart.py. Run it
with no command for local dev servers.
# Clone and install git clone https://github.com/JimothyJohn/specodex.git cd specodex uv sync # Local dev (backend :3001, frontend Vite :5173) ./Quickstart dev # Pre-push gate — mirrors CI exactly: lint + tests + build ./Quickstart verify # Extract specs from a PDF (page-finder + Gemini + Pydantic) uv run specodex \ --url "https://example.com/motor-catalog.pdf" \ --type motor --manufacturer "Acme" --product-name "X100" # Scrape a product webpage (JS-rendered, same pipeline) uv run web-scraper \ --url "https://shop.example.com/products/X100" \ --type motor --manufacturer "Acme" --product-name "X100" # Query the database uv run dsm find --type motor \ --where "rated_power>=1000" --sort "rated_torque:desc" # Propose a new Pydantic product model from 3-5 vendors ./Quickstart schemagen abb.pdf siemens.pdf schneider.pdf --type valve # Benchmark the ingress pipeline against control datasheets ./Quickstart bench
| Endpoint | Description |
|---|---|
GET /health | Status, mode, environment, timestamp |
GET /api/v1/search | Full-text + filtered + sorted search |
GET /api/products | List products (filter ?type=motor) |
GET /api/products/summary | Counts by product type |
GET /api/products/categories | Available product types with counts |
GET /api/products/manufacturers | Unique manufacturer list |
GET /api/datasheets | Datasheet source entries |
POST /api/upload | Queue a datasheet for processing (admin) |
-
ARCH-01
Extraction (Python)
Page-finder text heuristic strips a 600-page catalog to ~20 spec pages before any LLM call. Gemini emits structured JSON;
specodexvalidators map it to canonicalvalue;unitcompact strings. Quality-scored, then written. -
ARCH-02
Frontend
React + TypeScript + Vite. Deployed to S3 behind CloudFront. Two modes: admin (full CRUD) and public (read-only search and filter).
-
ARCH-03
Backend
Express on AWS Lambda via API Gateway. REST: search, product CRUD, datasheet management, upload pipeline.
-
ARCH-04
Data
DynamoDB single-table:
PK=PRODUCT#TYPE,SK=PRODUCT#UUID. S3 for PDFs. Deterministic UUIDs deduplicate across sources.