Artificial intelligence / portfolio

AI projects thatactually work — not demos.

Production RAG, agents, document processing and audio analytics. Local or cloud, GDPR and EU AI Act compliant. See what AI looks like when it ships to a real client — instead of to a conference slide.

12+Shipped AI projects
200k+PDF pages processed
500+Audio hours analyzed
100%On-prem RAG deploys
Live demo

What it looks like when AI extracts structured data from a contract.

Scroll. The system pulls 8 fields out of one contract — customer, dates, amounts, payment terms, risk clause — with automatic source references. The same way it works on your documents.

PDF • 12 pages • received: 2026-04-12 09:14

Service Agreement — 2026/04 — extract

The undersigned parties hereby enter into the following service agreement on the date set forth below. The customer: Example Trading Ltd., registered office: 1051 Budapest, Sample street 12, tax ID: 12345678-2-41.

Date of execution: April 12, 2026. Performance schedule per the attached annex. The contractor's fee is due on May 15, 2026.

The net consideration for the services contracted is HUF 1,250,000, to which HUF 337,500 (27% VAT) value-added tax applies. The gross consideration is HUF 1,587,500.

In the event of delayed performance, the contractor shall pay a penalty of 0.5%/day, up to 10% of the contract value. The parties shall first attempt to resolve any disputes through negotiation; failing that, the competent Hungarian court shall have jurisdiction.

Case studies

Six real projects we actually shipped.

Behind every card: a client, a concrete business problem and measurable results. Open one to see the details.

Models & pipeline

What we run in production.

Frontier APIs and open-source models — always picked for the job at hand. No vendor lock-in, on-prem deployment available.

GPT
C
L
M
Q
W
G
DS
Ol
vL

Pipeline & infrastructure

LangChainLangGraphLangSmithLangfuseLlamaIndexPydanticFastAPITemporalCeleryQdrantPineconeChromaDBRedisPostgreSQLDockerKubernetes
Process

What an AppForge AI project looks like.

We don't promise a 6-month POC. We ship in 2–6 weeks, then iterate based on real data and user feedback.

01

1. Use case & ROI workshop

After a 30-minute call we walk through your data and pick the 1–2 use cases where AI pays back fastest. Concrete numbers — not a deck.

  • Data audit
  • Risk and GDPR / EU AI Act check
  • Measurable success metric

02

2. Data pipeline & eval

Garbage in / garbage out is the most expensive AI mistake. We build the data pipeline, create an eval set, and only move on when the model can objectively measure progress.

  • Eval set + golden standard
  • Embedding / chunking strategy
  • Langfuse / LangSmith observability

03

3. MVP in production

We ship to a small user group in 2–6 weeks. Minimum production setup: monitoring, tracing, cost tracking, fallback paths — you see real metrics, not a demo video.

  • Monitoring & tracing
  • Token & latency budget
  • Resilient fallbacks

04

4. Iterate & scale

The real tuning starts when people use it. Eval-driven iteration, prompt and model versioning, A/B testing. Development starts on launch day — it doesn't end there.

  • Prompt & model versioning
  • A/B testing on traffic
  • Cost optimization

What "production AI" actually means

The portfolio above shows projects that ship to a real client and stay running. That is the bar. A demo on a trade-show screen running against three handpicked documents is not on this page. Production AI means: the system handles real traffic, fails gracefully when an LLM provider has a bad day, logs every call for the EU AI Act Article 12 audit trail, and produces a measurable business result the client can point at — fewer hours of human work, faster decision-making, less revenue leakage from missed signals.

Our default stack is Python with FastAPI, LangChain or LangGraph for the agentic orchestration, Postgres for state, Redis for cache, Sentry for errors, OpenTelemetry for traces. The LLM layer is provider-neutral — Anthropic Claude, OpenAI GPT, Google Gemini, or local models (Llama 3.3, Qwen 2.5, Mistral) on Ollama or vLLM for sensitive data. We can swap models behind a config change, and every prompt has an eval suite so a model upgrade does not silently regress behaviour. For RAG we lean on Postgres + pgvector or Qdrant; document parsing uses unstructured.io or a custom pipeline depending on file shape.

Privacy and compliance are first-class. GDPR data residency is in EU regions (Hetzner Falkenstein, Cloudflare EU, AWS Frankfurt). For sensitive verticals (banking, healthcare, public sector) we run the full stack on your hardware or in your VPC, with no data leaving your perimeter. By August 2, 2026 the EU AI Act high-risk obligations apply — for relevant systems we ship Article 11 technical documentation, Article 12 logging, Article 13 transparency notices, Article 50 chatbot and deepfake disclosure built into the application. Not a compliance theatre add-on; baked into the architecture from the start.

How an AI engagement starts

The first conversation is a 30-minute scoping call. We ask three questions: what business outcome do you want, what data do you have access to, and what does "done" look like in numerical terms. From that we draft a one-page proposal with a fixed-price pilot — usually EUR 8,000-25,000, 4-8 weeks. The pilot ships a working prototype against your real data, with measurable results (precision, recall, latency, cost per call), so you can decide before signing on for the full build whether the use case earns its keep.

After the pilot, full engagements run EUR 25,000-150,000 over 3-9 months depending on integration depth and infrastructure choices. Maintenance is 15-25% per year, plus prompt eval regression checks every model update. Source code is yours from day one (Git repository, full ownership), with documentation, prompt registry, and an onboarding session for your engineers if they want to take over operations later.

Let's talk

Want to see where AI pays back fastest in your business?

30-minute free consultation. We'll walk through your processes, point at the 1–2 use cases worth starting with — and show you our references.

Start a project