Live
Black Hat USAAI BusinessBlack Hat AsiaAI Business🔥 ggml-org/llama.cppGitHub Trending🔥 ollama/ollamaGitHub Trending🔥 sponsors/kepanoGitHub Trending🔥 KeygraphHQ/shannonGitHub Trending🔥 sponsors/abhigyanpatwariGitHub TrendingOpenAI Releases Policy Recommendations for AI AgeBloomberg TechnologyBeware the Magical 2-Person, $1 Billion AI-Driven StartupForrester AI Blog[D] ICML 26 - What to do with the zero follow-up questionsReddit r/MachineLearningStop Writing Mega-Prompts: Use These 5 Anthropic Design Patterns InsteadMedium AIBuilding a Semantic Research Assistant: A Production RAG Pipeline Over 120 arXiv PapersMedium AIBuilding a Multi-Agent Investment PlatformMedium AIClaude Code in the Philippines: ₱112/month vs ₱1,120 for ChatGPTDev.to AIBlack Hat USAAI BusinessBlack Hat AsiaAI Business🔥 ggml-org/llama.cppGitHub Trending🔥 ollama/ollamaGitHub Trending🔥 sponsors/kepanoGitHub Trending🔥 KeygraphHQ/shannonGitHub Trending🔥 sponsors/abhigyanpatwariGitHub TrendingOpenAI Releases Policy Recommendations for AI AgeBloomberg TechnologyBeware the Magical 2-Person, $1 Billion AI-Driven StartupForrester AI Blog[D] ICML 26 - What to do with the zero follow-up questionsReddit r/MachineLearningStop Writing Mega-Prompts: Use These 5 Anthropic Design Patterns InsteadMedium AIBuilding a Semantic Research Assistant: A Production RAG Pipeline Over 120 arXiv PapersMedium AIBuilding a Multi-Agent Investment PlatformMedium AIClaude Code in the Philippines: ₱112/month vs ₱1,120 for ChatGPTDev.to AI
AI NEWS HUBbyEIGENVECTOREigenvector

Building a Node.js document intelligence pipeline for under $10/day

Dev.to AIby Sébastien LOPEZApril 6, 20266 min read0 views
Source Quiz

You've got 10,000 support tickets, blog posts, or product reviews to process. You need summaries and keywords for each. What does that actually cost? This post walks through a real Node.js pipeline that processes documents in parallel with rate limiting, error handling, and retry logic — and calculates exactly what you'll pay. The economics first Using a pay-per-use API (1 USDC = 1,000 credits): Operation Credits Cost per call 10,000 docs Summarize 10 $0.01 $100 Keywords 5 $0.005 $50 Both 15 $0.015 $150 No monthly fee. No minimum. Idle months cost $0. Setting up npm init -y npm install node-fetch p-limit Get a free API key (100 credits, no card needed): curl -s -X POST https://textai-api.overtek.deno.net/keys/create \ -H "Content-Type: application/json" \ -d '{"label":"node-pipeline"}' # {

You've got 10,000 support tickets, blog posts, or product reviews to process. You need summaries and keywords for each. What does that actually cost?

This post walks through a real Node.js pipeline that processes documents in parallel with rate limiting, error handling, and retry logic — and calculates exactly what you'll pay.

The economics first

Using a pay-per-use API (1 USDC = 1,000 credits):

Operation Credits Cost per call 10,000 docs

Summarize 10 $0.01 $100

Keywords 5 $0.005 $50

Both 15 $0.015 $150

No monthly fee. No minimum. Idle months cost $0.

Setting up

npm init -y npm install node-fetch p-limit

Enter fullscreen mode

Exit fullscreen mode

Get a free API key (100 credits, no card needed):

curl -s -X POST https://textai-api.overtek.deno.net/keys/create \  -H "Content-Type: application/json" \  -d '{"label":"node-pipeline"}'

{"apiKey":"sk_...","credits":100}`_

Enter fullscreen mode

Exit fullscreen mode

The pipeline

// pipeline.js import fetch from 'node-fetch'; import pLimit from 'p-limit';

const API_BASE = 'https://textai-api.overtek.deno.net'; const API_KEY = process.env.TEXTAI_API_KEY;

// Rate limit: 10 concurrent requests max const limit = pLimit(10);

async function processDoc(doc, retries = 3) { for (let attempt = 1; attempt <= retries; attempt++) { try { const [summaryRes, keywordsRes] = await Promise.all([ fetch(${API_BASE}/summarize, { method: 'POST', headers: { 'Content-Type': 'application/json', 'X-API-Key': API_KEY }, body: JSON.stringify({ text: doc.text, max_sentences: 3 }) }), fetch(${API_BASE}/keywords, { method: 'POST', headers: { 'Content-Type': 'application/json', 'X-API-Key': API_KEY }, body: JSON.stringify({ text: doc.text, top_n: 5 }) }) ]);

if (!summaryRes.ok || !keywordsRes.ok) { throw new Error(API error: ${summaryRes.status} / ${keywordsRes.status}); }

const [summary, keywords] = await Promise.all([ summaryRes.json(), keywordsRes.json() ]);

return { id: doc.id, summary: summary.summary, keywords: keywords.keywords, creditsUsed: (summary.creditsUsed || 0) + (keywords.creditsUsed || 0) }; } catch (err) { if (attempt === retries) throw err; // Exponential backoff: 1s, 2s, 4s await new Promise(r => setTimeout(r, 1000 * 2 ** (attempt - 1))); } } }***

async function processBatch(documents) { console.log(Processing ${documents.length} documents...); const start = Date.now(); let totalCredits = 0; let errors = 0;

const tasks = documents.map(doc => limit(() => processDoc(doc) .then(result => { totalCredits += result.creditsUsed; return result; }) .catch(err => { errors++; console.error(Failed doc ${doc.id}: ${err.message}); return { id: doc.id, error: err.message }; }) ) );

const results = await Promise.all(tasks); const elapsed = ((Date.now() - start) / 1000).toFixed(1);

console.log(Done in ${elapsed}s); console.log(Credits used: ${totalCredits} ($${(totalCredits * 0.00001).toFixed(4)})); console.log(Errors: ${errors}/${documents.length});*

return results; }

// Example usage const docs = Array.from({ length: 50 }, (_, i) => ({ id: doc_${i}, text: Document ${i}: The quick brown fox jumps over the lazy dog. This is sample content that needs to be summarized and have keywords extracted from it for demonstration purposes. }));

processBatch(docs).then(results => { console.log('Sample result:', JSON.stringify(results[0], null, 2)); });`

Enter fullscreen mode

Exit fullscreen mode

Running it

TEXTAI_API_KEY=sk_your_key node pipeline.js

Processing 50 documents...

Done in 8.3s

Credits used: 750 ($0.0075)

Errors: 0/50`

Enter fullscreen mode

Exit fullscreen mode

50 documents, under a cent, 8 seconds.

Scaling to 10,000 documents

The same code handles 10,000 docs — just change the input array. With 10 concurrent requests and ~200ms average latency per pair of calls:

  • Throughput: ~50 docs/second

  • Time for 10k docs: ~3.3 minutes

  • Cost: 150,000 credits = 150 USDC = $150

For most pipelines, you'd run this overnight or as a scheduled job. Total infrastructure: one Node.js script and a few dollars of USDC.

Cost optimization tips

  1. Only extract what you need. Keywords alone is 5 credits vs 15 for both. If you just need tagging, skip summarization.

  2. Chunk large documents. The API processes up to 50,000 characters. For longer docs, split into sections and summarize each, then summarize the summaries.

function chunkText(text, maxChars = 40000) {  const chunks = [];  for (let i = 0; i < text.length; i += maxChars) {  chunks.push(text.slice(i, i + maxChars));  }  return chunks; }

Enter fullscreen mode

Exit fullscreen mode

  1. Cache results. If you're re-processing documents that haven't changed, cache by content hash to avoid re-spending credits.

import crypto from 'crypto';

const cache = new Map();

function hashDoc(text) { return crypto.createHash('sha256').update(text).digest('hex').slice(0, 16); }

async function processDocCached(doc) { const key = hashDoc(doc.text); if (cache.has(key)) return { ...cache.get(key), id: doc.id, cached: true }; const result = await processDoc(doc); cache.set(key, result); return result; }`

Enter fullscreen mode

Exit fullscreen mode

Error handling patterns

The pipeline above uses retry with exponential backoff. For production, also handle:

  • Credit exhaustion: Check creditsRemaining in each response and top up proactively

  • Rate limits: The 429 response means slow down — p-limit handles this via concurrency control

  • Partial failures: Log failed doc IDs and re-run just those after fixing the issue

What this is good for

  • Content pipelines: Summarize RSS feeds, news articles, research papers automatically

  • SEO tools: Extract keywords from competitor content at scale

  • Support ticket triage: Auto-tag and summarize incoming tickets before routing

  • Newsletter curation: Summarize 100 articles to pick the best 10

The pay-per-use model means you pay for actual processing, not capacity. Process 50 docs or 50,000 — the math is linear.

TextAI API is launching on Product Hunt today. Try it free (100 credits, no card): https://textai-api.overtek.deno.net

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Building a …modellaunchupdateproductapplicationreviewDev.to AI

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 207 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Products