Show HN: I built a human rights evaluator for HN (content vs. site behavior) https://ift.tt/tzdylps

Show HN: I built a human rights evaluator for HN (content vs. site behavior) My health challenges limit how much I can work. I've come to think of Claude Code as an accommodation engine — not in the medical-paperwork sense, but in the literal one: it gives me the capacity to finish things that a normal work environment doesn't. Observatory was built in eight days because that kind of collaboration became possible for me. (I even used Claude Code to write this post — but am only posting what resonates with me.) Two companion posts: on the recursive methodology ( https://ift.tt/rgLafkS... ) and what 806 evaluated stories reveal ( https://ift.tt/C2KJ4G0... ). I built Observatory to automatically evaluate Hacker News front-page stories against all 31 provisions of the UN Universal Declaration of Human Rights — starting with HN because its human-curated front page is one of the few feeds where a story's presence signals something about quality, not just virality. It runs every minute: https://ift.tt/soYVEfd . Claude Haiku 4.5 handles full evaluations; Llama 4 Scout and Llama 3.3 70B on Workers AI run a lighter free-tier pass. The observation that shaped the design: rights violations rarely announce themselves. An article about a company's "privacy-first approach" might appear on a site running twelve trackers. The interesting signal isn't whether an article mentions privacy — it's whether the site's infrastructure matches its words. Each evaluation runs two parallel channels. The editorial channel scores what the content says about rights: which provisions it touches, direction, evidence strength. The structural channel scores what the site infrastructure does: tracking, paywalls, accessibility, authorship disclosure, funding transparency. The divergence — SETL (Structural-Editorial Tension Level) — is often the most revealing number. "Says one thing, does another," quantified. Every evaluation separates observable facts from interpretive conclusions (the Fair Witness layer, same concept as fairwitness.bot — https://ift.tt/MVox3Wf ). You get a facts-to-inferences ratio and can read exactly what evidence the model cited. If a score looks wrong, follow the chain and tell me where the inference fails. Per our evaluations across 805 stories: only 65% identify their author — one in three HN stories without a named author. 18% disclose conflicts of interest. 44% assume expert knowledge (a structural note on Article 26). Tech coverage runs nearly 10× more retrospective than prospective: past harm documented extensively; prevention discussed rarely. One story illustrates SETL best: "Half of Americans now believe that news organizations deliberately mislead them" (fortune.com, 652 HN points). Editorial: +0.30. Structural: −0.63 (paywall, tracking, no funding disclosure). SETL: 0.84. A story about why people don't trust media, from an outlet whose own infrastructure demonstrates the pattern. The structural channel for free Llama models is noisy — 86% of scores cluster on two integers. The direction I'm exploring: TQ (Transparency Quotient) — binary, countable indicators that don't need LLM interpretation (author named? sources cited? funding disclosed?). Code is open source: https://ift.tt/agke86A — the .claude/ directory has the cognitive architecture behind the build. Find a story whose score looks wrong, open the detail page, follow the evidence chain. The most useful feedback: where the chain reaches a defensible conclusion from defensible evidence and still gets the normative call wrong. That's the failure mode I haven't solved. My background is math and psychology (undergrad), a decade in software — enough to build this, not enough to be confident the methodology is sound. Expertise in psychometrics, NLP, or human rights scholarship especially welcome. Methodology, prompts, and a 15-story calibration set are on the About page. Thanks! https://ift.tt/soYVEfd March 4, 2026 at 01:26AM

Comments

Popular posts from this blog

Complete Guide to E-Commerce Business: Meaning, Models, and How to Start

Micro Niches: The Secret Weapon for SaaS Startups Struggling to Gain Traction

"From Micro Niche to Money Maker: How I Validated My E-Commerce Idea with AI (No Budget Needed)" Published: September 23, 2025 Keywords: Micro niche, AI validation, e-commerce, free tools, startup strategy Introduction Ever wondered if your e-commerce idea is worth pursuing? In this post, I’ll walk you through how I used free AI tools to validate a micro niche, build a lean store, and test demand—without spending a dime. If you’re stuck between ideas or afraid of wasting time and money, this guide is your shortcut to clarity. Step-by-Step Breakdown 1. Finding the Micro Niche Used ChatGPT to brainstorm underserved product categories. Cross-referenced with Google Trends and AnswerThePublic to check search interest. 2. Validating Demand Leveraged Perplexity AI to analyze competitors and market gaps. Ran polls using Typeform and Twitter/X to gauge interest. 3. Building the Store Created a free storefront using Shopify Starter and Canva for branding. Used Durable.co to generate landing page copy in minutes. 4. Driving Traffic Scheduled posts with Buffer across Instagram, Threads, and LinkedIn. Used Notion AI to draft blog content and email sequences. 5. Tracking Results Monitored engagement with Google Analytics and Hotjar. Adjusted product positioning based on feedback from Tally Forms. Key Takeaways Micro niches are goldmines when paired with smart AI validation. You don’t need a budget—just the right tools and strategy. Testing before investing saves time, money, and frustration. Thinking of launching your own store? Drop your niche idea in the comments and I’ll help you validate it with AI—free of charge!