17 stories tagged with #nlp, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.
⌘ RSS feed for this tag → or search "Nlp"
I spent a week on regex before realizing AI agent was the answer for data extraction
I spent a week on regex before realizing AI agent was the answer for data extraction A couple of...…
How I Built a 7-Layer NL2SQL Guardrail Stack for a Fortune 500 Enterprise
Liquid syntax error: Unknown tag 'endraw'…
Practical NLP in the Browser with Transformers.js
This tutorial covers three NLP tasks: text classification, zero-shot labelling, and question answering using Transformers.js's pipeline() API.…
Show HN: Trelk – Read, Think, Connect
Your personal knowledge base. Save articles, papers, and notes. AI connects your ideas.…
Dual Encoder vs Cross-Encoder: Why Your RAG Pipeline Needs Both
My RAG pipeline looked fine on paper. Fast retrieval. Decent cosine scores. But when I tested it with...…
LELA: An End-to-end LLM-based Entity Linking Framework with Zero-shot Domain Adaptation
Entity linking is a key component of many downstream NLP systems, yet existing approaches are often tied to the specific target knowledge bases and domains, limiting their real wor…
Already 11 000 submissions for EMNLP? [D]
Uncertainty Decomposition via Cyclical SG-MCMC and Soft-label Learning for Subjective NLP
Annotator disagreement in emotion classification reflects ambiguity intrinsic to emotion concepts and is essential for predictor-quality assessment in subjective NLP. Yet no prior …
I Built a Multilingual Spam Detection Dataset with 149K+ Messages Across 23 Languages
Spam detection datasets are surprisingly bad once you move outside English. Most public datasets...…
DreamerNLplus: Interpretable Modeling of Mental Health Dynamics from Social Media Timelines using Hybrid Rule-Based and RAG Methods
We present DreamerNLplus, a hybrid framework for modeling mental health dynamics from social media timelines in the CLPsych 2026 shared task. Our system addresses three tasks: psyc…
LMR-BENCH: Can LLM Agents Reproduce NLP Research Code? (EMNLP 2025)
LMR-BENCH (EMNLP 2025) benchmarks LLM agents on reproducing code from 23 NLP papers. This PoC explains the masking methodology, evaluation axes, and what the results mean for AI-as…
Retrieval-Augmented Long-Context Translation for Cultural Image Captioning: Gators submission for AmericasNLP 2026 shared task
We present the University of Florida Gators submission to the AmericasNLP 2026 shared task on cultural image captioning for Indigenous languages. Our two-stage pipeline generates a…
92. BERT: The Model That Reads in Both Directions
GPT generates text by predicting the next word. It reads left to right. BERT does something...…
Python Sentiment Analysis: From Basics to BERT
` Imagine opening your laptop and seeing 5,000 product reviews, hundreds of support tickets, and a...…
LLMs as Linguistic Probes: A Graduate Student's Guide to Advanced Syntax, Semantics, and Efficient Fine-Tuning
The intersection of large language models (LLMs) and advanced linguistics has moved beyond...…
How I keep LLMs on a tight leash and stopped hand-creating 30 GitHub issues in the process
The way I keep LLMs on a tight leash is through structured issue breakdowns. In this post you'll see...…
I Added Three Rules to Gemma 4. The MoE Searched. The Dense Model Refused.
I ran Gemma 4 26B (MoE, 4B active) and Gemma 4 31B (dense) against GPT-4o and GPT-4o mini on a real Arabic e-commerce chatbot. Then I added three Gemma-only prompt rules. The MoE v…