8 stories tagged with #pretraining, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.
⌘ RSS feed for this tag → or search "Pretraining"
Karpathy Joined Anthropic to Train Claude Using Claude
Andrej Karpathy joined Anthropic's pretraining team in May 2026. The specific job: use Claude to accelerate the research that makes Claude better.…
Pretraining Data Exposure in Large Language Models: A Survey of Membership Inference, Data Contamination, and Security Implications
Large Language Models (LLMs) have become the predominant paradigm in NLP, advancing both research and industry. As model sizes and pretraining data grow, concerns about Pretraining…
Memorization Dynamics of Fill-in-the-Middle Pretraining
Fill-in-the-middle (FIM) is a pretraining objective widely used to equip causal language models with infilling ability, yet its effect on verbatim memorization remains underexplore…
LLM Pretraining Shapes a Generalizable Manifold: Insights into Cross-Modal Transfer to Time Series
Can language-pretrained transformers become effective time-series forecasters, and why? In this paper, we show that cross-modal transfer arises because language pretraining precond…
OpenAI co-founder Andrej Karpathy joins Anthropic's pretraining team
OpenAI co-founder Andrej Karpathy joins Anthropic’s pre-training team
Andrej Karpathy has joined Anthropic to work on pre-training. He previously co-founded and worked at OpenAI and led computer vision and AI at Tesla.…
Alignment pretraining: AI discourse creates self-fulfilling (mis)alignment
Pretraining corpora contain extensive discourse about AI systems, yet the causal influence of this discourse on downstream alignment remains poorly understood. If prevailing descri…
Pretraining Objective Matters in Extreme Low-Data FGVC: A Backbone-Controlled Study
Extreme low-data fine-grained classification is common in expert domains where labeling is expensive, yet practitioners still need principled guidance for selecting pretrained enco…