WeSearch
Hub / Tags / Datasets
TAG · #DATASETS

Datasets coverage.

Every story in the WeSearch catalog tagged with #datasets, chronological, with view counts. Subscribe to the per-tag RSS feed to follow this topic in your reader of choice.

15 stories tagged with #datasets, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.

⌘ RSS feed for this tag →   or   search "Datasets"

RELATED TAGS
#ai1#collaboration1#ml1#data-science1
ARXIV CS.AI

Do Real-World Datasets Contain Natural Experiments? An Empirical Study Using Causal Feature Selection

In nature, events that affect some individuals or groups but not others constitute an implicit intervention and are known as natural experiments. For example, the COVID-19 pandemic…

14 views ·
#artificial intelligence#machine learning#causal inference
R/MACHINELEARNING

Before we spend months processing open-source robotics datasets, tell us why this is a bad idea [D]

10 views ·
R/MACHINELEARNING

noisekit - CLI for generating realistic degraded speech datasets for ASR benchmarking [P]

13 views ·
GIZMODO

Silicon Valley VC Backs Startup That Gathers AI Datasets From Head-Mounted Cameras on Workers in India

Human Archive believes its technology "will become foundational infrastructure for automating manual labor."…

22 views ·
#artificial-intelligence#automation#technology
R/PROMPTENGINEERING

How are people doing prompt optimization with datasets safely?

17 views ·
KDNUGGETS

Auditing Model Bias with Balanced Datasets with Mimesis

Learn how to use Mimesis library to generate a balanced, counterfactual dataset that helps analyze potential bias in your models.…

13 views ·
#machine learning#bias#data science
R/ARTIFICIAL

Testing a Cold War-Era AI on Satellite Image Datasets

14 views ·
ALGORHYTHM

Data Fundamentals Primer for Learning LLM

The minimum data plumbing every ML pipeline needs — samples, features and labels, the train/val/test split, text encoding (ASCII and UTF-8), and preprocessing.…

12 views ·
#machine learning#data science
ARXIV CS.AI

Less Data, Faster Training: repeating smaller datasets speeds up learning via sampling biases

This work investigates the ``small-vs-large gap'', where repeating on fewer samples can lead to compute saving during training compared to using a larger dataset. This is observed …

14 views ·
#machine learning#artificial intelligence#data science
R/OPENAI

I benchmarked my AI agent runtime firewall against 3 public academic datasets — here are the honest results including where it fails

17 views ·
PHYS.ORG

AI tool fuses five satellite datasets to help track harmful algal blooms

19 views ·
ARXIV CS.AI

GroupAffect-4: A Multimodal Dataset of Four-Person Collaborative Interaction

Existing affective-computing, social-signal-processing, and meeting corpora capture important parts of human interaction, but they rarely support analysis of affect in co-located g…

13 views ·
#artificial intelligence#collaboration
R/DATABASE

Built an address-level Calgary civic data explorer by connecting multiple public datasets

13 views ·
PC GAMER

Take-Two's CEO says AI's not in the business of making hits, 'datasets by their very nature are backward looking', but that doesn't mean AI can't be 'super helpful'

"Clones don't sell".…

13 views ·
#gaming#technology#ai
R/MACHINELEARNING

How are you handling training data when public datasets don't match your use case? [D]

15 views ·