How to avoid a human-LLM turf war — and enhance your work along the way
View in browser
towards data science

The Variable

LLMs for Data Science: What You Need to Know

More and more data professionals now encounter large language models as integral components in core data science workflows. From basic data analysis to complex extraction processes, LLMs play an increasingly visible role in areas where they used to have a minimal footprint (if any).

 

How should you navigate this rapid change? What can data scientists do to avoid human-LLM turf wars and instead leverage their powers to produce better, more streamlined results? This week, we zoom in on data-specific use cases with articles that show how agents, prompts, and other LLM-powered tools can enhance, rather than jeopardize, the value of your work.

 

Before we jump in: in case you missed it, we recently published our latest Author Spotlight, an insight-filled Q&A with longtime TDS contributor Egor Howell discussing his career journey and offering advice for aspiring ML engineers.

LangChain-for-EDA-Build-a-CSV-Sanity-Check-Agent-in-Python (3)

LangChain for EDA: Build a CSV Sanity-Check Agent in Python

Tired of performing the same exploratory data-analysis chores time after time? Sarah Schürch walks us through an automation project — powered by Python and LangChain — that produces agents with the ability to display columns, detect missing values, and retrieve descriptive statistics, among other time-saving benefits.

Read More
39341ae4-cfa8-48aa-b4c0-730e7e2ea8a8 (1)

The End-to-End Data Scientist’s Prompt Playbook

If you're skeptical about LLMs' place in data scientists' toolkit, Sara Nobrega's latest exploration of prompting techniques — especially in the area of stakeholder communication — might just change your mind.
Read More
LangExtract_Intro_Image_Unsplash-scaled-1

Extracting Structured Data with LangExtract: A Deep Dive into LLM-Orchestrated Workflows

Subha Ganapathi offers a hands-on guide to building modular workflows for structured intelligence, ensuring schema alignment and fact completeness.

Read More
TDS Weekly Mention

From our partner, Tonic

Keep PII Out of AI

Keep sensitive data out of inputs and embeddings for AI is paramount for privacy. Make the process easy with Tonic Textual.    

Demo Today

This Week's Most-Read Stories

The articles our community has been buzzing about in recent days cover MCP, the future of data generalists, and more:

  • Using LangGraph and MCP Servers to Create My Own Voice Assistant, by Benjamin Lee
    Built over 14 days, all locally run, no API keys, cloud services, or subscription fees.

     
  • The Generalist: The New All-Around Type of Data Professional?, by Loizos Loizou

    Is over-specialization ending and are data generalists on the rise?

  • The Machine Learning Lessons I’ve Learned This Month, by Pascal Janetzky

    August 2025: logging, lab notebooks, overnight runs.

Other Recommended Reads

From climate data to Python essentials, here are a few more recent must-reads we wanted to highlight:

  • AI FOMO, Shadow AI, and Other Business Problems, by Stephanie Kirmer

    What’s the state of AI in business these days, and how much does it cost us?

  • Stochastic Differential Equations and Temperature — NASA Climate Data pt. 2, by Marco Hening Tallarico
    The Ornstein-Uhlenbeck process in Python.

  • What Being a Data Scientist at a Startup Really Looks Like, by Yu Dong

    What I learned about growth, visibility, and chaos over the past five years.

  • MobileNetV1 Paper Walkthrough: The Tiny Giant, by Muhammad Ardi

    Understanding and implementing MobileNetV1 from scratch with PyTorch.

  • Implementing the Coffee Machine in Python, by Mahnoor Javed
    A beginner-friendly step-by-step guide to coding a Coffee Maker in Python.
TDS Contributor

Contribute to TDS

We love publishing articles from new authors, so if you’ve recently written an interesting project walkthrough, tutorial, or theoretical reflection on any of our core topics, why not share it with us?

Submit Your Article

Meet Our New Authors

Explore excellent work from some of our recently added contributors:

  • James Gibbins is a data scientist with a multidisciplinary background, and has been publishing a popular series on hyperparameter tuning.

  • Erika G. Gonçalves joins us with expertise in applied math and statistics, as well as deep industry experience; her first article looks under the hood of AI applications.

  • Sean Moran has led multiple AI/ML initiatives at large-scale enterprises. His TDS debut looks into a potential future where scientific innovation might be heavily AI-assisted.

Want more insights, industry trends, and exclusive content? Follow us on social media for real-time updates and expert discussions!

TDS_Website
TDS_X
TDS_LinkedIn
TDS_Mastodon
TDS_Threads
TDS_Bluesky
TDS_YouTube

Towards Data Science

1111 6th Ave Ste 550, PMB 50938, San Diego, CA 92101-5211

 

We’re here to deliver the content that truly matters to you! Manage your preferences to receive what captivates your interest. But if you ever need a break, you can unsubscribe from all email communications—though we’d really miss you!