Data Science for Social Sciences using AI
Data Search and Collection
Exploratory Data Analysis
Very data centric!
AI literacy
Is this what they call vibe coding ?
Data enrichment: semantic analysis, topic modeling, named entity recognition, etc. combining datasets, …
Qualitative and Quantitative Analysis: stats, segmentation, insights, leveraging domain knowledge,etc.
Data visualization: What’s the story, what’s the narrative?
The only way to learn is to do!
It goes way beyond writing emails or summarizing some text.
But it’s hard. There’s no “mode d’emploi”.
This is not : “investigating with data science”
It is : investigating with AI
AI actors: LLMS, Agents, prompting, vibe coding, … going meta
Hands-on: projects, practice, learning by doing
AI Literacy: Web, APIs, NLP, ML, DL, coding in python
Practice, practice, practice
Project
Hands on practice
Collaborative Work
Practical Focus
and questions, questions, and more questions
Office hours, support: discord!
We have 12 classes, 2h each. Tuesday Mornings :).
AI:
Tech knowledge & culture:
All course material is available on the skatai.com/inwai website and on the discord channel
You are a team of data journalists. Choose your topic, ask the questions, exploit the data.
Goal:. Design an engaging platform to showcase your report.
I’d like to know a bit more about you
All questions are optional (except your email)
Please fill out this form
LLM : large language models, aka agents, AIAgents,
Language models are trained to mimic human behavioral data.
This mimicry makes it tempting to anthropomorphize a system—to think of it like a person. However, not only is the model not a person, it is not even trained to mimic a person.
Instead, the model has been trained to mimic a group of people with individual capacities, predilections, and biases. […] but we also see the enormous advantage of training on data from a diverse set of people: often, it is possible to outperform any individual member of that group.
The capacity of a generalist model to exceed individual ability is evident in a chatbot that can converse with equal competence about cryptography, international law, and the work of Dostoevsky. Our goal is to describe the circumstances in which a model, trained to mimic multiple people, is capable of transcending its sources by outperforming each individual.
Where LLMs compete LMArena.ai Leaderboard
Total models: 229. Total votes: 4M
How do we evaluate and compare LLMs?
LLM benchmarks are tests that measure how well large language models perform on different tasks, like answering questions, solving problems, or writing text. They give a way to compare models side by side.
Challenges: Benchmarks don’t always reflect real-world use, can become outdated quickly, and models often “train to the test,” meaning high scores don’t always equal better usefulness.
Traditional Benchmarks:
Problem: Models quickly saturate these tests
The New Frontier:
These represent humanity’s cognitive boundaries
The hardest science questions humans can answer
Graduate-Level Google-Proof Q&A
Why it matters: Tests deep scientific reasoning, not memorization
Real programming challenges from competitive coding platforms
Human vs AI Performance (2025)
The question: What happens when AI exceeds human performance on every cognitive benchmark?
Google A.I. System Wins Gold Medal in International Math Olympiad
AI achieves gold medal solving International Mathematical Olympiad problems
two 4.5 hour exam sessions, no tools or internet, reading the official problem statements, and writing natural language proofs.
Artificial Analysis Intelligence Index combines performance across seven evaluations: MMLU-Pro, GPQA Diamond, Humanity’s Last Exam, LiveCodeBench, SciCode, AIME 2025, and IFBench.
“When GPT-4 came out it was around $50/ Mtokens, now it costs around 0.14$ / Mtokens to use GPT-5 nano. GPT-5 nano is a much more capable model than the original GPT-4.”
see measuring the environmental impact of delivering ai at google
“Google has reported that energy efficiency per prompt has improved by 33x in the last year alone.
The marginal energy used by a standard prompt from a modern LLM in 2025 is relatively established at this point, from both independent tests and official announcements.
It is roughly 0.0003 kWh, the same energy use as 8-10 seconds of streaming Netflix or the equivalent of a Google search in 2008.
Image creation seems to use a similar amount of energy as a text prompt.
How much water these models use per prompt is less clear but ranges from a few drops to a fifth of a shot glass (.25mL to 5mL+)
41’37”
Like google docs but for executing code
Shareable, collaborative work
A notebook is a series of executable cells
What you write
Simple Syntax
# Header 1
**this is bold**, not bold
[a link](https://sciencespo.fr)
markdown and code in colab
The dataset
midjourney: a panda is typing on a laptop ; a penguin is looking over the shoulder realistic, natural colors, low sun, the pandas wears a knitted hat and a warm jacket background : polar station, ice, snow
In this demo, I will
Simple exercise on google colab with a similar dataset : the Titanic!
The titanic dataset is a classic in machine learning.
A short exercise to explore Google Colab step by step.
You will write text in Markdown, run a simple Python function (with Gemini’s help)
Then load some data and explore it. And try to understand the code.
Ask gemini
Gemini probably knows this file by heart. It’s a classic!
Take one of the cell code generated by Gemini and ask Gemini to explain the code.
Dive deep to really understand what’s going on.
Gemini never tires, never judges
Please post your questions in the course channel #sciencespo-dhum25a43
not in private messages, so all can contribute.