API

Application Programming Interface

What we saw last time

  • open source vs closed source
  • data sources
  • data formats
  • Project reviews

Any questions?

Today

  • news review
  • The web as a gigantic API
  • the REST protocol
  • inspecting a website (hack 101)
  • building datasets from APIs
  • Project reviews

  • Guided practice on the wikipedia API

At the end of this class

You

  • understand what an API is
  • know the 4 operations in the REST protocol
  • can get data out of a public API

In the News

What caught your attention this week?

API

An API defines

  • the methods and data formats
  • that applications can use
  • to request and exchange information.

API

In other terms : an API allows apps to talk to each other.

API

  1. You send a request to a web address : a url
  2. The server answers the request
  3. You get back some data

API

Attention! - Attention!

In safari and other browsers, the full URL is hidden.

In the finder, go to settings  » advanced
enable the “Show full website address” option.



Full URL in safari

What’s a URL ?

A URL (Uniform Resource Locator) is the address of a unique resource on the internet.

domain name + everything else to specify the data you requested

https://{domain name}/{endpoint}?{params}

Example :

https://skatai.com//inwai/api/#slide-9

REST protocol

The REST protocol is a set of rules that define how applications can interact with each other.

Four verbs to rule the world

  • GET : read the data
  • POST : create the data
  • PUT : update the data
  • DELETE : delete the data

The whole digital economy is based on these 4 words!

Example

  • you read your feed: GET
  • you like a post : POST
  • you update your profile : PUT
  • you remove a reel : DELETE

example on instagram, bluesky, X, facebook, tiktok, etc.

and every other website

The Web is One BIG API + GET requests

  1. on your browser you go to a url. This is the initial GET request
  2. that request triggers a call to the server.
  3. the server sends you back the content as the response, most often as JSON

Full URL in safari

The Web is an API

Let’s illustrate

The web is one gigantic API

It uses URLs to send requests to a server

The server sends the html page back

  • Go on goodreads.com
  • Search for Dune
  • Click on the author’s name Frank Herbert

You should end up on this URL:

https://www.goodreads.com/author/show/58.Frank_Herbert

alt text

https://www.goodreads.com/author/**show**/58.Frank_Herbert

which can be read as: show an author, with label 58.frank.herbert

/list instead of /show

Now scroll down and click on “More Books by Frank Herbert”

The URL is now https://www.goodreads.com/author/list/58.Frank_Herbert

The verb “/show” is replaced with the verb “/list”.

Screenshot showing Books by Frank Herbert page on Goodreads

Parameters: ?page=2&per_page=30

Now click on page 2, the URL becomes

https://www.goodreads.com/author/list/58.Frank_Herbert?page=2&per_page=30

which reads

  • list all the works of author 58.Frank_Herbert
  • show page 2
  • and show only 30 works per page

REST is the building block of the internet

An endpoint: an URL and a path

some optional parameters: ?page=2&per_page=30

A method : GET the content, PUT or POST new content, DELETE the content

The data in JSON format as the server response, or just plain text, html, pdfs, csv, audio, video etc

Hack 101

DEV tools - under the hood

go on a social network or a website

  • click right and get to the developpers tools : inspect
  • click on network tab
  • like a post: you should see a request with method POST
  • click on a post: you should see a requests with method GET

inspect devtools

Exercise

Grab a screenshot of the Devtools screen, network tab

Paste in a LLM like chatGPT or Claude or …

ask: explain in simple terms what I’m seeing

dev tools inspect

Wikipedia API

wikipedia

The wikipedia API

We can use the API front end (sandbox) to play with the API but as we can see it’s not trivial

So we need to read :

Best to check out

#  Do a Wikipedia search for query.
wikipedia.search(query, results=10, suggestion=False)

read the docs

https://pypi.org/project/Wikipedia-API/

Install the library

First install the library : !pip install wikipedia-api

Note the ! before pip.

Then we look at some code

import wikipediaapi

# Initialize Wikipedia API (English)
wiki_wiki = wikipediaapi.Wikipedia( user_agent="[email protected]",  language='en')

# Get the page : the actual request to the API
page = wiki_wiki.page("Paris")

# Check if the page exists
if page.exists():
    print(f"Title: {page.title}\n")
    print(f"Summary: {page.summary[:500]}...")  # print first 500 chars of summary
else:
    print("Page not found.")

instanciate the object

pass all the parameters to specify how how want to interact with the object

wiki_wiki = wikipediaapi.Wikipedia( user_agent="[email protected]",  language='en')

wiki_wiki is the object that we use to interact with the API. It has now been initialized, or instanciated

Also pass all the required identification parameters (login, password, API key, …). Not needed for wikipedia API.

strings

  • Simple, direct
print("Hello world")
  • With a variable
my_var = "Hello world"
print(my_var)
  • interpolation
my_var = "Hello world"
print(f"Title: {my_var}\n")

notice :

  • the f before the string
  • the {} around the variable
  • the \n at the end of the string.
  • \n is the line return character

Practice

  • Print the string "My name is Spiderman"
  • Create a variable called hero with the value "Spiderman"
  • Print the variable hero
  • Print "Hero:" followed by the variable hero (concatenation)
  • Create another variable quote = "With great power comes great responsibility"
  • Print using an f-string “hero says … "

subleties

A method call : notice the ()

page.exists()

A property on the object page: no ()

page.title

practice

Now in google colab

  • use the wikipedia API to get a page (for instance a city, a person, a country)
  • explore what available elements the page object has besides title and summary
  • create an array of similar pages (["Paris", "New York", "Tokyo", "London", "Berlin"])
  • use the wikipedia API to get the summary of each page
  • print the summary of each page
  • store the pages in a pandas dataframe with columns : url, title, summary
  • store the dataframe in a csv file on your local machine

Next time

  • more python
  • NYT API
  • NLP
  • Spacy

new data source: aifray.com

In-depth reporting and analytical commentary on artificial intelligence regulation.

1 / 33
Use ← → arrow keys or Space to navigate