Pokemon Case

As you have likely noticed, sometimes APIs will give us many records of data at once and other times we will get only one record at a time. The challenge in that case is often that the JSON results we get back are complex with multiple layers of nesting. There is no preset JSON orientation that can be used to load the data into a Pandas DataFrame automatically.

This case study example will show you how to parse through those results to load them into a DataFrame. We will use the endpoints provided by https://pokeapi.co/ to do this. Follow along with the video below and try to recreate the code in your own notebook:

      import requests as r, json, pandas as pd, time
        
      # Create a blank DataFrame to store the results that you're interested in
      df = pd.DataFrame(columns=['id', 'name', 'base_experience', 'height', 
                                 'weight', 'species', 'razor-wind', 'cut'])
      df.set_index('id', inplace=True)
        
      # Assuming you know or can guess the ID value required for the endpoint request, 
      # create a loop to make requests one-at-a-time for each value
      for i in range(1, 6):
        url = f"https://pokeapi.co/api/v2/pokemon/{i}/" # Generate the dynamic URL
        res = r.get(url)                                # Make the request
        res_json = json.loads(res.text)                 # Convert results to JSON
        
        # Store each of the values you need based on their individual unique nesting
        id = res_json['id']
        name = res_json['name']
        base = res_json['base_experience']
        height = res_json['height']
        weight = res_json['weight']
        species = res_json['species']['name']
        moves = res_json['moves']
        
        moves_list = []
        for move in moves:
          moves_list.append(move['move']['name'])
          
        if 'razor-wind' in moves_list:
          razor_wind = True
        else:
          razor_wind = False
        
        if 'cut' in moves_list:
          cut = True
        else:
          cut = False
        
        # Add a new record into the DataFrame with the results for this iteration
        df.loc[id] = [name, base, height, weight, species, razor_wind, cut]
        
        time.sleep(1) # Be courteous to the provider by adding an iteration delay
        
      # Store your results
      # path = "your path to your Google Drive"
      # df.to_csv(f'{path}/data.csv', mode='a', header=False)
      df.head()