Issue
I am trying to run the loops simultaneously the second loop is dependent on the first one output and needs it to fetch the input from the ids list so no need to wait for the first one until the finish. I tried to do it with multiple libraries and methods but failed to find the optimal structure for that.
import time
import pandas as pd
import requests
import json
from matplotlib import pyplot
import seaborn as sns
import numpy as np
API_KEY = ''
df = pd.read_csv('lat_long file')
# get name and information of each place
id = df['id']
lat = df['latitude']
lon = df['longitude']
ids=[]
loc=[]
unit=[]
print('First API now running')
def get_details(lat, lon):
try:
url = "https://maps.googleapis.com/maps/api/geocode/json?latlng="+ str(lat) + ',' + str(lon)+'&key='+ API_KEY
response = requests.get(url)
data = json.loads(response.text)
ids.append(data['results'][0]['place_id'])
except Exception as e:
print('This code NOT be running because of', e)
return data
def get_deta(ids):
url1 = "https://maps.googleapis.com/maps/api/place/details/json?language=en-US&placeid="+str(ids)+"&key=" + API_KEY
responsedata = requests.get(url1)
data2 = json.loads(responsedata.text)
if 'business_status' in data2['result'].keys():
loc.append((data2['result']['business_status']))
else:
loc.append('0')
flag = False
if data2['result']:
for level in data2['result']['address_components']:
#if len(level['types']) > 1:
if level['types'][0] == 'premise':
flag = True
unit.append(level['long_name'][4:])
else:
print(data2)
if not flag:
unit.append('0')
return data2
def loop1():
for i in range(len(id)):
get_details(lat[i], lon[i])
return
print('Seconed API now running')
def loop2(len(id)):
#printing and appending addresses to use them with the next API
for i in range(50):
get_deta(ids[i])
return
loop1()
loop2()
Solution
It is not very clear what you are trying to achieve here. How exactly does the second API depends on the first?
To achieve concurrency you could use the AsyncIO
library which is designed to perform concurrent network requests efficiently. However, the requests
library you are using is a synchronous one, you must change to an asynchronous one such as aiohttp
.
Given that, you can communicate between two concurrent tasks using asyncio.Queue
. Here is a draft of what your program could look like:
import asyncio
import aiohttp
import json
async def get_details(lat, lon, session: aiohttp.ClientSession, id_queue: asyncio.Queue):
url: str = f"https://maps.googleapis.com/maps/api/geocode/json?latlng={lat},{lon}&key={API_KEY}"
async with session.get(url) as response:
data = await json.loads(response.text())
await id_queue.put(data['results'][0]['place_id'])
async def get_data(id, session: aiohttp.ClientSSLError, loc_queue: asyncio.Queue):
# Network requests and JSON decoding
...
await loc_queue.put((data['result']['business_status']))
async def loop_1(coords, id_queue: asyncio.Queue):
await asyncio.gather(
*[get_details(lat, lon) for lat, lon in coords]
)
async def loop_2(id_queue: asyncio.Queue, loc_queue: asyncio.Queue):
while True:
id = await id_queue.get()
await get_data(id)
async def main():
id_queue = asyncio.Queue(maxsize=100)
loc_queue = asyncio.Queue(maxsize=100)
await asyncio.gather(
loop_1(),
loop_2()
)
if __name__ == "__main__":
asyncio.run(main())
I simplified your example for the purpose of the example. If you take a look at the main()
function, the two loops are executed concurrently with asyncio.gather()
. The first loop gets the details of all places concurrently (again with asyncio.gather
) and feed a shared queue id_queue
. The second loop waits for new id
s to come up in the queue and process them with the second API as soon as they are available. It then enqueue the results in a last queue loc_queue
.
You could extend this program by adding a third API linked plugged to this last queue and continue to process.
Answered By - Louis Lac Answer Checked By - Senaida (PHPFixing Volunteer)
0 Comments:
Post a Comment
Note: Only a member of this blog may post a comment.