Issue
I have the following problem: I get an error when trying to add 'time' and 'y_corrected' into the new dataframe.
I need to calculate a variable, 'y_corrected', and add it to a new dataframe. In order to calculate this variable, I use the group function to loop through the dataset based on two criteria: filename and treatment. The final dataframe should contain filename, treatment, time, y_corrected.
file = pd.read_excel(r'C:.....xlsx')
grouped = file.groupby(['File name', 'Treatment'])
######################################## output dataframe #####################################
new = pd.DataFrame(columns=['File name','Treatment', 'Time', 'y_corrected'])
new.columns = ['File name', 'Treatment', 'Time', 'y_corrected']
######################################## correction ########################################
for key, g in grouped:
a = g['y'].max()
b = g['y'].min()
y_corrected = (g['y'] - b) / a
row = {'File name': key[0], 'Treatment': key[1], 'Time': time[2], 'y_corrected': y_corrected[3]}
new = new.append(row, ignore_index=True)
print(new)
This is the error: result = self.index.get_value(self, key)
Solution
You do not have to loop through the different groups. You only have to use pandas magic on your dataframe:
file = pd.read_excel(r'C:.....xlsx')
file['y_corrected'] = file.groupby(['File name', 'Treatment'])['y'].apply(lambda x: (x-min(x))/max(x))
Answered By - Cylldby Answer Checked By - Gilberto Lyons (PHPFixing Admin)
0 Comments:
Post a Comment
Note: Only a member of this blog may post a comment.