创建列表 遍历dataframe_如何遍历DataFrame并生成新的DataFrame

本文介绍如何检查DataFrame中'L'列是否存在值,如果存在,则根据'L'和'P'列的值生成新的DataFrame。当'L'列有多个值时,会创建多行记录。提供的解决方案涉及对'L'列进行拆分、堆叠和重新组合操作,最终删除包含NaN的行。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

I have a data frame looks like this:

P Q L

1 2 3

2 3

4 5 6,7

The objective is to check if there is any value in L, if yes, extract the value on L and P column:

P L

1 3

4,6

4,7

Note there might more than one values in L, in the case of more than 1 value, I would need two rows.

Bellow is my current script, it cannot generate the expected result.

df2 = []

ego

other

newrow = []

for item in data_DF.iterrows():

if item[1]["L"] is not None:

ego = item[1]['P']

other = item[1]['L']

newrow = ego + other + "\n"

df2.append(newrow)

data_DF2 = pd.DataFrame(df2)

解决方案

First I extract multiple values of column L to new dataframe s with duplicity index from original index. Remove unnecessary columns L and Q. Then output join to original df and drop rows with NaN values.

print df

P Q L

0 1 2 3

1 2 3 NaN

2 4 5 6,7

s = df['L'].str.split(',').apply(pd.Series, 1).stack()

s.index = s.index.droplevel(-1) # to line up with df's index

s.name = 'L'

print s

0 3

2 6

2 7

Name: L, dtype: object

df = df.drop( ['L', 'Q'], axis=1)

df = df.join(s)

print df

P L

0 1 3

1 2 NaN

2 4 6

2 4 7

df = df.dropna().reset_index(drop=True)

print df

P L

0 1 3

1 4 6

2 4 7

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值