Pandas - Groupby with conditional formula(Pandas - 带条件公式的 Groupby)
问题描述
Survived SibSp Parch
0 0 1 0
1 1 1 0
2 1 0 0
3 1 1 0
4 0 0 1
Given the above dataframe, is there an elegant way to groupby
with a condition?
I want to split the data into two groups based on the following conditions:
(df['SibSp'] > 0) | (df['Parch'] > 0) = New Group -"Has Family"
(df['SibSp'] == 0) & (df['Parch'] == 0) = New Group - "No Family"
then take the means of both of these groups and end up with an output like this:
SurvivedMean
Has Family Mean
No Family Mean
Can it be done using groupby or would I have to append a new column using the above conditional statement?
An easy way to group that is to use the sum of those two columns. If either of them is positive, the result will be greater than 1. And groupby accepts an arbitrary array as long as the length is the same as the DataFrame's length so you don't need to add a new column.
family = np.where((df['SibSp'] + df['Parch']) >= 1 , 'Has Family', 'No Family')
df.groupby(family)['Survived'].mean()
Out:
Has Family 0.5
No Family 1.0
Name: Survived, dtype: float64
这篇关于Pandas - 带条件公式的 Groupby的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:Pandas - 带条件公式的 Groupby
- 我如何透明地重定向一个Python导入? 2022-01-01
- 计算测试数量的Python单元测试 2022-01-01
- ";find_element_by_name(';name';)";和&QOOT;FIND_ELEMENT(BY NAME,';NAME';)";之间有什么区别? 2022-01-01
- CTR 中的 AES 如何用于 Python 和 PyCrypto? 2022-01-01
- 如何使用PYSPARK从Spark获得批次行 2022-01-01
- YouTube API v3 返回截断的观看记录 2022-01-01
- 使用公司代理使Python3.x Slack(松弛客户端) 2022-01-01
- 检查具有纬度和经度的地理点是否在 shapefile 中 2022-01-01
- 使用 Cython 将 Python 链接到共享库 2022-01-01
- 我如何卸载 PyTorch? 2022-01-01