pandas||df.dropna() 缺失值删除操作代码

作者:袖梨 2022-06-25

本篇文章小编给大家分享一下pandas||df.dropna() 缺失值删除操作代码,文章代码介绍的很详细,小编觉得挺不错的,现在分享给大家供大家参考,有需要的小伙伴们可以来看看。

df.dropna()函数用于删除dataframe数据中的缺失数据,即 删除NaN数据.

官方函数说明:

DataFrame.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)
 Remove missing values.
 See the User Guide for more on which values are considered missing, 
 and how to work with missing data.
Returns
 DataFrame
 DataFrame with NA entries dropped from it.

参数说明:

测试:

>>>df = pd.DataFrame({"name": ['Alfred', 'Batman', 'Catwoman'],
          "toy": [np.nan, 'Batmobile', 'Bullwhip'],
          "born": [pd.NaT, pd.Timestamp("1940-04-25"),
              pd.NaT]})
>>>df
    name    toy    born
0  Alfred    NaN    NaT
1  Batman Batmobile 1940-04-25
2 Catwoman  Bullwhip    NaT

删除至少缺少一个元素的行:

>>>df.dropna()
   name    toy    born
1 Batman Batmobile 1940-04-25

删除至少缺少一个元素的列:

>>>df.dropna(axis=1)
    name
0  Alfred
1  Batman
2 Catwoman

删除所有元素丢失的行:

>>>df.dropna(how='all')
    name    toy    born
0  Alfred    NaN    NaT
1  Batman Batmobile 1940-04-25
2 Catwoman  Bullwhip    NaT

只保留至少2个非NA值的行:

>>>df.dropna(thresh=2)
    name    toy    born
1  Batman Batmobile 1940-04-25
2 Catwoman  Bullwhip    NaT

从特定列中查找缺少的值:

>>>df.dropna(subset=['name', 'born'])
    name    toy    born
1  Batman Batmobile 1940-04-25

修改原数据:

>>>df.dropna(inplace=True)
>>>df
   name    toy    born
1 Batman Batmobile 1940-04-25

相关文章

精彩推荐