1、apply()

pandas.DataFrame.apply

该方法可对Series和DataFrame进行操作

DataFrame.apply(self, func, axis=0, raw=False, result_type=None, args=(), **kwds)

  • func应用函数
  • axis 0为行,1为列
  • raw 传入为 Series 还是 ndarray 对象
  • result_type 返回类型, expand 将返回结果扩展
>>>df = pd.DataFrame([[4, 9]] * 3, columns=['A', 'B'])
>>>df
   A  B
0  4  9
1  4  9
2  4  9
>>>df.apply(np.sum, axis=0)
A    12
B    27
dtype: int64
>>>df.apply(np.sum, axis=1)
A    12
B    27
dtype: int64

>>>df.apply(lambda x: [1, 2], axis=1)
0    [1, 2]
1    [1, 2]
2    [1, 2]
dtype: object

>>>df.apply(lambda x: [1, 2], axis=1, result_type='expand')
   0  1
0  1  2
1  1  2
2  1  2
#高级应用
import numpy as np
>>>df=pd.DataFrame(np.random.randn(6,4),index=list('abcdef'),columns=list('ABCD'))
>>>df
          A         B         C         D
a  1.728403  0.837979  0.802319  0.290152
b -0.975462  1.083493 -0.233180 -0.431483
c -0.142911 -0.277469 -0.249412  0.152072
d -0.398880 -0.441956  0.374148  0.984257
e -0.042556  0.156787 -0.689351  1.551284
f  0.741325  0.557786 -2.353452  0.042494

>>>df['E']=df['D'].apply(lambda x:x*2)
>>>df
          A         B         C         D         E
a  1.728403  0.837979  0.802319  0.290152  0.580305
b -0.975462  1.083493 -0.233180 -0.431483 -0.862965
c -0.142911 -0.277469 -0.249412  0.152072  0.304143
d -0.398880 -0.441956  0.374148  0.984257  1.968514
e -0.042556  0.156787 -0.689351  1.551284  3.102569
f  0.741325  0.557786 -2.353452  0.042494  0.084989

>>>df['F']=df.apply(lambda row:max(row['B'],row['C']),axis=1)
>>>df
          A         B         C         D         E         F
a  1.728403  0.837979  0.802319  0.290152  0.580305  0.837979
b -0.975462  1.083493 -0.233180 -0.431483 -0.862965  1.083493
c -0.142911 -0.277469 -0.249412  0.152072  0.304143 -0.249412
d -0.398880 -0.441956  0.374148  0.984257  1.968514  0.374148
e -0.042556  0.156787 -0.689351  1.551284  3.102569  0.156787
f  0.741325  0.557786 -2.353452  0.042494  0.084989  0.557786


>>>def get_res(a,b):
       print([a*3,b-2])
       return a*3,b-2
>>>df=pd.concat([df,df.apply(lambda row:pd.Series(get_res(row['A'],row['B']), index=['H', 'I']),axis=1,result_type='expand')],axis=1)
>>>df
         A         B         C  ...         F         H         I
a  1.728403  0.837979  0.802319  ...  0.837979  5.185209 -1.162021
b -0.975462  1.083493 -0.233180  ...  1.083493 -2.926385 -0.916507
c -0.142911 -0.277469 -0.249412  ... -0.249412 -0.428732 -2.277469
d -0.398880 -0.441956  0.374148  ...  0.374148 -1.196640 -2.441956
e -0.042556  0.156787 -0.689351  ...  0.156787 -0.127668 -1.843213
f  0.741325  0.557786 -2.353452  ...  0.557786  2.223976 -1.442214

2、map()

pandas.Series.map

map() 只对Series有作用

Series.map(self, arg, na_action=None)

>>> s = pd.Series(['cat', 'dog', np.nan, 'rabbit'])
>>> s
0       cat
1       dog
2       NaN
3    rabbit
dtype: object

>>> s.map({'cat': 'kitten', 'dog': 'puppy'})
0    kitten
1     puppy
2       NaN
3       NaN
dtype: object

>>> s.map('I am a {}'.format, na_action='ignore')
0       I am a cat
1       I am a dog
2              NaN
3    I am a rabbit
dtype: object

3、applymap()

pandas.DataFrame.applymap

DataFrame.applymap(self, func)

只对DataFrame有作用

>>> df = pd.DataFrame([[1, 2.12], [3.356, 4.567]])

>>> df
       0      1
0  1.000  2.120
1  3.356  4.567

>>> df.applymap(lambda x: len(str(x)))
   0  1
0  3  4
1  5  5

Categories: pandasPython

0 Comments

Leave a Reply

Your email address will not be published.