ML之DS:仅需一行代码实现对某字段下的所有数值实现同一机制的改变或转换(比如全部转为str类型/全部取平方值)
ML之DS:仅需一行代码实现对某字段下的所有数值实现同一机制的改变或转换(比如全部转为str类型/全部取平方值)
仅需一行代码实现对某字段下的所有数值实现同一机制的改变或转换(比如全部转为str类型/全部取平方值)
输出结果
name object
ID object
age object
age02 int64
age03 object
born datetime64[ns]
sex object
hobbey object
money float64
weight float64
test01 float64
test02 float64
dtype: object
name ID age age02 age03 born sex hobbey money weight 0 Bob 1 NaN 14 14 NaT 男 打篮球 200.0 140.5
1 LiSa 2 28 26 26 1990-01-01 女 打羽毛球 240.0 120.8
2 Mary 38 24 24 1980-01-01 女 打乒乓球 290.0 169.4
3 Alan None 6 6 NaT None 300.0 155.6
test01 test02
0 1.000000 1.000000
1 2.123457 2.123457
2 3.123457 3.123457
3 4.123457 4.123457
name ID age age02 age03 born sex hobbey money weight 0 Bob 1 NaN 14 14 NaT 男 打篮球 200.0 140.5
1 LiSa 2 28 26 26 1990-01-01 女 打羽毛球 240.0 120.8
2 Mary 38 24 24 1980-01-01 女 打乒乓球 290.0 169.4
3 Alan None 6 6 NaT None 300.0 155.6
test01 test02 age02_Square
0 1.000000 1.0 196
1 2.123457 2.123456789 676
2 3.123457 3.123456781011126 576
3 4.123457 4.123456789109999 36
实现代码
import pandas as pd
import numpy as np
contents={"name": ['Bob', 'LiSa', 'Mary', 'Alan'],
"ID": [1, 2, ' ', None], # 输出 NaN
"age": [np.nan, 28, 38 , '' ], # 输出
"age02": [14, 26, 24 , 6],
"age03": [14, '26', '24' , '6'],
"born": [pd.NaT, pd.Timestamp("1990-01-01"), pd.Timestamp("1980-01-01"), ''], # 输出 NaT
"sex": ['男', '女', '女', None,], # 输出 None
"hobbey":['打篮球', '打羽毛球', '打乒乓球', '',], # 输出
"money":[200.0, 240.0, 290.0, 300.0], # 输出
"weight":[140.5, 120.8, 169.4, 155.6], # 输出
"test01":[1, 2.123456789, 3.123456781011126, 4.123456789109999], # 输出
"test02":[1, 2.123456789, 3.123456781011126, 4.123456789109999], # 输出
}
data_frame = pd.DataFrame(contents)
# data_frame.to_excel("data_Frame.xls")
print(data_frame.dtypes)
print(data_frame)
# ML之DS:仅需一行代码实现对某字段下的所有数值实现同一机制的改变或转换(比如全部转为str类型/全部取平方值)
col='test02'
data_frame[col].astype("string")
data_frame[col]=data_frame[col].apply(str)
def ChangeSquare(x):
return x*x
col='age02'
data_frame[col+'_Square']=data_frame[col].apply(ChangeSquare)
print(data_frame)
赞 (0)