导航：首页 > 互联网科技 >

Pandas中dff的示例分析

发表于：2025-02-24 作者：千家信息网编辑

千家信息网最后更新 2025年02月24日，这篇文章主要介绍Pandas中dff的示例分析，文中介绍的非常详细，具有一定的参考价值，感兴趣的小伙伴们一定要看完！数据分析处理库import pandas as pddf=pd.read_csv("

千家信息网最后更新 2025年02月24日Pandas中dff的示例分析

这篇文章主要介绍Pandas中dff的示例分析，文中介绍的非常详细，具有一定的参考价值，感兴趣的小伙伴们一定要看完！

数据分析处理库

import pandas as pddf=pd.read_csv("./pandas/data/titanic.csv")

df.head(N) 读取数据的前N行

df.head(6)

df.info() 获取DataFrame的简要摘要

df.info()

RangeIndex: 891 entries, 0 to 890Data columns (total 12 columns): #   Column       Non-Null Count  Dtype  ---  ------       --------------  -----   0   PassengerId  891 non-null    int64   1   Survived     891 non-null    int64   2   Pclass       891 non-null    int64   3   Name         891 non-null    object  4   Sex          891 non-null    object  5   Age          714 non-null    float64 6   SibSp        891 non-null    int64   7   Parch        891 non-null    int64   8   Ticket       891 non-null    object  9   Fare         891 non-null    float64 10  Cabin        204 non-null    object  11  Embarked     889 non-null    object dtypes: float64(2), int64(5), object(5)memory usage: 83.7+ KB

df.index 查看索引

df.index

RangeIndex(start=0, stop=891, step=1)

df.columns 查看所有列名

df.columns

Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp',       'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'],      dtype='object')

df.dtypes 查看每一列的字段类型

df.dtypes

PassengerId      int64Survived         int64Pclass           int64Name            objectSex             objectAge            float64SibSp            int64Parch            int64Ticket          objectFare           float64Cabin           objectEmbarked        objectdtype: object

df.values查看所有数据

df.values

array([[1, 0, 3, ..., 7.25, nan, 'S'],       [2, 1, 1, ..., 71.2833, 'C85', 'C'],       [3, 1, 3, ..., 7.925, nan, 'S'],       ...,       [889, 0, 3, ..., 23.45, nan, 'S'],       [890, 1, 1, ..., 30.0, 'C148', 'C'],       [891, 0, 3, ..., 7.75, nan, 'Q']], dtype=object)

df['Name']

0                                Braund, Mr. Owen Harris1      Cumings, Mrs. John Bradley (Florence Briggs Th...2                                 Heikkinen, Miss. Laina3           Futrelle, Mrs. Jacques Heath (Lily May Peel)4                               Allen, Mr. William Henry                             ...                        886                                Montvila, Rev. Juozas887                         Graham, Miss. Margaret Edith888             Johnston, Miss. Catherine Helen "Carrie"889                                Behr, Mr. Karl Howell890                                  Dooley, Mr. PatrickName: Name, Length: 891, dtype: object

df=df.set_index('Name')df

查询Age列的前8列数据

df['Age'][:8]

NameBraund, Mr. Owen Harris                                22.0Cumings, Mrs. John Bradley (Florence Briggs Thayer)    38.0Heikkinen, Miss. Laina                                 26.0Futrelle, Mrs. Jacques Heath (Lily May Peel)           35.0Allen, Mr. William Henry                               35.0Moran, Mr. James                                        NaNMcCarthy, Mr. Timothy J                                54.0Palsson, Master. Gosta Leonard                          2.0Name: Age, dtype: float64

对单列数据的操作

age=df['Age']age

NameBraund, Mr. Owen Harris                                22.0Cumings, Mrs. John Bradley (Florence Briggs Thayer)    38.0Heikkinen, Miss. Laina                                 26.0Futrelle, Mrs. Jacques Heath (Lily May Peel)           35.0Allen, Mr. William Henry                               35.0                                                       ... Montvila, Rev. Juozas                                  27.0Graham, Miss. Margaret Edith                           19.0Johnston, Miss. Catherine Helen "Carrie"                NaNBehr, Mr. Karl Howell                                  26.0Dooley, Mr. Patrick                                    32.0Name: Age, Length: 891, dtype: float64

# 每一个Age统一加10age=age+10age

NameBraund, Mr. Owen Harris                                32.0Cumings, Mrs. John Bradley (Florence Briggs Thayer)    48.0Heikkinen, Miss. Laina                                 36.0Futrelle, Mrs. Jacques Heath (Lily May Peel)           45.0Allen, Mr. William Henry                               45.0                                                       ... Montvila, Rev. Juozas                                  37.0Graham, Miss. Margaret Edith                           29.0Johnston, Miss. Catherine Helen "Carrie"                NaNBehr, Mr. Karl Howell                                  36.0Dooley, Mr. Patrick                                    42.0Name: Age, Length: 891, dtype: float64

# Age的最大值age.max()

90.0

# Age的最小值age.min()

10.42

# Age的平均值age.mean()

39.69911764705882

describe得到数据的基本统计特征

df.describe()

只查询某集几列

df[['Age','Fare']][:5]

通过索引或者标签查询数据

# 通过索引查看某一行的数据df.iloc[0]# 查询前4行数据df.iloc[0:5]# 查询前4行前3列的数据df.iloc[0:5,1:3]

# 通过索引列值读取某一行的数据df.loc['Futrelle, Mrs. Jacques Heath (Lily May Peel)']# 查询某行某列的某个值df.loc['Futrelle, Mrs. Jacques Heath (Lily May Peel)','Age']# 查询某几行的数某几列的数据df.loc['Braund, Mr. Owen Harris':'Graham, Miss. Margaret Edith','Sex':'Age']# 修改某个值df.loc['Heikkinen, Miss. Laina','Age']=2000

bool运算

# 查询Age大于50的前5行数据df[df['Age']>50][:5]# 查询Sex为female的数据df[df['Sex']=='female']# 计算Sex为male,Age的平均值df.loc[df['Sex']=='male','Age'].mean()# 计算Age大于50的年龄和(df['Age']>50).sum()

DataFrame groupby数据分组

dff=pd.DataFrame({'key':['A','B','C','A','B','C','A','B','C'],'value':[0,5,10,5,10,15,10,15,20]})dff

按照key分组求和

dff.groupby('key').sum()

import numpy as npdff.groupby('key').aggregate(np.mean)

# 按照Sex分组,计算Age的平均值df.groupby('Sex')['Age'].mean()

Sexfemale    35.478927male      30.726645Name: Age, dtype: float64

数值运算

df1=pd.DataFrame([[1,2,3,4],[3,4,5,6]],index=['a','b'],columns=['A','B','C','D'])df1

# 每一列求值df1.sum()df1.sum(axis=0)

A     4B     6C     8D    10dtype: int64

# 每一行求和df1.sum(axis=1)

a    10b    18dtype: int64

# 每一列求平均值df1.mean(axis=0)

A    2.0B    3.0C    4.0D    5.0dtype: float64

# 每一行求平均值df1.mean(axis=1)

a    2.5b    4.5dtype: float64

df

# 协方差df.cov()

# 相关性df.corr()

# 统计某一个每一个值出现的次数df['Age'].value_counts()

24.00    3022.00    2718.00    2628.00    2519.00    25         ..53.00     155.50     170.50     123.50     10.42      1Name: Age, Length: 89, dtype: int64

# 统计某一个每一个值出现的次数,次数由少到多排列df['Age'].value_counts(ascending=True)

0.42      123.50     170.50     155.50     153.00     1         ..19.00    2528.00    2518.00    2622.00    2724.00    30Name: Age, Length: 89, dtype: int64

对象操作(Series一行或者一列)

data=[1,2,3,4]index=['a','b','c','d']s=pd.Series(index=index,data=data)# 查询第一行s[0]# 查询1到3行s[1:3]# 掩码操作 只显示a c行mask=[True,False,True,False]s[mask]#修改某个值s['a']=200# 值替换将3替换为300s.replace(to_replace=3,value=300,inplace=True)# 修改列名s.rename(index={'a':'A'},inplace=True)# 添加数据s1=pd.Series(index=['e','f'],data=[5,6])s3=s.append(s1)# 删除A行数据del s3['A']# 一次删除多行数据s3.drop(['c','d'],inplace=True)s3

b    2e    5f    6dtype: int64

DataFrame的增删改查操作

# 构造一个DataFramedata=[[1,2,3,4],[5,6,7,8]]index=['a','b']columns=['A','B','C','D']dff=pd.DataFrame(data=data,index=index,columns=columns)

	A	B	C	D
a	1	2	3	4
b	5	6	7	8

# 通过loc('索引值')和iloc(索引数值)查询dff1=dff.iloc[1]dff1=dff.loc['a']dff1

A    1B    2C    3D    4Name: a, dtype: int64

# 修改值dff.loc['a']['A']=1000dff

	A	B	C	D
a	1000	2	3	4
b	5	6	7	8

# 修改索引dff.index=['m','n']dff

	A	B	C	D
m	1000	2	3	4
n	5	6	7	8

# 添加一行数据dff.loc['o']=[10,11,12,13]dff

	A	B	C	D
m	1000	2	3	4
n	5	6	7	8
o	10	11	12	13

#  添加一列数据dff['E']=[5,9,14]dff

	A	B	C	D	E
m	1000	2	3	4	5
n	5	6	7	8	9
o	10	11	12	13	14

# 批量添加多列数据df4=pd.DataFrame([[6,10,15],[7,11,16],[8,12,17]],index=['m','n','o'],columns=['F','M','N'])df5=pd.concat([dff,df4],axis=1)df5

	A	B	C	D	E	F	M	N
m	1000	2	3	4	5	6	10	15
n	5	6	7	8	9	7	11	16
o	10	11	12	13	14	8	12	17

# 删除一行数据df5.drop(['o'],axis=0,inplace=True)df5

	A	B	C	D	E	F	M	N
m	1000	2	3	4	5	6	10	15
n	5	6	7	8	9	7	11	16

# 删除列df5.drop(['E','F'],axis=1,inplace=True)df5

	A	B	C	D	M	N
m	1000	2	3	4	10	15
n	5	6	7	8	11	16

以上是"Pandas中dff的示例分析"这篇文章的所有内容，感谢各位的阅读！希望分享的内容对大家有帮助，更多相关知识，欢迎关注行业资讯频道！

很赞哦！

数据查询一行索引平均值分析次数分组统计示例内容数值篇文章运算最大最小价值兴趣协方差字段数据库的安全要保护哪些东西数据库安全各自的含义是什么生产安全数据库录入数据库的安全性及管理数据库安全策略包含哪些海淀数据库安全审计系统建立农村房屋安全信息数据库易用的数据库客户端支持安全管理连接数据库失败ssl安全错误数据库的锁怎样保障安全数据库对配置要求外国植物病害数据库数据库设计页面的横轴坐标深圳服务器散热器哪家好语音对讲软件开发的主要特点信息网络安全管理制度ppt 国生互联网科技有限公司地址服务器英文单词江苏家居网络技术一体化腾讯容纳10万台服务器是几u的为什么进行软件开发规范网络安全认证系统水管网络技术软件开发需要了解哪些语言松江区会计软件开发口碑推荐服务器不显示svg图片化妆品erp软件开发新员工培训软件开发价格手机方舟服务器租借吐鲁番网络安全攻防 2000数据库脱机怎么解决逆光调色软件开发企业网络安全漏洞分析报告兴安盟软件开发定制莆田软件开发招聘 asp 提取数据库指定行 db2停止数据库 ping文件服务器无法访问蓝牙手机遥控软件开发 mc服务器需要正版账号吗

千家信息网