Pandas学习-Series

智汇君2025-01-17

Pandas学习-Series

Series

创建series

通过列表list

使用列表和数组创建的Series数组则是副本，改变其中一个不会影响另一个。

>>> from pandas import Series,DataFrame
>>> s=Series([2,1,5,4,3])
>>> s
0    2
1    1
2    5
3    4
4    3
dtype: int64
>>> type(s)
<class 'pandas.core.series.Series'>

1
2
3

series默认index是从0 - N-1

可以使用.values keys index获取对应类型数据

>>> s.keys
<bound method Series.keys of 0    2
1    1
2    5
3    4
4    3
dtype: int64>
>>> s.values
array([2, 1, 5, 4, 3])
>>> s.index
RangeIndex(start=0, stop=5, step=1)

通过numpy

使用numpy创建出来的Series数组不是副本。这意味着当你改变原来的numpy数组时，Series也会跟着改变


>>> data = np.array([1, 2, 3, 4, 5])
>>> series = pd.Series(data)
>>> series
0    1
1    2
2    3
3    4
4    5
dtype: int64
>>> data[1]=10
>>> series
0     1
1    10
2     3
3     4
4     5
dtype: int64
>>> data
array([ 1, 10,  3,  4,  5])

1	s1=Series(np.random.randint(1,10,size=5))#用arrary创建一个Series数组

通过dict

1	使用字典创建的Series对象也不是副本，而是与字典共享数据。因此，如果你修改了原始字典中的数据，那么对应的Series对象也会跟着改变

in操作

>>> s2.index = ['a','b','c','d','e']
>>> s2
a    2
b    1
c    5
d    4
e    3
dtype: int64
>>> 'a' in s2
True

修改value值

>>> s2['c']=55
>>> s2
a     2
b     1
c    55
d     4
e     3
dtype: int64

通过传入dict创建series

>>> s3=Series({'5':1,'6':2,'7':3})
>>> s3
5    1
6    2
7    3
dtype: int64

指定index

>>> s2=Series([2,1,5,4,3],index=['a','b','c','d','e'])
>>> s2
a    2
b    1
c    5
d    4
e    3
dtype: int64

1	可以在创建series对象时指定，也可以在创建之后修改

>>> s2.index=[0,1,2,3,4]
>>> s2
0    2
1    1
2    5
3    4
4    3
dtype: int64

取值

series取值：通过[],iloc,loc
[]：
-能用显式标签索引(没有设置默认等于隐式索引，设置后可以是数字类型或者字符串类型)-最简单的方法就是输入变量回车看显示索引是啥 
-[x:x]里面只要是数字就是左闭右开，字符串就是左闭又闭 
-设置的是字符串索引，还是可以用隐式索引；设置的是数字索引，则隐式索引不能用(取的是标签数字对应的值)

iloc 它使用隐式索引:0-n-1 左闭右开
loc (没有设置默认等于隐式索引，设置后可以是数字类型或者字符串类型)-最简单的方法就是输入变量回车看显示索引是啥 不管是数字还是字符串都是左闭又闭

dataframe取值：
取行值通过iloc,loc
iloc 只能用隐式索引0-len-1 左闭右开
loc 只能用显式索引 可以是字符串或者数字(以实际显式索引为准) 左闭右闭

取列值通过：
.列 ['列']
loc iloc

通过索引

通过和list一样的方法[index]

>>> s2[3]
np.int64(4)
>>> type(s2[3])
<class 'numpy.int64'>

# 如果行标签为字符串类型，也可以用['xx']

切片

使用切片方式

>>> s2[2:4] # 前闭后开
2    5
3    4
dtype: int64

>>> type(s2[2:4])
<class 'pandas.core.series.Series'>

取多值

指定index数组来取多个值
>>> s2[[0,4,2]]
0    2
4    3
2    5
dtype: int64
>>> type(s2[[0,4,2]])
<class 'pandas.core.series.Series'>

1	前面两个结果都是series

iloc

1	配合隐式的索引，官方推荐的访问机制

import numpy as np
import pandas
from pandas import Series
s1=Series(np.random.randint(1,10,size=5,),index=["A","B","C","D","E"])
#自行设置的有index，为字符类型,可以用loc取值(如果行标签为数字类型，也可以用loc，但是是左闭又闭)，iloc可以对默认行索引取值(左闭右开)
print(s1[0])
print(s1[0:3])
print(s1.iloc[[0,1]])

loc

1	配合显式的索引，官方推荐的访问机制

import numpy as np
import pandas
from pandas import Series
s1=Series(np.random.randint(1,10,size=5,),index=["A","B","C","D","E"])#用arrary创建一个Series数组
print(s1)
print(s1.loc["A":"B"]) # 前闭后闭

运算

>>> s2+2
0    4
1    3
2    7
3    6
4    5
dtype: int64
>>> s2*3
0     6
1     3
2    15
3    12
4     9
dtype: int64

与一个实数做逻辑运算，结果还是series

>>> s2>3
0    False
1    False
2     True
3     True
4    False
dtype: bool

过滤操作

>>> s2[s2>3]
2    5
3    4
dtype: int64