Pandas-Series常用用法积累

智汇君2025-01-17

Pandas-Series常用用法积累

series.between

1	["tweet_date"].between("2024-02-01", "2024-02-29")

series.isin([])

1	order[['dishes_id','dishes_name']][order['dishes_name'].isin(['内蒙古烤羊腿','xxx'])]

series.str

1
2
3

https://zhuanlan.zhihu.com/p/30894133

https://blog.csdn.net/weixin_43750377/article/details/107979607

.str.cat

.str.split

.str.findall(r”#\w+”)

1	hashtags = tweets_feb_2024["tweet"].str.findall(r"#\w+")

.str.contain(‘xxx’)

1	order[['dishes_id','dishes_name']][order['dishes_name'].str.contains('烤')]

.str.extract

1	tweets["hashtag"] = "#" + tweets["tweet"].str.extract(r"#(\w+)")

Series.dt

.dt.strftime

1	tweets = tweets[tweets["tweet_date"].dt.strftime("%Y%m") == "202402"]

series可以遍历

1	循环遍历可以打印每一行

1
2
3

hashtags = tweets_feb_2024["tweet"].str.findall(r"#\w+")
    for i in hashtags:
        print(i)

.value_counts()

1	统计series中各个值的个数

.reset_index()

any(series)

.max()

.apply(xx)

1
2
3

# 添加新列：
df['salary_level'] = df['salary'].apply(lambda x: '高' if x > 12000 else '中' if x > 9000 else '低')
df['age_group'] = pd.cut(df['age'], bins=[0, 30, 35, 100], labels=['青年', '中年', '资深'])

.pct_change()

1	对数值型series列求变化率，(当前行-上一行)/当前行