Pandas-Series常用用法积累

Pandas-Series常用用法积累

series.between

1
["tweet_date"].between("2024-02-01", "2024-02-29")

series.isin([])

1
order[['dishes_id','dishes_name']][order['dishes_name'].isin(['内蒙古烤羊腿','xxx'])]

series.str

1
2
3
https://zhuanlan.zhihu.com/p/30894133

https://blog.csdn.net/weixin_43750377/article/details/107979607

.str.cat

.str.split

.str.findall(r”#\w+”)

1
hashtags = tweets_feb_2024["tweet"].str.findall(r"#\w+")

.str.contain(‘xxx’)

1
order[['dishes_id','dishes_name']][order['dishes_name'].str.contains('烤')]

.str.extract

1
tweets["hashtag"] = "#" + tweets["tweet"].str.extract(r"#(\w+)")

Series.dt

.dt.strftime

1
tweets = tweets[tweets["tweet_date"].dt.strftime("%Y%m") == "202402"]

series可以遍历

1
循环遍历可以打印每一行
1
2
3
hashtags = tweets_feb_2024["tweet"].str.findall(r"#\w+")
for i in hashtags:
print(i)

.value_counts()

1
统计series中各个值的个数

.reset_index()

any(series)

.max()

.apply()

1
2
3
# 添加新列:
df['salary_level'] = df['salary'].apply(lambda x: '高' if x > 12000 else '中' if x > 9000 else '低')
df['age_group'] = pd.cut(df['age'], bins=[0, 30, 35, 100], labels=['青年', '中年', '资深'])