1.requests介绍与请求
1.requests模块介绍与请求的发送
2.requests请求参数讲解与响应数据处理
3.requests模块处理cookie与session
1.1 requests模块介绍
requests模块是python的第三方模块, 用来发送网络请求,常用于爬虫, 能够完全满足基于HTTP协议的接口测试.
requests模块的安装:
# 安装:
pip install requests
# 验证:
pip show requests
1.2 requests模块发送请求
(1).简单的发送get请求
# 导包
import requests
# 发送请求
response = requests.get("http://www.baidu.com")
# 查看响应
print("原始的数据编码为:", response.encoding) # 响应数据的编码
print("设置前响应数据:", response.text) # 响应数据的文本形式
(2).get请求说明
# 说明
1.get()请求内第一个参数为url, 即发送请求的网址
2.requests.get()发送get请求. 发送其他请求可以将get换作相应的请求方法即可.
3.response是请求的返回值对象, 注意response是一个对象, 想要获取response中的内容需要调用响应对象的相应方法实现
(3).练习案例
# 需求:
请求网址: http://you.163.com/item/list?categoryId=1013001&_stat_area=nav_5&_stat_referer=index, 获取响应数据, 从响应数据中拿到商品信息.
提示: 商品信息在响应数据, 可以将响应数据转换成文本, 通过正则提取
var json_Data={"currentCategory":{"showIndex":4,"superCategoryId":0,"level":"L1","bannerUrl":"https://yanxuan.nosdn.127.net/25121bcedec1443f80caa7b9abcbe889.png","name":"个护清洁","subCateGroupList":[],"iconUrl":"https://yanxuan.nosdn.127.net/a6dcd39065e12767ec099cf37b65f000.png","id":1013001,"frontDesc":"亲肤之物,严选天然","subCateList":[],"frontName":""},"deliveryAreaList":[{"count":293,"name":"香港","id":1},{"count":293,"name":"澳门","id":2},{"count":251,"name":"台湾","id":3},{"count":2,"name":"美洲","id":7}],"focusList":[],"categoryItemList":[{"itemList":[{"layawayList":[],"promId":0,"scenePicUrl":"https://yanxuan-item.nosdn.127.net/f3af9b6eaf15e6e3be1f14e14da90930.jpg","itemTagList":[],"rank":0,"id":1036003,"sellVolume":993,"primaryPicUrl":"https://yanxuan-item.nosdn.127.net/bbdbfc9d084e0188af383ff9255a4cc1.png","categoryRank":0,"soldOut":false,"onSaleTime":1466647757483,"picMode":1,"underShelf":false,"status":2,"couponConflict":true,"forbiddenBuy":false,"promotionDesc":"品类狂欢","limitedFlag":204,"pieceNum":0,"hasMoreCouponsShow":false,"itemSizeTableDetailFlag":false,"forbidExclusiveCal":false,"updateTime":1621423082740,"pieceUnitDesc":"件","specialPromTag":"","retailPrice":79.00,"primarySkuPreSellPrice":0,"preLimitFlag":0,"itemPromValid":true,"promTag":"品类狂欢","source":0,"remarkTitle":"","primarySkuPreSellStatus":0,"extraServiceFlag":0,"flashPageLink":"","autoOnsaleTimeLeft":0,"innerData":{},"saleCenterSkuId":0,"pointsStatus":0,"colorNum":0,"showTime":0,"autoOnsaleTime":0,"preemptionStatus":1,"isPreemption":0,"promLogo":{"width":63,"logoUrl":"http://yanxuan.nosdn.127.net/deade6672ad7d23a9024cd31cb684173.png","height":60},"name":"别致养生设计 猪鬃气垫按摩梳","appExclusiveFlag":false,"itemType":1,"listPicUrl":"https://yanxuan-item.nosdn.127.net/52eaa046dcf297a35767fa35c94da7d6.png","pointsPrice":0,"collectionedByUser":false,"simpleDesc":"猪鬃顺发,按摩减乏","seoTitle":"","newItemFlag":false,"remarkTargetUrl":"","buttonType":0,"primarySkuId":1035006,"displaySkuId":1035006,"productPlace":"","itemSizeTableFlag":false},{"layawayList":[],"promId":0,"scenePicUrl":"https://yanxuan-item.nosdn.127.net/f3c020b5af6221d531c9269e5039f8c1.jpg","itemTagList":[{"itemId":1046008,"tagId":128565465,"freshmanExclusive":false,"name":"品类狂欢","appFreshmanBargain":false,"subType":204,"forbidJump":false,"type":2}],"listPromBanner":...};
# 参考示例代码:
# 导包
import re
import json
import requests
# 发送请求
url = 'http://you.163.com/item/list?categoryId=1013001&_stat_area=nav_5&_stat_referer=index'
res = requests.get(url=url)
# 处理响应数据
data_json = re.findall(r'var json_Data=({"currentCategory":.*?}]});', res.text)
if data_json:
data = json.loads(data_json[0])
else:
print('未获取到数据!')
2.requests请求参数与响应
2.1 URL参数与查询参数
URL参数是请求的网址为字符串形式.
如果是get请求, URL中可以携带查询参数, 使用request模块可以直接将请求参数写在URL中, 也可以通过params参数传递查询参数, 示例如下:
# 直接将查询参数拼接在URL中
import requests
url = 'http://www.baidu.com/s?wd=python'
res = requests.get(url=url)
print(res.text)
# 通过params传递查询参数
import requests
url = 'http://www.baidu.com/s'
params = {
'wd': 'python'
}
res = requests.get(url=url, params=params)
2.2 发送post请求的data参数与json参数
发送post请求一般需要传递表单数据或json数据给服务器, 所以我们需要了解一下如何传递数据给服务器.
# 传递formdata数据
import requests
url = "http://www.laoyu.com/login/"
data = {
'username': 'laoyu'
'password': '123456'
}
res = requests.post(url=url, data=data)
# 传递json数据
import requests
url = "http://www.laoyu.com/login/"
data = {
'username': 'laoyu'
'password': '123456'
}
res = requests.post(url=url, json=data)
2.3 设置请求头
import requests
url = 'https://www.baidu.com/'
# 请求头信息
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.77 Safari/537.36'
}
res = requests.get(url=url, headers=headers)
print(res.text)
2.4 响应的三种形式与其他属性
# requests响应三种形式的获取:
- 文本形式: response.text
- json形式: response.json(), 注意有括号.
- 二进制流: response.content
# requests响应其他属性:
- response.status_code 状态码
- response.url 请求url
- response.encoding 查看响应头部字符编码
- response.headers 头信息
- response.cookies cookie信息
3.cookie与session处理
3.2 requests模块对cookie的处理
由于HTTP协议是无状态的, 所以想让两次请求之间有状态的保持, 我们需要使用cookie来达到目的.
请参照下面示例:
# cookie处理
import requests
url = 'http://www.baidu.com/'
cookies = "BIDUPSID=6C53432BC579B9F60F5EBDB41E16CEBD; PSTM=1606314482; BAIDUID=6C53432BC579B9F6AF5843643A61C1AD:FG=1; BD_UPN=123253; __yjs_duid=1_4a5b93be115661dc9239f249eabf438b1619354479792; BDUSS=NkZlpsNGRhNUVxaDBRUWs1TE1JMmZ1YkJ3RWdqdi1mMEFsUEctMHRKdHNkTmxnRVFBQUFBJCQAAAAAAAAAAAEAAABwfMtW09rQodPjMDgyMGZyZWUAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGznsWBs57FgSz; BDUSS_BFESS=NkZlpsNGRhNUVxaDBRUWs1TE1JMmZ1YkJ3RWdqdi1mMEFsUEctMHRKdHNkTmxnRVFBQUFBJCQAAAAAAAAAAAEAAABwfMtW09rQodPjMDgyMGZyZWUAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGznsWBs57FgSz; BDORZ=B490B5EBF6F3CD402E515D22BCDA1598; H_PS_PSSID=31660_33848_33607_22158; BAIDUID_BFESS=6C53432BC579B9F6AF5843643A61C1AD:FG=1; delPer=0; BD_CK_SAM=1; PSINO=2; COOKIE_SESSION=2_1_5_4_7_17_0_1_5_2_5_0_64105_0_8_0_1623114683_1623050552_1623114675%7C9%23881404_20_1623050545%7C9; BD_HOME=1; sugstore=0; Hm_lvt_aec699bb6442ba076c8981c6dc490771=1622390556,1622455056,1622774170,1623133058; Hm_lpvt_aec699bb6442ba076c8981c6dc490771=1623133058; BDSVRTM=0"
res = requests.get(url=url, cookies=cookies)
print(res.text)
3.3 session类自动封装cookie
多个接口的状态保持, 每次传递cookie参数比较麻烦, 则是我们可以借助requests模块提供的session类, 该类可以自动封装cookie, 实现所有请求之间的状态保持, 但需要注意, 一旦使用sesion类后, 不能中途使用requests直接发送请求, 这样会中断cookie的状态保持.
# session状态保持示例: http://www.2552.com.cn/e/member/doaction.php http://www.2552.com.cn/e/member/login/
from requests import Session
session = Session()
url_login = 'http://www.2552.com.cn/e/member/doaction.php'
data = {
'ecmsfrom': '',
'enews': 'login',
'tobind': '0',
'username': 'Jeremy',
'password': '123456',
'lifetime': '0',
'Submit': '登 录'
}
session.post(url=url_login, data=data)
url_mine = 'http://www.2552.com.cn/e/member/cp/'
res = session.get(url=url_mine)
with open('./mypage.html', 'w', encoding='utf8') as f:
f.write(res.text)