Httpx库

简介

  • httpx是Python新一代的网络请求库, 官方文档open in new window
    • 兼容Resquests语法
    • 基于Python3的功能齐全的http请求模块
    • 既能发送同步请求,也能发送异步请求
    • 支持HTTP/1.1和HTTP/2
    • 能够直接向WSGI应用程序或者ASGI应用程序发送请求

安装

# pip安装
pip install httpx[http2]
# python安装
python -m pip install httpx[http2]

使用

  • 发送请求

# 引入库
import httpx

# 开启http/2支持(通过client请求)
client = httpx.Client(http2=True)
# 发送请求
r = httpx.get('https://httpbin.org/get')
r = httpx.post('https://httpbin.org/post', data={'key': 'value'})
r = httpx.put('https://httpbin.org/put', data={'key': 'value'})
r = httpx.delete('https://httpbin.org/delete')
r = httpx.head('https://httpbin.org/get')
r = httpx.options('https://httpbin.org/get')
  • 带有请求头和请求参数

import httpx

headers = {'user-agent': 'my-app/1.0.0'}
params = {'key1': 'value1', 'key2': 'value2'}
url = 'https://httpbin.org/get'
r = httpx.get(url, headers=headers, params=params)
print(r)
print(r.status_code)  # 状态码
print(r.encoding)  # 文本编码
print(r.text)
print(r.json())
  • 带有cookies的请求

import httpx

url = 'http://httpbin.org/cookies'
cookies = {'color': 'green'}
r = httpx.get(url, cookies=cookies)
print(r.json())  # {'cookies': {'color': 'green'}}
  • 设置超时时间

import httpx

r = httpx.get('http://httpbin.org', timeout=0.001)
print(r)
  • 使用client发送请求

import httpx

with httpx.Client() as client:
    headers = {'X-Custom': 'value'}
    r = client.get('https://example.com', headers=headers)
    print(r.text)
  • HTTP代理

import httpx

proxies = {
    'http://': 'http://localhost:8080',  # 代理1
    'https://': 'http://localhost:8081',  # 代理2
}
url = 'https://example.com'
with httpx.Client(proxies=proxies) as client:
    r1 = client.get(url)
    print(r1)

Parsel库

简介

安装

# pip安装
pip install parsel
# python安装
python -m pip install parsel

使用

  • 初始化

# 引入库
from parsel import Selector

# 请求数据
html = httpx.get('https://ssr1.scrape.center').text

# 创建selector对象 并传入文档
selector = Selector(text=html)
  • css选择器

for iterm in iterms:
    text = item.css('.name ::text').get
    print(text)
  • xpath选择器

for iterm in iterms:
    text = item.xpath('//a[contains(@class, "name")]').get
    print(text)

Xlwings库

简介

安装

# pip安装
pip install xlwings
# python安装
python -m pip install xlwings