新手的异步爬虫的问题:爬取一个异步加载的网页,找到了其中的XHR文件的链接,可以在浏览器打开,但是无法用requests.get爬取,返回404错误,是我的请求
问答交流
1776 人阅读
|
2 人回复
|
2020-11-20
|
import requests
headers = {'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36' } url='https://www.amazon.com/profilewidget/timeline/visitor?nextPageToken=&filteredContributionTypes=productreview%2Cglimpse%2Cideas&directedId=amzn1.account.AHA5SZDDFWIEZQTBKI2AYSUNEZGQ&token=eyJhbGciOiJIUzI1NiJ9.eyJzZXNzaW9uSWQiOiIxNDYtMTE0NDk0OS0wMzAxNTU1IiwicmVxdWVzdGVyRGlyZWN0ZWRJZCI6ImFtem4xLmFjY291bnQuQUVaN0hFSVBBMlFUUExIR0dZTFM1RUY1WE9MUSIsImV4cCI6MTYwNTg2OTQ4MCwiZGlyZWN0ZWRJZCI6ImFtem4xLmFjY291bnQuQUhBNVNaRERGV0lFWlFUQktJMkFZU1VORVpHUSJ9.6kpdCXyNh52Yy_WfZHHSO4U7VYVBTkxLX5Fc2clkWXQ'
r = requests.get(url, headers=headers) print(r.status_code)
|
|
|
|
|
|
|
荞荞
发表于 2020-12-17 10:07:54
|
显示全部楼层
|
|
|
|
|
|
wang910129
发表于 2020-12-17 10:16:46
|
显示全部楼层
兄弟,亚马逊的?你看看你的网址,网页上都打不开 。你把你要抓的连接发来,大家好看看 |
|
|
|
|
|