新手的异步爬虫的问题:爬取一个异步加载的网页,找到了其中的XHR文件的链接,可以在浏览器打开,但是无法用requests.get爬取,返回404错误,是我的请求头有问题吗,我有添加简单的user-agent和cookies参数 无合适标签 未标注

zzzm 6天前 39

import requests 


headers = {'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36'    } url='https://www.amazon.com/profilewidget/timeline/visitor?nextPageToken=&filteredContributionTypes=productreview%2Cglimpse%2Cideas&directedId=amzn1.account.AHA5SZDDFWIEZQTBKI2AYSUNEZGQ&token=eyJhbGciOiJIUzI1NiJ9.eyJzZXNzaW9uSWQiOiIxNDYtMTE0NDk0OS0wMzAxNTU1IiwicmVxdWVzdGVyRGlyZWN0ZWRJZCI6ImFtem4xLmFjY291bnQuQUVaN0hFSVBBMlFUUExIR0dZTFM1RUY1WE9MUSIsImV4cCI6MTYwNTg2OTQ4MCwiZGlyZWN0ZWRJZCI6ImFtem4xLmFjY291bnQuQUhBNVNaRERGV0lFWlFUQktJMkFZU1VORVpHUSJ9.6kpdCXyNh52Yy_WfZHHSO4U7VYVBTkxLX5Fc2clkWXQ' 


r = requests.get(url, headers=headers) print(r.status_code)


最新回复 (0)
返回