[Scrapy-6] XPath使用的一个坑

鸢公子 发表于 2020-2-26 12:12:45

昨天晚上发现的一个博客写的，感觉都可以，但限制离开他的思维，

https://www.jianshu.com/p/e56e94e387f9
import scrapyfrom scrapy.selector import Selectorclass QuoteSpider(scrapy.Spider):
    name = "quotes"
    start_urls = [
        "http://quotes.toscrape.com/"
    ]

    def parse(self, response):
        quotes = response.xpath("//div[@class='quote']")  # 分段 list
        for quote in quotes:
            print(quote.xpath("//span[@class='text']/text()").extract_first())  # 取的却不是小段的

    def parse(self, response):
        quotes = response.xpath("//div[@class='quote']").extract()  # 取出来
        for quote in quotes:
            print(Selector(text=quote).xpath("//span[@class='text']/text()").extract_first())  # 再做

    def parse(self, response):
        for divin response.xpath("//div[@class='quote']"):    # 本者更喜欢这样
            print(div.xpath(".//span[@class='text']/text()").get())  # 及简单又舒服
分段提取的思维一如往常，xpath 语法麻烦一批，

页: [1]

夜幕爬虫安全论坛|爬虫论坛's Archiver

[Scrapy-6] XPath使用的一个坑