热门标签:
Q:

如何使用Selenium和Python点击tripadvisor中的第一篇评论中的阅读更多链接

我正在使用selector gadget从本网站上的第一次评论中的"阅读更多"按钮中获取xpath

这是它给出的xpath:

//*[contains(concat( " ", @class, " " ), concat( " ", "Z", " " ))]

这是我正在使用的代码的第一部分:

import selenium
import csv #This package lets us save data to a csv file
from selenium import webdriver #The Selenium package we'll need
import time #This package lets us pause execution for a bit
from selenium.webdriver.common.by import By

path_to_file = "/Users/user/Desktop/HotelReviews.csv"

pages_to_scrape = 3

url = "https://www.tripadvisor.com/Hotel_Review-g60982-d209422-Reviews-Hilton_Waikiki_Beach-Honolulu_Oahu_Hawaii.html"

# open the file to save the review
csvFile = open(path_to_file, 'a', encoding="utf-8")
csvWriter = csv.writer(csvFile)

for i in range(0, pages_to_scrape):
    
    driver = webdriver.Chrome()
    driver.get("url")
    # give the DOM time to load
    time.sleep(2) 
    driver.find_element_by_xpath("//*[contains(concat( " ", @class, " " ), 
    concat( " ", "Z", " " ))], 'Read more')]").click()

这是我得到的错误:

File "/var/folders/6c/jpl964752rv_72zjclrp_8ym0000gn/T/ipykernel_24978/2812702568.py", line 8
    driver.find_element_by_xpath("//*[contains(concat( " ", @class, " " ), concat( " ", "Z", " " ))], 'Read more')]").click()
                                                                                         ^
SyntaxError: invalid syntax

看起来是引号引起来的问题。

所以我遵循了这个建议。 我试图使代码成为一个变量,但它吐出了同样的错误。 我试图删除额外的引号,同样的错误。 我试图删除引号之间的空格,同样的错误。

我尝试了一个不同的xpath,一个用于整个评论 //*[contains(concat( " ", @class, " " ), concat( " ", "F1", " " ))] 同样的错误。

然后我尝试调整第一个xpath上的引号

driver.find_element_by_xpath("//*[contains(concat( " ", @class, " " ), 
    concat( " ", "Z", " " ))]", "Read more")]).click()

结果相同的错误。

原网址
A:

Toclick()on The Read more link from the first review withintripadvisor website您需要为element_to_be_clickable()诱导WebDriverWait并且您可以使用以下定位器策略:

  • 使用XPATH:

    driver.get('https://www.tripadvisor.com/Hotel_Review-g60982-d209422-Reviews-Hilton_Waikiki_Beach-Honolulu_Oahu_Hawaii.html')
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@placeholder='Search reviews']//following::div[@data-test-target='HR_CC_CARD']//span[text()='Read more']"))).click()
    
  • 注意:您必须添加以下导入:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
  • 浏览器快照:

<img alt="少读"src="https://i.stack.imgur.com/yfaAF.png缧/>

所有回答

共 1 条

author avatar

基本问题是,虽然例如a[x="3"]是一个有效的XPath表达式,但如果不转义引号,则不能将其作为"a[x="3"]"放入Python字符串字面量中。 我不是Python用户,但在大多数语言中你会写"a[x=\"3\"]";或者在XPath中单引号和双引号可以互换使用,所以你可以写"a[x='3']"

相似问题