Search
 
SCRIPT & CODE EXAMPLE
 

PYTHON

web crawler using python

import requests
import lxml
from bs4
import BeautifulSoup
url = "https://www.rottentomatoes.com/top/bestofrt/"
headers = {
  'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36 QIHU 360SE'
}
f = requests.get(url, headers = headers)
movies_lst = []
soup = BeautifulSoup(f.content, 'lxml')
movies = soup.find('table', {
    'class': 'table'
  })
  .find_all('a')
num = 0
for anchor in movies:
  urls = 'https://www.rottentomatoes.com' + anchor['href']
movies_lst.append(urls)
num += 1
movie_url = urls
movie_f = requests.get(movie_url, headers = headers)
movie_soup = BeautifulSoup(movie_f.content, 'lxml')
movie_content = movie_soup.find('div', {
  'class': 'movie_synopsis clamp clamp-6 js-clamp'
})
print(num, urls, '
', 'Movie:' + anchor.string.strip())
print('Movie info:' + movie_content.string.strip())
Comment

python web crawler

import scrapy

class BlogSpider(scrapy.Spider):
    name = 'blogspider'
    start_urls = ['https://blog.scrapinghub.com']

    def parse(self, response):
        for title in response.css('.post-header>h2'):
            yield {'title': title.css('a ::text').get()}

        for next_page in response.css('a.next-posts-link'):
            yield response.follow(next_page, self.parse)
Comment

PREVIOUS NEXT
Code Example
Python :: udp server python 
Python :: prevent division by zero numpy 
Python :: search in dict python 
Python :: basic script 
Python :: pandas -inf and inf to 0 
Python :: how to take multiple line input in python 
Python :: join to dataframes pandas 
Python :: how to get any letter of a string python 
Python :: python timer() 
Python :: python threading 
Python :: get user django 
Python :: how to change data type from int to float in dataframe 
Python :: sha256 decrypt python 
Python :: delete cell in jupyter notebook 
Python :: sqlalchemy_database_uri 
Python :: write to csv pandas 
Python :: django update request.post 
Python :: python check if string contains 
Python :: pi in python 
Python :: list to dict python with same values 
Python :: python pandas shift last column to first place 
Python :: python beginner projects 
Python :: radiobuttons django 
Python :: unsigned int python 
Python :: superscript python 
Python :: list methods append in python 
Python :: how to print specific part of a dictionary in python 
Python :: convert list to set python 
Python :: pandas resample groupby 
Python :: python package for confluence 
ADD CONTENT
Topic
Content
Source link
Name
7+3 =