Python

Search

web crawler using python

import requests
import lxml
from bs4
import BeautifulSoup
url = "https://www.rottentomatoes.com/top/bestofrt/"
headers = {
  'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36 QIHU 360SE'
}
f = requests.get(url, headers = headers)
movies_lst = []
soup = BeautifulSoup(f.content, 'lxml')
movies = soup.find('table', {
    'class': 'table'
  })
  .find_all('a')
num = 0
for anchor in movies:
  urls = 'https://www.rottentomatoes.com' + anchor['href']
movies_lst.append(urls)
num += 1
movie_url = urls
movie_f = requests.get(movie_url, headers = headers)
movie_soup = BeautifulSoup(movie_f.content, 'lxml')
movie_content = movie_soup.find('div', {
  'class': 'movie_synopsis clamp clamp-6 js-clamp'
})
print(num, urls, '
', 'Movie:' + anchor.string.strip())
print('Movie info:' + movie_content.string.strip())

Comment

python web crawler

import scrapy

class BlogSpider(scrapy.Spider):
    name = 'blogspider'
    start_urls = ['https://blog.scrapinghub.com']

    def parse(self, response):
        for title in response.css('.post-header>h2'):
            yield {'title': title.css('a ::text').get()}

        for next_page in response.css('a.next-posts-link'):
            yield response.follow(next_page, self.parse)

Comment

PREVIOUS	NEXT

Code Example
Python :: pandas write to excel
Python :: rotate 90 degrees clockwise counter python
Python :: Python Requests Library Put Method
Python :: make a window tkinter
Python :: tqdm progress bar python
Python :: timestamp to date time till milliseconds python
Python :: python webbrowser close tab
Python :: how to open application using python
Python :: python group by multiple aggregates
Python :: python string to int
Python :: python checking if something is equal to NaN
Python :: python talib install windows
Python :: dataframe move row up one
Python :: sort series in ascending order
Python :: python read entire file
Python :: Clear All the Chat in Discord Channel With Bot Python COde
Python :: pip install streamlit
Python :: print class python
Python :: how to disconnect wifi using python
Python :: create dictionary from keys and values python
Python :: python get last element of list
Python :: how to get images on flask page
Python :: Sorting Dataframes by Column Python Pandas
Python :: knowing the sum null values in a specific row in pandas dataframe
Python :: readlines from file python
Python :: biggest of 3 numbers in python
Python :: beautiful soup get class name
Python :: python check if number is integer or float
Python :: convert list to nd array
Python :: plot second y axis matplotlib

Search

PYTHON

web crawler using python

python web crawler

ADD CONTENT