Search
 
SCRIPT & CODE EXAMPLE
 

PYTHON

How to send data to scrapy pipeline to mongodb

# In your pipeline

class EPGD_pipeline(object):
    def __init__(self):
        self.collections = {
            spider_name: self.setup_db_connection(dj_mongo_database_url.parse(url))
            for spider_name, url in settings['MONGODB_PIPELINE_SETTINGS'].iterItems()
        )
    }

    def process_item(self, item, spider):
        collection = self.collections[spider.name]
        ...


# In settings.py

MONGODB_PIPELINE_SETTINGS = {
    "GenDis": "mongodb://myhost:29297/test_db/collection",
    "EPGD": "mongodb://myhost:29297/test_db/collection2",
}
Comment

How to send data to scrapy pipeline to mongodb

    BOT_NAME = 'capstone'

    SPIDER_MODULES = ['capstone.spiders']
    NEWSPIDER_MODULE = 'capstone.spiders'

    ITEM_PIPLINES = {'capstone.pipelines.MongoDBPipeline': 300,}
    MONGO_URI = 'mongodb://localhost:27017'
    MONGO_DATABASE = 'congress'
    ROBOTSTXT_OBEY = True
    DOWNLOAD_DELAY = 10
Comment

How to send data to scrapy pipeline to mongodb

    from pymongo import MongoClient
    from scrapy.conf import settings
    from scrapy.exceptions import DropItem
    from scrapy import log

    class MongoDBPipeline(object):
        collection_name= 'members'
        def __init__(self, mongo_uri, mongo_db):
            self.mongo_uri = mongo_uri
            self.mongo_db = mongo_db
        @classmethod
        def from_crawler(cls, crawler):
            return cls(
                mongo_uri=crawler.settings.get('MONGO_URI')
                mongo_db=crawler.settings.get('MONGO_DATABASE', 'items')
            )
        def open_spider(self,spider):
            self.client = pymongo.MongoClient(self.mongo_uri)
            self.db = self.client[self.mongo_db]
        def close_spider(self, spider):
            self.client.close()
        def process_item(self, item, spider):
            self.db[self.collection_name].insert(dict(item))
            return item
Comment

PREVIOUS NEXT
Code Example
Python :: pycharm shortcut to create methos 
Python :: set shortcut for Qaction pyqt5 
Python :: iterating over the two ranges simultaneously 
Python :: io.imsave 16 bit 
Python :: how to use print statement in python 
Python :: pyqt message box set information text 
Python :: como inserir regras usg pelo prompt 
Python :: add 10 min to current time django 
Python :: more args python 
Python :: python itérer dictionnaire 
Python :: python update python 
Python :: Python Reloading a module 
Python :: pyqt-opengl-drawing-simple-scenes 
Python :: Python Printing negative timedelta object 
Python :: pie plot chance size python 
Python :: How to provide type hinting in UserDict 
Python :: django register form return a 302 request 
Python :: for loop python terminal 
Python :: python regex compile 
Python :: compresser fichier pyhton 
Python :: colab show all value 
Python :: sample stochastic gradient boosting regressor algorithm 
Python :: sanic ip whitelist 
Python :: appropriate graph for data visualization 
Python :: turtle screen close error fix 
Python :: 2D list from dataframe column 
Python :: def print_seconds(hours minutes seconds) print() print_seconds(1 2 3) 
Python :: huffepuf 
Python :: how to draw squircle python 
Python :: asp blocking sedular python stackoverflow 
ADD CONTENT
Topic
Content
Source link
Name
5+8 =