Verify SPIDER_MANAGER_CLASS interface while loading it in CrawlerRunner #1148

curita · 2015-04-10T16:50:34Z

Discussed in #873.

Verify SPIDER_MANAGER_CLASS interface while loading it in CrawlerRunner

kmike · 2015-04-16T10:54:37Z

This change broke some SpiderManager implementations which don't implement find_by_request method. find_by_request is only useful for scrapy fetch, scrapy parse and scrapy shell commands; crawling should work fine without this method.

kmike · 2015-04-16T13:58:08Z

tests/test_crawler.py

@@ -42,3 +43,19 @@ class CustomSettingsSpider(DefaultSpider):

        self.assertFalse(settings.frozen)
        self.assertTrue(crawler.settings.frozen)
+
+
+def SpiderManagerWithWrongInterface(object):


Oh god, why did we merged this (should be 'class', got a little too functional there). I'm not even sure how the test is working.

curita · 2015-04-16T16:20:11Z

About find_by_request, I'm not sure, I kept it in the interface on purpose (It was there before my changes in the Crawler API), I'd like that any SpiderManager implemented all needed functions to run all commands. We're doing this instead of letting it raise an exception later on because some method is not implemented, and that's going to happen eventually if we remove find_by_request from the interface (but I grant that another use-case, probably more important, for checking the interface is to verify that the SpiderManager is not using the old interface, and that could be done without find_by_request since it's present in the old and the new interface).

I still prefer this check being done over the complete interface, but we could change it.

curita · 2015-04-16T18:59:18Z

Now that I think of it again maybe we could issue a warning instead of an exception if SpiderManager (soon to be SpiderLoader) doesn't comply with the interface, seems like it's the more tolerant alternative.

Another option could be to raise an exception if the absolutely needed methods to run are not implemented (which I think are from_settings and load, list seems like it's only needed for scrapy list) and then issue a warning if the other two methods are not there.

I like both options (but the first one slightly more).

MojoJolo · 2015-04-17T07:24:42Z

@kmike pointed me into this issue. Looks like my problem is related.

I'm having an error and getting The find_by_request attribute was not provided. message when executing scrapy crawl or even scrapy version.

Verify SPIDER_MANAGER_CLASS interface in CrawlerRunner

24a07fd

pablohoffman added a commit that referenced this pull request Apr 13, 2015

Merge pull request #1148 from Curita/verify-spidermanager-interface

71c0afa

Verify SPIDER_MANAGER_CLASS interface while loading it in CrawlerRunner

pablohoffman merged commit 71c0afa into scrapy:master Apr 13, 2015

curita added this to the Scrapy 1.0 milestone Apr 14, 2015

curita deleted the verify-spidermanager-interface branch April 14, 2015 14:11

kmike reviewed Apr 16, 2015
View reviewed changes

curita mentioned this pull request Apr 16, 2015

[MRG+1] rename SpiderManager to SpiderLoader #1166

Merged

curita mentioned this pull request Apr 23, 2015

Relax SpiderLoader interface check #1187

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Verify SPIDER_MANAGER_CLASS interface while loading it in CrawlerRunner #1148

Verify SPIDER_MANAGER_CLASS interface while loading it in CrawlerRunner #1148

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Verify SPIDER_MANAGER_CLASS interface while loading it in CrawlerRunner #1148

Verify SPIDER_MANAGER_CLASS interface while loading it in CrawlerRunner #1148

Uh oh!

Conversation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!