Update cheatsheets

2026-03-13 07:59:15 +00:00 · 2024-02-21 11:19:49 +00:00
parent 4e88a1b42f
commit 3d653cc7e6
4803 changed files with 127002 additions and 0 deletions
--- a/37
+++ b/37
@@ -0,0 +1,37 @@
+---
+syntax: markdown
+tags: [tldr, common]
+source: https://github.com/tldr-pages/tldr.git
+---
+# scrapy
+
+> Web-crawling framework.
+> More information: <https://scrapy.org>.
+
+- Create a project:
+
+`scrapy startproject {{project_name}}`
+
+- Create a spider (in project directory):
+
+`scrapy genspider {{spider_name}} {{website_domain}}`
+
+- Edit spider (in project directory):
+
+`scrapy edit {{spider_name}}`
+
+- Run spider (in project directory):
+
+`scrapy crawl {{spider_name}}`
+
+- Fetch a webpage as Scrapy sees it and print the source to `stdout`:
+
+`scrapy fetch {{url}}`
+
+- Open a webpage in the default browser as Scrapy sees it (disable JavaScript for extra fidelity):
+
+`scrapy view {{url}}`
+
+- Open Scrapy shell for URL, which allows interaction with the page source in a Python shell (or IPython if available):
+
+`scrapy shell {{url}}`