- Home
- Categorie
- Coding e Sistemistica
- WordPress
- File Robots.txt
-
File Robots.txt
Ho letto qlc post sul forum
E sono concluso a scrivere questo robots.txtUser-agent: Googlebot
Disallow:
User-agent: Googlebot-Image
Disallow:
User-agent: MSNBot
Disallow:
User-agent: Slurp
Disallow:
User-agent: Teoma
Disallow:
User-agent: Gigabot
Disallow:
User-agent: Scrubby
Disallow:
User-agent: Robozilla
Disallow:
User-agent: BecomeBot
Disallow:
User-agent: Nutch
Disallow:
User-agent: Fast
Disallow:
User-agent: Scooter
Disallow:
User-agent: Mercator
Disallow:
User-agent: Ask Jeeves
Disallow:
User-agent: teoma_agent
Disallow:
User-agent: ia_archiver
Disallow:
User-agent: BizBot04 kirk.overleaf.com
Disallow:
User-agent: HappyBot (gserver.kw.net)
Disallow:
User-agent: CaliforniaBrownSpider
Disallow:
User-agent: EINet/0.1 libwww/0.1
Disallow:
User-agent: Ibot/1.0 libwww-perl/0.40
Disallow:
User-agent: Merritt/1.0
Disallow:
User-agent: StatFetcher/1.0
Disallow:
User-agent: TeacherSoft/1.0 libwww/2.17
Disallow:
User-agent: WWW Collector
Disallow:
User-agent: processor/0.0ALPHA libwww-perl/0.20
Disallow:
User-agent: wobot/1.0 from 206.214.202.45
Disallow:
User-agent: WhoWhere Robot
Disallow:
User-agent: ITI Spider
Disallow:
User-agent: w3index
Disallow:
User-agent: MyCNNSpider
Disallow:
User-agent: SummyCrawler
Disallow:
User-agent: OGspider
Disallow:
User-agent: linklooker
Disallow:
User-agent: CyberSpyder
Disallow:
User-agent: SlowBot
Disallow:
User-agent: heraSpider
Disallow:
User-agent: Surfbot
Disallow:
User-agent: Bizbot003
Disallow:
User-agent: WebWalker
Disallow:
User-agent: SandBot
Disallow:
User-agent: EnigmaBot
Disallow:
User-agent: spyder3.microsys.com
Disallow:
User-agent: 205.252.60.71
Disallow:
User-agent: 194.20.32.131
Disallow:
User-agent: 198.5.209.201
Disallow:
User-agent: acke.dc.luth.se
Disallow:
User-agent: dallas.mt.cs.cmu.edu
Disallow:
User-agent: darkwing.cadvision.com
Disallow:
User-agent: waldec.com
Disallow:
User-agent: www2000.ogsm.vanderbilt.edu
Disallow:
User-agent: unet.ca
Disallow:
User-agent: murph.cais.net
Disallow:
User-agent: *
Disallow: /cgi-bin
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: /wp-content/themes
Disallow: /?*
Disallow: /*?
Sitemap: link tua sitemapè quello ottimale secondo voi?
-
elencare tutta quella serie di spider senza motivo (non vengono bloccati, quindi perchè stabilirne una regola??) non credo serva a molto...
-
L'unica parte che potrebbe andar bene è
User-agent: * Disallow: /cgi-bin Disallow: /wp-admin Disallow: /wp-includes Disallow: /wp-content/plugins Disallow: /wp-content/cache Disallow: /wp-content/themes
per quanto riguarda le ultime 3 righe, non so. Non so come funzioni il sitemap nel robots.txt, e le due righe con /? corrono il rischio di bloccare qualsiasi pagina che contiene un punto interrogativo nel titolo, e magari non è quello che vuoi.
Disallow: /*?* Disallow: /*? Sitemap: link tua sitemap