|
I have the domain www.mydomain.com and I set apache mod-rewrite so as to have www.mydomain.com/myappl. Where should I place the file robots.txt? thanks!
Started by Angelo on
, 3 posts
by 3 people.
Answer Snippets (Read the full thread at stackoverflow):
So if you don't know guess where.
How ;-) But you can use the robots checker to find if your file is accessible.
The robots.txt must be accessible to clients as http://www.mydomain.com/robots.txt .
|
|
I would like to check to a remote website if it contains some files. Eg. robots.txt, or favicon.ico. Of course the files should be accessible (read mode).
So if the website is : http://www.myweb.com/ i would like to check if http://www.myweb.com/robots...
Started by Granit on
, 4 posts
by 4 people.
Answer Snippets (Read the full thread at stackoverflow):
Naturally....
For favicons it.
The header that it's a custom error page instead of a robots.txt (which should be served as text/plain ).
Containing stuff that robots.txt is allowed to contain and favicon.ico should be an image file.
|
|
I'm reviewing a couple of my web sites to make sure my SEO bases are covered. There are no private pages on the site in question, and we want all pages to be indexed. Does including a robots.txt file anyway, even if it isn't needed, make my site look ...
Started by Stuart B on
, 6 posts
by 6 people.
Answer Snippets (Read the full thread at stackoverflow):
Google has....
If it's not there, the search engineno, it won't.
Search engines will look for a robots.txt file.
Their instructions / guidelines are actually pretty thorough.
As per Google , it's a good idea.
robots.txt file.
|
Ask your Facebook Friends
|
On Kongregate we use one robots.txt for the main site , which isn't really very funny at all, and another one for subdomains , which is pretty funny.
# we don't serve your kind here User-Agent: * Disallow: /
Any other goods ones out there?
Started by Duncan Beevers on
, 4 posts
by 3 people.
Answer Snippets (Read the full thread at superuser):
Agent: zombies
Disallow: /brains
http://www.hilton.com/robots.txt
# Do not visit Hilton.com during the day!
It's a few years old now, but see Andrew Wooster's robots.txt Adventure , where he spelunks through various sites finding funny....
|
|
I've got a Plone site that I administer and I'd like to add some pages to the Disallow of a robots.txt.
It appears that Plone automatically generates a robots.txt file. I can't find any way to modify that. I've also tried adding a 'robots.txt' file to...
Started by Jimmy Z on
, 3 posts
by 3 people.
Answer Snippets (Read the full thread at stackoverflow):
Using the ZMI....
Create a robots.txt on your desktop Go in the xmi of the plone site -> add a new file -> call#nabble-td329779
As has been noted already, Plone 3.x already includes a robots.txt file so the preferred one.
|
|
Is any one please tell me that what is robots.txt file??? and how we implement in my site?? Is it necessary for any site???
Started by pratap295 on
, 11 posts
by 10 people.
Answer Snippets (Read the full thread at digitalpoint):
Robots.txt file is to improve site Robots....
This is Called Robots.txt. Files Robots.txt file and public for sharing in the search engine indexes and what is not.
Be indexed and which Web pages should be ignored.
|
|
I would like to know that which things of website, an expert SEO must put into robots.txt file for disallow? Which pages is better to not show to Search Engines?
Started by jesicawillss on
, 11 posts
by 11 people.
Answer Snippets (Read the full thread at affiliateseeking):
I agree not want to share with....
If you want to attach robots.txt file then don't include private pages or info of your website in robot file....
It is better not to show the security pages in robots.txt file like https.
|
|
I have a bunch of files at www.example.com/A/B/C/NAME (A,B,C change around, NAME is static) and I basically want to add a command in robots.txt so crawlers don't follow any such links that have NAME at the end.
What's the best command to use in robots...
Started by Mike F on
, 7 posts
by 6 people.
Answer Snippets (Read the full thread at serverfault):
To my knowledge there is no pattern matching routine supported by the robots.txt file parsers in mind that listing those files in the robots.txt file will give out a list of those links to anyone who so it executes....
|
|
I am using htaccess in my site, such that all the request to my site will be redirected to index page in my root directory. No other file in my site can be accessed because my htaccess will restrict it. My doubt is, when I use robots.txt file, will the...
Started by Goysar on
, 3 posts
by 3 people.
Answer Snippets (Read the full thread at stackoverflow):
That is, http://yoursite....
Best wishes,
Fabian
How about trying if you can yourself access the robots.txt via a web browser? If you can, then the search or such).
Replace _%{REQUEST_FILENAME} with robots.txt and you should be fine.
Further.
|
|
I have a site with the following robots.txt in the root:
User-agent: * Disabled: / User-agent: Googlebot Disabled: / User-agent: Googlebot-Image Disallow: /
And pages within this site are getting scanned by Googlebots all day long. Is there something ...
Started by Tim Scott on
, 4 posts
by 4 people.
Answer Snippets (Read the full thread at stackoverflow):
Maybe give the Google robots.txt checker a try.
Google have an analysis tool for checking robots.txt if they really are owned by Google.
It should be Disallow: , not Disabled:.
|