Our site is receiving requests which try to access non existing pages. Usually it is only case of misspelled address. Like the right link is /Info/SiteMap.aspx, but the bad request is /Inf/SiteMap.aspx with missing "o". There are usually many request, like 20 in row in one minute. All throw 404.
Is it problem of some indexing robot? Or does someone test gaps in our site?
Have any experiences or tips?
Respostas:1 para resposta № 1
Yes, Consider the following things.
The IP address of the host sending the requests - If they are distinct but several IPs we can suspect it as a distributed attack done using Zombies.
. If the requests are coming from same source, check the delay between requests. Generally crawlers do not use extremely short periods between the requests.
Indexing Robots (Crawlers) do not perform "Brute force" type indexing. They just retrieve the links from one page and recursively traverse page by page. So the reason for this should be not a indexing robot.
Check for any patterns. I mean a sequential naming pattern etc.
EX : /Inf/SiteMap.aspx, /Infa/SiteMap.aspx, /Infb/SiteMap.aspx