SharePoint Experts, Information Architects, Expert Witness

We provide consulting in a broad array of business and technology from architecture to design to deployment of global systems with a focus on surfacing data in the enterprise. Specialists in Microsoft, we are a premier provider of SharePoint Expertise (including 2016 and Office 365). We also provide Expert Witness/Legal Expert in eDiscovery, source discovery, patent infringement, piracy and more! We also have established SICG DLDS s.a. - our counterpart in Costa Rica that specializes in water systems (http://www.crwatersolutions.com) - Contact me direct: david_sterling@sterling-consulting.com or call 704-873-8846 x704.

Search This Blog

Wednesday, February 18, 2015

SharePoint Search and Robots.txt - telling search not to crawl parts of SharePoint 2013

Much of SharePoint's workings should A) not be accessbile (i.e. Site Settings, etc.) and B) shouldn't be crawled for content purposes. For A, there are no canned solutions but in the past, I've created a master page the redirected unauthorized users and in SharePoint 2013 there are ways to reduce for users using the Limited-access user permission lockdown mode feature:

 Limited-access user permission lockdown mode

For B, it is recommended to add Robots.txt file that will tell search to skip areas in SharePoint as follows:

1) Use Notepad (or Notepad++) to create a new text file and add the following:
User-Agent: *
Disallow: /_Layouts/
Disallow: /search/
Disallow: /searchcenter/
Disallow: /Documents/Forms/ 

WHERE:

Entry                                       What it means
User-Agent: *                           Apply to All Users
Disallow: /_Layouts/                Block _layouts
Disallow: /search/                    Block default search
Disallow: /searchcenter/          Block enterprise search center
Disallow: /Documents/Forms/  Block forms (i.e. Edit.aspx, etc.) 

Apply the proper URL here for the name of your search site (i.e. Search, SC, SearchCenter, etc.). Save this file as robots.txt.

2) Move/copy this file to the root folder of the SharePoint site (as in c:\inetpub\wwwroot\wss\virtual directories\<port #>. Right click on the file and grant "Everyone" read access. ALTERNATELY, the file can be uploaded to SharePoint's root using SharePoint designer.

3) Open Central Administration and click on Application Management > Manage Web Applications.

4) Click the Web Application to select it then in the ribbon, click Managed Paths.

5) The 'Add a new Path' text box, enter in /robots.txt, change the type to be Explicit Inclusion and click the Add Path button.

6) Open up the SharePoint Management Shell or a Command window using Run as administrator and run IISReset.


No comments: