Crawl a WordPress Blog with SharePoint 2013

At work we have a WordPress blog that we wanted to include in our public website’s search results.  Yep, our public website is SharePoint 2013.  We recently moved it to Azure, but it is a blog for another day…

Anyway, I started down this path and ran into a few issues before I sorted it out.  Once I figured it out I thought it’d be a good idea to share.

The first thing I did was go to my search config and create a new content source.  I added the URL to the blog to it and I started down a path of trying several different Crawl Settings.

Turns out I just needed to set it to Only crawl within the server of each start address.  I couldn’t tell this worked though because I kept running into this warning in my crawl logs every time I did a full crawl…

Item not crawled due to one of the following reasons: Preventive crawl rule; Specified content source hops/depth exceeded; URL has query string parameter; Required protocol handler not found; Preventive robots directive. ( This item was deleted because it was excluded by a crawl rule. )

I tried to google the site and could only ever get a result it I googled the URL, blog.b2btech.com, which told me…

A description for this result is not available because of this site’s robots.txt

I went and checked the reading settings on the blog.  Turns out the Search Engine Visibility check box, Discourage search engines from indexing this site, was checked.  I unchecked it and kicked off a crawl.  At this point I didn’t have the proper Crawl Setting set and was just trying to crawl the sitemap.xml file with SharePoint can’t do.  I experimented with crawl rules for a while and then switched back to the url of the blog in the content source.

This resulted in much more stuff coming into the index than I would ever want.

Eventually, what ended up working for me is the following:

  • Content source  with Blog
  • Crawl Setting set to, Only crawl within the server of each start address
  • Crawl rule set to
    • blogpath/*
    • User regular expression syntax for matching this ruled checkbox checked
    • Include all items in this path selected and all check boxes below left unchecked
    • Anonymous access

After getting my configuration sorted out as indicated above, I crawl worked as expected and I have blog entries showing up in search.

I specifically wrote this blog because this forum post didn’t provide a solution to the poster’s issues…

https://social.technet.microsoft.com/Forums/exchange/en-US/d0e50c07-662b-47c5-9347-b6fe44ec23ed/crawling-wordpress-blog?forum=sharepointsearch

Advertisement

SharePoint Search Driven Navigation Start to Finish Configuration.

If you have worked on a SharePoint Online site that has more than 20-30 links in the top nav, you have likely noticed that the response time is slower than it feels like it should be.  As the number of sites in the navigation increase so does the slowness until it become painful.

The culprit is structured navigation and its lack of efficiency.

There are a couple potential solutions to this.  One being managed metadata navigation,  the other being the topic of this post, Search Driven Navigation.

Put simply, search driven navigation uses a search query to create the navigation tree.  This option requires a developer and this blog post is being written because the documentation out there was not sufficient for me to deliver an acceptable top nav situation to a client.

Support.office.com offers an incomplete solution that leaves out what js libraries you need to add, where to get them, and provides js that conflicts with at least some ribbon buttons, one being the explorer view button.

https://support.office.com/en-us/article/Navigation-options-for-SharePoint-Online-adb92b80-b342-4ecb-99a1-da2a2b4782eb

Raymond Little’s blog was extremely helpful and addressed most of the issues in the support.office.com post

https://raymondlittle.wordpress.com/2015/01/20/updates-needed-to-get-navigation-options-for-sharepoint-online-msdn-article-to-work/

I recommend giving both of these a read as they will be helpful in getting your head around exactly what we are doing here. That said this blog will serve as a step by step/start to finish solution.

The very first thing you want to do is add the following to your masterpage.

  • You are going to need a custom master page for this, so if you don’t already have one make  a copy of one of the OOTB master pages and add this to its html file…
  • In my version the <![CDATA[ tags were added by designer, so they should be fine in yours, but if you run into any issues loading any of these scripts later on, you may want to remove them.
  • Notice that you will need a custom css file for this solution to work.  Obviously you are going to get errors from jump because you haven’t downloaded these libraries yet, but we will get there.
  • You will need to either create a SearchNav folder in the Style Library and put the required files there or you will need to change the paths below to match where you put the files
  • This all lives under this line in my masterpage.  There is some weirdness with these showing up unless they are in the right place
    • <!–SPM:–>
  • WordPress isn’t cool with script references, which makes sense, so the ones below look like links.  Those will need to be script references.
  • I highly recommend you use notepad++ or some other editor that lets you see what tags match etc when you are looking at the code in this post.  It will make it much easier to sort out.


https://ajax.googleapis.com/ajax/libs/jquery/2.2.2/jquery.min.js
/Style%20Library/searchNav/linq.js
/Style%20Library/SearchNav/SearchTopNavigation.js
/Style%20Library/SearchNav/knockout-3.4.0.js

Next let’s download knockout and linq.js

For linq.js you’ll need to download the zip from here and pull the linq.js file from it – UPDATE.  I’m adding the linq.js file that I use to this post because codeplex is going away and someone commented about an error in linq.js that they ran into, but I did not

Click here to download linq in a pdf

Now you have all the files that you need to download and you are ready to create the SearchTopNavigation.js file

Open your favorite editor , copy the following into it and put it in the appropriate folder.  A few notes about this js

  • I wrapped the bits under //Models and Namespaces in a IIFE to deal with the conflict with the ribbon buttons
  • the addNav function is not needed unless your client wants to have a way to add additional links to a drop down on their top menu.  The client I wrote this for wanted to be able to do that and basically had a quick links nav item that is tied to a list and allows them to add links via that list.
  • isIEorEDGE() deals with the fact that IE and Edge deal with sorting the opposite way that FF and Chrome do.  The rest of this is under //sorting stuff
  • If you are using test accounts this is going to look weird to you as you switch between account because the hierarchy is saved to local storage and pulled from there, so as to make this more efficient.  What that means for testing is that you’ll need to use different browsers or private browsing windows for different users or clear local storage often to see what that user will really see.

function checkLength(childCount)
{
if (childCount > 0)
{
return true;
}
else
{
return false;

}
}

function checkNoLength(childCount)
{
if (childCount == 0)
{
return true;
}
else
{
return false;

}
}

//only needed if you want to give client a way to add links

function addNav()
{
$.ajax({
url: “/_api/web/lists/getbytitle(‘added_Links’)/items?$orderby=NavOrder asc”,
type: “GET”,
headers: {“Accept”: “application/json;odata=verbose”},
cache:false,
success: function(data){

var items = [];
var count = 0;
var chCount;

$(data.d.results).each(function(){
items.push(‘

 

‘);

});

$(“#showNav”).html(items.join(”))}
});

}

function isIEorEDGE(){
if (navigator.appName == ‘Microsoft Internet Explorer’){
return true; // IE
}
else if(window.navigator.userAgent.indexOf(“Edge”) > -1){
// EDGE
return true;

}   else if (!!navigator.userAgent.match(/Trident\/7\./)){
return true;     }

return false;
}
//Models and Namespaces
(function () {var SPO = SPO || {};
SPO.Models = SPO.Models || {}
SPO.Models.NavigationNode = function () {

this.Url = ko.observable(“”);
this.Title = ko.observable(“”);
this.Parent = ko.observable(“”);

};
var isIE = isIEorEDGE();
var root = “https://yoursite.sharepoint.com&#8221;;
var baseUrl = root + “/_api/search/query?querytext=”;
var query = baseUrl + “‘contentClass=\”STS_Web\”-WebTemplate:APP+path:” + root + “‘&trimduplicates=false&rowlimit=300”;

var baseRequest = {
url: “”,
type: “”
};
//Parses a local object from JSON search result.
function getNavigationFromDto(dto) {
var item;
if (dto != undefined) {
item = new SPO.Models.NavigationNode();
item.Title(dto.Cells.results[3].Value);
item.Url(dto.Cells.results[6].Value);
item.Parent(dto.Cells.results[20].Value);
}

return item;
}
//Parse a local object from the serialized cache.
function getNavigationFromCache(dto) {
var item = new SPO.Models.NavigationNode();

if (dto != undefined) {

item.Title(dto.Title);
item.Url(dto.Url);
item.Parent(dto.Parent);
}

return item;
}

/* create a new OData request for JSON response */
function getRequest(endpoint) {
var request = baseRequest;
request.type = “GET”;
request.url = endpoint;
request.headers = { ACCEPT: “application/json;odata=verbose” };
return request;
};
/* Navigation Module*/
function NavigationViewModel() {
“use strict”;
var self = this;
self.nodes = ko.observableArray([]);
self.hierarchy = ko.observableArray([]);;
self.loadNavigatioNodes = function () {
//Check local storage for cached navigation datasource.
var fromStorage = localStorage[“nodesCache”];
if (fromStorage != null) {
var cachedNodes = JSON.parse(localStorage[“nodesCache”]);
var timeStamp = localStorage[“nodesCachedAt”];
if (cachedNodes && timeStamp) {
//Check for cache expiration. Currently set to 3 hrs.
var now = new Date();
var diff = now.getTime() – timeStamp;
if (Math.round(diff / (1000 * 60 * 60)) < 3) { //return from cache. var cacheResults = []; $.each(cachedNodes, function (i, item) { var nodeitem = getNavigationFromCache(item, true); cacheResults.push(nodeitem); }); var sortedArray = cacheResults.sort(self.sortObjectsInArray); self.buildHierarchy(sortedArray); self.toggleView(); return; } } } //No cache hit, REST call required. self.queryRemoteInterface(); }; //Executes a REST call and builds the navigation hierarchy. self.queryRemoteInterface = function () { var oDataRequest = getRequest(query); $.ajax(oDataRequest).done(function (data) { var results = []; $.each(data.d.query.PrimaryQueryResult.RelevantResults.Table.Rows.results, function (i, item) { if (i == 0) { //Add root element. var rootItem = new SPO.Models.NavigationNode(); rootItem.Title(“Quick Links”); rootItem.Url(root); rootItem.Parent(null); results.push(rootItem); } var navItem = getNavigationFromDto(item); results.push(navItem); }); //Add to local cache localStorage[“nodesCache”] = ko.toJSON(results); localStorage[“nodesCachedAt”] = new Date().getTime(); self.nodes(results); if (self.nodes().length > 0) {
var unsortedArray = self.nodes();
var sortedArray = unsortedArray.sort(self.sortObjectsInArray);
self.buildHierarchy(sortedArray);
self.toggleView();
addEventsToElements();
}
}).fail(function () {
//Handle error here!!
$(“#loading”).hide();
$(“#error”).show();
});
};
self.toggleView = function () {
var navContainer = document.getElementById(“navContainer”);
ko.applyBindings(self, navContainer);
$(“#loading”).hide();
$(“#navContainer”).show();

};
//Uses linq.js to build the navigation tree.
self.buildHierarchy = function (enumerable) {

self.hierarchy(Enumerable.From(enumerable).ByHierarchy(function (d) {
return d.Parent() == null;
}, function (parent, child) {

if (parent.Url() == null || child.Parent() == null)
return false;
return parent.Url().toUpperCase() == child.Parent().toUpperCase();
}).ToArray());
};
self.sortObjectsInArray = function (a,b){

//sorting stuff

if(isIE){
if (a.Title() > b.Title())
return -1;
if (a.Title() < b.Title()) return 1; return 0; } else { if (a.Title() > b.Title())
return 1;
if (a.Title() < b.Title())
return -1;
return 0;
}

}
}
//Loads the navigation on load and binds the event handlers for mouse interaction.
$(document).ready(function () {
“use strict”;
_spBodyOnLoadFunctionNames.push(“addNav”);
addNav();
var viewModel = new NavigationViewModel();
viewModel.loadNavigatioNodes();

});
}());

Save this file to wherever your reference points in the master page and let’s move on to css.  This is all my preference because frankly position this stuff via js was a nightmare and if someone used their zoom is caused more problems than it was worth.

My CSS looks like this…

#navContainer{
padding: 0;
margin: 0;
}

#navContainer ul {
float:left;
}

#navContainer a{
display:block;
text-decoration:none;
padding: 5px 5px 0px 5px;
color: rgb(102, 102, 102);
}

#navContainer a:hover{
color:rgb(72, 35, 93);
}

#navContainer li {

position:relative;

list-style: none;
}

#navContainer ul ul {
position: absolute;
left: 0px;
top:100%;
visibility:hidden;
width: 200px;
}
#navContainer ul ul ul {
left: 100%;
top: 0;
width: 200px;
}

#navContainer li:hover > ul {
visibility: visible;
}

save this to a css file and make sure to change your reference to point to it.

Finally, we are ready to replace the OOTB nav with our.

Do a find on DeltaTopNavigation and make it look like the following.

Some notes on this

  • you may just want to leave the actual tag alone and replace the code in between it, starting with and ending with
  • This code assumes you want flyouts, if you don’t adjust it accordingly by removing the second span that looks like this…
  • You may not want the quick links.  If you don’t just remove that section.


<!–SPM:–>






 

 


<!–SPM:–>

That should be it.  I have set this up in a production environment and it works great.  It is very speedy, and is free of conflicts.  If you run into issues I recommend checking the designer tools to make sure that everything is loading properly etc.

If you have any questions or comments, please post them.

Office 365 Profile Picture

Today, in SharePoint Online, or anywhere in office 365 where a user can see their picture, or the an icon in place of the picture, in the upper right corner of the screen they can click on it and select View Account.

viewAccount

 

When they do it brings me to the My Account page from where they can click on Profile Info and then hover over the picture which gives them the, “Change Photo” option.

personalinfo

Clicking Change Photo results in the following pop-up that allows the user to browse to a photo they’d like to upload from their computer. After clicking save the photo will be saved to their profile and the propagation process will begin. The process is immediate in some places and some places, specifically SharePoint, it is not. The photo will propagate throughout Office 365 within 72 hours.

addpic

Cloud Saturday Registration

Have you registered for Cloud Saturday Atlanta yet?

Did you know that it is sponsored by the biggest cloud players in the industry? IBM and AWS are sponsoring. Microsoft is providing the facility. There will be 25 sessions covering a wide variety of Cloud Computing topics.

Sounds like a pretty serious event doesn’t it?

Source: Register Here

Cloud Saturday Atlanta is 9-26-2015

Cloud Saturday Atlanta is fast approaching.   http://atlanta.cloudsaturday.com

I will be presenting…

Buried In Unlimited OneDrive Storage? Map Your Way Out!
Tyler Bithell (B2B Technologies)

  • Microsoft has given you unlimited cloud storage! Now what?
  • How about throwing out those old network shares?
  • Have you considered ridding yourself of local storage?

If you are looking at these or other options, but are scratching your head as to the how of it all, than this is the session for you! This session will explain in detail how some School Districts have used OneDrive mapping to drastically change their storage infrastructure without changing end user experience.

Cloud Saturday Atlanta Registration is open!

Cloud Saturday Website

Be sure to register soon, the first 100 to register will receive Early Bird pricing.

Death Match: Azure vs. Amazon Web Services (AWS)

Check out my latest blog post on the B2B Website!

http://b2btech.com/whatsnew/blog/Pages/blogpost.aspx?PostID=114#sthash.i5xvgqBO.dpbs

SharePoint Saturday Atlanta 2015- This Saturday May 30th

I’ll be presenting about Yammer and using Powershell/CSOM to access all you OneDrives and change settings, two separate presentations this Saturday. I’m very much looking forward to SharePoint Saturday this year, and hope to see you there 🙂

http://www.spsevents.org/city/atlanta/atlanta2015
@sps_atl
#sps_atl

OneDrive for Business DOES NOT have a file limit of 20,000

Update – the new sync tool is now available.  Really it is the OneDrive Personal version sync tool now with the ability to add a OneDrive for Business account.  Once you add the account you can do the following…

Right click the tray icon and select Choose folders under your OneDrive for Business account, choose your folders and click OK.  That’s it, the sync tool with only sync your selected files from there on out 🙂

 

Been noticing a lot of traffic on this post and thought this edit made sense…
(New portion start)

Just found this  Sync Tool Announcement… Looks like a new sync client is on its way 🙂

Thanks to this blog post… Overthinking Blog Post

For pointing this out 🙂

For now I still stand by this, but I’m hoping this Sync tool does give people who really want to sync with a viable tool.
In a lot of cases mapping is the way to go and provides more by way of what people are actually looking for. I helped write a script that automates this and I blogged about it… https://sharepointv15.wordpress.com/2014/05/12/office-365-drive-mapping-for-enterprise-desktops/

Keep in mind things are changing in the cloud on the regular, so tweaks need to be made to the script to account for that. I’ll be presenting on this at Cloud Saturday Atlanta on Sept 26th http://atlanta.cloudsaturday.com. If you want a lot more detail I recommend you attend.
(New portion end)

Today I ran across this posting about OneDrive for business and while reading through the colorful comments it occurred to me that most of these folks think that there is a 20K file limit in OneDrive for Business.

https://onedrive.uservoice.com/forums/262982-onedrive/suggestions/6392647-remove-the-limit-20-000-of-files-that-can-be-sto?page=1&per_page=20

I thought it warranted a post because A LOT of people seem to have misunderstood this limit. I found some other articles that sort of misunderstood as well, but won’t be posting links here.

Just to say it again the 20,000 item limit is only for syncing, which let’s face it you are never going to need to take 20,000 documents offline between when you will be able to connect to the internet next. You can absolutely store millions of documents, I believe 50 million is the limit, in OneDrive, so don’t let the Sync limit confuse you.

I’ll be the first one to admit there are plenty of issues with Sync and for the love of all that is holy, unless you want to test the limits of your anger and frustration, please do not attempt to use it as a migration tool.

If you want to migrate to OneDrive for Business I recommend using a tool like AvePoint Migrator. If you’d like to know more about OneDrive see the following post.

https://sharepointv15.wordpress.com/2014/12/05/what-onedrive-for-business-is-and-what-onedrive-for-business-is-not/

SharePoint Saturday Atlanta 2015

SharePoint Saturday Atlanta 2015 will be held on Saturday May 30th at GSU’s Alpharetta Campus.  If you are interested in attending or speaking, please follow the link below.

http://www.spsevents.org/city/Atlanta/Atlanta2015

This is the first year that I have helped plan the event, and I am very much looking forward to how it turns out.

Registration opens today.  Please be sure to register as it will help us as we are planning this event.

I am also helping plan Cloud Saturday that will be held later this year.