Showing posts with label sql server. Show all posts
Showing posts with label sql server. Show all posts

Wednesday, November 07, 2007

Microsoft Announces Search Server Express

Today Microsoft announced that it will be releasing a free search tool for enterprises based on SharePoint and SQL Server. Search Server Express for organizations can be installed on a standalone server and leverages Microsoft's SharePoint search engine.

The engine looks just like WSS and SharePoint Search. The administration console has many of the same options as the current MOSS Shared Service Provider search settings. These include content sources, best bets, and scopes. Alerts and RSS feeds are available on the results.

Some of the neat aspects of the new solution:

  • It supports the OpenSearch standard. This will help integrate Microsoft's search crawl and integrate with a wide variety of applications from other vendors.
  • Federated Security: Authentication against content sources can be done via NTLM, Basic authentication, Forms Based Authentication, Kerberos, and "cookies". Access control lists are applied to the indexes and security trimming is provided on queries.
  • Continuous Propagation Indexing: No need to wait for the whole corpus to be crawled before searching can be performed; the index will now be incrementally updated and searches can be performed on that increment immediately.

The express version can only be installed on one server. Naturally there is an upgrade path to more expensive solutions for those organizations that require scalability.

In 2008 Microsoft will also release some free connectors for Documentum and FileNet.

Microsoft has adopted a similar approach in the past with the releases of Visual Studio 2005 Express and SQL Server Express. In this case Search Server Express will provide a useful, free solution to give customers a taste of some core platform functionality and get developers using it.

The Enterprise Search website is located at http://www.microsoft.com/enterprisesearch/. You can download the release candidate here.

Tuesday, August 14, 2007

SharePoint Performance and Capacity Planning

This is based on Mike Fitzmaurice's seminar at Tech.Ed Gold Coast.

Mike defines capacity planning as the

"art of evaluating a technology against the needs of an organization and making educated decisions on how to meet those needs"

 He stresses that it is still more of an art than a science as there are a lot of imponderables and value judgements.

In terms of organizing the planning, he suggest breaking the process up into four distinct phases. These are:

  1. Phase 1: Plan for software and hardware boundaries.
  2. Phase 2: Estimate performance and capacity requirements
  3. Phase 3: Plan hardware and storage requirements
  4. Phase 4: Test, test, test your design

Phase 1: Plan for software and hardware boundaries

First you have to evaluate the environmental and the existing and possible limitations of the solution.

In SharePoint, hardware scalability isn't a huge concern as you can scale up and out by adding more servers over time. The big impact is in the software; specifically how you configure SharePoint and how you use it.

The main SharePoint objects to care about at this stage are: Sites, People, Search, Logical Architecture objects, and Physical objects (search indexes, physical database file and transaction logs).

These objects can all grow quite large, and SharePoint is designed to handle that to a degree, but there are finite limits. After running through multiple load-balancing tests, Microsoft has come up with the following benchmarks, beyond which performance suffers.

SharePoint Element Recommended Limit Negative Impact On
# Site Collections in single content db 10,000 General SharePoint performance
Site Collections 50,000 per web app General SharePoint performance
Content Databases 100 per web app General SharePoint performance
SharePoint Sites 250,000 per site collection General SharePoint performance
Shared Service Providers 3 per farm TOTAL Asynchronous processing, very seriously recommend you do not use more than 3 SPs per farm!
Indexed Documents (in Search catalog) 50,000,000 per search index Indexing and querying performance
Web Servers 8 web servers per database server General SharePoint performance
Maximum # documents in flat document library not using folders or indexed views 4,000 Searching, viewing, and navigating document library
Maximum # documents in document library using folders or indexed views 1,000,000 Library performance
Maximum # documents per nested folder in document library 2,000 Library Performance

 

Some other helpful tips:

  • Don't create too many distinct web sites in IIS when you are installing SharePoint for the first time, as the application pools and websites use up CPU resources
  • Create usage profiles to try to track common users and estimate how their daily actions might consume resources or generate content
  • Mike really recommends that you turn on IIS compression.
  • BLOB Caching is a feature that isn't well known in SharePoint but serializes large objects to disk on the Web Front Ends to avoid database round-tripping.
  • You can delay loading the core.js file if your users will be anonymous or readonly...instructions are here.

Phase 2: Estimate performance and capacity requirements

The requirements are a compromise based on your desired performance, capabilities, and resources.

Remember to plan for peak concurrency since your farm could be humming along most of the time until peak demand brings it to a crawl.

Microsoft recommends 64bit hardware to improve performance, but be aware that many applications and utilities have problems with this: PDF Filter is one example (although there is now a 64 bit version). MOSS supports mixed mode where your database server is running on 64bit hardware while the Web Front Ends are on 32bit boxes.

Phase 3: Plan hardware and storage requirements

If you're going to be storing your file system in SharePoint, multiply its size by 1.2x to 1.5x to reflect the additional metadata, storage, and related costs to that. Your database will have to accommodate at least this much space.

Plan an additional 30% to be the size of the Index for all content indexed for a single server.

If you have a dedicated Query Server multiply the Index size by 2.5x.

If you're enabling BLOB caching in SharePoint, remember that it is serializing large objects to file, and you will have to have room for this.

Plan for future database growth!

Phase 4: Test, test, test your design

You should definitely test the performance under a variety of scenarios.

To do this, first establish goals for your tests. Good goals might be to mimic the standard user profiles you created in Phase 1 at peak times and see how the SharePoint farm responds.

Create a test farm that closely mimics the production farm. I think it's ok to use virtualization as it will cut down on the number of physical servers you need. Populate the test farm with data that is representative of the real thing. You could try exporting existing content databases from a development or production farm into the test farm. There is also a data population tool available on (where else?) CodePlex

You can use Perfmon counters to assess responsiveness. There is a tool called Fiddler that can help you analyze the requests and responses between SharePoint and the client which can help figure out what's going on.

SharePoint can be very complex because there are a lot of moving parts. One of the most important aspects of a proper MOSS deployment is planning for future capacity and performance, and these tips by Mike are a great help in that regard. The white paper that discusses many of these issues in greater depth is available here.

Thursday, July 26, 2007

Another day, another SharePoint learning experience for me...This time one of my SharePoint colleagues at work, Viraf, was trying to connect to an Oracle database using the BDC. I'd had some previous success using database credentials authentication to a SQL Server-backed CRM application so he and I put our heads together to get this BDC Oracle connection working.

Because we didn't have Oracle installed anywhere initially, we began by mocking up one of the client's tables in a temporary SQL Server 2005 database and using the steps in my previous post to connect to it. Since those instructions weren't using Integrated Windows Authentication and used plain old SQL command text, we figured this would be a pretty good head start. The only major change to the BDC Schema App file was the line:

<Property Name="DatabaseAccessProvider" Type="System.String">SqlServer</Property>

had to change to:

<Property Name="DatabaseAccessProvider" Type="System.String">Oracle</Property>

Any table names referenced in the SQL Command text in the App schema had to be prefaced with the database schema name; ex:

SELECT Field1, Field2 FROM MySchemaName.TableName

Unfortunately although all was right with the BDC schema, there was an authentication error when trying this against the client's Oracle instance. The culprit was the Single Signon database which managed the connection.

In the SQL Server test database Viraf was using, the only fields required in the Single Signon Application Schema were User Id and Password. However, Oracle requires one more field (Field 3) to complete the Connection String: "Integrated Security".

After adding this, in the “Manage Single Signon” section, we went to “Manage Account Information for enterprise application definitions” and selected the application. We needed “\Domain Users” as the group that would authenticate using SSO (in other words, every SharePoint user would use this connection). Finally, we filled out the User Id field with the Oracle username, the Password field with the account password, and the Integrated Security field with the value "no".

Although we were able to get this running internally after Viraf set up an Oracle installation to prove this would work (it did), the deployment at the client site failed with the following error in the SharePoint log files:

System.Data.OracleClient requires Oracle client software version 8.1.7 or greater.

Viraf found the following article on Roy Tore Gurskevik's blog that seems to explain it: http://dotnetjunkies.com/WebLog/rtgurskevik/archive/2005/01/19/45958.aspx . The good news is that to reach this point a connection has to be happening and SSO reports that the connection string is being retrieved. We're awaiting official confirmation but it looks like this is an environmental issue.

Thursday, July 12, 2007

Windows Server 2008, Visual Studio 2008 ("Orcas") and Microsoft SQL Server 2008 will all launch on February 27, 2008! So said Microsoft Chief Operating Officer Kevin Turner at the Microsoft Worldwide Partner Conference 2007 in Denver Colorado. The press release is here. Scott Guthrie later clarified in a comment on Scott Dorman's blog that release is often several weeks or months before the official launch.

So it looks like we'll be able to start working with the new products around late January - which is good timing because I'm sure by then we'll be completely adapted to the huge changes in SQL 2005, Office 2007, .NET 3.5, WF, WCF, WPF, WWII, etc and ready to grab the next wave!

Wednesday, June 20, 2007

Yesterday I attended the monthly Sydney SharePoint User Group. The presenter was Grant Paisley who is the Chief Solutions Architect at Angry Koala. Grant demonstrated some of the ways Microsoft's Business Intelligence solution set has been improved in 2007. He covered a lot of ground and I thought I would write about the many great tips and insight he presented to us. This may hop around a bit as I have the memory span of a goldfish and the handwriting of a spider.

To begin with, Grant made a very interesting remark....he basically suggested that Business Intelligence had been brought out of the "back room" and was now what Microsoft would call a "first-class citizen".

We've all spent the last decade creating enormous quantities of data and we don't know how to manage it (or even find it!)...Now the major software vendors like Google and Microsoft are trying to get a handle on these problems and this overriding need to unify all our scattered data is becoming central to their business models. The proof of how seriously Microsoft takes this is that they are in the process of turning SharePoint into the web platform that will underpin all of their applications. I bet within 5 years the only significant Microsoft product that does not leverage SharePoint services and integrate with it- and I'm including desktop applications here - will be SQL Server itself.

So Grant's analysis that Microsoft is assigning more importance to Business Intelligence makes perfect sense because Microsoft would be missing a big piece of the data management pie if BI was given junior status. More to the point, the individual BI tools are being rolled into the SharePoint stack.

To prove this, Grant showed a slide of the BI stack...At the bottom was what he called the Operational Systems (Oracle, SAP, PeopleSoft, SQL Server, etc). These were the initial data repositories. The next level of the stack was the BI Platform: Reporting Services, Analysis Services. These provided the data for the third level of the stack, the End User Tools & Performance. Here I saw the new PerformancePoint product mentioned. The final level of the stack was the Delivery layer, where the end users manipulated and displayed the data. Again, the key element (at least to my eyes) was the SharePoint platform (read: MOSS), because on it sits Excel Services, KPIs, and published reports.

Grant then gave a demonstration of manipulating an Excel spreadsheet attached to an OLAP cube via Analysis Services. Someone asked where the best place to put KPIs is, because at the moment you can do that in a variety of places (Analysis Services, Excel Services). He suggested that the best place is within Analysis Services as those KPIs then become available everywhere else.

Grant demonstrated how to organize an Excel spreadsheet; using Pie Charts and Heat Charts he helped visualize the data within the spreadsheet. Next he published the spreadsheet to a report library using Excel Services. He mentioned that you can publish individual items, worksheets, or the whole report. One great tip he showed us was how to rename the Excel objects (such as the worksheet) before he published the form. This is important because the web parts refer to the object names and so it's a good practice to give them meaningful names ahead of time.

He had another tip - if you "Convert to formulas" (it's an Excel option somwhere) you'll get full control over how the spreadsheet appears and can move the cells around. I didn't understand this requirement too well, not being much of an Excel user, but it is apparently helpful when your end users are, how to put it delicately...."particular" or "precise" about how they want the spreadsheet showing up!

Finally Grant showed us a neat little add-in for Excel allowing a user to data-mine within the client. The link for these tools is here. As an example of its use he conducted a little campaign analysis on the Excel spreadsheet. Using the addin he was able to quickly drill down on what influences prompted the users to want to purchase certain products.

Obviously Grant provided a great deal of information and demonstration for us. There's a lot of progress being made in the Business Intelligence space and I thought it was a great overview.
Incidentally, he is a SQL Server MVP and is also the president of the Sydney SQL Users Group (http://www.sqlserver.org.au/).