This is based on Mike Fitzmaurice's seminar at Tech.Ed Gold Coast.
Mike defines capacity planning as the
"art of evaluating a technology against the needs of an organization and making educated decisions on how to meet those needs"
He stresses that it is still more of an art than a science as there are a lot of imponderables and value judgements.
In terms of organizing the planning, he suggest breaking the process up into four distinct phases. These are:
- Phase 1: Plan for software and hardware boundaries.
- Phase 2: Estimate performance and capacity requirements
- Phase 3: Plan hardware and storage requirements
- Phase 4: Test, test, test your design
Phase 1: Plan for software and hardware boundaries
First you have to evaluate the environmental and the existing and possible limitations of the solution.
In SharePoint, hardware scalability isn't a huge concern as you can scale up and out by adding more servers over time. The big impact is in the software; specifically how you configure SharePoint and how you use it.
The main SharePoint objects to care about at this stage are: Sites, People, Search, Logical Architecture objects, and Physical objects (search indexes, physical database file and transaction logs).
These objects can all grow quite large, and SharePoint is designed to handle that to a degree, but there are finite limits. After running through multiple load-balancing tests, Microsoft has come up with the following benchmarks, beyond which performance suffers.
SharePoint Element | Recommended Limit | Negative Impact On |
# Site Collections in single content db | 10,000 | General SharePoint performance |
Site Collections | 50,000 per web app | General SharePoint performance |
Content Databases | 100 per web app | General SharePoint performance |
SharePoint Sites | 250,000 per site collection | General SharePoint performance |
Shared Service Providers | 3 per farm TOTAL | Asynchronous processing, very seriously recommend you do not use more than 3 SPs per farm! |
Indexed Documents (in Search catalog) | 50,000,000 per search index | Indexing and querying performance |
Web Servers | 8 web servers per database server | General SharePoint performance |
Maximum # documents in flat document library not using folders or indexed views | 4,000 | Searching, viewing, and navigating document library |
Maximum # documents in document library using folders or indexed views | 1,000,000 | Library performance |
Maximum # documents per nested folder in document library | 2,000 | Library Performance |
Some other helpful tips:
- Don't create too many distinct web sites in IIS when you are installing SharePoint for the first time, as the application pools and websites use up CPU resources
- Create usage profiles to try to track common users and estimate how their daily actions might consume resources or generate content
- Mike really recommends that you turn on IIS compression.
- BLOB Caching is a feature that isn't well known in SharePoint but serializes large objects to disk on the Web Front Ends to avoid database round-tripping.
- You can delay loading the core.js file if your users will be anonymous or readonly...instructions are here.
Phase 2: Estimate performance and capacity requirements
The requirements are a compromise based on your desired performance, capabilities, and resources.
Remember to plan for peak concurrency since your farm could be humming along most of the time until peak demand brings it to a crawl.
Microsoft recommends 64bit hardware to improve performance, but be aware that many applications and utilities have problems with this: PDF Filter is one example (although there is now a 64 bit version). MOSS supports mixed mode where your database server is running on 64bit hardware while the Web Front Ends are on 32bit boxes.
Phase 3: Plan hardware and storage requirements
If you're going to be storing your file system in SharePoint, multiply its size by 1.2x to 1.5x to reflect the additional metadata, storage, and related costs to that. Your database will have to accommodate at least this much space.
Plan an additional 30% to be the size of the Index for all content indexed for a single server.
If you have a dedicated Query Server multiply the Index size by 2.5x.
If you're enabling BLOB caching in SharePoint, remember that it is serializing large objects to file, and you will have to have room for this.
Plan for future database growth!
Phase 4: Test, test, test your design
You should definitely test the performance under a variety of scenarios.
To do this, first establish goals for your tests. Good goals might be to mimic the standard user profiles you created in Phase 1 at peak times and see how the SharePoint farm responds.
Create a test farm that closely mimics the production farm. I think it's ok to use virtualization as it will cut down on the number of physical servers you need. Populate the test farm with data that is representative of the real thing. You could try exporting existing content databases from a development or production farm into the test farm. There is also a data population tool available on (where else?) CodePlex.
You can use Perfmon counters to assess responsiveness. There is a tool called Fiddler that can help you analyze the requests and responses between SharePoint and the client which can help figure out what's going on.
SharePoint can be very complex because there are a lot of moving parts. One of the most important aspects of a proper MOSS deployment is planning for future capacity and performance, and these tips by Mike are a great help in that regard. The white paper that discusses many of these issues in greater depth is available here.
Nice job! Thanks!
ReplyDeleteThanks, David, glad it helped!
ReplyDeleteYou helped me alot, Thanks !
ReplyDeleteI was wondering about scenarios of flat Document Libraries with Indexed Viewes - What is the recommended capacity about this ?
Hi Chen, the indexed view and use of folders will allow your library to support up to 1 million documents in total. Also check out Microsoft's testing benchmarks (http://technet2.microsoft.com/windowsserver/WSS/en/library/2aa12954-2ea7-475c-9dce-663f543820811033.mspx?mfr=true) . These number seem slightly different to what Mike mentioned during his talk - for instance they say that up to 5 million documents can be supported in a single library. I doubt most organizations will ever bump into this upper limit.
ReplyDeleteRegardless of the maximum capacity, your view or folder should probably return no more than 2000 documents; after this performance degrades according to Microsoft's tests.
Personally I doubt users expect to see more than a small subset of items in a list at a time. Just set your view to "Display items in batches of the specified size" and pick a reasonable size to display.
I hope this answers your question?
Excellent information. Crystal clear! It just answered to a question I was asked yesterday by one of my colleagues.
ReplyDeleteThanks for a very informative article about capacity and planning.
ReplyDeleteMany Sharepoint gurus now are asking developers to cache objects, sessions and especially page output in order to give their sharepoint performance the boost it requires.
We have written an article around the subject and hopefully it'll help inform various Sharepoint users about the types of caching available that can make their applications perform faster as well as making them more reliable and scalable.
Team NCache
Another problem with performance is that most developers dont really know what is going on within the frameworks that they use. When used incorrectly you end up with problems.
ReplyDeleteI spent some time analyzing the SharePoint object model to understand how to best develop against sharepoint lists and views.
check out my findings at http://blog.dynatrace.com/category/net/sharepoint-net
Hope this helps
Andreas - thanks for that. On your blog, I like the way you show various SharePoint code calls and relate it to the database queries behind the scenes. That helps to clearly explains the different performance impacts.
ReplyDeleteWe are new to SharePoicnt/Moss.
ReplyDeleteq. What are guidelines when creating a new SITE.
Should each site be housed in a separate database of few SIES can cooexist in one database.
Thanks.
Hi Tahir, thanks for commenting. The guidelines for creating sites depend on each organization.
ReplyDeleteAt a high level - SharePoint has something called a "Site Collection" which is a top level site. It contains many "Subsites" which could be used for teams, for projects, and so on. Some people use many Site Collections, one for each department. Others use one Site Collection and many subsites. Each approach has its own pros and cons.
If you can - make sure your site content databases are less than 100 GB in size. This might affect which approach you take.
If you are new to SharePoint I strongly recommend you work with a SharePoint consultant for at least a day or two to gather requirements. You really want to get your site architecture right.
Good luck!
Thanks, Nick.
ReplyDeleteTahir
To be in a position to recover a SharePoint environment.
ReplyDeleteWhat needs to be backed up at the sql server.
We are backing up data files and transaction logs only.
Thanks, a lot.
Tahir
Hi Tahir, the backups you are doing are not currently enough. SharePoint has lots of backup options, including from STSAdm which is a command line tool, or from Central Administration. I recommend you pick up a copy of Bill English or Ben Curry's books on SharePoint 2007 Administration, as well as their Best Practices books. This will help you with the learning curve.
ReplyDeleteYour article was very helpful.
ReplyDeleteMany Thanks.
The tips given above are very useful. However, we should know that SharePoint uses SQL Server for storing BLOBs, which is not optimized for BLOBs handling, especially at peak load times. SQL Server can experience degraded performance when at frequent read/write requests from multiple users. This issue must be resolved to overcome by externalizing BLOBs and using caching techniques.
ReplyDeleteCaching things other than BLOBs like ViewState, ASP.NET sessions and Lists can further help in boosting up SharePoint performance.
Third-party tools are available that externalize BLOBs and use different caching techniques for resolving SharePoint performance as well as scalability issues. Some of these software include Ncachepoint, avepoint and storagepoint.
These are the need of the heavy SharePoint users to avoid potential performance issues