Tuesday, March 16, 2010

An introduction of Windows Server AppFabric (code named as Velocity).

Windows Server AppFabric (previously codenamed "Velocity.")

Windows Server AppFabric is a set of integrated technologies that make it easier to build, scale and manage Web and composite applications that run on IIS.
Windows Server AppFabric has these core capabilities:
• For Web applications, AppFabric provides caching capabilities to provide high-speed access, scale, and high availability to application data. This feature was previously codenamed "Velocity"
• For composite applications, AppFabric makes it easier to build and manage services built using Windows Workflow Foundation and Windows Communication Foundation. This feature was previously codenamed "Dublin."

Windows Server AppFabric Caching Capabilities

Windows Server AppFabric includes a distributed in-memory cache that provides .NET applications with high-speed access, scale, and high availability to application data.



Figure 1: Windows Server AppFabric Caching Capabilities

Key Features

• Caches any serializable CLR object and provides access through simple cache APIs.
• Supports enterprise scale: tens to hundreds of computers.
• Configurable to run as a service accessed over the network
• Supports dynamic scaling-out by adding new nodes.
• Backup copy provides high availability.
• Automatic load balancing.
• Integration with administration and monitoring tools such as PowerShell, Event Tracing for Windows, System Center, etc.
• Provides seamless integration with ASP.NET to be able to cache session data in without having to write it to source databases. It can also be used as a cache for application data to be able to cache application data across the entire Web farm.
• Follows the cache-aside architecture (also known as Explicit Caching) for V1. That is, you must decide explicitly which objects to put/remove in your applications and the cache does not synchronize with any source database automatically.

Scenarios
Consider the social networking site for a real-world scenarios and the category of data that they are dealing with and their caching requirements.

Social Networking Site
Consider a social networking site that people use for keeping up with friends. The site allows the uploading of an unlimited number of photos, sharing links and videos, and learning more about the people they meet. Such a site can accelerate performance by caching users and friend lists. User names and friend lists fall under reference data, making it very attractive for caching.


AppFabric Caching Concepts
In this section we look at caching concepts that will be useful when programming against AppFabric.
Logical Hierarchy
Named caches and regions are the two basic building blocks of caching in AppFabric. A named cache is used for storing a logical grouping of data together. Regions are logical groupings of objects within named caches.
A user can run multiple processes that host a cache instance, called cache hosts. The cache hosts can access named caches that can be stored across all the cluster nodes. The named cache consists of regions and regions store cache items. Figure shows the logical hierarchy.


Figure : Logical hierarchy

Named Cache
You can think of a named cache as equivalent to a database. A named cache is used for storing a logical grouping of data.
A cluster or node can contain one or more named caches. An application may use one or more named caches based on the policies for the various caches

Region
Regions are logical grouping of objects within a named cache. You can think of regions as equivalent to tables, although regions can also store arbitrary sets of key value pairs. Items within a region are guaranteed to reside on a single node and are the logical units for replication and node placement. A named cache consists of one of more regions.
An application is not required to use regions and can use the put/get/remove APIs using only the key to the object. In fact, the application will scale better when not using regions because the keys can be distributed across the named cache. If no region is specified, the system automatically partitions the keys into multiple implicitly created regions.

Cache Item
A cache item represents the lowest level of caching that contains the object to be cached along with the key, the object payload, tags, and the time to live (TTL), the created timestamp, the version, and other internal bookkeeping information. A region contains one or more of these cache items.


The following is a code example that shows the creation of a named cache and region.
C#
// CacheFactory class provides methods to return cache objects
// Create instance of CacheFactory (reads appconfig)
DataCacheFactory fac = new DataCacheFactory();
// Get a named cache from the factory
DataCache catalog = fac.GetCache("catalogcache");
//-------------------------------------------------------
// Simple Get/Put
catalog.Put("toy-101", new Toy("thomas", .,.));
// From the same or a different client
Toy toyObj = (Toy)catalog.Get("toy-101");
// ------------------------------------------------------
// Region based Get/Put
catalog.CreateRegion("toyRegion", true);
// Both toy and toyparts are put in the same region
catalog.Put("toy-101", new Toy( .,.), "toyRegion");
catalog.Put("toypart-100", new ToyParts(…), "toyRegion");
Toy toyObj = (Toy)catalog.Get("toy-101", "toyRegion");

Cache Concepts
We will be referring to primary and secondary nodes often when we talk about caching in AppFabric.
Primary Node
The node where a region is located is the primary node for that region. All access to this region will be routed to the primary node for that region.

Secondary Nodes
If the named cache is configured to have a “backup” for high availability, then another node is chosen to contain a copy of this data. This node is called the secondary node for that region. All changes made to the primary node are also reflected on the secondary node. Thus if the primary node for a region fails, the secondary node can be used to retrieve the data without having to have logs written to disk.

Cache Types
AppFabric supports two common cache types – partitioned cache and local cache. Depending on the type of data, applications can choose the appropriate type of cache.

Partitioned Cache
In partitioned cache, regions are partitioned among all of the available nodes on which the named cache is defined. The combined memory of all the computers across the cluster can be used to cache data, thus increasing the amount of memory available to the cache. All reads and writes to a region are directed to the node that contains the primary copy of the region.
Scale
Partitioned cache can be used to achieve a desired scale.
The figure in the next section shows how Put and Get operations work in the partitioned cache. In the case where the client performs a Put operation on the cache, the Put assigns the value “v2” for key “K2”. The routing layer component in Cache1 determines that the key “K2” really belongs to Cache2 and routes the request to that cache host. Similarly, Get requests for the same key also get routed to Cache2. AppFabric also supports the configuration where routing layer can be part of the client.


Figure : Routing in a partitioned cache

Applications can increase scale in this partitioned cache model by adding more computers. New computers can be added to achieve two goals:
When new computers are added to the cluster, automatic load balancing occurs and some partitions on existing nodes get migrated to the new computers. This results in keys being distributed across all those computers. This results in the access requests being routed to more computers now, resulting in increased throughput
When new computers are added, there is more memory to store more data. Applications can now easily scaled by increasing the data being cached.

Availability

When using a partitioned cache, it possible to specify a number of nodes as secondary caches. This allows the same data to be stored on multiple computers. When a primary node fails, one of the secondary nodes becomes the primary node, enabling applications to continue accessing the data that was stored on the computer.
The following example shows how adding an object and retrieving it will work for a partitioned cache with two secondary caches configured. Take the case where the client is sending a request to put the value “v2” with key “K2” to Cache1. The routing layer in Cache1 determines that the key “K2” belongs to Cache2 and routes it appropriately. Cache2 performs the operation locally and, in addition, sends it to two other secondary nodes. It waits for an acknowledgement from the secondary nodes that they have received the message and then acknowledges the success of the operation back to Cache1. Cache1 then relays the success back to the client.
A Get operation behaves the same way as it does in a partitioned cache without secondary caches; the operation gets routed to the appropriate primary node.


Figure : Routing in a partitioned cache with secondary caches

Local Cache
Applications can also maintain a local cache in the application process space for frequently accessed items. In the local cache, payload is kept in the object form. This helps applications save the deserialization cost as well as the network hop to the primary computer, resulting in increased performance.

Figure : Routing when using a local cache

Cache Clients
There are two types of cache clients: simple and routing.
The simple client has no routing capabilities and does not track where each cached object is stored. A simple client is configured to contact only one cache host in the cluster. If the simple client requests an object from a cache host that is not located in its memory, that particular cache host retrieves the object from the cluster and then returns it to the simple client.
The routing client contains a routing table to keep track of cached object placement across all cache hosts in the cluster. Because the routing client keeps track of where each of the cached objects are, it can make requests directly to the cache host that stores the object in memory.
The diagram below shows a routing client used with a partitioned cache.




Figure : Routing client used with a partitioned cache

Expiration and Eviction
AppFabric ensures the validity of cache items by supporting expiration and eviction. When adding objects to the cache, it is possible to optionally specify a time to live (TTL). Expired objects are removed from the cache.
Applications can specify a “high watermark” that determines when eviction kicks inand items will be evicted according to a least-recently-used algorithm.

Notifications
In a distributed architecture, because many clients work with the same data, it may be useful for a client to know when a cache item is changed by another client. Notifications can be set on a region or a cache item level. When set, a notification event is generated for any change (add, put, remove) to the region or cache items.
Deployment Topology
The AppFabric cache runs as a service that can be accessed from client applications running on the same computer or on other client computers. Applications can access the cache through the .NET client APIs. The client API uses TCP/IP to communicate with the service.

Figure : Deployment Topology

ASP.NET Integration
AppFabric provides a SessionStoreProvider class that plugs into the ASP.NET session storage provider model and stores session states in a partitioned cache. This enables non-sticky routing and ensures that session information is available across the cluster. ASP.NET applications using AppFabric can scale-out by adding more nodes and also by configuring secondary caches for the session data and thereby increasing availability.

No comments:

Post a Comment