-
Marketplace
-
Channel Resources
Articles from this Site
Expert System Helps "Autostrade per l'Italia" Sort, Classify and Streamline Email Traffic
Blue Cross and Blue Shield of Minnesota Enables Enterprise Content Management Improvement with Oracle
Oracle Launches Universal Online Archive
pTools Integrates Windows Workflow Foundation
EMC Announces Plan to Acquire Document Sciences Corp. Provider of Customer Communications Solutions
White Papers
Organizations Shift Focus to Information Management: The Role of Documents in Highly Effective Business Processes
Portal Strategy to Achieve High Performance for North American Bank
How Portals Enhance Business Performance
The Rising Importance of Enterprise Content Management
The Accenture Executive: Portal Real-Time Decision Support, Collaboration and Compliance
Web Seminars
Books
Document Warehousing and Text Mining: Techniques for Improving Business Operations, Marketing, and Sales
Managing Gigabytes: Compressing and Indexing Documents and Images
Content Management Bible
Content Management for Dynamic Web Delivery
E-Policy: How to Develop Computer, E-Policy, and Internet Guidelines to Protect Your Company and Its Assets
The FAN Vision
With recent studies suggesting that file data growth and files management have become top IT priorities, organizations are starting to look closely at their approach to storing and accessing data, especially in light of incessant storage demand that shows no sign of abating. What they are finding is a disturbing correlation: increasing storage demand is driving organizations to increasingly complex and costly storage infrastructure. Today's seemingly proven solutions - more and bigger SANs - only serve to increase the cost and complexity without effectively satisfying the demand.
Not all storage demand, however, requires the same storage infrastructure. The shift in storage demand from block-based data to file-based data - a shift leading industry analysts see as only accelerating and becoming more pronounced going forward - opens up new possibilities that extend far beyond storage as usual. These new possibilities give organizations for the first time a realistic hope that they can accommodate growth in storage demand without a corresponding increase in storage complexity and cost.
The new possibilities hinge on the fact that file-based data, accessed at the file level through file systems can be managed by intelligent systems. They enable the bridging of the worlds of applications and file resources by making the resources transparent to the application through the use of services. Four components combine to deliver these possibilities: metadata, virtualization, intelligent file area networks and namespace.
- File area network (FAN) uses automated intelligence to apply business-level controls to file-based data through a federated global namespace. Although conceptually similar to the storage area network (SAN) applied to block-based data, the FAN is distinguished by its ability to tap the power of file-based metadata to deliver a level of network-based automation and control not possible with a SAN.
- Metadata consists of information about the file-based data and its usage. As a higher level of data abstraction, files make it possible to convey information about the data as well as the data itself. Through the use of metadata attached to files, intelligent systems can identify and manage the data based on business values, such as age of the data, frequency of use and ownership of the data.
- Virtualization enables the simplification of the storage infrastructure by masking the underlying complexity of the storage device and the specific location of the data. It makes it possible to move, access and manage data without regard to its actual physical storage. In the process, virtualization reduces the cost of owning and managing the data and the storage infrastructure.
- Namespace provides the ability to organize, present and store file-based data. With these abilities, especially organizing and presenting, the namespace, in effect, becomes the heart of the FAN, where its key functions are performed.
Organizations are adopting the FAN as a nondisruptive complement to the existing storage infrastructure. It allows them to massively scale and centrally manage their file-based storage. The FAN coexists easily with SANs, handling data type (files) and metadata that are not part of the SAN design. Thus, the FAN becomes an essential enabler for information lifecycle management (ILM), enterprise content management (ECM), content-addressable storage (CAS). Virtualization ensures that the actual physical location of the data and the specifics of the storage device are of no consequence.
The FAN, through the use of metadata, also enables the development of intelligent services for file-based data. Through the FAN, organizations can deploy, automate and manage policy-based services that provide access controls, move or replicate data, implement storage tiering and balance loads.
Ultimately, as the FAN vision is realized, the FAN's intelligent services will finally bridge the worlds of applications and file resources, bringing about long-awaited possibilities:
- Massive global scalability,
- Transparent and secure global access to data,
- Efficient, centralized global data and storage management, and
- Seamless, fully meshed FAN/SAN infrastructure.
Resolving the Storage Infrastructure Paradox
In a recent survey, the Taneja Group found that 62 percent of IT decision-makers identified file data growth and file management as two of their top priorities.1 File-based data is all that unstructured data that sits outside of databases - office documents, presentations, email, reports, records, audio and video content, images - yet are stored and accessed by systems.
Previously, IT focused on structured data, the core transaction operational data that sits at the heart of the enterprise. It has become apparent, however, that structured data isn't the problem now and it will be less of the problem going forward. The problem, as the respondents to Taneja's survey made clear, is unstructured, file-based data. As it turns out, 85% of all data is stored as unstructured data.2 Eighty percent of business is conducted with unstructured information.3
Furthermore, unstructured data, according to Gartner, is growing at an unprecedented rate, doubling every three months.4 This incessant growth leads to what is referred to as the storage paradox: the need to store increasing amounts of data only complicates the storage infrastructure, adding to its complexity and cost.
Fortunately, unstructured or file-based data is different from structured or block-based data. File data can be handled at a higher level of abstraction, which simplifies how the data is stored, accessed and managed. By taking advantage of the differences between file and block data, organizations can effectively resolve the storage paradox.
In effect, organization can store increasingly greater amounts of file-based data without increasing the complexity of the storage infrastructure or the cost of managing that data, which is the largest cost associated with data storage. Resolving the storage paradox opens up a wealth of new possibilities.
This paper introduces the concept of the intelligent file area network (FAN). The FAN is the key to resolving the storage paradox opening those possibilities. This article will explain how the FAN makes it possible to leverage those new file data possibilities. Specifically it will:
- Explain the power of virtualization, namespaces and metadata;
- Position FAN in the existing storage/systems infrastructure;
- Describe what can be done with FAN today;
- Introduce intelligent FAN-based services; and
- Describe the future use of FAN.
The File Area Network
The FAN is a file-based approach to storing and managing file-based data as a single, logical pool of data. It provides heterogeneous intelligent file virtualization. Files are stored, accessed, shared and managed based on their unique names as if they all were stored in one place and one device, even though they actually may reside in different places on the network and in different devices. The intelligent virtualization in the FAN makes those differences transparent to the applications, users and administrators who use and manage the file-based data.
The FAN consists of the following elements:
- Storage devices, either SAN or NAS;
- File servers, able to manage data at the file level;
- Namespaces, which organize, present and store file data;
- File management, intelligent software that interacts with the namespace;
- Policy driven, real-time services, which act on the stored data;
- Client systems, which access the namespaces over the network; and
- Network connectivity, typically supporting NFS, CIFS, or other standard protocols.
At one level, the FAN resembles the SAN. Both provide a network accessible, logical pool of shared storage. FANs, however, handle data at the file level, where each file has a unique name and where business and application context that can be tapped for management purposes and service delivery. In contrast, SANs handle files at the block level, which are not sufficiently unique to be globalized and are too low-level to provide business and application context.
Tapping the Power of Virtualization, Namespaces, Metadata
The FAN includes three key elements that allow it to achieve its distinctive results: virtualization, namespaces and metadata.
Virtualization enables the simplification of the storage infrastructure by masking the underlying complexity of the storage device and the specific location of the data. Storage has been using virtualization for a long time. The FAN uses virtualization to separate the logical view of the data from the specifics of the storage device and its physical location. Virtualization makes it possible to move, access and manage data logically without regard to its actual physical storage. In the process, virtualization reduces the cost of owning and managing the data and the storage infrastructure.
Namespace provides the ability to organize, present and store file-based data. It serves the same function as the switching fabric in the SAN, but with one critical distinction: the namespace performs its functions on the logical file-based data as described in the metadata, not on the physical storage device. The namespace, in effect, becomes the heart of the FAN, performing its key functions. There are several kinds of namespaces: nonshared, shared and global or federated. Each type of namespace supports a different level of sharing. The global or federated namespace is central to achieving a FAN.
Metadata consists of information about the file-based data and its usage. This includes information about the file, where it resides, how to access it and the type of file it is. Metadata also can include information about when a file was last used, who created it and who used it. In the future, metadata will even include key words describing the contents of a file. As a higher level of data abstraction, files make it possible to capture information about the context of the data. Through the use of metadata attached to files, intelligent systems can identify and manage the data based on context and business values, such as age of the data, frequency of use, ownership and ultimately, the content or meaning of the data.
The intelligence built into the FAN uses virtualization, the capabilities of the namespace, and metadata deliver the benefits of the FAN. Through the FAN, every node is aware of local and global resources and all other users. By acting on policies, the FAN can apply business-level controls to file-based data. Unlike the SAN, the FAN is able to tap the power of file-based metadata to deliver a level of network-based automation and control not possible with a SAN.
FAN in the Existing Storage/Systems Infrastructure
The FAN is not a replacement for the SAN. Each handles a different form of data with different properties; block-level for the SAN and file-level for the FAN. As such, the FAN is a complementary component. It can use the SAN for its physical storage. Its behavior is nonintrusive and nondisruptive. Block-level data can even be stored as files on the FAN, although database performance may suffer.
As organizations increasingly store different types of data - email, records, documents, images, rich media - and deploy enterprise content management (ECM) and content addressable storage (CAS) systems, the FAN will enable all this file-based data to be stored, accessed and managed in an efficient, scalable way.
Organizations also are increasingly interested in information lifecycle management (ILM). ILM refers to systems that store and move data to the appropriate platform in terms of performance and cost, based on information about the data. For example, files that have not been accessed for six months can be moved to a lower-cost, lower-performing storage platform. Files that are especially critical to business continuity can be kept on platforms that are highly protected through costly mirroring and replication.
The FAN, with its ability to understand metadata and act on policies related to that metadata, is an ideal platform for ILM. Through automation and intelligent services, the FAN can enable ILM to operate seamlessly, transparently moving data based on policies while keeping it accessible.
The FAN Today
FANs have already been implemented based on file-level virtualization, federating the existing file storage into a unified global namespace that is centrally managed with real-time management policies to ensure existing resources are fully optimized. Each node in the FAN is aware of the other objects, federated together in a global namespace. The FAN knows where and how to find every resource and provides administration and management for file-based data that may be widely dispersed throughout the enterprise. It satisfies file requests, typically, via CIFS or NFS.
The FAN delivers for file-based data the same benefits a SAN provides for block-level data. These include:
- Consolidation of data (file server consolidation),
- Shared, logical pooling for management efficiency,
- Centralized administration for high productivity,
- Improved file server utilization for greater ROI,
- Transparent access to file-based data anywhere on the network, and
- Non-disruptive migration of files anywhere across the network.
These benefits combine to provide a lower total cost of ownership (TCO) for file-based data, increased user and administrator productivity, and ultimately increased ROI.
Early adopters report signification results from the FAN implementations. Savings typically come from reduced disk expenditures and from the streamlining of backup, migration and disaster recovery processes:
- Large consumer electronics marketer reports a multiterabyte data migration without disruption while automation reduced the cost.
- Global equipment rental company used the FAN to switch to lower-cost ATA disk, resulting in a 50 percent savings while reducing backup and replication time.
- Entertainment company used the FAN to increase the amount of music it digitized by over 500 percent while reducing operating costs associated with managing the storage by 20 percent.
- International trade show producer used the FAN to implement a tiered storage strategy that reduced disk spending 50 percent while streamlining both its Tier 1 and 2 backup processes.
Through FAN virtualization files can be moved and accessed logically without regard to their actual physical address. Similarly, management can be centralized with administrators working logically rather than having to deal with actual physical addresses and specific devices.
Intelligent FAN-Based Services
The FAN brings another key capability that differentiates it from a conventional SAN - intelligent services. Intelligent FAN services are policy-driven advanced file controls that perform specific tasks on the stored data. FAN services enforce policies, such as access controls, and perform sophisticated file routing functions, such as moving files to less costly storage based on aging or usage.
The combination of metadata and virtualization make intelligent services possible. The FAN monitors activity, resource capacity and network conditions in real time and takes actions based on predefined rules (policies). The metadata provides the information upon which predefined rules can be applied. For example, an organization might have a rule to move every file that has not been accessed for 180 days to a lower-cost storage platform or to provide access to certain files only to people associated with a particular business unit. Virtualization ensures that the FAN can always find the data, even if it is moved.
The FAN also can provide high-availability capabilities like failover through the use of intelligent services. The FAN could, for instance, monitor storage devices in a NAS cluster and transparently fail over transparently when it sensed failure conditions while preserving the integrity of processes underway at all times.
FAN services include:
- Migration services that move files nondisruptively through a shared or global namespace,
- Replication services that copy files nondisruptively between resources and geographies,
- Placement services that assign files to specific devices based on attributes of the given device,
- Classification services that allow content-level indexing of the data for use with policy-based controls,
- Access control services that determine which users and applications can access data based on policies, and
- Extension services that extend the namespace across the WAN to connect various geographies as a single federated namespace.
Other services, in the future, may provide network optimization of various sorts and application acceleration. Over time, third parties will create a robust market for FAN-based services.
FAN services make it possible for organizations to reduce the TCO of their file-based storage through automation, improve file storage performance and flexibility, and streamline the administration and management of their file-based data.
The FAN-Based Services-Driven Infrastructure of The Future
The FAN is a new enterprise-wide architectural and methodological approach to storage delivering both scale and scope. It leverages intelligent file virtualization, network-based policy enforcement and global access in an open file storage architecture. The FAN maximizes the use of metadata in guiding the enforcement of automated policies on the data. The policies direct everything about the data, from where it is stored, how long it is kept, and who accesses it to where and when it is moved or copied. These policies trigger FAN services that perform the desired work automatically.
The resulting FAN is heterogeneous, capable of functioning in real time, flexible, easily scalable across a global unified namespace, and centrally managed through automated FAN services. It will allow for the policy-driven transparent movement of files between different storage tiers, whether for the purpose of data protection or load balancing. And most importantly, it will do all this while taking advantage of the existing file infrastructure.
The FAN is the key to:
- Massive global scalability,
- Transparent secure global access,
- Efficient centralized global management, and
- Seamless, fully meshed FAN/SAN infrastructure.
In the process, the FAN will make it possible for organizations to finally resolve the paradox of having to build increasingly costly and complex storage infrastructures to meet demand for storage. Instead, organizations will be able to satisfy the growing demand for storage and improve storage performance while reducing the storage TCO.
Bridging the Worlds of Applications and File Resources
SANs, by functioning at the block level, only know the world of storage. That makes it difficult for the SAN to intrinsically recognize and handle the stored data based on its business context and value.
The FAN, by contrast, operates at the file level. It can take advantage of file-oriented metadata that conveys business context and manage the data accordingly. Through the combination of virtualization and metadata, the FAN works with data and relationships, not the actual physical device and network addresses. As such, the FAN can manage applications and their data independent of the storage infrastructure.
By adding policy-driven, real-time services, the FAN is able to automate by fully exploiting the power of services that operate at the file level. Ultimately, in an increasingly file-based data environment, the FAN bridges the worlds of applications and files by leveraging business context to enable the streamlining of the file storage infrastructure and management, rein in cost and scale without adding complexity.
References:
- Taneja Group.
- Butler Group.
- Gartner, Inc.
- Gartner, Inc.
Kirby Wadsworth is a skilled professional with over 25 years experience developing and implementing breakthrough strategies for emerging and established companies. He has played a key role in several significant data storage industry transformations. Wadsworth joined Acopia from Revivio where, as senior vice president, Marketing and Business Development, he was pivotal in creating the continuous data protection market. He is a frequent speaker at industry conferences and events, and a contributing author to numerous publications. He may be reached at kirbyw@acopia.com.
For more information on related topics, visit the following channels:


