Houston-based LJA Engineering, Inc. has been in business since 1972 when John “Dutch” Lichliter founded The Lichliter Company. As the company grew, it became LJA Engineering and Surveying in 1997, and then evolved again in 2011 to become the present-day LJA Engineering, Inc. The employee-owned firm has scaled from 47 employees to more than 600, with multiple offices in Texas, Nebraska, and Florida. It is now a full-service firm that provides engineering, planning, and design services for private and government clients.
LJA Engineering manages more than 300 terabytes (TB) of data that it is required, by industry standard, to keep for a minimum of 10 years. The engineering firm needed a solution that would allow it to identify legacy data and transparently move it to a live archive on a daily basis. The firm’s data is growing at approximately 20 percent annually. Its network environment included several HP MSL 4048 tape libraries, but LJA was looking for a networked attached storage (NAS) product that would provide a live archive and incorporate easily into its existing infrastructure.
A NAS device connected to a network allows storage and retrieval of data from a centralized location for authorized network users and heterogeneous clients. NAS devices are flexible and scale-out — as users need additional storage, they can add on to the existing system.
The firm also wanted visibility across its storage. Many storage applications and storage infrastructure only allow users to see the data located in a specific application (or type of hardware). The ability to have visibility across all storage provides a single pane of glass that allows users to manage all aspects of storage (primary storage, secondary storage, archive storage), and drastically reduces the amount of time and effort to manage data.
Additionally, LJA wanted an automated approach to continuously identify “cold data” and transparently move it by policy without changes to user and application access. Cold data, sometimes referred to as “historically critical data,” is the data that is not actively being accessed. Often, organizations are utilizing data to produce results, do analytics, prove a concept, provide video evidence, etc. As the data grows, the older data becomes cold data and is rarely accessed, said Brian Grainger, Spectra Logic CSO. When organizations move this data off of expensive primary storage to a lower-cost second tier of storage, it reduces the total cost of storage and frees up valuable space on their primary storage system.
The cost of cold data is often not only the cost of storing that data on expensive NAS storage, but also the cost of replication, backup, and data protection of that footprint, said Komprise President and COO Krishna Subramanian. Data management is 80 percent of the costs, and when a solution such as Komprise moves data to less expensive storage, it cuts not only the primary storage costs but also the costs of replication and backup.
LJA Engineering considered HPE, Tegile, and Equalogic offerings, but ultimately decided on a Spectra Verde NAS Solution in combination with Komprise software (see Figure 1). “We chose Spectra Logic’s Verde NAS solution for its industry reputation, price point, and integration with Komprise, which provided us with the visibility and policy-based automation we need to create an active archive,” said David Kimball, LJA Engineering’s IT manager.
Active archive is a concept that has been crafted into a full-scale data management approach and is highlighted by the Active Archive Alliance (www.activearchive.com), Grainger said. This concept is based on the idea that users have access to all of their data all the time. This is important, because users are able to move cold (or inactive) data to a lower-cost tier of storage without sacrificing the speed of access that users have with primary storage. There is usually a software layer that acts as the director of data to ensure that when users request data, they are returned the data that they need in the most efficient way.
From traditional hierarchical storage management systems, to data-mover software, active archives are designed to give superior access speed at an affordable price. Historically, this has been a difficult storage infrastructure to set up, and that was the reason for the Active Archive Alliance: to reduce the complexity and bring multiple vendors who specialize in this concept together into a single solution. Modern software packages such as Komprise are a perfect example of the software layer needed to create and manage an active archive solution. Paired with Spectra’s affordable storage solution, this creates a never-before-seen solution that provides affordable, fast, and responsive data storage for any organization, Grainger said.
LJA Engineering’s Verde NAS Solution holds 75, 8-TB disk drives and currently stores 250 TB of data. According to Spectra Logic, the Verde NAS Solution is intuitive, easy to use, and offers the lowest cost-per-terabyte on the market — as low as 7.5 cents per gigabyte. Spectra Verde NAS Solution is the optimal disk platform for the storage of mid-tier data, including primary storage offload, data staging, backup, and archiving.
Mid-tier data, another term to describe cold data, is the data that has been offloaded from the primary storage and moved to a lower-cost tier of storage. In the past, the storage pyramid had three to four tiers of storage with reduced cost as you went down the storage pyramid, but you also had a slower access time to get to the data, or less features on the secondary storage tier than primary storage, Grainger said.
A staging area, or landing zone, is an intermediate storage area used for data processing during the extract, transform, and load (ETL) process. The data staging area sits between the data source(s) and the data target(s), which are often data warehouses, data marts, or other data repositories. Staging areas can be designed to provide many benefits, but the primary motivations for their use are to increase efficiency of ETL processes, ensure data integrity, and support data quality operations. The functions of the staging area include the following:
- minimizing contention,
- independent scheduling,
- change detection,
- cleansing data,
- aggregate precalculation, and
- data archiving.
The expandable Verde disk solution provides raw storage capacities from 48 TB to 7.1 petabytes. Designed for a variety of workloads, a single Verde NAS solution supports three disk drive types, including 4-TB, 8-TB, and 12-TB enterprise drives; 8-TB archive drives; and high-performance SSD drives.
Partnering with Spectra Logic, Komprise’s mission is to help businesses handle the incoming flood of data by providing automated software that adapts to the customer’s environment and scales across their storage silos on premise and cloud storage. The Komprise data-aware management software empowers businesses to manage today’s massive scale of data growth while unlocking data value.
Komprise analyzes data across network file systems (NFS) and server message block/common internet file system (SMB/CIFS) storage, and moves data transparently by policy across NFS, SMB/CIFS, and REST/S3 storage (also known as object or object-based storage). The modern scale-out architecture required no agents, no dedicated hardware or storage, no complex setup or proprietary integrations, delivers native access to data in the cloud, and is delivered as a hybrid cloud service.
NFS is a client/server application that lets a computer user view, and optionally store and update, files on a remote computer as though they were on the user’s own computer. The NFS protocol is one of several distributed file system standards for NAS. NFS is widely distributed to host VMWare data-stores or share network folders in a Linux/UNIX environment.
SMB/CIFS storage is the standard way that computer users share files across corporate intranets and the internet. An enhanced version of the Microsoft open, cross-platform SMB protocol, CIFS is a native file-sharing protocol in Windows 2000.
REST/S3 (object) storage is a computer data storage architecture that manages data as objects, as opposed to other storage architectures, such as file systems, which manage data as a file hierarchy; and block storage, which manages data as blocks within sectors and tracks. Each object typically includes the data itself, a variable amount of metadata, and a globally unique identifier.
Object storage can be implemented at multiple levels, including the device level (object storage device), the system level, and the interface level. In each case, object storage seeks to enable capabilities not addressed by other storage architectures, such as interfaces that can be directly programmable by the application, a namespace that can span multiple instances of physical hardware, and data management functions, like data replication and data distribution, at object-level granularity. Object-storage systems allow retention of massive amounts of unstructured data. Object storage is used for purposes such as storing photos on Facebook, songs on Spotify, or files in online collaboration services such as Dropbox.
LJA Engineering can now manage projects, user data, and files in a dual disk and tape network environment. The Verde NAS Solution, located in the firm’s Houston office, doubles as an ideal archive repository and a backup target with room to grow as data sets increase over time. Data is identified by Komprise and automatically archived to the Verde NAS Solution from various storage systems, including Windows and Linux file servers.
All of LJA’s offices are on a single network that allows the data that is housed in one office to be accessed by any of the other offices. This is true for data that is stored on primary storage or in the other tiers of storage, like the Spectra Verde product, for example. Each office has a Komprise Observer virtual machine (VM) loaded onto it that analyzes the data across storage that is located in their facility, and moves inactive data by policy to the Spectra Verde. Regardless of the location (in the same building or in another office), the data is sent over the network to the Verde, where the data is stored.
With Komprise, the user experience remains unchanged, Subramanian said. Any user, accessing any file in any location, can simply click and open the file they need. The data is then pulled from either the primary storage, if that is where the data lives, or the Spectra Verde, if it has been moved to a lower-cost tier of storage.
Komprise provides a single view of the organization’s data across its storage platform and makes it simple to reduce storage costs without impacting users. Primary storage offers high performance, but it is also expensive. And data stored on primary storage is often also replicated on primary storage, and protected via backup software and storage — so every terabyte of data stored is actually 3 to 5 terabytes, once managed, Subramanian said. More than 80 percent of this data is typically cold and not used within weeks of creation, and managing cold data on primary storage is expensive.
With Komprise, organizations can easily locate the cold and inactive data and move it to the lower-cost tier of storage. This is the primary cost reduction for users, and by moving this inactive data out of the actively managed footprint, Komprise also eliminates the need to replicate and backup this footprint on expensive primary storage. Komprise, along with cost-efficient secondary storage such as Spectra Verde and BlackPearl products, eliminate more than 70 percent of storage costs, according to Subramanian.
One other major benefit of the solution is that user experience is completely unchanged; no additional training is required, no change in process, no interruption to the work being done. This is unusual for a storage solution to not only save money, but leave the current workflow and user experience unchanged, Subramanian said.
“Together, the Spectra Logic and Komprise solution enables us to manage data growth efficiently,” Kimball said.