The market for document management has changed considerably in the last three years. What used to be strictly an image scanning and digital storage technology has blossomed, with the increased use of digitized and electronic documents, into a $1 billion industry that is presently re-branding itself as a "content management" industry. And, according to industry experts, this new segment is estimated to grow to $2.1 billion by 2009.
IBM Leads the Pack
Leading the industry in 2003 was IBM, with a 20% market share, followed by Documentum (14%) and Open Text (9 %), according to research group Wintergreen Research. IBM, though not a newcomer to the content management business, has managed to significantly leverage its DB2 Content Manager solution, which uses a blending of WebSphere, DB2, Tivoli, and, of course, its eServer platforms and disk storage technologies, transforming the vision of document management into something vibrantly overwhelming. That something is called content management.
iSeries Is a Part of IBM's Solution...Sort Of
Like the other IBM eServers, a version of DB2 Content Manager is available on the iSeries platform and is used in some high-profile accounts, including FedEx. However, like many of IBM's solutions, DB2 Content Manager is scaled to meet enterprise-level computing needs. In this context, the iSeries version requires other IBM hardware platforms to deliver full functionality. Consequently, according to the specifications that IBM publishes, the best way to picture the iSeries version of DB2 Content Manager is as one of many "client" machines that use the WebSphere and Tivoli software in combination with a pSeries or zSeries or xSeries. These other machines are the muscle in this IBM scenario, allowing the iSeries to store and view the content, along with any PCs that are attached to the network.
This may disappoint iSeries aficionados, but that's the way Big Blue generally views the iSeries these days: the "little computer that won't go away." And, from this rather jaded perspective, it makes sense to IBM that a machine like the iSeries should not be used for document and/or content management when there are so many other "better" machines available.
SOX: Everybody Needs Document Management
But these days, with new document retention requirements coming out of the Sarbanes-Oxley laws, everybody needs document and content management capabilities.
The Sarbanes-Oxley Act (SOX) is the most sweeping legislation to affect public companies in over 60 years. Though designed as an accounting act with SEC oversight, the impact on IT departments is significant. SOX requires comprehensive internal controls to ensure the accuracy and integrity of financial reporting. It clearly states that financial reports must include internal documentation that addresses the process controls in place and their effectiveness.
So, though IBM may have 20% of the content management market share--focused on enterprise-level computing--there is a growing need for document and content management offerings that are specific to the small and medium-sized businesses that run iSeries hardware.
iSeries Products that Deliver on Document Management
In the past, iSeries document management was primarily focused upon getting reports out the door, using creative software that transformed spool files into fax transmissions, emails, Web pages, XML feeds, spreadsheet files, and PDF files. Today, there are many iSeries products that help to manage report output. These products include solutions from Help/Systems, like Robot/REPORTS, and a slew of RJS Software products too numerous to mention here. These solutions help by automatically bundling and distributing spool files, archiving reports, safeguarding confidential data, and providing retrieval services.
Other solutions streamline the process of scanning physical documents into the iSeries: products like RJS Software's Image Server/400 and WebDocs offerings. These solutions let customers store a digital representation of the document on the hard disk of the iSeries, convert it into text, catalog it, and/or transport the content to other recipients in the network.
Still other software products, like IBM's Lotus Domino Document Manager (formerly Domino.Doc), manage repositories of digital documents, tracking and controlling document versions, providing access security, and facilitating rapid transport and knowledge management through various workflow processes.
All these solutions and processes are good: They feed the network, increase productivity, and provide a stable environment that IT managers and their CFOs and CEOs can feel confident about. And all of them run on the iSeries.
The Future Challenges of Content Management
However, the challenges of IT document management in the future will be very different from those we see today. Today, document management mostly consists of manipulating physical documents into electronic documents, or transforming electronic documents into other forms of content. But the challenges of tomorrow will be with the content itself, the needs of collaboration, and the requirements of confidentiality.
Indicator Challenges: Enron/WorldCom and 9/11
Consider, for a moment, two significant earth-shattering events that occurred within the last four years.
The first event was the dual collapse of the Enron and WorldCom corporations, fiscal events that shattered the stock market and created the political drive for the present SOX legislation. Many of the SEC rules derived as a result of these fiscal debacles are focused upon tracking and auditing the financial claims that corporations make to their shareholders, ensuring that the content of financial claims is accurate.
SOX requires a new transparency in the record-keeping of corporations and the long-term retention of archives to support better accounting guidelines. By having better records available to auditors, SOX helps prevent the overstatement of profits based on unfounded internal processes. The Sarbanes-Oxley law places new legal strictures upon CEOs and CFOs, who in turn are placing strictures on IT that all important records must be maintained in a readily-accessible format. This translates into better ongoing collaboration of content between corporations, the firms that audit them, and the SEC itself.
But there was a second catastrophic event in the last four years, and one that is key to understanding the dichotomy of collaboration and confidentiality: the attacks on the United States by terrorists on 9/11. That catastrophe is the epitome of a document/content management nightmare.
Too Little Collaboration and Too Much?
Within these past few months, the Presidential Committee on 9/11 has quickly discovered that much of the information that the U.S. government needed to prevent the attack of terrorist was actually known to various departments within the FBI and the CIA. However, certain domestic laws and bureaucratic security procedures prevented those agencies from sharing memorandum and files that contained the clues.
Had those clues been assembled, according to the committee, the chances for derailing the events of 9/11 would have been greatly increased. Better collaboration might have led to prevention.
However, there is a caveat and a warning in the committee's findings as well. This presidential committee also points out that the actual identities of several of the 9/11 hijackers were, in fact, stolen--with passport and drivers license documents forged from information obtained from insecure databases that the government itself shared with local officials.
In other words, the insecurity of the government's databases--used to collaborate with state officials--was also a key to the hijackers' abilities to shield themselves from government scrutiny. Though some of their identities were known to some officials, they merely selected someone else's identity from the databases to hide their actions from detection. As a result, hundreds of innocent law-abiding citizens were rounded up and detained by FBI officials, in many cases using the crudest of profiling data.
Collaboration vs. Confidentiality
So the challenge that faces document management is the transformation to real content management. And it will be a challenge for IT information management generally throughout the 21st century. How do we make information quickly accessible, deeply searchable, and easily collaborative, while simultaneously protecting the content itself from breeches in confidentiality?
Sound impossible? Well, IT scientists at IBM and other labs are now seeking to build comprehensive technologies that will bridge this gap between collaboration and confidentiality. Their models are not yet fully functioning, but their outlines are falling into place, consisting of multiple, stair-step processes.
The first process is document or content classification, wherein the content of documents and data are identified with ownership credentials and keyword search criteria. Birth certificates, Social Security statements, credit histories, telephone records, bank account transactions--these critical identity markers will be classified in tiers of importance by automated algorithms that mathematically profile the content.
This is followed by an encryption process, wherein the document's content is scrambled according to encryption algorithms. The de-scrambling formula is contained in a group of access-level digital certificates.
The owner or originator of the document will be provided with a digital certificate that allows read access to all elements contained within the content, enabling the owners to view the contents of records about them. However, once archived, the owner cannot change the content itself without going through a legal procedure.
Various other certificates, with different security levels, will grant access to specific areas of the document without compromising the identity or the confidentiality of the document's owner.
Searching Securely with Confidentiality
Using this kind security encryption/classification process, government employees with the appropriate access certificate would be allowed to progressively search and retrieve the confidential content contained within the documents at graduated levels of clearance, using keyword searches that are based upon their official access level. Implementing this technology, an agency could construct a database for correlation of specific types of data, collaborate with other agencies, and still shield the originators or owners of the content from identity theft.
Tomorrow from Today
Today, this kind of content management technology is still very rudimentary. Indeed, the very scope of classifying all the digital information that passes through our networks is a mind-boggling undertaking.
Yet, as more and more of the vital information that runs our companies passes through the digital network--as well as our own personal information--the more it becomes a necessity for technology to thread a needle between demands for increased collaboration on the one hand and the requirements for increased personal confidentiality on the other.
Laws like SOX help corporations verify that the claims made to shareholders are accurate and complete, and document management systems like those used on the iSeries are significantly assisting in that effort. Yet increased collaboration brings also the threat of theft of important information, and securing the well-being of individuals is also a requirement. So while better document management solutions help us to build the network for collaboration, in the future, content management technologies will help us to secure it from abuse.
Thomas M. Stockwell is Editor in Chief of MC Press, LP.
MC Press Online