by Darin Stewart | April 7, 2014 | 4 Comments
In my last post I discussed how enterprise search can bring big data within reach. To achieve this, however, crawling and indexing must move beyond traditional vertical scaling and move into a truly distributed model ala Hadoop and its cousins. The end product of integrating enterprise search and distributed computing is a scalable, flexible, responsive environment for information discovery and analytics. The system scales easily and efficiently in terms of both content size and query handling capacity. Both batch oriented content processing and near-real time information access can be supported. The key-value oriented nature of MapReduce along with the flexible schema and dynamic field support of the search engine allow any form of content, structured, unstructured and semi-structured, to be fully leveraged. In short, the enterprise search infrastructure becomes a powerful NoSQL database.
There is currently no canonical definition of what constitutes a NoSQL database. According to Wikipedia, “NoSQL (Not only SQL) is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases. These data stores may not require fixed table schemas, usually avoid join operations and typically scale horizontally.” Even so, there are certain characteristics all NoSQL databases share. First and foremost, NoSQL supports schema-on-read which means that any form of information can be written to the database with a schema only being applied when that information is retrieved for use. This reverses the schema-on-write approach of traditional relational databases in which information must be made to conform to a particular schema before it can be stored. In addition, NoSQL databases prefer eventual consistency over support for ACID compliance and strict consistency. A search-driven approach to Big Data can facilitate each of the four models for NoSQL information management described in that document: key-value stores, document databases, table-style databases and graph databases.
The search-driven approach I describe in the Gartner document The New NoSQL: How Enterprise Search and Distributed Computing Bring Big Data Within Reach offers several additional advantages that are likely not a standard part of more pure-play NoSQL solutions. An enterprise search platform will offer text oriented features that simplify the index generation process. Such standard features will include free text search, faceting, spell checking, vocabulary management, similar item search, hit highlighting, recommendation engine, visualizations, content rating and many other capabilities that will augment and enhance an analytical index. The combination of NoSQL capabilities with the near-real time information access afforded by enterprise search and the ability to do so in the cloud have the potential to unlock the full value of enterprise information assets and finally bring Big Data within reach.
Category: Big Content Uncategorized Tags: Big Content, Big Data, Enterprise Search, NoSQL
by Darin Stewart | April 1, 2014 | 5 Comments
Regardless of industry, sector or size, the promise of unlocking the full potential of enterprise information with big data technologies and big content techniques is tantalizing. Unfortunately, for most organizations, realizing the promise of big data remains out of reach. The perception is that there is just too much information to get under control, things change too quickly, and there are too many moving parts. The volume, velocity and variety conundrum stymies many potentially transformative undertakings before they ever make it off the whiteboard. The expense and disruption that are involved in expanding and retooling on-premises infrastructure remain insurmountable obstacles for many organizations that desire to undertake a big data initiative.
Providing big data functionality without overhauling the data center is achievable with a two-pronged approach that combines augmented enterprise search and distributed computing. Search is very good at finding interesting things that are buried under mounds of information. Enterprise search can provide near-real-time access across a wide variety of content types that are managed in many otherwise siloed systems. It can also provide a flexible and intuitive interface for exploring that information. The weakness of a search-driven approach is the fact that a search application is only as good as the indexes upon which it is built. The traditional global search index that only contains the raw source content is not sufficient to facilitate big data use cases. Multiple, purpose-built indexes that are derived from enriched content are necessary. Creating such indexes requires significant computational firepower and tremendous amounts of storage.
Distributed computing frameworks provide the environment necessary to create these indexes. They are particularly well-suited to efficiently collect extremely large volumes of unprocessed, individually low-value pieces of information and apply the complex analytics and operations that are necessary to transform them into a coherent and cohesive high-value collection. The ability to process numerous source records and apply multiple transformations in parallel dramatically reduces the time that is required to produce augmented indexes across large pools of information. Additionally, these operations and the infrastructure necessary to support them are open-source-oriented and very cloud-friendly. It is possible to establish a robust search-driven big data ecosystem without massive upfront investments in infrastructure.
Distributed computing and augmented enterprise search are two sides of the big data coin. Both are necessary, but neither is sufficient to facilitate many knowledge-intensive applications. Hadoop and its cousins are purely batch-oriented and so cannot provide the near-real-time access that is facilitated by search. Enterprise search provides rapid, easy access to information but cannot perform the complex analytics necessary to build the indexes supporting that access. Combining distributed computing and enterprise search provides a flexible, scalable and responsive architecture for many big data scenarios.
I explore this approach in depth in the Gartner document The New NoSQL: How Enterprise Search and Distributed Computing Bring Big Data Within Reach and will be speaking on it in London at the Catalyst Europe conference. I hope to see you there.
Category: Big Content search Uncategorized Tags: Big Content, Big Data, Distributed Computing, Enterprise Search, Hadoop
by Darin Stewart | May 24, 2013 | 2 Comments
Introducing the notion of Big Content has been an interesting study in reactions. To many, it has resonated and the possibility of more fully exploiting documents, social content and other unstructured resources has clicked. For others, it has been like fingernails on a chalkboard. A lot of us never liked the words “Big Data” in the first place, but like it or not we are stuck with the name. One of the reasons we didn’t like it (well…one of the reasons I’ve never liked it) is that “data” tends to emphasize structured resources like databases and logs to the neglect of more textual and free-form assets. We need them both. Big Content is simply shining a spotlight on the shadowy corners of the enterprise information ecosphere. Despite the idiosyncrasies of unstructured content and the unique demands of its management and analysis it remains fully a part of the Big Data world.
Nearly all steps and stages of preparing unstructured content for Big Data consumption have their analogue in the structured data world. Data must be cleaned, reconciled and modeled just as documents must be processed and prepared. To a certain degree this is a case of performing the same task with different tools. While both types of information may be enriched the nature of that enrichment will differ. Where a set of data points from a large array of sensors would be submitted to an inference engine to fill out a sparse data set an equally large number of “Tweets” would be analyzed for sentiment and both sets of information could be geotagged.
Structured information resources have played a more prominent role in Big Data than unstructured resources primarily because the enterprise is more comfortable with managing databases than it is managing documents. Data hygiene and information quality are de rigueur for the data warehouse, but are often never considered in the ECM environment. Likewise, structured resources are more likely to be “hard-wired” into the Big Data pipeline with well established connectors and regularly scheduled ETL windows. Unstructured content is often included almost as an afterthought, with extraction and enrichment applied on-the-fly, from scratch on a case-by-case basis. This undermines the potential of Big Data in several ways. It raises the cost of incorporating unstructured content while also increasing the opportunities for the introduction of inconsistencies and errors reducing the quality of the final product. Most importantly, the ad hoc approach also reduces the potential of Big Data by obscuring the extent of available raw materials.
If unstructured content is difficult to find, reconcile and include in the analytical environment, it is unlikely that novel applications will even be conceived, much less acted upon. The idea of Big Content is simply to encapsulate and enumerate the steps necessary to avoid this unfortunate situation by extending the Big Data environment and infrastructure to incorporate unstructured content in a strategic and systematic manner.
I will be speaking and leading workshops on Big Content at several upcoming Gartner events including Catalyst, The London Portals, Content and Collaboration Summit and Symposium Orlando. I hope to see you there and to continue the conversation.
Follow Darin on Twitter: @darinlstewart
Category: Big Content Enterprise Content Managment Tags: Big Content, Big Data, catalyst, ECM, PCC, Symposium
by Darin Stewart | May 15, 2013 | 1 Comment
In recent posts I’ve introduced the notion of Big Content as shorthand for incorporating unstructured content into the Big Data world in a systematic and strategic way. Big Data changes the way we think about content and how we manage it. One of the most important areas requiring a fresh look is metadata. Big Content expands the definition of metadata beyond the traditional tombstone information normally associated with documents (title, author, creation date, archive date, etc.). While these elements are necessary and remain foundational to both effective content management and Big Data, more is required. Big Content metadata encompasses any additional illuminating information that can be extracted from or applied to the source content to facilitate its integration and analysis with other information from across the enterprise. This expanded definition results in a three-tiered metadata architecture for Big Content.
At the bottom level of the architecture, a core enterprise metadata framework provides a small set of metadata elements that are applicable to the majority of enterprise information assets under management. These elements are often drawn from a well known standard set of elements such as the Dublin Core but can include whatever common elements are useful to the enterprise. This common framework provides the unifying thread that will facilitate locating content from across the enterprise, making an initial assessment of its relevance and of submitting it to the content ingestion pipeline.
The second layer of the Big Content metadata architecture consists of domain specific elements that are not necessarily applicable to all enterprise content, but are useful to a particular area such as a brand, product or department. At this level, common metadata often exists under different labels depending on where it is created and which department owns it. This increases its value and utility to that department but makes it more difficult to leverage for content integration and analysis. To reconcile domain metadata it is often necessary to create a metadata map that resolves naming and semantic conflicts.
The top layer of the metadata architecture consists of application specific metadata. This is additional information about content that is only relevant to the use-case at hand and the application facilitating its execution. As such it is not created or stored in the content management systems hosting the source content. It is created solely for the purpose of structuring and augmenting the content to be utilized within a vertical application in the Big Content environment.
Throughout the entire Big Content lifecycle ensuring metadata quality and integrity is of the highest importance. Quality measures must go beyond simply reconciling field names. It is important that the steps taken to enrich and refine content are applied consistently. If some dates are not normalized, entity extraction is incomplete, or terminology is not reconciled, the accuracy of the data behind the insights comes into doubt. As a result any analysis and its findings become questionable. Metadata represents a significant upfront investment and ongoing requirement when large amounts of content are involved. Never the less, it is a critical factor in effective content management and the key enabler of the Big Content ecosystem.
Category: Big Content Enterprise Content Managment metadata Tags: Big Content, Big Data, ECM, Metadata
by Darin Stewart | May 13, 2013 | 1 Comment
In recent years large, data-centric vendors acquired smaller enterprise search companies at an astonishing rate. Oracle purchased Endeca. IBM purchased Vivisimo. Hewlett-Packard purchased Autonomy. Microsoft purchased FAST. There is a reason for this feeding frenzy of corporate acquisitions. Big Data is the current killer application and search provides a ready entry into the Big Data space for both vendors and practitioners. This is particularly true in the case of incorporating unstructured content into the Big Data world (or as I called it in a previous post Big Content). Search is one area of technology where the order of precedence between structured and unstructured content was reversed. Historically, structured data and its management was the primary focus and top priority of industry. As a result, database, data warehouse and business intelligence technologies initially focused on structured data, leaving unstructured content for later consideration and thus causing those capabilities to mature much more slowly. Enterprise search, by contrast, began with unstructured content and only recently brought structured data sources into the fold. From the outset, search has focused on discovery, access and exploration rather than reporting or transaction processing. This gives search several distinct advantages when dealing with unstructured content.
First and foremost, most enterprise search platforms have mature and in some cases quite sophisticated content ingestion and indexing pipelines. At the most basic level, this surfaces content and makes it both visible and accessible regardless of where it is stored. At a deeper level, the indexing process facilitates the content processing and enrichment that underpins Big Content capabilities. The importance of content preprocessing to Big Data and Big Content capacities cannot be overstated. Content enrichment of this sort can bring structure and consistency to otherwise freeform content. Even with this preprocessing, unstructured content will always present itself inconsistently. Records will by only partially complete. Documents will be of varying length and so forth. A search engine is very comfortable with such jagged records and can incorporate them into its indexes without difficulty.
Business users have an intuitive grasp of what search does and how to make it work. If the enterprise search platform has been implemented well, finding relevant enterprise information assets is not that different or more difficult than using Google. A few well chosen keywords will usually at least put the user in the general vicinity of the information they are looking for. Search is also good at finding things that are “close enough” by managing spelling variants, synonyms, related content and other fuzzy matching mechanisms. This is extremely useful when attempting to uncover nuggets of information scattered across and hidden within large amounts of heterogeneous content.
The concept of search is simple and straight forward. Everything that goes in to making search effective, especially as a foundation for Big Content analytics, can get very complicated. Modern search has the ability to go far beyond simple retrieval. The indexes created by content ingestion pipelines can combine diverse information in novel ways that uncover relationships, trends and patterns that would not otherwise be apparent. This is accomplished primarily by determining the relevance of each indexed item to the query or question at hand. Big Content search takes a broader view of relevance than the one size fits all approach of a public web search engine. In the world of Google’s PageRank and its peer algorithms, a single index is developed and replicated that attempts to support all queries for all people. For basic information location and retrieval, this approach has been remarkably successful, but Big Content cannot take such a universal approach to indexing.
Search operates at two levels in a Big Content environment: Discovery and Analysis. At the discovery level search functions much as it does on the web or in traditional enterprise search. It provides a single, comprehensive index of available information assets against which queries are matched and relevant content is retrieved. Beyond simple information retrieval, this sort of discovery facilitates the deeper analysis at the heart of Big Content. Using search-based discovery, a user can identify a pool of information resources, both structured and unstructured, that may contain a desired insight or answer a particular question in a way that simply reading a document will not reveal. This pool of resources can be gathered from across the enterprise and processed through the indexing pipeline to be enriched, organized and indexed in an iterative manner specifically tailored to the situation at hand. Rather than attempting to envision all possible uses of content when designing the repository, the search index approach allows the user to model what they need when it is needed and only for the content involved.
Category: Big Content Enterprise Content Managment search Tags: Big Content, Big Data, Metadata, Search
by Darin Stewart | May 7, 2013 | 2 Comments
Most organizations recognize, at some level, that creating and sharing knowledge is a worthy endeavor. Unfortunately, efforts to promote knowledge sharing rarely go beyond mission and value statements displayed in the break room and silk-screened on coffee mugs. Organizations want to do it, they just don’t know how. It is ironic that while there is a large body or best practices demonstrating how to address this issue, we still haven’t quite figured out how to transfer knowledge about how to transfer knowledge. It can help to start with the basics.
The most common form of knowledge transfer is, more than anything, a means of preserving expertise and experience. When a team accomplishes a new task, usually learning dos and don’ts along the way, it is obviously desirable to preserve that knowledge within the team so that they don’t make the same mistakes the next time they must perform the same task in a different setting. This is serial knowledge transfer. The critical factor is that individual knowledge must be disseminated across the entire team. If Bob learns a new trick, he needs to share it with Sally, Umar and Greg. Likewise if Umar makes a mistake, he must warn his compatriots of the potential pitfall so they avoid the same error in the future.
The United States Army has formalized serial transfer in the form of After Action Reviews (AAR) in which they standardized three key questions: What was expected to happen? What actually happened? What accounts for the difference? In the course of the AAR, each of these questions are answered, but within certain ground rules. First, meetings are held regularly. A kickoff meeting and a post-mortem are not sufficient. Weekly reviews, or at least after each milestone or project phase, are necessary. Frequent meetings are also necessary in order to keep them brief. The more focused the meeting (limited scope and agenda) the more useful the knowledge produced tends to be. Everyone involved in the task under review participates in meeting. If you don’t show up or are silent during the proceedings, the implication is that you did not contribute or worse do not care.
Interestingly for the Army, recriminations of any sort are not allowed. The Army has a very clear rule: Nothing said in an AAR can be used in any kind of personnel action. This “What happens in Vegas, stays in Vegas” dynamic is critical to effective knowledge transfer. If there is fear of adverse consequences arising from information shared in the meeting, the truth will be supplanted by blame shifting and overly optimistic distortions in the finest C.Y.A. tradition. Once team members are truly convinced that they will not be penalized for honesty, the quality of information will improve dramatically.
It is important to remember that the goal of serial transfer is to build and preserve knowledge within a unit or team. As a result, summary reports are not forwarded beyond the actual participants. If fact in many of these meetings, no written record is kept. If notes are taken they are retained only for local use and distribution. In the case of the U.S. Army, AAR notes are not sent through reporting lines but through a “knowledge line.” Finally, it is essential that meetings be facilitated by a member of the team. This increases ownership and trust among the members. It will also allow internal expertise and orientation to shape the form of the documentation in a way that is most appropriate to the team as its primary consumers.
While the military is generally not considered a particularly “safe place” to “honestly share ones feelings and frustrations” it has found an effective way to learn from mistakes (at least in some cases). It is essential for staff to be able to share concerns, observations and failures along a complete knowledge line, even if that knowledge line runs through your commanding officer…or the CIO.
Category: Collaboration Knowledge Management Tags: Knowledge Management, knowledge transfer
by Darin Stewart | May 1, 2013 | 4 Comments
The age of information overload is slowly drawing to a close. The enterprise is finally getting comfortable with managing massive amounts of data, content and information. The pace of information creation continues to accelerate, but the ability of infrastructure and information management to keep pace is coming within sight. Big data is now considered a blessing rather than a curse. Even so, managing information is not the same as fully exploiting information. While ‘Big Data’ technologies and techniques are unlocking secrets previously hidden in enterprise data, the largest source of potential insight remains largely untapped. Unstructured content represents as much as eighty percent of an organizations total information assets. While Big Data technologies and techniques are well suited to exploring unstructured information, this ‘Big Content’ remains grossly underutilized and its potential largely unexplored.
Gartner defines unstructured data as content that does not conform to a specific, pre-defined data model. It tends to be the human-generated and people-oriented content that does not fit neatly into database tables. Within the enterprise unstructured content takes many forms, chief amongst which are business documents (reports, presentations, spreadsheets and the like), email and web content. Each of these content sources has mature disciplines supporting them. Business documents are shepherded through their lifecycle by ECM platforms. Email is managed, monitored and archived along with other text-based communication channels. Ever more sophisticated web content is matched by equally sophisticated Web Content Management tools. Each of these platforms is focused on management and retention rather than analysis and exploration. They are not intended to provide advanced analytical and exploration capabilities for the content they manage; nor are they capable of doing so. They can, however, provide a robust foundation supporting a Big Content infrastructure.
Enterprise owned and operated information is only part of the Big Content equation. The potential for insight and intelligence expands dramatically when enterprise information is augmented and enhanced with public information. Content from the social stream can be a direct line into the hearts and minds of customers. Blogs, tweets, comments and ratings are a reflection of the current state of public sentiment at any given point in time. More traditional web content such as news articles, product information and simple corporate informational web pages become an extension of internal research when tamed. More formal data sources are emerging in the public realm in the form of smart disclosure information from various areas of government in the US and Linked Open Data across the globe. All of these unstructured (and semi-structured) information sources become valuable extensions to enterprise information resources when approached in a Big Content manner.
Gartner is embarking on a new look at how Big Data technologies and techniques can be applied to unstructured information resources. We are calling this Big Content. I will be exploring this topic in a series of three documents that will appear over the course of the next few months.
Big Content: Unlocking the Unstructured Side of Big Data
Using Search to Discover Big Data
Building a Content Command Center
This an exciting and rapidly evolving part of the Big Data landscape. Big Data, ECM, Search and Semantics are converging to open up new possibilities emerging from our ever growing content stores. I’m looking forward to examining and discussing this new topic with the Gartner community.
Category: Big Content Enterprise Content Managment Open Data search Tags: Big Content, Big Data, ECM, Search
by Darin Stewart | September 26, 2012 | Comments Off
A few weeks ago, Barack Obama delivered a campaign speech in Roanoke, Va. in which he handed his political opponents sound bite gold. “If you’ve got a business—you didn’t build that. Somebody else made that happen.” Conservatives seized on this sentence as proof that the president doesn’t respect the efforts of entrepreneurs. The “We Did Build It” meme is now a rallying cry of the challenger’s campaign. Of course, the full quote could be taken a different way:
If you were successful, somebody along the line gave you some help. There was a great teacher somewhere in your life. Somebody helped to create this unbelievable American system that we have that allowed you to thrive. Somebody invested in roads and bridges. If you’ve got a business—you didn’t build that. Somebody else made that happen.
The point the president was trying make, as I read it anyway, is that we don’t have to do everything ourselves. Society jointly creates some things for the common good. We can and should leverage public infrastructure, like “roads and bridges”, to enable our businesses. However, while we all take advantage of some public infrastructure, other public resources are overlooked and underutilized. Case in point: Open Data initiatives.
A large and growing cloud of high quality, well structured data and information exists on the Web as a "public good" emerging from open government initiatives, publicly funded research and even commercial contributions. The available datasets cover topics ranging from arts and entertainment to finance and markets to pharmaceuticals and genomics. In the United States, a wealth of new data is being published under the Smart Disclosure initiative. According to the US Office of Budget and Management: “Smart disclosure refers to the timely release of complex information and data in standardized, machine readable formats in ways that enable consumers to make informed decisions.” In other words, Open Data. Entrepreneurs and other job creators are rushing to create new services and businesses powered by this public resource. A couple of popular examples are Castlight and LowerMyBills. These innovative services are useful to the consumer and lucrative to the entrepreneur.
Open data is information as infrastructure. It is like a public power grid of data. In the days before public power, companies would maintain their own power generators. As national power grids came online, this was no longer cost-effective or necessary. Many basic data needs can now be met in the same way. As an added bonus, the resource isn’t just from the local government. Open data provides a global power grid of information that is accessible to anyone from anywhere. In September 2011, 46 countries signed commitments to open government operating principles, including publishing data for public use. The tired trope of the “Information Superhighway” is starting to take on new life and new meaning. Now that the digital roadways have been built, we are shifting our attention to making those streets and interstates more useful and the original investment more valuable. Leveraging this emerging global information commons is a vast opportunity for the enterprise positioned and willing to respond.
Of course we could all go it alone, always generating our own proprietary information, reinventing the wheel every time we need some new data, but why would we?
(I’ve written at length about applying Open Data to commercial endeavors in the recently published paper Radical Openness: Profiting from Data You Didn’t Create, People You Don’t Employ and Ideas You Didn’t Have.)
Category: Innovation Open Data Semantic Web Tags: linked data, open data, open enterprise, smart disclosure
by Darin Stewart | September 21, 2012 | Comments Off
I have a deep and abiding love of craft beer. Fortunately, I live in Portland Oregon, epicenter of the craft brewing movement. Our fair city boasts 55 craft breweries and counting. Where other cities offer siteseeing tours, we have brewery tours. These are not drunken pub crawls. They are educational vehicles for the aspiring beer geek. Every sample is preceded by a lecture and followed by an analysis. In the midst of a recent Brewvana tour, I was struck by how much the technology community could learn from the brewing community.
While listening to a Master Brewer hold forth on the merits of Northwest hops and English malts, three young gentlemen seemed to be paying closer attention than the rest of us. During the tasting break they introduced themselves as aspiring brewers from two new competing microbreweries that were just starting out here in Portland. Rather than expelling them from the room as industrial spies, the brewmaster began offering guidance and advice. As another sample was poured for the group, the discussion turned toward techniques, recipes and increasingly technical brewing arcana. This was not a one way flow of sage advice from the wise master brewer to the precocious newcomers. The new guys were sharing their own innovations, ideas, successes and failures. On the bus to the next tour stop, the three young brewers chatted excitedly about what they had learned, which ideas had been validated and what they wanted to try next. The rest of us were happy to have been flies on the wall.
At the next stop, we were greeted by the owner of a small 10-barrel brewery ready to share his knowledge and promote his product. His t-shirt immediately caught my eye. It sported the logo of a brewery we had previously visited, a direct competitor to this gentleman. When I asked him about it he responded quizzically. “Why wouldn’t I want to promote his beer? We’re all friends. We share recipes. Sometimes we even share facilities and equipment. If we build up the community, we build up the market. Everybody wins.” That was an epiphany for me. I’ve discovered that craft brewing is not a zero-sum game. Technology shouldn’t be either. (His Kolsch was pretty good too).
Sharing, openness and collaboration among competitors is a radical notion to many businesses, but it shouldn’t be. By sharing pre-competitive information, whether it is hop to malt ratios, field experiment results or design patterns, open companies are creating an ecosystem in which they along with their competitors can thrive. Rather than jeopardizing existing revenue streams, opening intellectual property and precompetitive data to external entities can expand and accelerate profits by cultivating new channels of revenue. Jeff Weedman, Proctor and Gamble’s Vice President of external business development and global licensing sums this up nicely, saying: “Competitive advantage used to mean ‘I’ve got it and you don’t.’ Now it can mean, ‘we both have it and can make money off of it.’" IBM’s early investment in open source, which eventually yielded a $2 billion professional services revenue stream, is a prime example of this ethos.
Evidence suggests such open strategies are already stimulating innovation, fostering creativity, creating new business relationships and facilitating multidisciplinary analysis and insight. Industries as competitive as pharmaceuticals and energy are investing in precompetitive forms of information commons, and are developing "collective competencies" to spur innovation and boost the productivity of downstream product development. A prime example of this dynamic is the Pistioa Alliance. Such unique public-private partnerships to rapidly share data and findings are producing innovative results in the treatment of disease.
This does not mean the end of competition, just a change in its character. Portland brewers are extremely competitive, but winning or losing a taste competition rests on the merits of the product and the skill that goes into crafting it, not on the secrecy of the recipes. Most recipes are posted publicly. Amateur brewers are encouraged to experiment on those recipes and share their own innovative variants. We’ve even got a U-Brew facility where ambitious home brewers can use (for a fee) top caliber equipment under the tutelage of more experienced mentors. It has served as an incubator for several new businesses and a lot of interesting beers. The current rage is “collaborative brews” in which two competing breweries team up to create a brew that plays to both of their strengths, but which neither could produce alone.
Technology innovators should take a similar approach. The boundaries of the enterprise need to become more porous. The inflow of external ideas, assets and information should be placed on an equal footing with the outflow of product and licenses. Sharing precompetitive data, facilities and expertise accelerates innovation, reduces costs, improves the final product and can expand the market. I’ve blogged previously about how most intellectual property is not used as actionable innovation. Too often it is simply filed away as a hedge against future litigation or used as a tool to instigate litigation. This protectionist posturing is pointless. Next time you mine your patent portfolio and identify a potential infringer, don’t serve them court papers, invite them out for a beer. Chances are you will discover a collaborative way that both of you can profit from the IP in question and maybe come up with something really cool that neither of you could produce on your own.
Category: Collaboration Innovation Tags: beer, collaboration, information commons, open enterprise, Open Innovation
by Darin Stewart | September 7, 2012 | Comments Off
Content is moving into the cloud. This trend seems both undeniable and inevitable. As the amount of content we need to store and the length of time it must be retained both continue to grow, so do the options for managing and maintaining that content outside of the enterprise data center. Continually expanding the infrastructure to keep pace with an ever rising tide of content is a losing battle. Add to this the mobility factor with all the devices and users demanding seamless access to content on either side of the firewall. Continuing to keep all that content on-premise seems not only unnecessary, but unwise.
The cloud seems to present an appealing option for CIOs to reduce data center costs, but they are approaching it cautiously. Lingering concerns over security and compliance temper the enthusiasm IT leaders feel toward cloud-based solutions. In addition, the cost and complexity of integrating hosted solutions with legacy enterprise applications can quickly consume potential savings realized by eliminating servers and storage. Even so, the march to the cloud continues. It is just proceeding at a slower pace. Gartner projects that by 2016 most Global 1000 companies will have stored customer-sensitive data in the public cloud. Enterprise Content Management vendors are pivoting their product roadmaps to respond. Within three years, at least 80% of ECM vendors will provide cloud service alternatives to their on-premise solutions.
Unfortunately, most information workers aren’t willing to wait. They have grown accustomed to storing all of their personal content, including music, video, pictures and documents, in the cloud and being able to access it from anywhere, at any time, from any device. Not only is it cheaper to store their content in the cloud, someone else takes care of system administration including backup and recovery. Perhaps most importantly, it is much easier to share your content from the cloud than it is from a server sitting in the hallway closet. The desire for this flexibility and economy has followed the consumer from their private life into their professional work. Much like the early days of social computing, staff are going around the IT department and adopting public, consumer oriented services and solutions in an ad-hoc manner.
This presents several problems for the IT department. First, consumer oriented solutions are not designed or intended for enterprise use. As a result, many of the concerns preventing full adoption of cloud-based content management are being introduced into the enterprise anyway. In addition, because these are unofficial solutions the infrastructure hasn’t been reduced so no cost savings are realized, other than perhaps some staff time that would otherwise be spent trying to access and share content.
Enterprise content is already moving to the cloud whether or not official policy and planning intend it to do so. While the initial motivation may have been cost savings, the actual driver is convenient and ubiquitous access to content. As a result, the conversations surrounding cloud-based content management need to shift. Rather than debating whether or not content should be moved to the cloud, the enterprise should be focused on what content should be moved to the cloud and how to manage the process of doing so.
I examine the ins and outs, do’s and don’ts, risks and rewards of cloud content management in the new Gartner for Technical Professionals document Content in the Cloud and will be speaking on the topic at the Gartner 2012 Symposium in Orlando. Hope to see you there.
Category: cloud Enterprise Content Managment Tags: ccm, cloud, cloud content management, file sharing