by Doug Laney | November 13, 2013 | Submit a Comment
Many vendors and pundits have attempted to augment Gartner’s original “3Vs” from the late 1990s with clever(?) “V”s of their own. However, the 3Vs were intended to define the proportional dimensions and challenges specific to big data. Other “V”s like veracity, validity, value, viability, etc. are aspirational qualities of all data, not definitional qualities of big data. Conflating inherent aspects with important objectives leads to poor prioritization and planning. For example, if you’re like many organizations, your terabytes of streamed sensor, log file or multimedia data may not have veracity (data quality) issues at all, but your megabytes of master data may be in total disarray.
As author and analytics strategy consultant Seth Grimes observes in his InformationWeek piece Big Data: Avoid ‘Wanna V’ Confusion, “When a concept resonates, as big data has, vendors, pundits and gurus — the revisionists — spin it for their own ends….In my opinion, the wanna-V backers and the contrarians mistake interpretive, derived qualities for essential attributes.”
Also follow Doug on Twitter @Doug_Laney
Category: Uncategorized Tags: 3Vs, batman, big data, bigdata, data, information, information management, variety, velocity, veracity, volume
by Doug Laney | November 12, 2013 | Submit a Comment
Prior to Facebook’s IPO, I published a piece in the Wall Street Journal suggesting what the economic value of one of its active users was at the time: To Facebook You’re Worth $80.95. So why not reprise the concept by exploring the infonomics of Twitter?
Twitter’s S1 IPO filing reports that there are over 500 million tweets per day from 215 million active users. That’s about 900-some tweets per user per year. (Over 1000 tweets per year? Consider yourself above average!) Twitter’s S1 balance sheet identifies $964 million in assets, and as of this writing, TWTR’s market cap is $22.83 billion
As I argued in the WSJ piece, since companies like Facebook and Twitter are nearly pure information-based businesses, the difference between their market cap and reported assets represents the value of their information assets. Or more precisely: current investor expectations of Twitter’s ability to monetize its data, expressed in net present dollars. This means the value of Twitter’s data is $21.86 billion, assuming a year-long valuable life expectancy of a tweet. True, tweets are not easily searchable after a few days on the wire, but this doesn’t mean they’re not without value to Twitter.
Note: Due to arcane and archaic accounting practices dating back to the Great Depression, then reinforced in the aftermath of the 9/11 terrorist attacks, information assets are not considered corporate assets and therefore are nowhere to be found on balance sheets of any company. For more on this see my piece, Infonomics: The New Economics of Information in the Financial Times, or my Gartner research note, The Birth of Infonomics and the New Economics of Information.
So, with 215 million active users, this means that each one of us, as of this writing, is worth $101.70 to Twitter. In terms of revenue however, we generate a scant $1.47 per year for Twitter. Each measly tweet itself is worth 12 cents and generates 17 one-thousandths of a cent ($0.0017) in revenue.
How does Twitter monetize its data? Today mostly via advertising revenue (85-87% according to its S1). It delivers 2 billion tweets per day to desktops and mobile devices, so there’s plenty of room to slip in some ads. Twitter also has special deals with others to provide access to the Twitter Firehose (full data stream) and resell its content. As I suggested in my previous Gartner Blog Network piece, Twitter’s Secret Nest Egg is in Plain Sight, ultimately Twitter will shift to syndicating its data, over advertising, as a primary source of revenue.
Sure, Twitter and Facebook are extreme cases with extreme numbers to go along with them. Still, consider the vast amount of data your organization collects, that if sanitized, packaged and marketed effectively could introduce an entire new revenue stream for you—perhaps even self-funding your ongoing enterprise data warehouse or nascent big data initiative as some of our clients have done.
Yes, of course you can follow Doug on Twitter @Doug_Laney
Category: Uncategorized Tags: analytics, big data, bigdata, economics, facebook, finance, infonomics, monetization, social media, tweet, twitter, valuation, value
by Doug Laney | November 8, 2013 | Comments Off
With all the chirping about Twitter’s ability or inability to generate sufficient revenue via advertising income, it is important to consider an alternative revenue potential even more significant: syndicating its content.
Twitter’s own Terms of Service make it perfectly clear who has unlimited distribution rights to the content you post. Them.
By submitting, posting or displaying Content on or through the Services, you grant us a worldwide, non-exclusive, royalty-free license (with the right to sublicense) to use, copy, reproduce, process, adapt, modify, publish, transmit, display and distribute such Content in any and all media or distribution methods (now known or later developed).
Yes, Twitter also claims your content is yours (for obvious liability reasons), and that you can “reproduce, modify, create derivative works, distribute, sell, transfer, publicly display, publicly perform, transmit, or otherwise use the Content.” But “you have to use the Twitter API” to do so, and Twitter’s increasingly restrictive revisions of its API, have ruffled developers’ feathers by severely crippling applications that repurpose Twitter content, even putting some out of businesses. As to content, the API only provides access to a “collection of relevant Tweets” (i.e. subset of only those indexed) for the past several day’s tweets, and only for specified search parameters.
Therefore, you only can get broad-spectrum, longitudinal access to tweets, let alone historical ones, on a wing and a prayer. Or by special arrangement. Twitter’s big data is severely restricted to those without a special firehose access licensing and reseller agreement that only very few partners have (e.g. Gnip and DataSift). And Twitter has shown a propensity to flip the bird to enterprising developers by clamping-down on API functionality. The point is that at any moment Twitter could migrate its strategy to become the sole syndicator of historical Twitter content and Twitter firehose access. While this may not seem consistent with Twitter’s culture, remember it’s now a public company beholden to its NYSE:TWTR flock, not the good people of the interwebs.
The value of Twitter’s content to understand and leverage trends and sentiment about markets, products and companies is greater than the value of any twadvertisement. Use cases for customer support, product development, marketing and sales, corporate strategy and development, etc. render Twitter content invaluable to nearly any organization in any industry and geography. Therefore, syndicating its content is likely to be the primary way Twitter ultimately soars to greater heights. Just as likely, Twitter will create a fee-based API or application for self-service analytics.
As we watch how Twitter and other social media companies hatch new ideas for monetizing their content, let this be a lesson about the potential of collecting, packaging and marketing your company’s increasing storehouse of information assets. We are just at the dawn of infonomics and monetizing enterprise data. The early birds will catch the worm, so get cracking.
Yes, of course you can follow Doug on Twitter @Doug_Laney
Category: Uncategorized Tags: big data, bigdata, content, enterprise content, infonomics, information assets, monetization, social media, tweet, twitter, twtr
by Doug Laney | August 19, 2013 | Comments Off
As summer wanes and the kids are heading back to school, I got to thinking about what a Big Data university program might look like if taught by some of the top minds at Gartner. So if you are matriculating this year or just considering enrolling with Gartner, here are your syllabus and instructors for Big Data University (BDU), home of the Fighting Petabytes:
Big Data Hype 101, Professor Nick Huedecker
The Nexus of Forces: Information, Social, Mobile and Cloud 201, Professors Daryl Plummer, Chris Howard
Big Data Strategy Essentials 201, Professors Doug Laney, Frank Buytendijk
Enterprise Information Management 101, Professors Mark Beyer, Roxane Edjlali, Nick Huedecker
Big Data Architecture 201, Professors Mark Beyer, Marcus Collins
Data Governance and Quality 301, Professors Ted Friedman, Debra Logan
Data Science and Advanced Analytics 201, Professors Lisa Kart, Alexander Linden, Svetlana Sicular, Doug Laney
Big Data File Systems 301, Professors Merv Adrian, Donald Feinberg, Marcus Collins, Roxane Edjlali
Self-Service Business Intelligence 201, Professors Kurt Schlegel, Rita Sallam, Neil Chandler, Daniel Yuen
Big Data Privacy and Ethics 101, Professors Frank Buytendijk, Jay Heiser
Mobile Business Intelligence 301, Professors Joao Tapadinhas, Lyn Robison
Big Data Analytics Technologies Lab, Professors Carlie Idoine, Svetlana Sicular, Rita Sallam, Neil Chandler, Jamie Popkin
International Studies in Big Data, Professors Hideaki Horiuchi, Donald Feinberg, Bhavish Sood, Dan Sommer, Daniel Yuen, Alexander Linden, Frank Buytedijk, Eric Thoo
Social and Collaborative Analytics 201, Professors Carol Rozwell, Rita Sallam
Executive Education in Big Data 101, Professors Hung LeHong, Mark Raskino, Doug Laney
Innovating with Information 301, Professors Doug Laney, Frank Buytendijk, Lisa Kart
Business Intelligence Competency Centers 201, Professors Bill Hostmann, Kurt Schlegel
Data Integration Approaches and Technologies 201, Professors Colleen Graham, Mark Beyer, Roxane Edjlali
Of course there are several electives to choose from as well:
The Role of the Chief Data Officer, Professors Debra Logan, Mark Raskino, Joe Bugajski, Doug Laney
Infonomics and the Economics of Information, Professors Doug Laney, Andrew White
Sentiment Analysis, Professors Jamie Popkin, Gareth Herschel
Master Data Management, Professors Andrew White, Bill O’Kane
Big Data in Financial Services, Professor Mary Knox
Big Data in Telecommunications, Professor Mei Selvage
Big Data and Analytics Service Providers and Outsourcing, Alex Soejarto
Big Data and Operational Technology, Professors Kristian Steenstrup, Doug Laney
Complex Event Processing, Professor Roy Schulte
Digital Marketing, Yvonne Genovese, Gareth Herschel
Performance Management, Dr. Christopher Iervolino, Professor Nigel Raynor
Dean of the College of Business Intelligence, Analytics and Performance Management: Ian Bertram
Dean of the College of Enterprise Information Management: Regina Casonato
To see Gartner Big Data University “professor” biographies, visit: http://www.gartner.com/analysts/coverage.do
To schedule remote office hours with any professor, contact firstname.lastname@example.org
For your required Gartner BDU reading list, visit: http://www.gartner.com/technology/topics/big-data.jsp
To attend find out how to see and meet with your favorite Gartner “professors” at one of our upcoming global Symposia or Summits, visit: http://www.gartner.com/technology/symposium/orlando/ and http://www.gartner.com/technology/summits/na/business-intelligence/
We look forward to seeing you in class!
Also follow Doug on Twitter @Doug_Laney
Category: Uncategorized Tags: analytics, BI, big data, bigdata, business intelligence, data science, data scientist, eim, enterprise information management, infonomics, information management
by Doug Laney | May 24, 2013 | 1 Comment
As we watch America’s greatest auto racing spectacle this Memorial Day weekend, what we won’t see is even bigger than the event itself, faster than the cars themselves, and more varied than the driver personalities. Of course I’m talking about the data. Racing teams now eat Big Data for breakfast, lunch and dinner. And for snacks in-between.
Outside, Indy cars and their cousin Formula-1 cars may be covered with dozens of sponsor logos, but inside they’re smattered with nearly 200 sensors constantly measuring the performance of the engine, clutch, gearbox, differential, fuel system, oil, steering, tires, drag reduction system (DRS), and dozens of other components, as well as the drivers’ health. These sensors spew about 1GB of telemetry per race to engineers pouring over them during the race and data scientists crunching them between races. According to McClaren, its computers run a thousand simulations during the race. After just a couple laps they can predict the performance of each subsystem with up to 90% accuracy. And since most of these subsystems can be tuned during the race, engineers pit crews and drivers can proactively make minute adjustments throughout the race as the car and conditions change.
Throughout the season, based on this accumulated data warehouse of information on car performance, driver performance, tracks and conditions, racing teams will make 50 or more mods per day. And for each season, new cars are built from the ground up using 95% new parts designed using this data.
Of course all these modifications need to adhere to fluctuating, fastidious and unforgiving racing league specifications. So analytics to ensure compliance is just as important.
Telemetry Tech on the Track
So what’s behind all this Big Data wizardry? Here’s a summary of some of what McLaren Electronics has built and baked into and around its team’s cars:
- Its latest data collection device, the TAG-320, features 4000MIPS of processing power, 512MB internal RAM, 8GB of logged data capacity, 13 buses, up to 100kHz analog sampling rate, internal accelerometer, 4000 logging channels, and a 1Gbps Ethernet link speed. Most of these characteristics are a 5-10x improvement over the previous 2008 TAG-310b model.
- The ATLAS (Advanced Telemetry & Linked Acquisition System) is a suite of analytics tools for real time storage, analysis, visualization and manipulation of data. It provides a customizable workbook, graphical timelines and other comparative visualization, heuristic car system checks, automated data alignment and sequencing, and a Microsoft SQL Server API. ATLAS offers analysis features called functions to combine parameters and develop sophisticated analytics, checks to automatically assess any car component, and markers to automatically or manually pinpoint the time when some anomaly happens.
- Accelerated data analytics is achieved using SAP’s HANA in-memory database
- Its Remote Data Server (RDS) enables live telemetry to be viewed simultaneously anywhere in the world by factory engineers, parts suppliers and data analysts
- Simulation capabilities using MATLAB (Simulink) can determine what might happen under different track or race situations, or if a driver behavior or car system were changed
- Special servers are used for collecting and integrating weather and other external data
Is Your Business on Track with Big Data?
All the excitement of auto racing aside, consider the key underlying components of what racing teams are doing to accelerate the performance of their cars and drivers and how these techniques can and should apply to your albeit relatively mundane business.
Use this checklist to see if your business will have a checkered future or get the checkered flag:
- Are you sufficiently monitoring key business processes, systems and personnel using available sensors and instrumentation?
- Are your data streams collected frequently enough for real-time process adjustments (i.e. complex event processing)?
- Do your business processes support real-time or near real-time inputs to adjust their operation or performance?
- Can you anticipate business process or system failures before they occur, or are you doing too much reactive maintenance?
- Do you centrally collect data about business function performance?
- Do you make use of advances in high-performance analytics such as in-memory databases, NoSQL databases, data warehouse appliances, etc.?
- Do you gather important external data (e.g. weather, economic) to supplement and integrate with your own data?
- Do you synchronize, align and integrate data that comes from different streams?
- Do you make your data available to key business partners, suppliers and customers to help them provide better products and services to you?
- Do you have a common, sophisticated analytics platform that includes the ability to establish new analytic functions, alerts, triggers, visualizations?
- Can you run simulations on business systems while they’re operating and also between events to adjust strategies?
- Does your architecture support multiple users around the world seeing real-time business performance simultaneously?
- Do you have teams of business experts, product/service experts and data scientists collaborating on making sense of the data?
- Do you modify your products or services as frequently as you could or should based on available data?
- Do you also use data you collect to develop new products or services as frequently as you could or should?
Racing teams are able to invest in advanced analytics because millions of dollars and euros are on the line from hundreds of sponsors. Hopefully your own big data project sponsors appreciate that big money is on the line for your business as well. Winning the race in your industry now probably depends on it.
Also follow Doug on Twitter @Doug_Laney
Category: Uncategorized Tags: analytics, auto racing, big data, business intelligence, indianapolis 500, indy 500, operational technology, performance management, racing, telemetry
by Doug Laney | April 3, 2013 | 3 Comments
Given all the hype over Big Data and concerns of data ownership, I thought it would be interesting to explore who actually owns Big Data, no I mean really owns “big data.” Yes, the trademark. Next stop, the United States Patent and Trademark Office online database.
Talk about Big Data. The database contains a treasure trove of over 8 million patents and 16 million filings dating back to Samuel Hopkins’s 1790 registered process of making potash, an ingredient used in fertilizer (signed by President George Washington no-less), and the oldest active trademark, SAMSON, registered for a brand of rope in 1884, among the nearly 3 million trademarks. With almost 200,000 patent applications and 100,000 trademark applications a year and growing, so are the ranks of the examiners–almost doubling since 2005.
But back to “Big Data”. The term has been in use since at least the mid 1990′s, seemingly coined by Silicon Graphics chief engineer John Mashey who gave a seminar entitled “”Big Data & the Next Wave of InfraStress.” However since he never trademarked it, who did?
Those of you pioneers in data warehousing will remember a boutique consulting firm, often joined at the hip with Teradata, based in Chicago called Knightsbridge Solutions. Knightsbridge specialized in building large databases and data warehouses before it was absorbed into HP. On January 9, 2001, a Knightsbridge attorney filed the trademark and “big data” became a US citizen or whatever. However, they must have liked the term about as much as most of the industry does today (despite its popularity), as they abandoned the trademark less than a year later.
It wasn’t until ten years passed that an enterprising man in Texas reclaimed it only to abandon it again months later. Poor Big Data! It’s been declared dead twice even before it slides into the Gartner® Hype Cycle™ Trough of Disillusionment™. Not to worry, a fledgling VC called Big Data Boston Ventures nabbed the mark last summer. Until they launch, it seems to be the only asset in their portfolio.
Good news for those of you feeling like you missed the boat, there are plenty of variants still available. The USPTO site lists only 44 related marks including clever ones such as “Bigdata”, “Making Big Data Small”, “Big Data for the Little Guy”, “Rocket Fuel for Big Data Apps”, “Dominating Big Data”, “Wala! Big Data Simplified”, and my personal favorite that integrates large information and lager libation: “Big Data on Tap.”
Here’s to you Big Data! You’ve made your mark.
Follow Doug on Twitter: @Doug_Laney
Category: Uncategorized Tags: big data, intellectual property, patent, trademark
by Doug Laney | December 26, 2012 | 2 Comments
2012 has seen an acknowledgement and mainstream awareness of the challenges of managing the burgeoning streams of information generated and available to organizations, particularly big data. In 2013, I expect the focus to shift to the challenges of developing and implementing enterprise strategies for making use of all this data.
Opportunities abound for deploying information in transformative ways. Gartner’s 2013 research agenda will help IT and business leaders develop and execute strategies for achieving higher returns on their information assets. This includes leveraging big data, enhancing analytic capabilities, achieving more disciplined information asset management approaches, and incorporating new and expanded information-related roles:
The volume, velocity, and variety of information sources available today to organizations is more than just an information management challenge. Rather this phenomenon represents an incredible opportunity to improve enterprise performance significantly and even transform their businesses or industries. More than merely reporting on information or even basic decision making support, information assets are an instrument for innovation. Making this strategic shift quickly enough to meet create competitive advantage is the real challenge for most businesses.
Key issues Gartner will be exploring throughout the coming year are also questions business and IT leaders should be asking themselves:
Business uses and sources of information
- What are the range of internal and external sources of data available, starting with our own underutilized “dark data”?
- How can information be used, not just for decision-making, but for greater business insights and process automation?
- How can information be used to foster relationships and improve collaboration with our employees, partners, customers and/or suppliers?
- How can information facilitate business transformation and innovation, beyond just incremental performance improvements?
- How can information be monetized by packaging, sharing and/or selling it?
- How can we evolve to a more information-centric culture?
- How can our IT and business groups organize for achieving higher levels of information performance?
- What emerging information-related skills and methods should be considered, planned for, used or acquired?
Value and economics of information (infonomics)
- Why should and how can we inventory, measure and quantify their information assets?
- How can information’s value be used to justifying and gauge the ROI of information-related initiatives, as well as other IT and business initiatives?
- How can we manage information as an actual corporate asset?
So if you’re looking to make a corporate New Year’s resolution to do more with data for driving corporate value, consider developing answers to each of these questions. And keep an eye on Gartner’s Information Innovation research throughout 2013.
Follow Doug on Twitter: @doug_laney
Category: Uncategorized Tags: analytics, big data, bigdata, infonomics, information management, innovation, new year's, planning, strategy, vision
by Doug Laney | December 20, 2012 | 7 Comments
Going into the 2012 holiday season, North Pole Inc. (ticker: XMAS), the leading global distributor of presents to good girls and boys, called upon Gartner to assess and advise on its information related needs and opportunities.
STAMFORD, Conn., December 18, 2012—
Over the past quarter, Gartner was given exclusive access to the operations and information systems of North Pole Inc. (NPI), to help it set a strategic path for improved information management and analytic capabilities. For nearly two centuries NPI has struggled to support its growing operation and respond proactively to competitive pressures through the use of emerging technologies and best practices.
“We do a jolly good job year after year,” claims NPI’s Founder and CEO, Santa Claus, “but I have really put the pressure on my IT management team to achieve better efficiencies and creatively use information to innovate.”
As a long-time Gartner client, NPI has read about how other enterprises have selectively adopted information technologies, embraced new architectures and approaches, and acquired the necessary skills. “Now it’s our turn,” exclaimed NPI’s CIO Frederick Ellefsen. “We’ve heard a lot about the term ‘big data’ and the significant opportunities indicated by the confluence of mobile, cloud, social and information—Gartner’s Nexus of Forces—so we didn’t want to be left out in the cold, so to speak.”
“This is a unique opportunity for Gartner to be exposed to the inner workings of one of the world’s most secretive yet successful enterprises,” said Peter Sondergaard, Gartner SVP Research. “We were pleased to be able to offer our services and insights to NPI.”
Gartner’s review of NPI’s systems revealed an operation not too dissimilar to other distributors and some major retailers, but on a much larger scale. However due to NPI’s unique legal status it has no finance department, nor does it have a sales or marketing function.
Figure 1 - North Pole Inc. Operations
Santa’s Systems Portfolio
Key systems in NPI’s portfolio manage orders, inventory, quality testing, elfin performance and activities, along with tracking human behavior, correspondence, wish lists and contact information, and also environmental impact data. To achieve NPI’s objective of managing and leveraging information as an actual enterprise asset, Gartner first completed an inventory of the NPI’s extensive wealth of information assets:
- Toy Order Management System (“Tommy”) – Toy orders and order tracking of 5.5 billion orders; supplier and 2nd level supply chain and parts level visibility of 4.6 million suppliers; toy orders and order tracking of 5.5 billion orders
- Toy Inventory Management System (“Timmy”) – Receiving and inventory data on 6.9 billion toys
- Toy Assurance Management System (“Tammy”) – Test results and repairs/returns data on all toys received (average of three safety and quality tests per toy) totaling 21 billion tests annually
- Content system for Relations, Inbound Gift Request and Letters (CRINGLE) – Processing, scanning, content extraction and analysis of 6.5 million letters, emails and calls, and recording 19.5 million gifts requested
- Naughty or Nice Information Tracking System (NITS) – Processing and tagging of 16.8 trillion person-to-person interactions throughout the year
- Scheduling, Logistics and Expedited Distribution System (SLEDS) — Handling of 500,000 appearance requests and 280,000 actual mall and other appearances; the operation of 7700 gift express hubs and the logistics and maintenance of the half-million sleighs servicing them; and night-of-delivery (NOD) routing
- Kontact Information & Directory System (KIDS) – Basic contact, rooftop and chimney configuration information on 2.3 billion gift recipients and their 880 million households
- Helper Organization, Operations and Orchestration (HO-HO-HO) – Scheduling and coordination of elf workforce job responsibilities and activities; also coordinates elf housing and food service
- Job Information, Guidance, Learning & Elf Management System (JINGLES) – General elf resource (ER) system for tracking the performance, benefits and training activities of 230 million elves, along with ongoing recruiting activities
- Study for Negating the Outcome of Warming (SNOW) – A longitudinal study as part of NPI’s sustainability efforts. Millions of climate, atmospheric, emissions, deforestation, and animal and human population data points are collected annually to help NPI achieve its target of carbon neutrality by 2020
[See bottom of article for North Pole Inc. Core Data Requirements and Database Sizing]
Data Quality as Pure as the Driven Snow
Due to impeccable data governance and quality processes, a world-class master data management program, an impressive team of data elves, robust data quality technology, and unwavering executive-level commitment and involvement, NPI’s information assets show no signs of significant completeness, accuracy, integrity or other quality issues according to sample data profiling using Gartner’s data quality assessment toolkit.
Analytic Opportunities Beyond Just “Naughty or Nice”
From a business intelligence perspective, Gartner found that NPI is lagging others in the shipping and distribution industry. Its enterprise data warehouse , called “Chimneys”, is really a collection of stovepipe query and reporting systems, some still relying on first-generation BI tools like Red Brick. Gartner recommended evolving to a logical data warehouse architecture for most low-frequency queries to enable more insightful cross-functional, federated analytics.
Some predictive analytics is done to select appropriate toys based on NITS behavior modeling, demographics and prior-year presents. Gartner recommended that this system be enhanced to account for factors such as sibling response, damage/loss propensity, and social content analysis. NPI however is working on mobile-enabling Santa in the field during mall appearances so he can advise on toy availability and alternatives (as necessary) in real-time while a child is on his lap. This system is expected to be in place for the 2013 holiday season. Gartner analysts pointed out that this new capability would also require enhancing its “Tommy” toy order management system to capture full catalog and supply chain information from its suppliers. Today NPI only maintains this tracking data on actual orders.
Although NPI does a great job of social media participation, including a multi-channel Twitter strategy (i.e. @santa, @officialsanta, @santaclaus, @santa_claus, etc.), Gartner recommended that NPI begin tapping and analyzing social media streams. Social sentiment analysis will help NPI identify emerging “hot toys” for pre-ordering, and identify early warning signals of quality-related issues. NPI also took into consideration the idea of integrating global economic data to better focus its gift giving on those in the greatest need. However, NPI like many organizations is struggling to hire or train a team of data scientists. “Advanced analytics just isn’t a core elfin competency,” lamented Mr. Ellefsen. “We’re definitely going to have to fly up outside talent for a period of time.”
Operational Efficiency at Times Glacial
Gartner also advised NPI on how to consolidate its ordering process and information. Since the late 1970s, NPI has being consolidating inbound shipments using its gift express hubs scattered secretly in forests around the world. However it still orders and inventories gifts from suppliers one-by-one. “Our ‘Tommy’ system is definitely outmoded,” admitted Mr. Ellefsen. With sophisticated demand analysis, order pattern matching and smart RFID-enabled inventory management, Gartner believes NPI could save 70-80% of its current TOM processing expense.
No More Cookie Cutter Approaches to Data Management
Regarding the human behavior tracking system (NITS), Gartner suggested that in today’s world perhaps both online interactions (text, email, social media) and human-to-animal interactions should also be captured and tagged as “naughty” or “nice”, and that a broader 5-point Likert scale or automated video/audio analysis might improve measurement precision. NPI is obviously concerned by the size and performance of this already 168 terabyte system, but will be looking into HDFS or other NoSQL alternatives to support expanded tracking ideas. “For obvious reasons, we got away from inverted tree data management structures years ago,” Mr. Ellefsen chuckled.
Gartner and NPI also discussed a long-term cloud strategy. But with over 200 terabytes of online operational data, austere personally identifiable information (PII) privacy and security requirements, and spotty connectivity at its arctic headquarters, Gartner recommended that at this time NPI only consider hosted data solutions for its 7700 gift express hubs.
A Big Sack of New Ideas for Big Data
During the “Workshop at the Workshop” session as it was called, Gartner and NPI generated many innovative ways to use information, including:
- selecting toys that would encourage naughtier kids to be nicer
- putting de-identified data online for suppliers to analyze
- realtime NOD (night of delivery) routing and navigation via integrated weather, GPS and air traffic data to optimize Santa’s 10,200 takeoffs, landings and deliveries per second.
However the entire NPI management team was quick to squash the subject of transitioning to an outsourced, mobile-enabled parental workforce. “Elves have magical capabilities beyond those of most humans,” Mr. Claus interrupted, “Not to mention a tremendously strong union.”
Doug Laney, VP Analytics and Information Management
Gartner, Inc. (NYSE: IT) is the world’s leading information technology research and advisory company. Gartner delivers the technology-related insight necessary for its clients to make the right decisions, every day. From CIOs and senior IT leaders in corporations and government agencies, to business leaders in high-tech and telecom enterprises and professional services firms, to technology investors, Gartner is the valuable partner to clients in 12,000 distinct organizations. Through the resources of Gartner Research, Gartner Executive Programs, Gartner Consulting and Gartner Events, Gartner works with every client to research, analyze and interpret the business of IT within the context of their individual role. Founded in 1979, Gartner is headquartered in Stamford, Connecticut, U.S.A., and has 5,000 associates, including 1,280 research analysts and consultants, and clients in 85 countries. For more information, www.gartner.com.
North Pole Inc. Core Data Requirements and Database Sizing*
* For non-believers, these data sizings were derived from various sources: Population data used to determine the number of worldwide Christians (2.3B) and Christian households (884M) is from the US Census, the Catholic Education Resource Center, the Christian Post, and the the Global Population Clock. The average number of presents from Santa (3, excluding stocking stuffers) is from Babycenter.com and CircleofMoms.com. The number of person-to-person interactions (20/day) for calculating the volume of “naught/nice” data comes from the Tilted Forum Project on Humanity, Sexuality and Philosophy. The amount of correspondence Santa receives is from a Wired Magazine article (500K letters annually) and extrapolated to include emails and worldwide correspondence. The number of toy makers (1547 in US) is from toydirectory.com and is extrapolated to include worldwide toy makers, suppliers and parts. The number of shopping malls (105,000 in US) is from the International Council of Shopping Centers. And package delivery, transportation and personnel numbers are extrapolated from public FedEx data.
Category: Uncategorized Tags: analytics, BI, big data, bigdata, business intelligence, christmas, cloud, data management, data warehouse, humor, information management, mobile, predictive analytics, santa, social media
by Doug Laney | December 18, 2012 | 3 Comments
To understand the significance of December 21, 2012 to the Mayans (and today’s mass media) it’s necessary to recognize and understand the Mayan numbering system, theology and astronomical prowess.
First, the Mayan had two numbering systems which more-or-less are akin to our distinct decimal system for counting things, and our Gregorian system for counting dates. However, their numerical system is a base-20 vigesimal, not base-10 decimal system. This owes to the fact that they felt perfectly comfortable using their toes for counting, and relished the ability to represent petabyte-scale numbers like faraway dates efficiently. The downside of this and some unfortunate anomalies they introduced was that they never were able to master multiplication or division. Unlike the ancient Romans though, Mayan data modelers did invent a symbol for the number zero, which turns out to be an important part of the story.
However, unlike most of our cultures the Mayans also had two distinct calendar systems: the “Short Count” and “Long Count”. The Short Count derives from a sacred count of 260 days known as the tzolkin munged with Venus’s relatively-protracted year. Although based in part upon astronomical observations, this calendar was purely for ritualistic purposes, still used by Guatemalan highlanders today, and bears no relevance to our imminent ominous occasion. The Long Count calendar is also based on astronomical observations and cycles, and multiples thereof.
The longest of the five nested Long Count cycles is the Baktun which is 144,000 days or about 400 years – interestingly the same as our present-day quadricentenial leap year cycle. The 13-Baktun “Great Cycle” spans 5125.36 years, completing (and iterating, I hope) on December 21, or 184.108.40.206.0 in Mayan nomenclature.
But why December 21st? What happened 5125 years ago on 0.0.0.0.0? The answer that has perplexed scholars until recently is: nothing. Nothing happened on that date—which happens to predate the Mayan civilization by some 3000 years. Unlike most modern-day cultures whose ethnocentric calendars begin on an important date in their own history, the Mayans saw themselves as part of a much bigger and longer picture…one of astronomical scale. It wasn’t until scholars determined that the date 220.127.116.11.0 coincides with a confluence of Mayan theology and rare astronomical events (due to the astrological precession caused by the slow wobbling of the Earth’s axis) that they realized the Mayan calendar is reverse-engineered.
After decades and centuries of data collection (i.e. ancient Big Data curating methods), the Mayan’s best data scientists projected that on December 21, 2012 the Sun’s ecliptic will pass through the center (“dark region” or “dark road”) of the Milky Way, not just on any old day, but on the Winter solstice. It is on this day that the Mayan’s depict their sun god Pacal (no relation to Blaise) traveling into the underworld to do battle with the lords of Xibalba.
So if you want to really impress someone this holiday season, wish them a Happy 14th Baktun or “May you have a renewed Great Cycle!”
Follow Doug on Twitter: @Doug_Laney
Category: Uncategorized Tags: analytics, big data, data scientist, mayan
by Doug Laney | August 15, 2012 | 4 Comments
Tobin’s q is a simple ratio first posited by Nobel-winning American economist James Tobin in the 1960s to understand the relationship between a company’s market value and the replacement value of its assets. Analysis shows that this quotient has been growing since financial statements were standardized following the Great Depression. Smoothing economic boom and bust cycles via linear regression, Tobin’s q has more than doubled from 0.4 in 1945 to a predicted 1.1 in any given year currently.
This means that in general markets now value companies more than the sum of their tangible assets. How can this be? Non-reportable intangible assets of course.
We know that due to 75 year old accounting standards, certain intangibles cannot be valued and reported. These unreportable intangibles frequently cited include human capital and intellectual capital. Yet, could these alone have doubled over seven decades? Do corporations of similar revenue have twice the number of employees they once did? No, quite the opposite as we’ve become more efficient and reliant on technology. Do humans have twice the knowledge capacity than we did back in the day? Not only my teenager would fervently disagree with that.
Then what is it that companies have so much more of, has been accumulating for over half a decade, and that is hidden from balance sheets?
Ever since Arthur Andersen computerized a GE payroll plant in 1953, companies have become better and better at amassing information assets (leading up to this age of Big Data) and finding ways to leverage them. Yet the value of information isn’t quantified or reported in any way. Even today’s infocentric companies whose business models revolve around collecting, buying and selling data (e.g. Facebook, Google, Experian, Nielsen, etc.) have balance sheets devoid of their most valuable asset.
Furthermore, a study by intellectual capital research firm, Ocean Tomo, shows that the portion of corporate market value attributable to intangibles has grown from 17% in 1975 to a whopping 81% in 2010. Indeed, information accumulation has not only increased dramatically in businesses, but the importance of information itself has supplanted traditional assets in generating revenue, and therefore in contributing to market value as well.
So what are CEOs to do knowing that information comprises a majority of their corporate value? First, forget what the accountants say, and listen to what the market is saying. Stop just talking about information as such an important asset and start valuing and managing it like one.
For further reading on the topic of infonomics:
Category: Uncategorized Tags: big data, data, infonomics, information, information assets, information management