Jonah Kowall

A member of the Gartner Blog Network

Jonah Kowall
Research Vice President
3.5 years with Gartner
20 years IT industry

Jonah Kowall is a research Vice President in Gartner's IT Operations Research group. He focuses on application performance monitoring (APM), Unified Monitoring, Network Performance Monitoring and Diagnostics (NPMD), Infrastructure Performance Monitoring (IPM), IT Operations Analytics (ITOA), and general application and infrastructure availability and performance monitoring technologies. Read Full Bio

Ruxit is an impressive new SaaS delivered APM offering

by Jonah Kowall  |  October 7, 2014  |  4 Comments

Compuware APM (now Dynatrace) has a new product and brand they are treating internally like a startup. This new software product has been in the works for a while and takes a unique approach to APM. Although they would like this to be separate from Compuware, it’s not quite the case.

They are clearly targeting those web first APM companies (such as New Relic and AppNeta). Additionally with these unique features and ease of use focus it will appeal to enterprises, the same way New Relic has been making inroads around enterprise adoption. Come and hear more at the upcoming Gartner Data Center Conference in December where we’ll have end user speakers talking about this!

The product is SaaS only and they provide a free trial. The billing is usage based (hours), there is not a “monthly” price aside from paying for the hours in a month per instance being monitored. Upon signing up the setup of the server side takes a little time to provision. Once provisioned you login (below):

1

The main login allows provisioning and management of multiple environments (essentially AWS zones today, but in the future that could be a lot more). The problem is there isn’t really a wizard, you just get dropped in (the Ruxit team has provided feedback that they have already improved this since my testing, don’t you love the iterations with SaaS):

2After clicking the link for the proper environment, a wizard is presented helping the implementation of the host agents:

3Download the agent, which is already keyed for your environment, this makes the install very easy. On the Linux side it can be downloaded, or use the wget link is provided. In this example agents were installed on Windows and Linux, this is the Windows download link.

4

Install completed on the server, and you have my tenant ID, but whatever :):

5

Ruxit is very easy to use and implement here is a description of what is supported on Linux with the product:

6

Once installed (for example on my Linux host) it discovered and started monitoring the Tomcat install which is running on the JIRA host. What makes Ruxit unique is there was no editing of scripts, configuration files, or anything. The product really implements completely seamlessly. This is a new trend you will see among leading APM tools, but Ruxit is first to market with this feature:

7

On the windows side it discovers multiple services. Ruxit doesn’t go as deep on .NET as it does on Java right now:

8

More screenshots on a SQL server:

9

Once monitored here is the dashboard. The tiled interface is scrollable and all HTML5. It feels and acts like Windows 8 in terms of the UI:

10

The dashboard is customizable, sections can easily be added or organized via drag and drop:

11

When adding a tile here are some of the options:

12

One of the unique elements of Ruxit is that it has some IPM capabilities, hence it goes deeper into the infrastructure and topology of applications outside of what is instrumented. The Smartscape view is one representation of this. Layers represent the applications, services, hosts, and data centers (support for VMware APIs is supported) on the left hand side. You will see additional support for IaaS and virtualization providers in the future:

13

Clicking an entity provides details of the layers within the application (this one is Sharepoint – .NET).

14

Here is the view for a Java App:

15

When drilling into the component high level views are presented including metrics, methods, problems, and events. Rich data is presented, especially in Java applications. Less detail is shown for .NET or other applications (The Ruxit team has clarified PHP and .NET will be enhanced this year, and node.js is currently in beta):

16

Yet another view shows the services across tiers, you’ll notice the breakdown on the left hand column:

17

Here is the same view for the Sharepoint app on IIS:

18

Similarly here is a database view to give you the depth via JDBC:

19

Drill into the database activity, this shows the breakdown of slow statements and where time is being spent:

20Backing up to the Smartscape, there are interesting views from the database host perspective. Here you can see that both the .NET and Java (Sharepoint and JIRA) are both using the SQL backend.

21

Sticking to the host level view, system level metrics are visible when drilling down:

22

Clicking on a metric shows graphs and other data associated, in this case memory:

23

On the SQL server this is the breakdown of processor usage by process:

24

Here is the Disk IO graph:

25

The ever important disk latency broken down by reads and writes.

26

Process level view of the SQL processes and associated callers:

27

Here are some alerts, and a new take on alerting:

28

Some RUM screenshots:

29

My maps don’t look very good since my IPs are all internal, so I’m not showing those. You can configure how the applications and URL are classified along with how sessions and IPs are tracked and detected. Additionally the product support JavaScript framework support for many common frameworks in use. The product includes 3rd party information capture capabilities if you enable it.   The views include a  world map type view, with drilldowns. Ruxit is the first vendor using these new Resource Timing APIs available in many modern browsers.

Ruxit is a unique and much needed entry into the APM market, with an easy to implement SaaS deployment model providing breadth and just enough depth for many buyers. The product has unique and interesting IPM capabilities which span outside of the traditional APM market, which tend to focus on visibility and instrumentation within the applications themselves. We will see more similar products begin to emerge along with combinational products of unified monitoring and IPM to bridge availability and performance.

The pricing model is different than many other products with $.15 per hour for application monitoring, and $.15 per 1,000 user visits for RUM. This works out to about $108 per host per month, and the RUM is priced based on volume. This is competitive pricing, but more depth is needed, especially around RUM (The Ruxit team says this is an area of focus for this year).

Will be interesting to see how adoption goes of this new and fresh APM product.

4 Comments »

Category: APM IT Operations Monitoring Pick of The Week SaaS     Tags:

Gartner Infrastructure and Operations Management Team : SAS During Magic Quadrants

by Jonah Kowall  |  September 17, 2014  |  4 Comments

In order to assist our vendor clients we often do strategy days, these are typically full or half days where I work with clients to help them craft strategy, go to market, or other consultative help we cannot accomplish during our 30 to 60 minute inquiry phone calls. Our vendor clients find these very helpful, hence we have a good amount of demand for these sessions. The problem is when analyst resources are tied up for extended amount of times it causes longer wait times for phone calls due to travel and other requirements.

Creating Magic Quadrants is a very lengthy process, which consumes a lot of analyst and vendor resource and time. We regularly include vendors who are not clients in our Magic Quadrants and research, and we provide the same level of analyst access to non-clients as clients during Magic Quadrant processes. Additionally we do not discuss Magic Quadrants with clients or non-clients once they are in process, aside from during specific conversations which are focused on the Magic Quadrant.

The way we handle vendor interactions during Magic Quadrant development is just one example of Gartner maintaining independence. Gartner analysts also don’t deliver white papers, or other vendor sponsored research notes, and analysts are also not compensated or otherwise tied to the sale of our products or consulting. These are reasons why Gartner is differentiated and how we avoid our bias.

Avoiding bias is critical for us to deliver the best product possible, this is always the case. This can prevent analysts from doing SAS with vendors in a  specific Magic Quadrant during authoring.

I always strive to be transparent, and thought this would be good to share with the public and those I enjoy helping build the best technology and make the right decisions.

9/22/14: A few edits as requested by David Black, who’s a VP in the Content and Methodologies area. Expect more news around this in the coming weeks from Gartner across the company.

 

4 Comments »

Category: Uncategorized     Tags:

Cool Vendor Pick of the Week: Renesys Reborn as “Dyn Internet Intelligence”

by Jonah Kowall  |  September 16, 2014  |  2 Comments

 

 

 

 

Former 2013 APM cool vendor Renesys was recently bought by Dyn http://dyn.com/blog/dyn-acquires-renesys-the-global-authority-on-internet-intelligence-2/. You would not be too far off in saying Dyn, who’s known for DNS services, traffic management (using DNS), and email marketing software (via acquisition) is making a move into a new market. The interesting thing as we highlighted in the cool vendor report is Renesys has a unique data set, you can say the same thing about Dyn who manages traffic, email, and DNS for a lot of companies out there here are some interesting stats (sorry I like data too much)

 

  • 327k queries per second (monthly average)
  • Over 1 billion emails sent per month
  • 230+ countries & territories occupied by Dyn users
  • +65k Domains Registered (2013)

 

After buying Renesys they began to integrate and analyze some of this vast amount of data into a more operationally focused product. Here is Internet Intelligence, which was released today : http://www.digitaljournal.com/pr/2190339. I was kind of upset they put me on a trial since I wanted to have this tool after using it for a bit. After the login here are the highlight of the product:

 

image001

As you can see the focus on the initial product are focused on diagnostics and measurement, and less so on monitoring, but some elements come out later. This next screenshot shows the visual connectivity between the AS numbers.

image002

As you can see the focus on the initial product are focused on diagnostics and measurement, and less so on monitoring, but some elements come out later. This next screenshot shows the visual connectivity between the AS numbers.

image004

 

Additionally you can see them visually on a map and explore them in an easy manner:

image006

Here is a view for this specific AS number:

image008

You can also see the vantage from a point and the latency as you move from that point, this allows you to explore performance in a more ad-hoc manner:

image010

A path can be selected and explored as well.image012

Taking these two views together you can look at path and latency data:

image014

Latency can be examined over time as well, which can provide an indication of what the network latency might look like to a customer on a specific ISP. You can see how this would be useful for those hosting services or running SaaS businesses:

image016

If looking to improve performance to specific customers or areas you can also compare and understand providers and options connecting points:

image018

I found the measurements showing variability to be particularly interesting, the use of the scatter plot shows you the actual measurement variance:

image020

Looking out over longer periods of time you can see more about path variability:

image022

This is about the only monitoring in the product today, I expect this to improve a lot. You will notice the network events bulletin, they have an rss feed for this data, which I actually have in my feedly and love!

image024

In order to setup monitoring you must add the assets into your portfolio:

image026 image028

As you can see or at least imagine, this is an interesting but early product. It’s good to see this great data being exposed for operational purposes, and I expect it to become more relevant as the internet is so critical for almost every business today.

Please leave comments and questions here, or via twitter @jkowall

2 Comments »

Category: Analytics APM Monitoring Pick of The Week SaaS     Tags:

Pay Attention : How Performance Affects User Experience and Your Bottom Line, and What to Do About It

by Jonah Kowall  |  September 9, 2014  |  3 Comments

I’ve been eagerly awaiting the publication of this new research. I worked with colleagues from other coverage areas including Magnus Revang, who focuses on application development (UI/UX), Magnus did most of the work on this note and really drove it. I helped with contributions and work, along with Ray Valdes who covers web technologies and social platforms. The result of this was a really interesting research note we pull in a lot of research around performance and speed, and how this is related to your bottom line. Additionally we start to dig into why performance is critical when it comes to user experience, and the relationship between efficiency, engagement, and performance. We then move towards the recommendations as to how to measure, improve, and iterate on performance improvements including a lot of free and commercial tools to help with these steps.

Clients can find the research here :

How Performance Affects User Experience and Your Bottom Line, and What to Do About It

You can follow and interact with the authors on twitter:

Magnus Revang @MagnusRevang

Ray Valdez @rayval

Jonah Kowall @jkowall

 

3 Comments »

Category: Analytics APM Monitoring     Tags:

VMWorld 2014 Recap – What about the applications?

by Jonah Kowall  |  September 4, 2014  |  5 Comments

Thanks to VMWare for inviting me to attend VMWorld this year, I was looking forward to learning more about the progress on vCenter Operations Manager (especially Log Insight and Hyperic). I was also eager to learn more about NSX and the management packs around NSX which are critical to adoption. My colleagues who cover NSX closer to engineering and architecture get a lot more calls than we do on the management side of things. Similarly when I attended sessions on NSX deep dive, versus NSX management using vCenter Operations Manager there were about 12x the amount of people in the sessions showing you people are learning versus implementing at this stage.

There were other interesting announcements around OpenStack, which I think will fundamentally cause issues around where parent company EMC makes most of their money… selling enterprise class storage hardware. The true question is why not contribute to existing distributions versus adding fragmentation to an already complex project. The containerization announcements were also interesting, and caused many of us to think about the future of virtualization and the hypervisor as we are better able to abstract software and configuration into a more lightweight model.

As I attended sessions, spoke with attendees, and saw the vendors present in the exhibit halls, it struck me that VMware is lodged under the stuff which matters most to IT. Within my inquiry and coverage there is a major shift taking place where IT is trying to move up from the infrastructure components towards the end user experience and application level visibility. The growth and importance in APM and application level visibility is what’s driving interest. VMware seems challenged to address this interest.  Lydia’s blog is spot on http://blogs.gartner.com/lydia_leong/2014/08/25/bimodal-it-vmworld-and-the-future-of-vmware/. The history of Zimbra, SpringSource (now Pivotal), and other software acquisitions directed at addressing developers and moving above the infrastructure have resulted in failure and divestiture. VMware can still innovate, but unfortunately today is seems to be occurring below the level of what matters, which are the applications.

Who will adapt and move, and which IT organizations will be largely irrelevant as people work around them.

5 Comments »

Category: IT Operations Monitoring Trade Show     Tags:

HP Pronq AppPulse Mobile (BETA)

by Jonah Kowall  |  September 2, 2014  |  3 Comments

HP has been making pretty significant efforts in building a SaaS delivery product. In fact the SaaS delivery was called out in the recent earnings call as a highlight for an otherwise struggling HP Software business. Within the Pronq (I know the name is a bit odd) brand they include several SaaS delivered products with a lightweight sales motion including try and buy with online credit card transactions. These offerings include Fortify on Demand (Security), Agile Manager (Dev), Vertica (Analytics), LoadRunner (Performance Testing), HP Anywhere (Mobile Dev/Distribution), StormRunner (Mobile performance testing), Virtualization Performance Viewer (for VMWare vSphere and Microsoft Hyper-v), and App Pulse (synthetic monitoring and full featured APM), and finally the focus of this posting App Pulse Mobile (Beta). The product is very easy to sign up for and get a trial:

1

Within the Pronq products HP does share pricing until you get into larger deal sizes in which case you do need to contact them. For AppPulse mobile, the pricing will be released when the product is made generally available. The pricing, according to HP will be competitive and based on the standard monthly active user(MAU) we see most Mobile APM products using.  The product works by downloading a wrapper which instruments an iOS or Android applications, which can then be loaded onto the device or via a MDM app store. This is something other vendors also offer. In this screenshot you configure a new mobile app:

23

4

I took an apk file for Android and instrumented it with the wrapper below:

5

After installing the app on my device, and starting to use it a few times, data started flowing in 15 minutes, since I did my testing HP informed me this was reduced to about 5 minutes, and will be 1 minute when the product is released and generally available

6

Once you drill into the app here is the dashboard. This is after using the app for a while and getting some good data.

7

I had some poor experience, and this is how the dashboard looks:

8

This is the drill into why the “Fundex” was dropping (not a fan of the name, but the concept of a performance index makes sense). The idea of the index is user experience focused, similar to what Apdex was intended to do around user experience measurement on the web, HP has taken this concept further by looking at slow user actions, crashes, and device level resource consumption. By simplifying APM to these concepts and focusing on ease of use HP will better appeal to less technical users.

9

In my case even though the app was crashing, it wasn’t detected. After speaking with HP this was a bug which was fixed in the latest push. HP is releasing code every two weeks to production. HP’s crash detection includes the list of steps the user did before the crash helping developers understand the path of actions needed to replicate the crash.

10

More details from within the app. HP records and extracts user actions from within the app. This is much closer to what we expect to see with end user experience monitoring within web applications, but applied to native mobile apps.

11

Overall this is a good product offering for mobile APM, but it could go much deeper on the network analysis, device, carrier, and location based analysis. This is a beta, hence there will be more features added along with having a larger install base once this is released. HP is moving from a traditional software delivery cadence towards a SaaS first delivery method including weekly hotfix releases, monthly content updates, and quarterly major releases. Additional features seen in some other products include mobile app store integration and a deeper overall visibility from the mobile operations perspective. Once again the focus here seems more around MDM versus developer centric Mobile APM.

3 Comments »

Category: APM Mobile SaaS     Tags:

Introducing the Criteria for the 2015 Network Performance Monitoring and Diagnostics (NPMD) Magic Quadrant

by Jonah Kowall  |  August 15, 2014  |  3 Comments

Some changes in the upcoming Q1 2015 delivery of the NPMD Magic Quadrant. Vivek Bhalla will be taking over as the lead author of the research (@vbhalla1) and I’ll be co-authoring along with Colin Fletcher and Gary Spivak. We will be sending vendor surveys out on Monday, if you are a vendor who believes you qualify based on the criteria below and you did not receive a survey on Monday please reach out via email or twitter (firstname.lastname@gartner.com). We regularly include non-clients in research and this research is no different, we strive to build the most relevant research for our large end user client base.

Thank you, and we look forward to this research deliverable.

Market Definition

NPMD tools allow for network engineers to understand the performance of applications and infrastructure components via network instrumentation. Additionally, these tools provide insight into the quality of the end user’s experience. The goal of NPMD products is not only to monitor the network components to facilitate outage and degradation resolution, but also to identify performance optimization opportunities. This is conducted via diagnostics, analytics and debugging capabilities to complement additional monitoring of today’s complex IT environments.

This market is a fast-growing segment of the larger network management space ($1.9 billion in 2013) and overlaps slightly with aspects of the application performance monitoring space ($2.4 billion in 2013). Gartner estimates the size of the NPMD tools market at $1.1 billion.

Inclusion Criteria

Vendors will be required to meet the following criteria to be considered for the 2015 NPMD Magic Quadrant:

  • The ability to monitor, diagnose and generate alerts for:
    • Network endpoints — Servers, virtual machines, storage systems or anything with an IP address by measuring these components directly in combination with a network perspective.
    • Network components — Such as routers, switches and other network devices. This includes SDN and NFV components.
    • Network links — Connectivity between network-attached infrastructure.
  • The ability to monitor, diagnose and generate alerts for dynamic end-to-end network service delivery as it relates to:
    • End-user experience — The capture of data about how end-to-end application availability, latency and quality appear to the end user from a network perspective. This is limited to the network traffic visibility and not within components such as what application performance monitoring is able to accomplish.
    • Business service delivery — The speed and overall quality of network service and/or application delivery to the user in support of key business activities, as defined by the operator of the NPMD product. These definitions may overlap as services and applications are recombined into new applications.
    • Infrastructure component interactions — The focus on infrastructure components as they interact via the network, as well as the network delivery of services or applications.
  • Support for analysis of:
    • Real-time performance and behaviors — Essential for troubleshooting in the current state of the environment. Analysis of data must be done within three minutes under normal network loads and conditions.
    • Historical performance and behaviors — To help understand what occurred or what is trending over time.
    • Predictive behaviors by leveraging IT operations analytics technologies — The ability to distill and create actionable advice from the large dataset collected across the fourth requirement.
  • Leverage the following data sources:
    • Network-device-generated data, including flow-based data sources inclusive of NetFlow and IPFIX.
    • Network device information collected via SNMP.
    • Network packet analysis to identify application types and performance characteristics.
  • The ability to support the following scalability and performance requirements:
    • real-time monitoring of 10 gigabit (10G) Ethernet networks at full line rate via a single instance of the product
    • Ingest sampled flow records at a rate of 75,000 flows per second via a single instance of the product

Non-product Related Criteria

  • A minimum of 10 NPMD customer references must be included at the time of survey submission.
  • Customer references must exclude security-oriented use cases and scenarios.
  • Customer references must be located in at least two of the following geographic locations: North America, South America, EMEA, and/or Asia/Pacific/Japan.
  • Total NPMD product revenue (including new licenses, updates, maintenance, subscriptions, SaaS, hosting and technical support) must have exceeded $7.5 million in 2013, excluding revenue derived from security-related buying centers.
  • The vendor should have at least 75 customers that use its NPMD product actively in a production environment.
  • The product, and the specific version, submitted for evaluation must be shipping to end-user clients for production deployment and designated with general availability by  October 31st 2014.

Critical Capabilities

The 2015 NPMD Critical Capabilities that will be published subsequent to the 2015 NPMD Magic Quadrant and be a complimentary piece of research.

Your survey submission and demo briefing will also be used for the purposes of writing this document in addition to the Magic Quadrant.

The 2015 NPMD Critical Capabilities will be assessed upon the following criteria:

  1. End-point, Component and Link Monitoring
  2. Service Delivery Monitoring
  3. Diagnostics
  4. IT Operations Analytics
  5. Integration

In addition to the above criteria, we will be evaluating each vendor’s ability to cross multiple buying centers, as well as its ability to target specific verticals as validated by reference customers.

3 Comments »

Category: IT Operations Monitoring NPM NPMD     Tags:

Market Guide for Network Packet Brokers (NPBs)

by Jonah Kowall  |  August 14, 2014  |  2 Comments

A long running research project has been underway for many months collecting data from vendors and putting together this research note. Big kudos to my colleague Vivek Bhalla for doing most of the work on this note. It came out great, with lots of valuable insight and vendor analysis.

The network packet broker (NPB) space has been interesting for the last couple years with lots of shifts and market changes happening. We highlight many of these changes in the market guide along with providing write-ups along with strengths and challenges for each vendor profiled. We did this for solutions from Apcon, Arista Networks, cPacket, Cubro, Gigamon, Interface Masters Technologies, IXIA, JDSU (Network Instruments), Netscout, and VSS Monitoring. Gartner clients can access the research here:

Market Guide for Network Packet Brokers

07 August 2014  G00263407
Analyst(s): Vivek Bhalla Jonah Kowall

2 Comments »

Category: Monitoring NPB     Tags:

Read Our Latest Research to Learn Why Monitoring Must Evolve to Meet Tomorrow’s Demand

by Jonah Kowall  |  July 29, 2014  |  5 Comments

I’ve been working on a pretty conceptual research project over the last few months. The research is finally out as of yesterday (7/28/14). The basic premise is that as the environments and technologies continually evolve and become more abstract and complex the monitoring tools need to evolve in the same manner.  The main issues are the use of traditional architectures versus big data and streaming architectures. Additionally the ease of deployment and use are the new normal, SaaS is a critical deployment model to facilitate fast time to value.

Additionally we look at tool proliferation issues, and some data behind those problems. The other issue investigated is the general failure of ECA approaches in terms of fixing the complexity of the tools, by simplifying the tools with unified monitoring approaches, combined with ITOA these issues can be more easily handled. On the ITOA front, we share data collected on Unstructured Text Search and Inference (UTSI) or what many call log analytics.

Monitoring tools are beginning to be used for multiple use cases outside of operational visibility, and more of this is investigated in this latest research note. Clients can read more here:

Monitoring Must Evolve to Meet Tomorrow’s Demands

28 July 2014  G00263511
Analyst(s): Jonah Kowall

5 Comments »

Category: Analytics APM Big Data ECA IT Operations ITOA Logfile Monitoring OLM SaaS     Tags:

OSCON 2014 Wrap-up

by Jonah Kowall  |  July 29, 2014  |  4 Comments

This was my first year attending the open source focused OSCON conference by O’Reilly. I’m a huge Velocity fan, and get a lot out of that conference, hence I figured I would try another conference. Overall I found this conference far less valuable to me for several reasons. While there was a bit of interesting content, the show lacked focus in general. There were show floor exhibits including do it yourself electronics, non-profit, to commercial vendors. Much of the conference was for recruiting in Portland, where lots of startups trying to keep pace with growth pull talent. Here are some of the better sessions I attended and a little bit about them:

Tutorial Node.js 3 Ways
C. Aaron Cois (Carnegie Mellon University, Software Engineering Institute), Tim Palko (Carnegie Mellon University, Software Engineering Institute)

Since I’ve already done some programming in Node for my Google Glass App, I attended this one more to get a better tutorial than my hack/trial by fire. My programming skills are hackery at best :)

You can find the content here : http://cacois.github.io/nodejs-three-ways/#/

This was a good primer, but there were a couple people in the room who needed a lot of tech support. Additionally some pre-prep by attendees would have made this much smoother. We did some good WIFI load testing, which showed the network couldn’t handle peak loads.

An ElasticSearch Crash Course – http://www.oscon.com/oscon2014/public/schedule/detail/33571
Andrew Cholakian (Found)

I’ve written a lot about ElasticSearch, the internals of the engine were probably the most useful content I got from OSCON this year. I have a much better understanding of the technology with these fundamentals. Here are some notes:

  • Wikipedia is moving to it
  • Github code search based on it
  • Netflix using it for log data

Indexes can live anywhere, and are split across
Each Index has documents
Every field has an index

Docs are routed via hashing and sharded
Shards are lucene indexes – they are replicated

Deleting and updating indexes are expensive.
Writes are slow
Cannot do transactional operations

Docker – Is it Safe to Run Applications in Linux Containers?
Jerome Petazzoni (Docker Inc.)

This is one thing which is a major issue with putting Docker into production. The lack of control and general security are missing and this presentation was interesting and much needed. Jerome was an excellent presenter and made some very good points. I especially liked the idea of running app instances read only, that avoids most of the security issues.

Tracing and Profiling Java (and Native) Applications in Production – Twitter
Kaushik Srenevasan (Twitter)

Interesting discussion of how Twitter who is a heavy Java and Scala shop handles instrumentation. They run their own OpenJDK JVM distribution with customizations running on CentOS. In summary this is somewhat dated view of instrumentation since commercial BCI and instrumentation on Java has some so far. If you don’t want to pay for something and want to build your own, this is somewhat interesting, but has very limited capabilities in terms of what modern APM can do today. Here are the other notes:

  • Java, Scala most popular
  • Some C++
  • Some Ruby (Kiji), Python

They bundle their own JVMTI agents in the code.
https://www.youtube.com/watch?v=szvHghWyuoQ

Why?

  • Low latency garbage collection on dedicated hardware and mesos
  • Services are getting larger
  • Scala optimizations – functional programming language
  • Tools : Contrail, Twitter diagnostics runtime

Observability:

Diagnostics:

  • Wanted something like dtrace, but they don’t have it on Linux
  • Using perf for the linux profiling

Please leave comments here or on twitter @jkowall thanks!

4 Comments »

Category: APM DevOps IT Operations Monitoring Trade Show     Tags: