How to Detect (and Resolve) IT Ops/APM Issues Before Your Users Do

As originally published by APMdigest.

Among the most embarrassing situations for application support teams is first hearing about a critical performance issue from their users. With technology getting increasingly complex and IT environments changing almost overnight, the reality is that even the most experienced support teams are bound to miss a major problem with a critical application or service. One of the contributing factors is their continued reliance on traditional monitoring approaches.


Traditional tools limit us to monitoring for a combination of key performance indicator thresholds and failure modes that have already been experienced. So when it comes to finding new problems, the best case is alerts that describe the symptom (slow response time, transaction fails, etc.). A very experienced IT professional will have seen many behaviors, and consequently can employ monitoring based on best practices and past experiences. But even the most experienced IT professional will have a hard time designing rules and thresholds that can monitor for new, unknown problems without generating a number of noisy false alerts. Anomaly detection goes beyond the limits of traditional approaches because it sees and learns everything in the data provided, whether it has happened before or not.

Read More

Automated Anomaly Detection: A Connector for Amazon CloudWatch

Amazon CloudWatch is a monitoring service for AWS cloud resources and the applications you run on AWS. You can use Amazon CloudWatch to collect and track metrics, collect and monitor log files, and set alarms.

At the time of writing, CloudWatch is currently available to all AWS users, with the free version giving basic monitoring metrics (at 5 minute frequency) and generous usage limits. You can also add up to 10 custom metrics and 10 alarms.

In this blog I shall briefly explain why it is important to use unsupervised machine learning to effectively manage your AWS environments. I shall then point you at the developer resources we have made available that will allow you to configure this for Amazon Elastic Compute Cloud (EC2) which includes a free developer license for the Anomaly Detective® Engine & API.

The case for unsupervised machine learning in AWS

Read More

Rogue User Detection via Behavioral Analysis

Similar in concept to my last blog on data exfiltration, finding “rogue users” or “rogue systems” using behavioral analysis and automated anomaly detection takes a different approach than the traditional methods of manual data inspection, or the application of rules or signatures to identify specific behavioral violations.

Read More

Prelert Takes Home a Silver Stevie Award

Last Friday marked the twelfth annual American Business Awards and Prelert was honored with a Silver Stevie Award in the New Product or Service of the Year - Software - Big Data Solution category. The announcement was made at the organization’s first ever New Product & Tech Awards banquet at the Palace Hotel in (where else but the tech mecca) San Francisco.

The Stevies are the nation’s premier business awards program and any organization operating in the U.S. is eligible – big, small, for profit, non-profit, you name it. This year, more than 3,300 nominations were submitted, representing organizations of all sizes and in virtually every industry to be considered in a wide range of categories. Winners were selected by more than 240 executives nationwide who participated in the judging process.

Read More

It's Time to Democratize Data Science!

The biggest trend we’ve seen in the analytics industry is both the increase in understanding of data’s value and the desire of executives – who aren’t data scientists – to gain insights from it. This leads us to believe that it’s truly time to democratize data science.

In the early 1990s, the HTML protocol was invented, the government opened the internet to private enterprise and the first internet mail and shopping experiences came online. In that transition from government funded research network to ‘open to the public’ the internet was democratized.

It took the intentional development of tools like Netscape’s Mosaic web browser, Intel’s Pentium processor and Sun Microsystems’ Java to package that technology for widespread consumption. But that started the ball rolling and today we couldn’t imagine a world that did not have the internet.

Read More

Why What You Don't Know May Hurt You, & How Security Analytics Can Help

As originally published by Infosec Island.

Managing security in today’s enterprise is far different than it was ten to fifteen years ago. In the past, companies were able to set up proxy agents, firewalls and strong virus protection software and feel pretty secure that their company’s information was safe.

However, in today’s world, things have changed. We are no longer dealing with teenage hackers or disgruntled young adults with a political or social ax to grind. The real threat to your security comes from advanced cybercriminal organizations. They are well versed in your typical defenses and spend all their time figuring out ways to bypass them. These are professionals with the skills, knowledge, talent, creativity and motivation to succeed.

If you consider your organization to be a likely target, then it’s a safe bet that your defenses have already been infiltrated – and that it’s only a matter of time until the real theft begins. This means your organization needs to immediately focus on detecting nefarious activities inside of your perimeter.

Read More

How Security Analytics Help Identify and Manage Breaches

As originally published by Help Net Security.

In this interview, Steve Dodson, CTO at Prelert, illustrates the importance of security analytics in today's complex security architectures, talks about the most significant challenges involved in getting usable information from massive data sets, and much more.

How important are security analytics in today's complex security architectures? What are the benefits?

It has become a near 'mission impossible' to totally prevent breaches because of the increasingly large and complex environment security professionals are tasked with protecting. We’re even to the point where many organizations already assume they have been successfully breached by advanced persistent attacks, and in this difficult state of affairs, security analytics are extremely important to help us learn everything we can about our environments and the threats they face.

Read More

Occupy Your Data. Anomaly Detection Stops the Top 1% from Ruling IT.

How much of your data do you actually pay attention to?  Would you be surprised to realize it is probably far less than 1%?  How about 1% of 1%?

This is the case in the vast majority of IT operations, performance management and security shops of any size anywhere in the world. Do you agree? If not, consider the data that actually describes the performance of any app or service your organization provides. A typical web app involves hundreds if not thousands of components including software, networks, middleware, app servers, databases, etc. Out of all the thousands of things we could look at to understands performance, we typically look at a handful to a few dozen key performance indicators (KPI). That math works out to around 1% or less. 

Read More

Data Exfiltration Detection via Behavioral Analysis

There are many possible ways that one can detect “data exfiltration” (data theft), but in many cases, this involves either manual raw data inspection or the application of rules or signatures for specific behavioral violations. An alternative approach is to detect data exfiltration using automated behavioral anomaly detection using data that you’re probably already collecting and storing, and without the use of a DLP-specific security tool.

The key thing to note when using behavioral anomaly detection to expose data exfiltration is that you will be using an approach of looking for deviations of behaviors amongst users or systems - that is, you’re assuming that users or systems that exfiltrate data act differently than the typical user or system. This would either be a deviation with respect to that item’s own history (temporal deviation) or with respect to others within a population (peer analysis).

Read More

The Secret to Fixing Problems Before Users Find Them (part 2)


In part 1 of this post, we talked about the failed paradigm of using thresholds and rules or 'eyeballs on timecharts' to monitor a critical app or service.

Thresholds are notorious for generating 'noise.' It is tremendously difficult to create a sufficiently accurate combination of thresholds and rules to monitor anything but the most egregious indicators of system failure. Some KPI (key performance indicators), like response time for a standard query, may seem straight forward. One might suspect that this should never be more than say 1,000 milliseconds. But you can be pretty much guaranteed that the actual response time will vary widely depending on physical distances, other server workloads and network congestion times. As a result, we would generate a large percentage of false positives with such alert conditions. Given the difficulty of defining accurate alert conditions for any KPI, the number chosen to be monitored this way is often exceedingly low.

Read More

Subscribe to updates