The Secrets to Successful Data Mining

If you were planning a cross-country road trip, would you spend numerous hours manually reviewing paper maps to find the best route and calculate how long the trip will take? Or would you use a GPS navigation system that can suggest the best route and accurately calculate the travel time within seconds? Moving from traditional IT monitoring to machine learning-based anomaly detection technology is similar to moving from paper maps to GPS.

Traditional IT monitoring requires you to write extensive rules and thresholds for about 1% of your meaningful data and then manually data mining the rest when you have a problem you have to troubleshoot. It is an incredibly labor-intensive process, particularly as IT environments get larger and more complex. What if you could turn this around and have machine learning anomaly detection software monitor all your relevant data so you can reduce the time you spend troubleshooting by 60-70%? Would that be of interest?

Read More

Big Data Analytics to the Rescue: Why Security Teams Need Machine Learning

As originally published by Help Net Security.

In the battle against cyber criminals, the good guys have suffered some heavy losses. We’ve all heard the horror stories about major retailers losing tens of millions of credit records or criminal organizations accumulating billions of passwords. As consumers, we can look at a handful of friends at a cocktail party and assume that most, if not all, of them have already been affected.

So how can an IT security organization ensure they are not the next target (excuse the pun)?

It turns out there are common characteristics of successful attacks. However, the evidence of intrusion are often hidden in the noise of IDS/IPS alerts; security teams have no visibility to telltale signs of much of the discovery and capture activities; and exfiltration is cleverly designed to operate below alert thresholds, the traces hidden in huge volumes of data.

Read More

Data Mining: Don't Settle for Monitoring 1% of Your IT Operations Data

Automation vs. data mining

Do you have the whole automation vs. data mining thing backwards? Traditional IT monitoring approaches automatically analyze less than 1% of the data available looking for 'known bad' behaviors. When a problem is found, an alert is raised that tells us what happened. Troubleshooting teams then have to manually ‘mine’ the other 99% of the data to find out why there was an alarm in the first place.

No wonder recent surveys on the state of IT operations verify that two of the biggest concerns are "time spent troubleshooting" and "problems reported by users before IT knows about them."

With corporate data growing rapidly, many companies are starting to look for solutions that enable early detection of anomalies, as well as faster troubleshooting, so they can investigate emerging issues before they become critical.

Read More

Ensure Compliance With IT Operations Analytics

As originally published by The ITOA Landscape.

Spam is a four-letter word in the marketing industry. The fine line between keeping consumers updated and engaged – or spamming them with unwanted emails – is something that every digital marketer has balanced on at one time or another.

The availability of new social media channels only compounds the problem. To give marketers the power of a coordinated and consistent message, several digital marketing companies have sprouted up that combine their ability to market successfully with a platform for businesses to reach many different customers and prospects, across several different channels, all with a single click. Great technology for the companies using it – but unfortunately, however, also very attractive to spammers.

Read More

Implementing StatsReduce in Anomaly Detective

One of the major additions to version 3.3 of Prelert Anomaly Detective® for Splunk was a feature called StatsReduce.  This feature enables Anomaly Detective to take advantage of Splunk’s distributed processing to analyse immense volumes of data quickly enough to deliver real-time insights.

Due to the way Anomaly Detective is designed, adding this feature wasn’t particularly difficult.  Tom’s last post explained how our anomaly detection algorithms are designed to work with aggregated data to solve the problem of data inertia.  One of the dimensions we aggregate over is time: we divide time into periods called buckets, calculate summary statistics of the data for each time bucket and do our analysis based on these statistics.

Read More

Anomaly Detection on Large Data Sets via Aggregation

In our last release we announced the “Stats Reduce” feature. The functionality was a key ingredient of the original conception for how we'd get our anomaly detection algorithms to work on very large data sets. This post is going to discuss the rationale behind our use of aggregation, which underpins this feature.

Read More

IoT Won’t Work Without Artificial Intelligence

As originally published by Wired.

As the Internet of Things (IoT) continues its run as one of the most popular technology buzzwords of the year, the discussion has turned from what it is, to how to drive value from it, to the tactical: how to make it work.

IoT will produce a treasure trove of big data – data that can help cities predict accidents and crimes, give doctors real-time insight into information from pacemakers or biochips, enable optimized productivity across industries through predictive maintenance on equipment and machinery, create truly smart homes with connected appliances and provide critical communication between self-driving cars. The possibilities that IoT brings to the table are endless.

Read More

How to Find Anomalies in Splunk's Internal Performance

Clearly we have had many discussions on this blog about different use cases of the Prelert Anomaly Detective App for Splunk - those in IT Operations, Performance Management, and IT Security. But one area of applicability that shouldn’t be overlooked is using Anomaly Detective to find performance or operability problems in Splunk itself.

Read More

C++11 mutex implementations

C++11 brought concurrency to standard C++ for the first time.  Prior to this the only choice for writing multi-threaded C++ programs was to use a separate C++ library, such as Boost Thread or Intel Thread Building Blocks, or roll your own wrappers around the low-level operating system facilities, such as POSIX threads or Windows threads.

Whilst it’s great to have standard headers like <thread>, <mutex> and <atomic> available, the brutal truth is that under the covers all the implementations of the standard C++ classes use those same low-level operating system facilities.

Last year I looked into the performance of different types of locks on different platforms.  The variation in performance is surprisingly wide.  Prelert’s codebase pre-dates C++11, so we have our own wrappers around the low-level operating system facilities.

Read More

Machine Data is Different

A couple of years ago one of Prelert's taglines was “Machine Learning for Machine Data”.  Whilst the marketing drive has moved on, this phrase is still very relevant to what Prelert's products do.  But why is machine data different?

To answer this question, let’s start by considering different perspectives on what constitutes unstructured data.

At the end of February Rich wrote a blog about anomaly detection in unstructured dataPrelert Anomaly Detective® for Splunk has the ability to categorise log messages and then detect anomalies in the rates of the different message categories.

Read More

Subscribe to updates